+ Start a Discussion
Sumesh ChandranSumesh Chandran 

Remove duplicates from List

I am trying to remove duplicates from a list. The list has duplicates in the city column and the province column.
This is what I tried according to what was mentioned on this post https://developer.salesforce.com/forums/?id=906F00000008y3SIAQ

It doesn't work for me. 
global void execute(Database.BatchableContext BC, List<MDU_Squad_Data_min__c> cities) {  
        Set<MDU_Squad_Data_min__c> citySet = new Set<MDU_Squad_Data_min__c>();
        citySet.addAll(cities);	
        List<MDU_Squad_Data_min__c> noDuplicateCityList = new List<MDU_Squad_Data_min__c>(citySet);
}
Please advise!
 
Best Answer chosen by Sumesh Chandran
Sumesh ChandranSumesh Chandran
This is all what was needed. I am getting the expected the results now. I just had to put the second argument with a higher value than the total number of records.
cityMduMaster c = new cityMduMaster();
database.executeBatch(c,5000000);

​​​​​​​

All Answers

Sumesh ChandranSumesh Chandran
I also tried this:
for(MDU_Squad_Data_min__c c: cities)
        {
            sumchans__City_Master__c city = new sumchans__City_Master__c();
			city.Name = c.CITY_NAME__c;
			city.sumchans__PROVINCE__c = c.PROVINCE_CODE__c; 
            if(!cityList.contains(city)){ cityList.add(city); }
        }
The duplicates have been reduced, but still it remains. Here is the snapshot of the results:
User-added image
Maharajan CMaharajan C
Hi Sumesh,

Try the below code and let me know.
 
List<MDU_Squad_Data_min__c> noDuplicateCityList = new List<MDU_Squad_Data_min__c>();
	Map<String,MDU_Squad_Data_min__c> cityProvMap = new Map<String,MDU_Squad_Data_min__c>();
	for(MDU_Squad_Data_min__c c: cities)
        {
            cityProvMap.put(c.CITY_NAME__c, c);
        }

	If(!cityProvMap.isEmpty())	
	noDuplicateCityList = cityProvMap.values();

Thanks,
Maharajan.C 
Sumesh ChandranSumesh Chandran
Thanks @Maharajan, that didn't help. I still the same list as attached above.
Maharajan CMaharajan C
Try this one:
 
List<sumchans__City_Master__c> noDuplicateCityList = new List<sumchans__City_Master__c>();
		Map<String,sumchans__City_Master__c> cityProvMap = new Map<String,sumchans__City_Master__c>();
		for(MDU_Squad_Data_min__c c: cities)
        {
            sumchans__City_Master__c city = new sumchans__City_Master__c();			
			city.Name = c.CITY_NAME__c;
			city.sumchans__PROVINCE__c = c.PROVINCE_CODE__c; 
			String str = city.Name;
			cityProvMap.put(str.deleteWhitespace(),city);
		}
		noDuplicateCityList = cityProvMap.values();

Thanks,
Maharajan.C
Sumesh ChandranSumesh Chandran
Didn't help, it is still the same result set.
Christan G 4Christan G 4
I think I know why your list still contained duplicates even after putting them into a set. There is another field in those records that are different amongst each other. which is causing the set to consider each of them unique and adds them to its collection. This may be confusing to understand so I provided an example below:

Lets say we create 3 Account records who all have the same name, annual revenue but different industries:

Account acc1 = new Account(Name= 'Test Account', AnnualRevenue = 500, Industry = 'Apparel');
Account acc2 = new Account(Name= 'Test Account', AnnualRevenue = 500, Industry = 'Banking');
Account acc3 = new Account(Name= 'Test Account', AnnualRevenue = 500, Industry = 'Agriculture');

Now lets add these to a List called accList:

List <Account> accList = new List <Account>();

accList.add(acc1); accList.add(acc2); accList.add(acc3);

Now lets put this list into a set called accSet by passing it as an argument.

Set <Account> accSet = new Set <Account>(accList);

Now if I was to do System.debug(accSet); you'll notice that the output contains the same records as the list. As mentioned before, even though each record had the same name and Annual Revenue, they had some other field which didn't match causing the set to consider each of them unique.

Possible Solutions:

Possible Solution #1: Reduce the amount of fields you are using in your query to only those that are relevant. For example, assuming you are using Database.QueryLocator in your start method, only state the Name and sumchans_Province__c in your query. This should resolve the issue.

Possible Solution #2: If the other fields in your query are required, then copy and paste the code I have written below into the curly brackets of the execute method. 

Overall, this was quite fun to troubleshoot and I hope my solution helps! If you have any questions, feel free to leave a comment or message me.

Code Starts:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

List <MDU_Squad_Data_min__c> noDuplicateCityList = new List <MDU_Squad_Data_min__c>();

Map <String, MDU_Squad_Data_min__c> mduMap = new Map <String, MDU_Squad_Data_min__c>();

for (integer i = 0; i < cities.size(); i++) {
    
mduMap.put(cities[i].Name, cities[i]); // I assume ‘Name’ is what stores the city name. If not please change ’Name’ to the correct API name.

}

Set <String> mduSet = new Set <String>(mduMap.keyset());

List <String> mduSetList = new List <String>(mduSet);

for (integer i = 0; i < mduSetList.size(); i++) {
    
 noDuplicateCityList.add(mduMap.get(mduSetList[i]));   
    
}

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Code Ends!
Sumesh ChandranSumesh Chandran
Hello Christian, That was super detailed, really appreciate the time taken. 

I tried your code, it still doesn't get rid of the duplicates.

The solution 1 you mentioned, this what I had in my start method since the beginning, I was not pulling in any other fields. 
return Database.getQueryLocator('SELECT CITY_NAME__c,PROVINCE_CODE__c FROM MDU_Squad_Data_min__c');

Also attaching the whole code here, if you could find, if I am doing something wrong somewhere.
 
global class cityMduMaster implements Database.Batchable<sObject> {   
    // Start Method
    global Database.QueryLocator start(Database.BatchableContext BC) {
        return Database.getQueryLocator('SELECT CITY_NAME__c,PROVINCE_CODE__c FROM MDU_Squad_Data_min__c');
    }    
    // Execute method    
    global void execute(Database.BatchableContext BC, List<MDU_Squad_Data_min__c> cities) {  
        List<sumchans__City_Master__c> cityList = new List<sumchans__City_Master__c>();
        
        for(MDU_Squad_Data_min__c c: cities)
        {
            sumchans__City_Master__c city = new sumchans__City_Master__c();
            city.Name = c.CITY_NAME__c;
            city.sumchans__PROVINCE__c = c.PROVINCE_CODE__c; 
            cityList.add(city);                             
        }
        ///////////////////////////   -- This is the code you provided.     
        List <sumchans__City_Master__c> noDuplicateCityList = new List <sumchans__City_Master__c>();
        Map <String, sumchans__City_Master__c> mduMap = new Map <String, sumchans__City_Master__c>();
        for (integer i = 0; i < cityList.size(); i++) {   
            mduMap.put(cityList[i].Name, cityList[i]);
        }
        Set <String> mduSet = new Set <String>(mduMap.keyset());
        List <String> mduSetList = new List <String>(mduSet);
        for (integer i = 0; i < mduSetList.size(); i++) {   
            noDuplicateCityList.add(mduMap.get(mduSetList[i]));   
        ///////////////////////////    
        }        
        Database.SaveResult[] MySaveResult = Database.Insert(noDuplicateCityList, false);
        for (Database.SaveResult sr : MySaveResult) {
            if (!sr.isSuccess()) {
                // Operation failed, so get all errors                
                for(Database.Error err : sr.getErrors()) {
                    System.debug('The following error has occurred.');                    
                    System.debug(err.getStatusCode() + ': ' + err.getMessage());
                    System.debug('Fields that affected this error: ' + err.getFields());
                }
            }
        }
    }      
    // Finish Method    
    global void finish(Database.BatchableContext BC) {
        
    }    
}

Thanks again!
Christan G 4Christan G 4
Anytime! I am so surprised that the code I provided didn't work since it filters the data so many times before it is inserted back into the noDuplicate List. Unfortunately, I couldn't see anything wrong with the code after reviewing it. I think inputting debug logs will be a great way to help track what is occurring in the backend to pinpoint the issue. If possible, can you put System.debug(mduSetList[i]); between code lines 25 and 26 and take a screenshot of what is shown in the debug log. I am curious to see if duplicates of the city name still exist within the set.

Also, I am curious as to what the result would be if we were to filter out the cities list first before creating the sumchans__City_Master__c objects out of it. I don't think it'll make much of a difference but it is worth a try.
Sumesh ChandranSumesh Chandran
Hello Christian, That is another problem, I am not getting any system.debug messages. I find it very hard to debug, I put the System.Debug message as you said, it only printed two cities.

User-added image
Sumesh ChandranSumesh Chandran
Anybody here who could help me out of this. I am stuck on this for past 2 days, not able to move forward.
Sumesh ChandranSumesh Chandran
I tried with limited amount of data, with just 2 cities, still I got duplicates, but when I cut it down to an even shorter list, I was able to get rid of the duplicates. Could it be something with the data?
Christan G 4Christan G 4
Hmm very interesting. I am starting to wonder if this could be a potential bug in Salesforce. Okay lets add more debug logs to the code to figure out what is going on. Between code lines 14 and 15, add System.debug(city); Take a screenshot of that and then afterwards, between code lines 22 and 23, input System.debug(mduMap.keyset()); and take a screenshot of that. I am curious as to how different their outputs would be. After, I would like for you to apply the code I have written to the List<MDU_Squad_Data_min__c> cities first and then use the new noDuplicateCity List to create the sumchans__City_Master__c records just to see if that changes anything.  
Sumesh ChandranSumesh Chandran
This is what I found out so far. I tried creating 4 custom objects each having a single city with its dupes. City A had 121 records which came out as 1 when I ran the above code, City B had 170 records which also came out as 1. City C had 222 records but came out as 2 and City D had 406 records but it came out as 3. I wondering if this has anything to do with the batch processing. I have heard when we create a batch class it will process the data in 200's and if the variables are getting re-initialized after the first batch of 200 records.
Sumesh ChandranSumesh Chandran
This is all what was needed. I am getting the expected the results now. I just had to put the second argument with a higher value than the total number of records.
cityMduMaster c = new cityMduMaster();
database.executeBatch(c,5000000);

​​​​​​​
This was selected as the best answer