function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
anchoviesanchovies 

Trigger for catching dupes and Lists indexes in SOQL

Hi all,

I'm currently working on a trigger for preventing duplicate Leads from being inserted during the mass data upload. The trigger should treat two leads equal if their Company and Email fields are equal. Since the trigger would fire during the mass data upload (custom), I had to minimize the number of executed SOQL queries in order to cope with SF limits.

Here's how the code looks like atm:

trigger MyTrigger on Lead (before insert) {

Map<String, Lead> leadMap = new Map<String, Lead>();

List<String> fieldNames = new List {'Email', 'Company'}; // it will actually be obtained from somewhere else, but I'm hardcoding it atm

String leadQueryString = '';
String leadDupeError = 'A Lead with the same {0} already exists: {1}';
String fieldSet = '';
Integer i = 0;

List<Set<String>> leadValues = new List<Set<String>>(); // referencing a set by its index in the list doesn't work in dynamic query

for (String fieldName : fieldNames)
{
	Set<String> thisFieldValues = new Set<String>();
	
	for (Lead lead : System.Trigger.new)
	{
		if(String.isNotBlank((String)lead.get(fieldName))) thisFieldValues.add((String)lead.get(fieldName));
	}
	// keeping a corresponding set of values for this field
	leadValues.add(thisFieldValues);
	
	// constructing my query filters
	leadQueryString += (leadQueryString == '' ? '' : ' AND ') + fieldName + ' IN :leadValues[' + i + ']'; // this index doesn't work	
	fieldSet += (fieldSet == '' ? '' : ', ') + fieldName;
	
	i++;
}

List<Lead> possibleDupes = Database.query('SELECT Id, ' + fieldSet + ' FROM Lead WHERE ' + leadQueryString);

List<Lead> dupes = new List<Lead>();

for (Lead lead : System.Trigger.new)
{
	for (Lead pd : possibleDupes)
	{
		boolean match = true;
		
		for(String fieldName : fieldNames)
		{
			if (((String)lead.get(fieldName)) != (String)pd.get(fieldName)) {
				match = false;
				break;
			}
		}
		if (match) {
			lead.addError(String.format (leadDupeError, new List<String> { fieldSet, pd.Id})); // this will prevent the insert
			dupes.add(lead);
			break;
		}
	}
}

}

 Actually, referencing leadValues[i] in query doesn't work :( It gives me the following error:  System.QueryException: unexpected token: '['
To get it work, I had to remove the [] clauses from the query and add two additional lists: 

leadValues0 = leadValues[0]
leadValues1 = leadValues[1]

This way it does work, but it's not dynamic (I want the number of filter fields be more flexible).

So, the question is: is there a way to bypass this error without having to keep the exact number of field values sets?

I would also appreciate any other suggestions on improving my trigger :)

Best Answer chosen by Admin (Salesforce Developers) 
sfdcfoxsfdcfox

Here's my version:

 

trigger preventDupes on Lead (before insert) {
    Map<String, Map<String, Lead>> leads = new Map<String, Map<String, Lead>>();
    for(Lead record: Trigger.new)
        leads.put(record.company, new map<string, lead>());
    for(lead record: trigger.new)
        if(leads.get(record.company).get(record.email) != null)
            record.addError('Duplicate lead in transaction.');
        else 
            leads.get(record.company).put(record.email, record);
    String[] filters = new String[0];
    for(string company: leads.keyset()) {
        String[] entries = new String[0];
        for(String email: leads.get(company).keySet()) {
            entries.add(email == null? 'null': '\''+string.escapesinglequotes(email)+'\'');
        }
        String tempCompany = company == null? 'null': '\''+string.escapesinglequotes(company)+'\'';
        filters.add(String.format('(Company = {0} AND Email IN ({1}))', new String[] { tempCompany, String.join(entries,',') }));
    }
    // Query looks like "select id, email, company from lead where (Company = null AND Email IN (null,'john.doe@contoso.com')) OR (Company = 'Anywhere' AND Email IN ('Wonderland@happyplace.com'))"
    for(Lead record:Database.query(String.format('select id, email, company from lead where {0}', new string[] { string.join(filters, ' OR ') }))) {
        leads.get(record.company).get(record.email).addError('Duplicate lead in database.');
    }
}

Making this dynamic (e.g. using any two or more arbitrary fields) greatly increases the complexity of the query in question, but it would be possible. Note that this system tries to use the higher cardinality (e.g. company name is more likely duplicated than email is) to produce the most efficient query possible; any lead that appears in the query is automatically a duplicate of an incoming lead because of the selectivity of the query.

 

Edit: Missing parens in query.

 

Edit: To make this dynamic, you could replace the static fields with variable fields. Keep in mind that your implementation will need to determine the appropriate type of map to create, which can be tricky at best. You'll probably need to make use of a Map<Object, Object>, where the value could itself be another Map<Object, Object> or a final value.

All Answers

sfdcfoxsfdcfox

Here's my version:

 

trigger preventDupes on Lead (before insert) {
    Map<String, Map<String, Lead>> leads = new Map<String, Map<String, Lead>>();
    for(Lead record: Trigger.new)
        leads.put(record.company, new map<string, lead>());
    for(lead record: trigger.new)
        if(leads.get(record.company).get(record.email) != null)
            record.addError('Duplicate lead in transaction.');
        else 
            leads.get(record.company).put(record.email, record);
    String[] filters = new String[0];
    for(string company: leads.keyset()) {
        String[] entries = new String[0];
        for(String email: leads.get(company).keySet()) {
            entries.add(email == null? 'null': '\''+string.escapesinglequotes(email)+'\'');
        }
        String tempCompany = company == null? 'null': '\''+string.escapesinglequotes(company)+'\'';
        filters.add(String.format('(Company = {0} AND Email IN ({1}))', new String[] { tempCompany, String.join(entries,',') }));
    }
    // Query looks like "select id, email, company from lead where (Company = null AND Email IN (null,'john.doe@contoso.com')) OR (Company = 'Anywhere' AND Email IN ('Wonderland@happyplace.com'))"
    for(Lead record:Database.query(String.format('select id, email, company from lead where {0}', new string[] { string.join(filters, ' OR ') }))) {
        leads.get(record.company).get(record.email).addError('Duplicate lead in database.');
    }
}

Making this dynamic (e.g. using any two or more arbitrary fields) greatly increases the complexity of the query in question, but it would be possible. Note that this system tries to use the higher cardinality (e.g. company name is more likely duplicated than email is) to produce the most efficient query possible; any lead that appears in the query is automatically a duplicate of an incoming lead because of the selectivity of the query.

 

Edit: Missing parens in query.

 

Edit: To make this dynamic, you could replace the static fields with variable fields. Keep in mind that your implementation will need to determine the appropriate type of map to create, which can be tricky at best. You'll probably need to make use of a Map<Object, Object>, where the value could itself be another Map<Object, Object> or a final value.

This was selected as the best answer
anchoviesanchovies

Thank you, sfdcfox, making query more efficient is a very good point! I'll try your solution and will let you know.

I've also tried a different approach (that isn't dynamic):
1. create a custom formula field, e.g. 'Company  &  Email' (concatenation of the fields you want to filter on)

2. search for duplicates by this field in trigger

sfdcfoxsfdcfox
That may work for small databases, except formulas cause a full table scan, which may run into non-selectivity errors when more than 100,000 rows of data exist. You would do well do use dynamic queries for that purpose.
georggeorg

Why can't you create an external id field and populate with CompanyName+email, this will automatically prevent the users creating the duplicate record.

 

Thanks,

George

Visit My blog here

sfdcfoxsfdcfox
External IDs wouldn't prevent it, but unique IDs would. But that might be less desirable than having control over it in some cases (like, if you just wanted to record that there was a duplicate, for example).
anchoviesanchovies

Hi georg, thanks for your reply. Unfortunately, formula fields don't have 'unique' and 'external ID' settings.

anchoviesanchovies

Hmmm, that's a good point as well, I guess I'll have to find out possible data volumes from our customer. Would 10-30 thousands of records be a problem?

 

anchoviesanchovies

Thank you SO MUCH for all your help! I love your solution, it's very elegant :)