+ Start a Discussion
N AhmedN Ahmed 

Optimal way to upsert bulk data based on dynamic deduplication rules

I have a set of records in a spreadsheet each with it's own fields. For example, a record might be a contact with an email address, company name, city, phone number, and employee count. I want to bulk upsert this data into Salesforce. The tricky part is what's considered a new record vs an existing record. I do not know the object IDs for the data in the spreadsheet - some might already exist in Salesforce and some might not. Here are two scenarios:

1) If a record in the spreadsheet has the same email address as a record in Salesforce, then they are considered the same and it would be an update. If no record is found with the same email address, then it should create a new record.
2) If a record in the spreadsheet has the same company name and city as a record in Salesforce, then they are considered the same and it would be an update. Unlike before, the email address is irrelevant. If no matching record is found, then it should create a new record.

The rules here might vary and there will be more use cases here. Ideally, matching can be done in a case insensitive manner.

As far as I can tell, an upsert request is based on an object ID. Since there might be thousands of requests per day, using individual API requests might be a problem, and a batch job is ideal.

The Question: Is there some way to send a batch upsert job with thousands of records and specify the rules for what's considered an update (ie: if these fields match) vs what's considered a new field, then the batch job will figure it out and do what's needed? Additionally, is this possible to do via a REST API?

Rahma__cRahma__c
Hi N Ahmed,
If you are inserting these contacts uploaded in your own org then you can specify in a trigger before insert whether to insert or not a contact if it is duplicated

Regards
Rahma__c