function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
dlucia21dlucia21 

What is the best practice for refreshing 50,000+ records?

We have a data file that contains over 50,000 records that we get from a partner. The file may grow with additions and it may shrink with records that have been deleted. They are advising us to delete and insert every night 50,000+ records.

The problem with that is the recycle bin gets overloaded with all the deleted records.

Is there a way to delete records permanently without them reaching the recycle bin?

Would it be better for them to assign a unique id to each record and perform and update and insert?

What should I do?
DevAngelDevAngel
Yeah, I think using upsert would be a better approach.  This does require another id (external id field) but sounds like you likely have some smaller number of actual changes, on the order of 10 to 30 percent, that actually change.  Using upsert will update just about all of them and create any new ones.

Your partner should be reasonable and only send you changes and additions and deletions.


Cheers
dlucia21dlucia21
That's what I thought.  The file they provide us contains the changes, additions, and deletes but as a whole.  Which is why they recommend deleting and inserting the records nightly. 
 
We would rather do an upsert but the detail records that they provide us are not unique.  The give us a file that contains a header row and then the following rows detail records for an account.  So, one account can have any where from 1 to n number of detail records.  Each header row contains a unique id but each detail record does not.
 
So, we are kind of in a battle between what their other customers do(which is a nightly refresh) and what we want to do (which is an upsert).
 
In this scenario, what would SF recommend we do in this type of situation?
 
Thank you.
benjasikbenjasik
add a unique key and do the upsert.  It will also scale better as your data set grows