+ Start a Discussion
Paul Dyson.ax812Paul Dyson.ax812 

Using the Bulk API to delete records?

All the examples of using the Bulk API show how to insert records (including the sample code). This works perfectly but we're also interested in using the API to delete all records of a particular type - basically we have a need to delete and recreate some reference data on a daily or weekly basis.

 

 

  • Do you need to provide a CSV file if you want to delete all records?
  • If you do, what format should the CSV be in (just the Id needed) and any hints on how to get the Ids of 100,000 records without hitting limits?
Thanks in advance for any help!

 

Best Answer chosen by Admin (Salesforce Developers) 
dkadordkador

The external source doesn't have a last_updated or system_modstamp column you can use to detect changes?

 

Why is it hard to query 500,000 records?  It will take a while to issue the query/queryMores to get your data set, but that's not hard.  But, like I said before, it's a bit expensive.

 

If this is a custom object, your other option is just to delete the object entirely and then re-create it with the metadata API.  We'll drop your rows automatically then (well, sort of - the old object will be soft-deleted for a while).

All Answers

dkadordkador

We currently make it difficult to do what you want to do because it's so resource intensive.  If you have to, you can query for the entire table, then build up a CSV of only one column (Id) with all the rows you want to delete.  I'd recommend using hard delete if you have to do this.

 

But perhaps a better question would be why you need to do this?  You can't do some sort of data merging to detect which rows need to be updated/deleted in SFDC and which need to be inserted from your external system of record?

Paul Dyson.ax812Paul Dyson.ax812

We're trying to insert what is effectively read-only financial 'reference data' which comes from an external source. The external source doesn't provide change sets so the simplest thing is for us to clear out all the data and reload. We're talking about ~500,000 records so we can't determine the change set within salesforce, nor can we easily query all the record Ids. 

dkadordkador

The external source doesn't have a last_updated or system_modstamp column you can use to detect changes?

 

Why is it hard to query 500,000 records?  It will take a while to issue the query/queryMores to get your data set, but that's not hard.  But, like I said before, it's a bit expensive.

 

If this is a custom object, your other option is just to delete the object entirely and then re-create it with the metadata API.  We'll drop your rows automatically then (well, sort of - the old object will be soft-deleted for a while).

This was selected as the best answer
Paul Dyson.ax812Paul Dyson.ax812

No, the external system (not one we have any ownership of or influence over) doesn't mark modification times or provide any information we could determine a change set from without doing a full diff between the new data set and the previous one.

 

You're right ... its not hard to query 500,000 records, just time-consuming snd a bit of a roundabout process when we know exactly what we want to delete: everything. The object is part of a managed package so we can't drop it so I think the query-all-rows/create csv/bulk delete is looking like the only option available to us.

 

Thanks for your help.

dkadordkador

In this case it might make sense to do that full diff, with an option to fall back to a complete teardown / reload of the data.  That might end up being faster in terms of wall-clock time.

 

Obviously it makes things more complex, though.  Good luck with whatever approach you take.

Saurav Singh 7Saurav Singh 7
Facing the same issue. Did anyone manage to find the solution.