+ Start a Discussion
CRMfusion - GWCRMfusion - GW 

Things I've learned about the BULK API Query

Here is a list of the things I’ve discovered over the last week in using the query functionality of the BULK API.  It is not a complete list of issues but it should help out any developers that are getting started and are having issues:


-          It DOES NOT support relationship queries.  You can only query one table at a time and using something like: SELECT Id, FirstName, Account.Name from Contact will cause it to fail

-          It DOES NOT support gzip compressed data.  If you send a contentType of ZIP_CSV your Job will fail.  It also causes the size of the returns to be larger and slower

-          The return of a “Bulk Query Results Response” is not as indicated in the docs (if you are consuming from .NET).  If you have created an object from the xst you will find that the return is a list of <results> inside a <results-list> and NOT a batchQueryResult as expected

-          Several objects CAN NOT be queried (even though the metadata indicates that they are queryable).  These include AccountPartner, ApexTestResult, AssignementRule, CampaignShare, CaseStatus, ContactShare, ContractStatus, DatedConversionRate, EventAttendee, FiscalYearSettings, ForecastShare, LeadShare, LeadStatus, OpportunityPartner, OppoertunityStage, PartnerRole, Period, ProcessInstance, ProcessInstanceStep, SolutionStatus, StaticResource, TaskPriority, TaskStatus, UserLicense, UserPreference (I didn’t try the chatter fields but I’m sure the feeds won’t work)

-          It is SLOWER than using the SOAP API (generally).  In most of my tests the SOAP API beat the BULK API by a significant margin.  This is most likely due to the lack of compression and the newness of the API.  It does appear to speed up dramatically during non peak hours.

The big advantage of the BULK API vs the SOAP API is that you don’t run into the query time outs that occur when you are accessing large volumes of data.  If you try to pull down a large number of fields on 3 million contacts during the day using the SOAP API you will most likely get a time out from sfdc.  The BULK will keep trying (up to 10 times) to get the data and you’re not sitting waiting for a response in your application.