+ Start a Discussion
Jerry SwartzJerry Swartz 

Inexplicable loss of batches during Apex Batch Execution

I have a set of records, 4119, for which I want to do some batch processing.  I have developed a class and executed my batch using a batch size of 15 and later a batch size of 10.

During execution, I can monitor progress on the Apex Jobs page.  Using a batch size of 10, I can see the Total Batches displays as 411.  During the course of execution, the Total Batches stays at 411 a few times while the Batches Processed increases.  However, at the 10th batch processed, the number of Total Batches shrinks to, 401.  After a few more, it shrinks to 391, and so on.  For example:

Some iterations as viewed in Apex Jobs:

Total Batches/Batches Processed
411/0
411/9
401/10
391/15
391/16
381/18
381/19
381/20
371/21

Eventually, the batch "finishes" but at the end, only a portion of the records have been processed.  It seems as though batches are lost somewhere along the way.  I don't see any failures in my debug logs and the Apex job shows no failures.  Where it started with 411, it might end up at 93 Total Batches/93 Batches Processed.  Based upon my particular logic and processing of the original 4119 records, I can tell they all haven't been touched.

Does anyone have an explanation for this or suggestions how to troubleshoot what's going on?  I reduced my batch size from 15 to 10 thinking maybe something was hitting a limit, but as noted, I never see evidence of this.
 
Rafael Suarez 13Rafael Suarez 13
Hey Jerry;
I've seen this. But It's always come hand in hand with a "CPU Timeout" or "Maximum SOQL query run time exceeded".  Are you sure no governors are being hit?  Can you post the summary debug log Governor entries for the end of the batch ?

Regards
Rafa
Jerry SwartzJerry Swartz
I think you're asking for the final log upon execution of the finish method, is that right?  Here it is for one of my runs.  Embedded in that EMAIL_QUEUE entry is a summary of the job and though it is hard to read in that HTML form, there were 147 batches at the start and there were 81 batches completed with 0 errors, leaving 66 batches unaccounted for.

As far as the governor limits are concerned, I have no evidence I'm running into any limit.  I inspect every log generated during the course of execution and none signal a governor limit was hit -- if that was the reason, I'd feel better seeing some evidence.  If I purposely alter the batch logic to err and/or exceed limits, I can observe the result via the failed batch count or in logs.  Neither seems to be the case here though.

38.0 APEX_CODE,DEBUG;APEX_PROFILING,DEBUG;CALLOUT,DEBUG;DB,INFO;SYSTEM,DEBUG;VALIDATION,DEBUG;VISUALFORCE,INFO;WAVE,INFO;WORKFLOW,INFO
16:06:43.0 (84737)|USER_INFO|[EXTERNAL]|005K00000046Xt8|<obscured login>|Pacific Standard Time|GMT-08:00
16:06:43.0 (146454)|EXECUTION_STARTED
16:06:43.0 (158420)|CODE_UNIT_STARTED|[EXTERNAL]|01pK0000000IlXA|MatchBatchManager
16:06:43.0 (11203839)|DML_BEGIN|[118]|Op:Update|Type:Matcher_List__c|Rows:1
16:06:43.0 (44336871)|DML_END|[118]
16:06:43.0 (44806946)|SOQL_EXECUTE_BEGIN|[120]|Aggregations:0|SELECT Id, Status, ExtendedStatus, TotalJobItems, JobItemsProcessed, NumberOfErrors, CompletedDate FROM AsyncApexJob WHERE Id = :tmpVar1
16:06:43.0 (47115290)|SOQL_EXECUTE_END|[120]|Rows:1
16:06:43.0 (244819076)|EMAIL_QUEUE|[729]|subject: Matching completed for TheBigMarketingList2016, bccSender: false, saveAsActivity: true, useSignature: true, toAddresses: [jswartz@acadia-pharm.com], whatId: a3nK0000000JOwM, htmlBody: <table><tr><td>Name</td><td>Value</td></tr><tr><td>Id</td><td>a3nK0000000JOwM</td></tr><tr><td>Status</td><td>Matched</td></tr><tr><td>Rows Matched</td><td>4,119</td></tr><tr><td>Creator</td><td>Jerry Swartz</td></tr><tr><td>Job Id</td><td>707K000000m4o7lIAA</td></tr><tr><td>Job Status</td><td>Completed</td></tr><tr><td>Job Status Detail</td><td></td></tr><tr><td>Completed</td><td>2017-01-11 16:06:43</td></tr><tr><td>Batch Count</td><td>147</td></tr><tr><td>Batch Completions</td><td>81</td></tr><tr><td>Batch Errors</td><td>0</td></tr></table><p>Click here to view the matched list:  https://cs9.salesforce.com/a3nK0000000JOwM</p>, 
16:06:43.262 (262313354)|CUMULATIVE_LIMIT_USAGE
16:06:43.262 (262313354)|LIMIT_USAGE_FOR_NS|(default)|
  Number of SOQL queries: 0 out of 200
  Number of query rows: 0 out of 50000
  Number of SOSL queries: 0 out of 20
  Number of DML statements: 0 out of 150
  Number of DML rows: 0 out of 10000
  Maximum CPU time: 0 out of 60000
  Maximum heap size: 0 out of 12000000
  Number of callouts: 0 out of 0
  Number of Email Invocations: 0 out of 10
  Number of future calls: 0 out of 0
  Number of queueable jobs added to the queue: 0 out of 1
  Number of Mobile Apex push calls: 0 out of 10

16:06:43.262 (262313354)|CUMULATIVE_LIMIT_USAGE_END

16:06:43.0 (262361445)|CODE_UNIT_FINISHED|MatchBatchManager
16:06:43.0 (263935022)|EXECUTION_FINISHED