function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
MJ09MJ09 

Unable to write to any of the ACS stores in the alloted time

I have a batch job that runs in several production orgs. It has run fine, twice a day, for several weeks now. However, in a few of the orgs in which it runs, I've started getting the error "Unable to write to any of the ACS stores in the alloted time." What does this error mean, why am I getting it, and what can I do about it?

 

Thanks--

pmorellipmorelli

what instance is your org on?

MJ09MJ09

I've gotten the error message in 3 of the dozen or so orgs in which this batch job runs. (Each org has different numbers of records to process.) The batch job is scheduled to run daily (somtimes, twice daily, at 12 hour intervals).

 

Org 1 is on na1. The problem happened on 9/11, but not before or since.

Org 2 is on na7. The problem happened on 9/14, but not before or since.

Org 3 is on na7. The problem happened on 9/14, but not before or since.

 

 

MJ09MJ09

I can't tell whether this is related or not, but in yet another org that runs the same batch job code, I've gotten the error "Unable to access query cursor data; too many cursors are in use" twice now.  The batch's start() method calls Database.getQueryLocator(), and the execute() method has 2 queries, neither of which is in a loop. There's nothing I can see in the code that should have caused this problem, and there were no other batch jobs running at the times that the error occured.

 

This error happened on na7 on 9/10 and 9/15, but the batch job has run twice a day without this error before 9/10, and from 9/11 through 9/14.

 

The Apex Jobs log records for the failures shows Total Batches = 0, with Batches Processes & Failures both = 1. When the job runs successfully, there are upwards of 1,150 Total Batches, which leads me to think that the error happened in the start() method, when I try to create the QueryLocator.

 

Could this be related to the ACS problem?

schulttjschulttj

I don't have an answer, but we received the ACS error once on na5 in the early morning hours of 9-19.

pmorellipmorelli

most of these issues should be resolved now, please post here if you notice them again. Apologies for the burp, it had to do with fragmentation in one of our caching layers.

MJ09MJ09

This problem just re-surfaced. we're seeing it again in one of the orgs where it occured before.

Aiden ByrneAiden Byrne

I also received this error from a customer organization today.

MJ09MJ09

This isn't my first go-round with this problem, and it's not my only problem with Batch Apex jobs. If you're having problems getting Batch Apex to behave properly, please see http://boards.developerforce.com/t5/Apex-Code-Development/Batch-Apex-problems/td-p/242867 where I'm hoping we can at least build a community of people who are having similar problems with Batch Apex, and maybe get some attention paid to the problems.

patrick.bulaczpatrick.bulacz

One of our clients (A current ISV Partner) is having a similar issue across multiple instances across multiple servers.

 

Will update with specifics for each server / times of failings however the errors are coming through in the following format, which all seem related to the SFDC side caching issue described by a few people across the boards.

 

Developer script exception from ORGNAME : 'JOBCLASSNAME' : Unable to retrieve file from ACS, transient error?: batch/01p900000004fxY/005XXXXXXXXXXX/707N000000005eQ/30

Force.com Sandbox
 
Apex script unhandled exception by user/organization: 005XXXXXXXXXXX/00DXXXXXXXXXX
Source organization: 00DXXXXXXXXX (null)
Failed to process batch for class 'JOBCLASSNAME' for job id '707N000000005eQ'.  Salesforce System Error: 1169690028-32 (1846831103) (1846831103)
MJ09MJ09

I've been working with Tier 3 Support on a different Batch Apex issue, and they say that the ACS issue is a known problem that they're working on. When it happened months ago, Support said it was due to a fragmentation issue. More recently, Tier 3 says it's due to heavy load on the server. The two (fragmentation and heavy load) could easily be the same issue. With Spring 11, the wording of the error message seems to have changed, but it seems to be the same issue.

 

They seem to think that, recently anyway, it's limited to na7, so if you're seeing it on other Salesforce instances, they'd probably want to know about it.

 

Recently, it's been happening at least since early February. Very frustrating -- I have Batch Apex jobs scheduled to run once or twice daily in several orgs, and because of this issue, those jobs are unreliable. I'd love to see Support post a message here giving more details and a plan for resolving the problem.

pmorellipmorelli

We've rewritten the batch apex implementation to increase reliability and address these issues. We're currently finishing up our load testing, and will be staggering it out now that we've released everywhere.

 

If you'd like to test it out earlier, to see if it improves your reliability, please send me a private message, I'll put you in touch with the apex PM.

 

We're investigating the recent batch apex failures on na7 (and any others) to see if we have a regression, or the release borked something.

patrick.bulaczpatrick.bulacz

MJ09 wrote:

They seem to think that, recently anyway, it's limited to na7, so if you're seeing it on other Salesforce instances, they'd probably want to know about it.


My client is experiencing across all AP1 instances of their install base (ISV Partner).

 

The current running CASE is #04779985

pmorellipmorelli

confirm that the transient error was a regression for batch apex, for the latest release. We introduced a race condition that deletes files out from underneath the jobs as they process.

 

we're currently testing a fix, and will try to get it out asap.

pinkelkpinkelk

We encountered this problem last night on my client's Production system, on na7.  Result was that a BI feed (external Java program) which was to have picked up the results of the batch process failed, which has caused client to get very concerned about the reliability of scheduled Apex batch processes in general.  SFDC Support:  please contact me directly to advise me how I explain this to our mutually important client.  Alan Davies +1 805 273 5130.  adavies@astadia.com 

MJ09MJ09

I really want to be able to rely on the Batch Apex platform, but I've also had cause for concern. It's been very frustrating, to say the least.

 

You might find this interesting:  http://boards.developerforce.com/t5/Apex-Code-Development/Need-to-talk-with-Batch-Apex-Product-Manager/td-p/264363

 

On the bright side, Taggart Matthiesen (who replied to the above-referenced post) has been really responsive. I do have some hope that, with his help, we'll be able to get to the bottom of some of the Batch Apex issues.

 

I see your request for Support to contact you, but I suspect you'll do a lot better logging into the Partner Portal (I'm sure Astadia has a login) and submitting a case there.

 

apexsutherlandapexsutherland

We've experienced this issue several times within the past few weeks from a number of our client's orgs with our AppExchange package. This morning we had a rash of this error across 4 customer orgs. I don't know what instances all these orgs are on (thanks to the lack of information in the exception email), but all the exception emails were delivered to between 7-9AM.

 

Any word on a root cause or resolution?