• Jacob Killian
  • NEWBIE
  • 0 Points
  • Member since 2021

  • Chatter
    Feed
  • 0
    Best Answers
  • 0
    Likes Received
  • 1
    Likes Given
  • 0
    Questions
  • 1
    Replies

I am getting a ‘Regex too complicated’ error below when loading data into our org using the following process:

 

1) an email service to receive the CSV data,

2) an APEX class to split and validate the CSV data, and then

3) a set of @future calls to upsert the data.

 

The same data works in smaller volumes, but not beyond a certain threshold. This applies whether we reduce the number of rows, or reduce the width of certain columns of data by truncating them to 3000 characters (a small number of columns have 10,000 characters of text included). When we do either or both of these steps in any combination to reduce the file size, we don't get this problem. It’s not a problem with a specific badly formatted row either, because reducing the number of rows in various combinations always causes the problem to go away.

 

So we don’t believe it is actually a regex problem, because the regular expression is just finding commas to split up a comma separated file/string - i.e. it's very simple.

 

This is why we think there's an undocumented storage or capacity limit somewhere within the APEX processing that is being exceeded - but one that doesn't have a governor limit associated with it, or indeed an accurate error message. We think it is an erroneous error message - i.e. it's not to do with complicated regex – and that this error message is a symptom of another issue.

 

This error has occurred in code that has been stable to date, but has appeared since the filesize we're uploading has increased to beyond about 4600-4800KB, which seems to be the threshold beyond which this problem occurs. There seem to be some undocumented limits in the volume of data than can be processed using the solution architecture we've designed.

 

We want to be able to code around this problem, but unless we know exactly what the error is, any changes we make to our code may not actually fix the problem and result in wasted effort. So I don't want to start changing this until I know exactly which part of the solution needs to be changed!

 

I’ve raised this with Salesforce as a potential bug or to see if they could clarify any undocumented limits on processing large volume datasets using the process we’ve designed, but they seem to have decided it’s a developer issue so won’t help.

 

The error message is below:

 

Apex script unhandled exception by user/organization: 

Failed to invoke future method 'public static void PrepareCSV(String, String, String, Integer, Boolean)'

caused by: System.Exception: Regex too complicated

Class.futureClassToProcess.GetList: line 98, column 17
Class.futureClassToProcess.parseCSV: line 53, column 38
Class.futureClassToProcess.PrepareCSV: line 35, column 20 External entry point

 The relevant code snippet is below:

 

 

 

public static list<List<String>> GetList(String Content)
        {
        Content = Content.replaceAll(',"""',',"DBLQT').replaceall('""",','DBLQT",');
            Content = Content.replaceAll('""','DBLQT');
            List<List<String>> lstCSV = new List<List<String>>();
            Boolean Cont = true;
            while (Cont == true){
                List<String> lstS = Content.Split('\r\n',500);
                if(lstS.size() == 500){
                    Content =lstS[499];
                    lstS.remove(499);
                }else{
                    Cont = false;
                }
                lstCSV.add(lstS);
            }
            return lstCSV;
        }

 

Any suggestions gratefully received as to whether we're missing something obvious, whether 4MB+ files just can't be processed this way, or whether this might actually be a SFDC APEX bug.

 

 

 

public static list<List<String>> GetList(String Content)
        {
            //Sanjeeb
            Log('GetList started.');
            Content = Content.replaceAll(',"""',',"DBLQT').replaceall('""",','DBLQT",');
            Log('Replaing DBLQT.');
            Content = Content.replaceAll('""','DBLQT');
            Log('Replaing DBLQT.');
            List<List<String>> lstCSV = new List<List<String>>();
            Boolean Cont = true;
            while (Cont == true){
                List<String> lstS = Content.Split('\r\n',500);
                Log('Split upto 500 Rows.');
                //List<String> lstS = Content.Split('\r\n',1000);
                if(lstS.size() == 500){
                    Content =lstS[499];
                    lstS.remove(499);
                }else{
                    Cont = false;
                }
                lstCSV.add(lstS);
            }
            Log('GetList ends.');
            return lstCSV;
        }

I am getting a ‘Regex too complicated’ error below when loading data into our org using the following process:

 

1) an email service to receive the CSV data,

2) an APEX class to split and validate the CSV data, and then

3) a set of @future calls to upsert the data.

 

The same data works in smaller volumes, but not beyond a certain threshold. This applies whether we reduce the number of rows, or reduce the width of certain columns of data by truncating them to 3000 characters (a small number of columns have 10,000 characters of text included). When we do either or both of these steps in any combination to reduce the file size, we don't get this problem. It’s not a problem with a specific badly formatted row either, because reducing the number of rows in various combinations always causes the problem to go away.

 

So we don’t believe it is actually a regex problem, because the regular expression is just finding commas to split up a comma separated file/string - i.e. it's very simple.

 

This is why we think there's an undocumented storage or capacity limit somewhere within the APEX processing that is being exceeded - but one that doesn't have a governor limit associated with it, or indeed an accurate error message. We think it is an erroneous error message - i.e. it's not to do with complicated regex – and that this error message is a symptom of another issue.

 

This error has occurred in code that has been stable to date, but has appeared since the filesize we're uploading has increased to beyond about 4600-4800KB, which seems to be the threshold beyond which this problem occurs. There seem to be some undocumented limits in the volume of data than can be processed using the solution architecture we've designed.

 

We want to be able to code around this problem, but unless we know exactly what the error is, any changes we make to our code may not actually fix the problem and result in wasted effort. So I don't want to start changing this until I know exactly which part of the solution needs to be changed!

 

I’ve raised this with Salesforce as a potential bug or to see if they could clarify any undocumented limits on processing large volume datasets using the process we’ve designed, but they seem to have decided it’s a developer issue so won’t help.

 

The error message is below:

 

Apex script unhandled exception by user/organization: 

Failed to invoke future method 'public static void PrepareCSV(String, String, String, Integer, Boolean)'

caused by: System.Exception: Regex too complicated

Class.futureClassToProcess.GetList: line 98, column 17
Class.futureClassToProcess.parseCSV: line 53, column 38
Class.futureClassToProcess.PrepareCSV: line 35, column 20 External entry point

 The relevant code snippet is below:

 

 

 

public static list<List<String>> GetList(String Content)
        {
        Content = Content.replaceAll(',"""',',"DBLQT').replaceall('""",','DBLQT",');
            Content = Content.replaceAll('""','DBLQT');
            List<List<String>> lstCSV = new List<List<String>>();
            Boolean Cont = true;
            while (Cont == true){
                List<String> lstS = Content.Split('\r\n',500);
                if(lstS.size() == 500){
                    Content =lstS[499];
                    lstS.remove(499);
                }else{
                    Cont = false;
                }
                lstCSV.add(lstS);
            }
            return lstCSV;
        }

 

Any suggestions gratefully received as to whether we're missing something obvious, whether 4MB+ files just can't be processed this way, or whether this might actually be a SFDC APEX bug.

 

 

 

public static list<List<String>> GetList(String Content)
        {
            //Sanjeeb
            Log('GetList started.');
            Content = Content.replaceAll(',"""',',"DBLQT').replaceall('""",','DBLQT",');
            Log('Replaing DBLQT.');
            Content = Content.replaceAll('""','DBLQT');
            Log('Replaing DBLQT.');
            List<List<String>> lstCSV = new List<List<String>>();
            Boolean Cont = true;
            while (Cont == true){
                List<String> lstS = Content.Split('\r\n',500);
                Log('Split upto 500 Rows.');
                //List<String> lstS = Content.Split('\r\n',1000);
                if(lstS.size() == 500){
                    Content =lstS[499];
                    lstS.remove(499);
                }else{
                    Cont = false;
                }
                lstCSV.add(lstS);
            }
            Log('GetList ends.');
            return lstCSV;
        }