+ Start a Discussion

RegEx for a very specific XML gives Error (Regex too complicated)

Hi all,


I need to trim an XML from unnecesary elements before parsing it (due to its size). In order to do that I had the idea of loop over a given list of elements I don't need and using replaceAll() for that.


Once I found what I thought is the right regular expression (testing it in regex engines online with a given XML) I use it in my code but Salesforce gives me the error 'Regex too complicated'


The regular expression I came up with is this:




 This regular expression matches an entire XML ELEMENT with its value (it includes ELEMENTS with attributes) and its closing tag. 


I tried this regex in this engine: regex online engine tool


This is the text I use for testing purposes:



 Using the engine above linked with the regex I came up with the engine gives as a result (underlined) the match. This result is what I use in a simple piece of Apex to trim the whole XML. Something like this:



String[] elementsToDelete = new String[]{'TEST'};
String result = null;
for(String s:elementsToDelete){
   result = xml.replaceAll('<('+s+')\b[^>]*>(.*?)<\\1>','');


Notice I had to change the grouping /\1 for \\1 to be able to save this code in Salesforce and following java.util.regex Pattern Class (the one that SF is based in according to the documentation). (More info about groupings here)


I know this is a kind of very specific problem, but I tried to provide all information I have here. Can anyone spot what is the problem with the regex? Is it even possible to write a regex like that and Apex regex engine read it?


Any help is very welcome.







Just quickly reply that I tested this code in an anonymous block and works perfectly, which brings me to the point of questioning if the problem is the size of the String.


Can anyone bring some light on this?