You need to sign in to do that
Don't have an account?
jlcoverity
"Illegal regex" - HUH?!?!?
Hello all, Would someone please explain to me why this:
Pattern p = Pattern.compile('^AMAZON(\\.COM)?($|[,-\\s]+');
Would produce this error:
08:18:26.042 (42021000)|EXECUTION_STARTED 08:18:26.042 (42032000)|CODE_UNIT_STARTED|[EXTERNAL]|execute_anonymous_apex 08:18:26.043 (43805000)|FATAL_ERROR|System.StringException: Invalid regex: Illegal character range near index 22 ^AMAZON(\.COM)?($|[,-\s]+
What's up with the substitution of the pipe character???
This regex works perfectly fine in Javascript.
Thanks, -jl
JavaScript isn't Java, so some regexs work differently. However, I did find what you're looking for by playing around with a Java RegEx tester (http://www.regexplanet.com/advanced/java/index.html). I believe this is code you're seeking:
The trick was that the parser was seeing the unescaped - as a character range, so it tries to match "comma through space", which isn't a valid range (because space comes before comma in the UTF-8 code page). This code successfully matches "amazon", "amazon, inc", "amazon.com, inc", and "amazon.com", but not "amazon.co" or "amazons".
All Answers
Hi,
Pattern p = Pattern.compile('^AMAZON(\\.COM)?($ | [,-\\s]+'); // the highlighted character only getting you this error remove that and try the following,
Pattern p = Pattern.compile('^AMAZON(\\.COM)?($ [,-\\s]+');
Hope so this helps you...!
Please mark this answer a Solution and please give kudos by clicking on the star icon, if you found this answer as helpful.
Yes, I know that the "pipe" character is the problem...but I want the pipe character as it indicates "OR" in a regular expression. What I am trying to do here is match a string that starts with "Amazon" or "Amazon.com" and then either ends ($) or is followed by a comma, a dash, a whitespace (or any combination of the 3, so that it will also match "Amazon - UK" or "Amazon.com, Inc").
The pipe character is a valid regular expression entity (see: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#special-or). In fact, it is even referenced in the Apex documentation:
So...why is it now being escaped? The pattern compile method is failing on the semi-colon on the HTML entity (|).
JavaScript isn't Java, so some regexs work differently. However, I did find what you're looking for by playing around with a Java RegEx tester (http://www.regexplanet.com/advanced/java/index.html). I believe this is code you're seeking:
The trick was that the parser was seeing the unescaped - as a character range, so it tries to match "comma through space", which isn't a valid range (because space comes before comma in the UTF-8 code page). This code successfully matches "amazon", "amazon, inc", "amazon.com, inc", and "amazon.com", but not "amazon.co" or "amazons".
That was it! I totally forgot that there are inconsistencies between Java & JS regex...been having too much fun with node.js lately, I guess. :)
Thank you!
UPDATED: To complete the full matching that I needed (in Apex), I also needed to add a [\\w\\s]* to the end (to match "Amazon.com - Palo Alto", for example):