java.util.regex.PatternSyntaxException: Dangling meta character '+' for the string +91 - regex

I am using below regular expression string to replace the supplied string with ****
String output=output.replaceAll("(?<!\\w)(?i)"+requesterView.getFirstname()+"(?!\\w)","****");
Above the supplied string +91
If it contains + then getting below exception
java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 12
(?<!\w)(?i)(+91)(?!\w)
^
at java.util.regex.Pattern.error(Pattern.java:1955)
at java.util.regex.Pattern.sequence(Pattern.java:2123)
at java.util.regex.Pattern.expr(Pattern.java:1996)
at java.util.regex.Pattern.group0(Pattern.java:2905)
at java.util.regex.Pattern.sequence(Pattern.java:2051)
at java.util.regex.Pattern.expr(Pattern.java:1996)
at java.util.regex.Pattern.compile(Pattern.java:1696)
at java.util.regex.Pattern.<init>(Pattern.java:1351)
at java.util.regex.Pattern.compile(Pattern.java:1028)
How to resolve above exception ?

You need to escape regex meta-characters in your input String, which you can do with the Pattern.quote(String str) static method :
String output=output.replaceAll("(?<!\\w)(?i)"+Pattern.quote(requesterView.getFirstname())+"(?!\\w)","****");
Currently Java tries to parse the tokens of the input string (+91) as regex tokens and fails to make sense of the + meta-character in the context it's found in. Additionnally the parenthesis would have been understood as a capturing group.

Like Aaron mentioned you need to quote the regular expression.
This can be achieved either with Pattern.quote or using \Q together with \E. Here is an example:
public static String transformRegex(String input, String testStr) {
return input.replaceAll("(?<!\\w)(?i)\\Q" + testStr + "\\E(?!\\w)", "****");
}
Here is a test of the method above:
String output = transformRegex("+91 123123123", "+91");
System.out.println(output);
This prints:
**** 123123123

Related

Formatting regex in Dart on several lines

I have
Pattern pattern = r'^((?:19|20)\d\d)[- /.]
(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])$';
My editor shows an error on this regexp:
How can I fix it?
You entered a line break inside a string literal, that is why you get a syntax issue.
If you want to split a pattern into several lines, just use string concatenation:
Pattern pattern = r'^((?:19|20)\d\d)[- /.]' +
r'(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])$';
Or, since string literals separated only with whitespace characters are concatenated automatically:
Pattern pattern = r'^((?:19|20)\d\d)[- /.]'
r'(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])$';
Or, if you plan to re-use a long pattern, you may define this part as a variable, and just use string interpolation:
String d = r'((?:19|20)\d\d)';
String M = r'(0[1-9]|1[012])';
String y = r'(0[1-9]|[12][0-9]|3[01])';
String sep = r'[- /.]';
Pattern pattern = '^$d$sep$M$sep$y\$';

regex_search trying to match a string containing '['

Being relatively new to regular expressions I am having trouble figuring out the correct syntax. I am trying to match a string with the following pattern: String[string]!='string'. I want to divide it into three matches as it follows:
String[string] // can contain numbers
!= // operator that can also be: =,> and < or a combination of these
'string' // can contain _ and numbers
So far I have managed to match a string with the following pattern: string='string'. Using this code:
const string strExpression = "TypeValue!='Set_Site'";
regex regex("([a-zA-Z0-9]+)([=><!]+)(['a-zA-Z0-9_]+)");
smatch match;
if (regex_search(strExpression.begin(), strExpression.end(), match, regex))
{
string indicator(match[1]);
string op(match[2]);
string value(match[3]);
}
However when I try to add '[' and ']' to the regex syntax I don't get any matches. I have modified the code like this:
const string strExpression = "Type[Value]!='Set_Site'";
regex regex("([]a-zA-Z0-9[]+)([=><!]+)(['a-zA-Z0-9_]+)");
smatch match;
if (regex_search(strExpression.begin(), strExpression.end(), match, regex))
{
string indicator(match[1]);
string op(match[2]);
string value(match[3]);
}
Аccording to the documentation I am reading the right-square-bracket ( ']' ) will lose its special meaning (terminating the bracket expression) and represent itself in a bracket expression if it occurs first in the list. For the left-square-bracket( '[' ) it says that it will lose its special meaning within a bracket expression. So following this rules and definitions I cannot identify why i am not getting any matches
Can someone give me some guidelines what I am doing wrong?
Thank you.

.Net Regular Expression(Regex)

VB.NET separate strings using regex split?
Im having a logical error with the pattern string variable, the error occur after i extend the string from "(-)" to "(-)(+)(/)(*)"..
Dim input As String = txtInput.Text
Dim pattern As String = "(-)(+)(/)(*)"
Dim substrings() As String = Regex.Split(input, pattern)
For Each match As String In substrings
lstOutput.Items.Add(match)
This is my output when my pattern string variable is "-" it works fine
input: dog-
output: dog
-
My desired output(This is want i want to happen) but there is something wrong with the code.. its having an error after i did this "(-)(+)(/)()" even this
"(-)" + "(+)" + "(/)" + "()"
input: dog+cat/tree
output: dog
+
cat
/
tree
when space character input from textbox to listbox
input: dog+cat/ tree
output: dog
+
cat
/
tree
You need a character class, not the sequence of subpatterns inside separate capturing gorups:
Dim pattern As String = "([+/*-])"
This pattern will match and capture into Group 1 (and thus, all the captured values will be part of the resulting array) a char that is either a +, /, * or -. Note the position of the hyphen: since it is the last char in the character class, it is treated as a literal -, not a range operator.
See the regex demo:

Groovy complaining about illegal character range in regex

Groovy 2.4 here. I am trying to build a regex that will filter out all the following characters:
`,./;[]-&<>?:"()|
Here's my best attempt:
static void main(String[] args) {
// `,./;[]-&<>?:"()|
String regex = "`,./;[]-&<>?:\"()|"
String test = "ooekrofkrofor ` oxkeoe , wdkeodeko / kodek ] woekoedk \" swjiej ' wsjwdjeiji :"
println test.replaceAll(regex, "")
}
However this produces a compile error on the regex string definition, complaining:
illegal character range (to < from)
Not sure if this is a Java or Groovy thing, but I can't figure out how to define the regex properly so that it quiets the error and correctly strips these "illegal characters" out of my string. Any ideas?
It seems to me you want to remove all the characters listed in your regex variable. The problem is that you declared a sequence while you need a character class (enclose the characters with []).
See Groovy demo:
String regex = "[`,./;\\[\\]&<>?:\"()|-]+"
^ ^^^^^^ ^ ^
String test = "ooekrofkrofor ` oxkeoe , wdkeodeko / kodek ] woekoedk \" swjiej ' wsjwdjeiji :"
println test.replaceAll(regex, "")
Output: ooekrofkrofor oxkeoe wdkeodeko kodek woekoedk swjiej ' wsjwdjeiji
The pattern now contains a character class matching any of the characters defined inside it - [`,./;\[\]&<>?:\"()|-] - one or more times due to the + quantifier. Note that inside the character class, ] and [ must always be escaped, and the - can be left unescaped when placed at the start/end of the character class.
You need to escape a few special characters in your pattern:
String regex = "[`,./;\\[]\\-&<>?:\"\\(\\)|]+"
Note using double \\ to turn them into a single \ in the string, so when the pattern is parsed, the next character is escaped.

How to match a string with an opening brace { in C++ regex

I have about writing regexes in C++. I have 2 regexes which work fine in java. But these throws an error namely
one of * + was not preceded by a valid regular expression C++
These regexes are as follows:
regex r1("^[\s]*{[\s]*\n"); //Space followed by '{' then followed by spaces and '\n'
regex r2("^[\s]*{[\s]*\/\/.*\n") // Space followed by '{' then by '//' and '\n'
Can someone help me how to fix this error or re-write these regex in C++?
See basic_regex reference:
By default, regex patterns follow the ECMAScript syntax.
ECMAScript syntax reference states:
characters:
\character
description: character
matches: the character character as it is, without interpreting its special meaning within a regex expression.
Any character can be escaped except those which form any of the special character sequences above.
Needed for: ^ $ \ . * + ? ( ) [ ] { } |
So, you need to escape { to get the code working:
std::string s("\r\n { \r\nSome text here");
regex r1(R"(^\s*\{\s*\n)");
regex r2(R"(^\s*\{\s*//.*\n)");
std::string newtext = std::regex_replace( s, r1, "" );
std::cout << newtext << std::endl;
See IDEONE demo
Also, note how the R"(pattern_here_with_single_escaping_backslashes)" raw string literal syntax simplifies a regex declaration.