Find and replace with regular expressions - c++

I'm trying to replace a bunch of function calls using regular expressions but can't seem to be getting it right. This is a simplified example of what I'm trying to do:
GetPetDog();
GetPetCat();
GetPetBird();
I want to change to:
GetPet<Animal_Dog>();
GetPet<Animal_Cat>();
GetPet<Animal_Bird>();

Use below regex:
(GetPet)([^(]*) with subsitution \1<Animal_\2>
Demo

You can use the following regex and code for that:
std::string ss ("GetPetDog();");
static const std::regex ee ("GetPet([^()]*)");
std::string result;
result = regex_replace(ss, ee, "GetPet<Animal_$1>");
std::cout << result << endl;
Regex:
GetPet - Matches GetPet literally (we need no capturing group here)
([^()]*) - A capturing group to match any characters other than ( or ) 0 or more times (*)
Output:

Related

Regex to replace single occurrence of character in C++ with another character

I am trying to replace a single occurrence of a character '1' in a String with a different character.
This same character can occur multiple times in the String which I am not interested in.
For example, in the below string I want to replace the single occurrence of 1 with 2.
input:-0001011101
output:-0002011102
I tried the below regex but it is giving be wrong results
regex b1("(1){1}");
S1=regex_replace( S,
b1, "2");
Any help would be greatly appreciated.
If you used boost::regex, Boost regex library, you could simply use a lookaround-based solution like
(?<!1)1(?!1)
And then replace with 2.
With std::regex, you cannot use lookbehinds, but you can use a regex that captures either start of string or any one char other than your char, then matches your char, and then makes sure your char does not occur immediately on the right.
Then, you may replace with $01 backreference to Group 1 (the 0 is necessary since the $12 replacement pattern would be parsed as Group 12, an empty string here since there is no Group 12 in the match structure):
regex reg("([^1]|^)1(?!1)");
S1=std::regex_replace(S, regex, "$012");
See the C++ demo online:
#include <iostream>
#include <regex>
int main() {
std::string S = "-0001011101";
std::regex reg("([^1]|^)1(?!1)");
std::cout << std::regex_replace(S, reg, "$012") << std::endl;
return 0;
}
// => -0002011102
Details:
([^1]|^) - Capturing group 1: any char other than 1 ([^...] is a negated character class) or start of string (^ is a start of string anchor)
1 - a 1 char
(?!1) - a negative lookahead that fails the match if there is a 1 char immediately to the right of the current location.
Use a negative lookahead in the regexp to match a 1 that isn't followed by another 1:
regex b1("1(?!1)");

RE2 Nested Regex Group Match

I have a RE2 regex as following
const re2::RE2 numRegex("(([0-9]+),)+([0-9])+");
std::string inputStr;
inputStr="apple with make,up things $312,412,3.00");
RE2::Replace(&inputStr, numRegex, "$1$3");
cout << inputStr;
Expected
apple with make,up,things $3124123.00
I was trying to remove the , in the recognized number, $1 would only match 312 but not 412 part. Wondering how to extract the recursive pattern in the group.
Note that RE2 doesn't support lookahead (see Using positive-lookahead (?=regex) with re2) and the solutions I found all use lookaheads.
RE2 based solution
As RE2 does not support lookarounds, there is no pure single-pass regex solution.
You can have a workaround (as usual, when no solution is available): replace the string twice with (\d),(\d) regex and $1$2 substitution:
const re2::RE2 numRegex(R"((\d),(\d))");
std::string inputStr("apple with make,up things $312,412,3.00");
RE2::Replace(&inputStr, numRegex, "$1$2");
RE2::Replace(&inputStr, numRegex, "$1$2"); // <- Second pass to remove commas in 1,2,3,4 like strings
std::cout << inputStr;
C++ std::regex based solution:
You can remove the commas between digits using
std::string inputStr("apple with make,up things $312,412,3.00");
std::regex numRegex(R"((\d),(?=\d))");
std::cout << regex_replace(inputStr, numRegex, "$1") << "\n";
// => apple with make,up things $3124123.00
See the C++ demo. Also, see the regex demo here.
Details:
(\d) - Capturing group 1 ($1): a digit
, - a comma
(?=\d) - a positive lookahead that requires a digit immediately to the right of the current location.
In the pattern that you tried, you are repeating the outer group (([0-9]+),)+ which will then contain the value of the last iteration where it can match a 1+ digits and a comma.
The last iteration will capture 412, and 312, will only be matched.
You are using regex, but as an alternative if you have boost available, you could make use of the \G anchor which can get iterative matches asserting the position at the end of the previous match and replace with an empty string.
(?:\$|\G(?!^))\d+\K,(?=\d)
The pattern matches:
(?: Non capture group
\$ match $
| Or
\G(?!^) Assert the position at the end of the previous match, not at the start
) Close non capture group
\d+\K Match 1+ digits and forget what is matched so far
,(?=\d) Match a comma and assert a digit directly to the right
Regex demo
#include<iostream>
#include <string>
#include <boost/regex.hpp>
using namespace std;
int main()
{
std::string inputStr = "apple with make,up things $312,412,3.00";
boost::regex numRegex("(?:\\$|\\G(?!^))\\d+\\K,(?=\\d)");
std::string result = boost::regex_replace(inputStr, numRegex, "");
std::cout << result << std::endl;
}
Output
apple with make,up things $3124123.00

How to select the complete word within the brackets even if it have that brackets within word

Give some solution to this following example,
Scenario-1:
My String : Password={my_pswd}}123}
I want to select the value enclosed within the {} brackets(Example: I want to select the complete password key value {my_pswd}123} not {my_pswd})
If I'm using this regex \{(.*?)\} , this will select {my_pswd} not {my_pswd}}123}. So how to get complete word even if the word has } in between? Give me some suggestions by using regex or any other way.
Scenario-2:
I am using this regex ^\{|\}$ . If my string have both { bracket and } bracket like this {{my_password}} then only it want to select first and last bracket. If my string like this {{my_password, it don't want to select that starting bracket. Its like AND condition in Regex. I referred many posts they did with look up but I can't get clear idea. Give me some suggestion.
Thanks.
It seems that the {...} substrings you want to match must be followed with ; or end of string.
This will not work for cases when a } inside the values can also be followed with ;.
You may solve the first issue by adding a (?![^;]) lookaround:
\{(.*?)\}(?![^;])
See the regex demo.
Details
\{ - a { char
(.*?) - Group 1: any 0+ chars as few as possible
\} - a } char
(?![^;]) - no char other than ; is allowed right after the current position
See the C++ demo:
#include <iostream>
#include <vector>
#include <regex>
int main() {
const std::regex reg("\\{(.*?)\\}(?![^;])");
std::smatch match;
std::string s = "Username={My_{}user};Password={my_pswd}}123}}}kk};Password={my_pswd}}123}";
std::vector<std::string> results(
std::sregex_token_iterator(s.begin(), s.end(), reg, 1), // See 1, it extracts Group 1 value
std::sregex_token_iterator());
for (auto result : results)
{
std::cout << result << std::endl;
}
return 0;
}
Output:
My_{}user
my_pswd}}123}}}kk
my_pswd}}123
As for the second scenario, you may use
std::regex reg("^\\{([^]*)\\}$");
std::string s = "{My_{}user}";
std::cout << regex_replace(s, reg, "$1") << std::endl; // => My_{}user
See another C++ demo.
The \{([^]*)\}$ pattern matches the { at the start (^) of the string, then matches and captures into Group 1 (later referenced with the help of $1 in the replacement pattern) any 0+ chars, as many as possible, and then matches a } at the end of the string ($).

Regex in C++11 vs PHP

I'm new to regex and C++11. In order to match an expression like this :
TYPE SIZE NUMBER ("regina s x99");
I built a regex which looks like this one :
\b(regina|margarita|americaine|fantasia)\b \b(s|l|m|xl|xxl)\b x([1-9])([0-9])
In my code I did this to try the regex :
std::string s("regina s x99");
std::regex rgx($RGX); //$RGX corresponds to the regex above
if (std::regex_match(s, rgx))
std::cout << "It works !" << std::endl;
This code throw a std::regex_error, but I don't know where it comes from..
Thanks,
This works with g++ (4.9.2) in c++11 mode:
std::regex rgx("\\b(regina|margarita|americaine|fantasia)\\b\\s*(s|l|m|xl|xxl)\\b\\s*x([1-9]*[0-9])");
This will capture three groups: regina s 99 which matches the TYPE SIZE NUMBER pattern, while your original captured four groups regina s 9 9 and had the NUMBER as two values (maybe that was what you wanted though).
Demo on IdeOne
In C++ strings the \ character is special and needs to be escaped so that it gets passed to the regular expression engine, not interpreted by the compiler.
So you either need to use \\b:
std::regex rgx("\\b(regina|margarita|americaine|fantasia)\\b \\b(s|l|m|xl|xxl)\\b x([1-9])([0-9])");
or use a raw string, which means that \ is not special and doesn't need to be escaped:
std::regex rgx(R"(\b(regina|margarita|americaine|fantasia)\b \b(s|l|m|xl|xxl)\b x([1-9])([0-9]))");
There was a typo in this line in original question:
if (std::reegex_match(s, rgx))
More over I am not sure what are you passing with this variable : $RGX
Corrected program as follows:
#include<regex>
#include<iostream>
int main()
{
std::string s("regina s x99");
std::regex rgx("\\b(regina|margarita|americaine|fantasia)\\b \\s*(s|l|m|xl|xxl)\\b \\s*x([1-9])([0-9])"); //$RGX corresponds to the regex above
if (std::regex_match(s, rgx))
std::cout << "It works !" << std::endl;
else
std::cout<<"No Match"<<std::endl;
}

Qt Regex Help (Array Keys)

Okay, so the following string is what my regex will attempt to match against:
[key1][key2][key3]
and here is my regex.
\[(.+?)\]
This is all being done in Qt, and here is the code I am using
QRegExp reg("\\[(.+?)\\]");
reg.indexIn(string);
qDebug() << "Matches: " << reg.capturedTexts();
The above returns this:
("", "")
So two questions then:
Why are the captures empty
On my regex, why did I need to put \\ for it to work? If I just put \ it will not capture anything.
Thank you!
First, let's optimize your regular expression: instead of .+? reluctant expression use [^\]]+, which lets you avoid so-called catastrophic backtracking. The new expression is as follows:
\\[([^\\]]+)\\]
On my regex, why did I need to put \\ for it to work?
Because the regex goes through two compilers which pay attention to backslashes - first, your C++ compiler, and then the regex compiler inside QRegExp constructor. The first slash of the pair is for the C++ compiler; the second one is for the regex compiler. Once C++ compiler is finished, each pair of backslahses is replaced with a single slash, which is what the regex needs.
I got key1, but now how do I get the other 2? reg.capturedCount() returns 1
Your regular expression captures one square bracket - delimited item at a time. If you want to capture them all, you need a loop:
int pos = 0;
while (pos >= 0) {
pos = reg.indexIn(str, pos);
if (pos >= 0) {
++pos; // move along in str
qDebug() << "Matches: " << reg.capturedTexts();
}
}