Replacing characters by a modified version of them in a string - c++

I want to replace the characters below (or sub-strings for the && and ||)in an input string with regex replace
+ - ! ( ) { } [ ] ^ " ~ * ? : \ && ||
How can I write this request in the construction of the std::regex ?
For example if I have
"(1+1):2"
I want to have an input of :
"\(1\+1\)\:2"
The final code looks something like this :
std::string s ("(1+1):2");
std::regex e ("???"); // what should I put here ?
std::cout << std::regex_replace (s,e,"\\$2"); // is this correct ?

You can use std::regex_replace with capture:
#include <iostream>
#include <string>
#include <regex>
using namespace std;
int main() {
regex regex_a("(\\+|-|!|\\(|\\)|\\{|\\}|\\[|\\]|\\^|\"|~|\\*|\\?|:|\\\\|&&|\\|\\|)");
cout << regex_replace("(1+1):2", regex_a, "\\$0") << endl;
}
This prints
$ ./a.out
\(1\+1\)\:2

Related

extract digits from string using Regex in c++

I have created this c++ to extract digits from mixed strings limited by xxx and yyy
Here is my code
#include <iostream>
#include <regex>
using namespace std;
int main() {
string text = "xxx1111yyy xxxrandomstring2222yyy";
string start_delimiter = "xxx";
string end_delimiter = "yyy";
regex pattern(start_delimiter + "([0-9]+)" + end_delimiter);
smatch match;
while (regex_search(text, match, pattern)) {
cout << match[1] << endl;
text = match.suffix().str();
}
return 0;
}
I expect the output:
1111
2222
But I'm getting in output only: 1111
Where is my fault ?
As I understand, delimiters xxx and yyyy are statics, randomstring isn't static so it can be any random string.
So the error simply is in your regex pattern.
it should something like this:
regex pattern("xxx.*?(\\d+).*?yyy");
The whole code could be like this:
#include <iostream>
#include <regex>
#include <string>
int main() {
std::string text =
"xxxrandomstring2222yyy xxx1111yyy";
std::regex pattern("xxx.*?(\\d+).*?yyy");
std::smatch match;
while (regex_search(text, match, pattern)) {
std::cout << match[1] << std::endl;
text = match.suffix().str();
}
return 0;
}

How to match two groups with different surroundings? C++

I would like to parse strings like (X->Y) or [X=>Y], and extract the X and Y parts. I did it like this:
// Example program
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::string text1 = "(X->Y)";
std::string text2 = "[X=>Y]";
std::regex my_regex("\\(([A-Z]+)->([A-Z]+)\\)|\\[([A-Z]+)=>([A-Z]+)\\]");
std::smatch reg_match;
if(std::regex_match(text1, reg_match, my_regex)) {
std::cout << reg_match[1].str() << ' ' << reg_match[2].str() << std::endl;
} else {
std::cout << "Nothing" << std::endl;
}
}
It works with text1, but it gives an empty result with text2. What do I wrong? Why isn't X and Y in reg_match[1] and reg_match[2] if I run the code with text2?
This is because when you are matching text1, groups 1 and 2 gets matched:
\\(([A-Z]+)->([A-Z]+)\\)|\\[([A-Z]+)=>([A-Z]+)\\]
^^^^^^ ^^^^^
Whereas in text2, groups 3 and 4 gets matched:
\\(([A-Z]+)->([A-Z]+)\\)|\\[([A-Z]+)=>([A-Z]+)\\]
^^^^^^ ^^^^^
So you have to use reg_match[3] and reg_match[4] in the case of text2.
Of course, a more versatile solution would be to check whether reg_match[1] is empty first. If it is, use group 1 and 2, otherwise, use group 3 and 4.
Alternatively to the given answer by #Sweeper you could rewrite your regex to only have 2 match groups:
// Example program
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::string text1 = "(X->Y)";
std::string text2 = "[X=>Y]";
std::regex my_regex("[[(]([A-Z]+)(?:->|=>)([A-Z]+)[)\\]]");
std::smatch reg_match;
if(std::regex_match(text1, reg_match, my_regex)) {
std::cout << reg_match[1].str() << ' ' << reg_match[2].str() << std::endl;
} else {
std::cout << "Nothing" << std::endl;
}
}
This however has the small disadvantage that it will also match a few more variants:
(X=>Y)
[X->Y)
(X=>Y]
etc...

Boost regex cpp for finding strings between %% with output excluding the % character itself

I am having a problem with boost regex in cpp. I want to match a string like
"Hello %world% regex %cpp%" and expected string output is world, cpp
Can somebody suggest a regex for this
Thanks
Anil
I personally prefer "\\%([^\\%]*)\\%" (or as a raw string R"r(\%([^\%]*)\%)r")
It doesn't rely on non-greedy qualifiers
Which is essentially
one percent character \\%
any amount of non-percent characters [^\\%]*
one percent character \\%
I know this is tagged boost but here's a solution with std::regex
#include <string>
#include <regex>
#include <iostream>
int main()
{
using namespace std;
string source = "Hello %world%";
regex match_percent_enclosed (R"_(\%([^\%]*)\%)_");
smatch between_percent;
bool found_match = regex_search(source,between_percent,match_percent_enclosed);
if(found_match && between_percent.size()>1)
cout << "found: \"" << between_percent[1].str() << "\"." << endl;
else
cout << "no match found." << endl;
}
you may get some idea
%(.+?)%
Result:
Match 1
1. world
Match 2
1. cpp
You can use this regex \%(.*?)\%smallest group
Online regex: https://regex101.com/r/dSCE2a/2
And for the code with boost
#include <iostream>
#include <cstdlib>
#include <boost/regex.hpp>
using namespace std;
int main()
{
boost::cmatch mat;
boost::regex reg( "\\%(.*?)\\%" );
char szStr[] = "Hello %world% regex %cpp%";
char *where = szStr;
while (regex_search(where, mat, reg))
{
cout << mat[1] << endl; // 0 for whole match, 1 for sub
where = (char*)mat[0].second;
}
}

Taking every character literally in RegEx

Using std::regex I want to create a function that takes, for example, a string
and creates a RegEx using that string, but with every char of the string matched literally.
For example, lets say s("[ds-aa]"); I want to create a RegEx using that string but literally so that the RegEx will match "\[ds\-aa\]".
Assuming you are using std::regex, and the default ECMA regex flavor, you just need to escape
. * + ? ^ $ { } ( ) | [ ] \
So, you can use
#include <regex>
#include <string>
#include <iostream>
using namespace std;
std::string regexEscape(std::string str) {
return std::regex_replace(str, std::regex(R"([.^$|()[\]{}*+?\\])"), R"(\$&)");
}
int main()
{
std::cout << "Test escaped pattern: " << regexEscape("[da-d$\\]") << std::endl; // = > \[da-d\$\\\]
std::string key = "\\56";
string input = "John\\56 Fred\\12";
std::regex rx(R"((\w+))" + regexEscape(key));
smatch m;
if (std::regex_search(input, m, rx)) {
std::cout << "Who has \\56? - " << m[1].str() << std::endl;
}
}
See IDEONE demo
Results:
Test escaped pattern: \[da-d\$\\\]
Who has \56? - John

Case Sensitive Partial Match with Boost's Regex

From the following code, I expect to get this output from the corresponding input:
Input: FOO Output: Match
Input: FOOBAR Output: Match
Input: BAR Output: No Match
Input: fOOBar Output: No Match
But why it gives "No Match" for input FOOBAR?
#include <iostream>
#include <vector>
#include <fstream>
#include <sstream>
#include <boost/regex.hpp>
using namespace std;
using namespace boost;
int main ( int arg_count, char *arg_vec[] ) {
if (arg_count !=2 ) {
cerr << "expected one argument" << endl;
return EXIT_FAILURE;
}
string InputString = arg_vec[1];
string toMatch = "FOO";
const regex e(toMatch);
if (regex_match(InputString, e,match_partial)) {
cout << "Match" << endl;
} else {
cout << "No Match" << endl;
}
return 0;
}
Update:
Finally it works with the following approach:
#include <iostream>
#include <vector>
#include <fstream>
#include <sstream>
#include <boost/regex.hpp>
using namespace std;
using namespace boost;
bool testSearchBool(const boost::regex &ex, const string st) {
cout << "Searching " << st << endl;
string::const_iterator start, end;
start = st.begin();
end = st.end();
boost::match_results<std::string::const_iterator> what;
boost::match_flag_type flags = boost::match_default;
return boost::regex_search(start, end, what, ex, flags);
}
int main ( int arg_count, char *arg_vec[] ) {
if (arg_count !=2 ) {
cerr << "expected one argument" << endl;
return EXIT_FAILURE;
}
string InputString = arg_vec[1];
string toMatch = "FOO*";
static const regex e(toMatch);
if ( testSearchBool(e,InputString) ) {
cout << "MATCH" << endl;
}
else {
cout << "NOMATCH" << endl;
}
return 0;
}
Use regex_search instead of regex_match.
Your regular expression has to account for characters at the beginning and end of the sub-string "FOO".
I'm not sure but "FOO*" might do the trick
match_partial would only return true if the partial string was found at the end of the text input, not the beginning.
A partial match is one that matched
one or more characters at the end of
the text input, but did not match all
of the regular expression (although it
may have done so had more input been
available)
So FOOBAR matched with "FOO" would return false.
As the other answer suggests, using regex.search would allow you to search for sub-strings more effectively.