It is supposed to match "abababab" since "ab" is repeated more than two times consecutively but the code isn't printing any output.
Is there some other trick in using regex in C++.
I tried with other languages and it works just fine.
#include<bits/stdc++.h>
int main(){
std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\1\1+");
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
Your problem is your backslashes are escaping the '1''s in your string. You need to inform std::regex to treat them as '\' 's. You can do this by using a raw string R"((.+)\1\1+)", or by escaping the slashes, as shown here:
#include <regex>
#include <string>
#include <iostream>
int main(){
std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\\1\\1+");
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
Which produces the output
abababab ab
Related
It is supposed to match "abababab" since "ab" is repeated more than two times consecutively but the code isn't printing any output.
Is there some other trick in using regex in C++.
I tried with other languages and it works just fine.
#include<bits/stdc++.h>
int main(){
std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\1\1+");
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
Your problem is your backslashes are escaping the '1''s in your string. You need to inform std::regex to treat them as '\' 's. You can do this by using a raw string R"((.+)\1\1+)", or by escaping the slashes, as shown here:
#include <regex>
#include <string>
#include <iostream>
int main(){
std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\\1\\1+");
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
Which produces the output
abababab ab
I need to get sentences with regex from String with word "walk". Now I am trying just to get sentences
std::string s ("Hello world! My name is Mike. Why so serious?");
std::smatch m;
std::regex e ("^\\s+[A-Za-z,;'\"\\s]+[.?!]$"); // matches words beginning by "sub"
while (std::regex_search (s,m,e)) {
for (auto w:m)
std::cout << w << "\n" ;
}
And this doesn't work.
Apart from start and end of the string in regex, you are forgetting to update the 's' with the suffix.
#include <iostream>
#include <regex>
int main()
{
std::string s ("Hello world! My name is Mike. Why so serious?");
std::smatch m;
std::regex e ("\\s?[A-Za-z,;'\"\\s]+[.?!]");
while(std::regex_search (s,m,e))
{
std::cout << m.str() << "\n" ;
s = m.suffix();
}
return 0;
}
I'm reading a text file in the form of
People list
[Jane]
Female
31
...
and for each line I want to loop through and find the line that contains "[...]"
For example, [Jane]
I came up with the regex expression
"(^[\w+]$)"
which I tested that it works using regex101.com.
However, when I try to use that in my code, it fails to match with anything.
Here's my code:
void Jane::JaneProfile() {
// read each line, for each [title], add the next lines into its array
std::smatch matches;
for(int i = 0; i < m_numberOfLines; i++) { // #lines in text file
std::regex pat ("(^\[\w+\]$)");
if(regex_search(m_lines.at(i), matches, pat)) {
std::cout << "smatch " << matches.str(0) << std::endl;
std::cout << "smatch.size() = " << matches.size() << std::endl;
} else
std::cout << "wth" << std::endl;
}
}
When I run this code, all the lines go to the else loop and nothing matches...
I searched up for answers, but I got confused when I saw that for C++ you have to use double backslashes instead one backslash to escape... But it didn't work for my code even when I used double backslashes...
Where did I go wrong?
By the way, I'm using Qt Creator 3.6.0 Based on (Desktop) Qt 5.5.1 (Clang 6.1 (Apple), 64 bit)
---Edit----
I tried doing:
std::regex pat (R"(^\[\\w+\]$)");
But I get an error saying
Use of undeclared identifier 'R'
I already have #include <regex> but do I need to include something else?
Either escape the backslashes or use the raw character version with a prefix that won't appear in the regex:
escaped:
std::regex pat("^\\[\\w+\\]$");
raw character string:
std::regex pat(R"regex(^\[\w+\]$)regex");
working demo (adapted from OPs posted code):
#include <iostream>
#include <regex>
#include <sstream>
#include <string>
#include <vector>
int main()
{
auto test_data =
"People list\n"
"[Jane]\n"
"Female\n"
"31";
// initialise test data
std::istringstream source(test_data);
std::string buffer;
std::vector<std::string> lines;
while (std::getline(source, buffer)) {
lines.push_back(std::move(buffer));
}
// test the regex
// read each line, for each [title], add the next lines into its array
std::smatch matches;
for(int i = 0; i < lines.size(); ++i) { // #lines in text file
static const std::regex pat ("(^\\[\\w+\\]$)");
if(regex_search(lines.at(i), matches, pat)) {
std::cout << "smatch " << matches.str() << std::endl;
std::cout << "smatch.size() = " << matches.size() << std::endl;
} else
std::cout << "wth" << std::endl;
}
return 0;
}
expected output:
wth
smatch [Jane]
smatch.size() = 2
wth
wth
I have this sample code
// regex_search example
#include <iostream>
#include <string>
#include <regex>
int main ()
{
std::string s ("eritueriotu3498 \"pi656\" sdfs3646df");
std::smatch m;
std::string reg("\\(?<=pi\\)\\(\\d+\\)\\(?=\"\\)");
std::regex e (reg);
std::cout << "Target sequence: " << s << std::endl;
std::cout << "The following matches and submatches were found:" << std::endl;
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
I need to get number between pi and " -> (piMYNUMBER")
In online regex service my regex works fine (?<=pi)(\d+)(?=") but c++ regex don't match anything.
Who knows what is wrong with my expression?
Best regards
That is correct, C++ std::regex flavors do not support lookbehinds. You need to capture the digits between pi and ":
#include <iostream>
#include <vector>
#include <regex>
int main() {
std::string s ("eritueriotu3498 \"pi656\" sdfs3646df");
std::smatch m;
std::string reg("pi(\\d+)\""); // Or, with a raw string literal:
// std::string reg(R"(pi(\d+)\")");
std::regex e (reg);
std::vector<std::string> results(std::sregex_token_iterator(s.begin(), s.end(), e, 1),
std::sregex_token_iterator());
// Demo printing the results:
std::cout << "Number of matches: " << results.size() << std::endl;
for( auto & p : results ) std::cout << p << std::endl;
return 0;
}
See the C++ demo. Output:
Number of matches: 1
656
Here, pi(\d+)" pattern matches
pi - a literal substring
(\d+) - captures 1+ digits into Group 1
" - consumes a double quote.
Note the fourth argument to std::sregex_token_iterator, it is 1 because you need to collect only Group 1 values.
I am trying to match a literal number, e.g. 1600442 using a set of regular expressions in Microsoft Visual Studio 2010. My regular expressions are simply:
1600442|7654321
7895432
The problem is that both of the above matches the string.
Implementing this in Python gives the expected result:
import re
serial = "1600442"
re1 = "1600442|7654321"
re2 = "7895432"
m = re.match(re1, serial)
if m:
print "found for re1"
print m.groups()
m = re.match(re2, serial)
if m:
print "found for re2"
print m.groups()
Gives output
found for re1
()
Which is what I expected. Using this code in C++ however:
#include <string>
#include <iostream>
#include <regex>
int main(){
std::string serial = "1600442";
std::tr1::regex re1("1600442|7654321");
std::tr1::regex re2("7895432");
std::tr1::smatch match;
std::cout << "re1:" << std::endl;
std::tr1::regex_search(serial, match, re1);
for (auto i = 0;i <match.length(); ++i)
std::cout << match[i].str().c_str() << " ";
std::cout << std::endl << "re2:" << std::endl;
std::tr1::regex_search(serial, match, re2);
for (auto i = 0;i <match.length(); ++i)
std::cout << match[i].str().c_str() << " ";
std::cout << std::endl;
std::string s;
std::getline (std::cin,s);
}
gives me:
re1:
1600442
re2:
1600442
which is not what I expected. Why do I get match here?
The smatch does not get overwritten by the second call to regex_search thus, it is left intact and contains the first results.
You can move the regex searching code to a separate method:
void FindMeText(std::regex re, std::string serial)
{
std::smatch match;
std::regex_search(serial, match, re);
for (auto i = 0;i <match.length(); ++i)
std::cout << match[i].str().c_str() << " ";
std::cout << std::endl;
}
int main(){
std::string serial = "1600442";
std::regex re1("^(?:1600442|7654321)");
std::regex re2("^7895432");
std::cout << "re1:" << std::endl;
FindMeText(re1, serial);
std::cout << "re2:" << std::endl;
FindMeText(re2, serial);
std::cout << std::endl;
std::string s;
std::getline (std::cin,s);
}
Result:
Note that Python re.match searches for the pattern match at the start of string only, thus I suggest using ^ (start of string) at the beginning of each pattern.