c++ regex: how to use sub-matches - c++

This code will output 192.168.1.105 but I want it to find each number-part of the ip. The output would be
192
168
1
105
Since the ip_result only has 1 sub-match (192.168.1.1), how would I get 4 submatches for each number-part?
#include <iostream>
#include <regex>
#include <string>
std::regex ip_reg("\\d{1,3}."
"\\d{1,3}."
"\\d{1,3}."
"\\d{1,3}");
void print_results(const std::string& ip) {
std::smatch ip_result;
if (std::regex_match(ip, ip_result, ip_reg))
for (auto pattern : ip_result)
std::cout << pattern << std::endl;
else
std::cout << "No match!" << std::endl;
}
int main() {
const std::string ip_str("192.168.1.105");
ip::print_results(ip_str);
}

I rewrote ip_reg to use sub-patterns and print_results to use iterators
std::regex ip_reg("(\\d{1,3})\\."
"(\\d{1,3})\\."
"(\\d{1,3})\\."
"(\\d{1,3})");
void print_results(const std::string& ip) {
std::smatch ip_result;
if (std::regex_match(ip, ip_result, ip_reg)) {
std::smatch::iterator ip_it = ip_result.begin();
for (std::advance(ip_it, 1);
ip_it != ip_result.end();
advance(ip_it, 1))
std::cout << *ip_it << std::endl;
} else
std::cout << "No match!" << std::endl;
}

If you replace std::regex_match with std::regex_search, loop that and always remove the match, you can access all the submatches. Also, you need to change the expression to only one group of digits:
std::regex ip_reg{ "\\d{1,3}" };
void print_results(const std::string& ip_str) {
std::string ip = ip_str; //make a copy!
std::smatch ip_result;
while (std::regex_search(ip, ip_result, ip_reg)){ //loop
std::cout << ip_result[0] << std::endl;
ip = ip_result.suffix(); //remove "192", then "168" ...
}
}
output:
192
168
1
105

Related

Can't match curly brackets using regex_search [duplicate]

It is supposed to match "abababab" since "ab" is repeated more than two times consecutively but the code isn't printing any output.
Is there some other trick in using regex in C++.
I tried with other languages and it works just fine.
#include<bits/stdc++.h>
int main(){
std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\1\1+");
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
Your problem is your backslashes are escaping the '1''s in your string. You need to inform std::regex to treat them as '\' 's. You can do this by using a raw string R"((.+)\1\1+)", or by escaping the slashes, as shown here:
#include <regex>
#include <string>
#include <iostream>
int main(){
std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\\1\\1+");
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
Which produces the output
abababab ab

How to match two groups with different surroundings? C++

I would like to parse strings like (X->Y) or [X=>Y], and extract the X and Y parts. I did it like this:
// Example program
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::string text1 = "(X->Y)";
std::string text2 = "[X=>Y]";
std::regex my_regex("\\(([A-Z]+)->([A-Z]+)\\)|\\[([A-Z]+)=>([A-Z]+)\\]");
std::smatch reg_match;
if(std::regex_match(text1, reg_match, my_regex)) {
std::cout << reg_match[1].str() << ' ' << reg_match[2].str() << std::endl;
} else {
std::cout << "Nothing" << std::endl;
}
}
It works with text1, but it gives an empty result with text2. What do I wrong? Why isn't X and Y in reg_match[1] and reg_match[2] if I run the code with text2?
This is because when you are matching text1, groups 1 and 2 gets matched:
\\(([A-Z]+)->([A-Z]+)\\)|\\[([A-Z]+)=>([A-Z]+)\\]
^^^^^^ ^^^^^
Whereas in text2, groups 3 and 4 gets matched:
\\(([A-Z]+)->([A-Z]+)\\)|\\[([A-Z]+)=>([A-Z]+)\\]
^^^^^^ ^^^^^
So you have to use reg_match[3] and reg_match[4] in the case of text2.
Of course, a more versatile solution would be to check whether reg_match[1] is empty first. If it is, use group 1 and 2, otherwise, use group 3 and 4.
Alternatively to the given answer by #Sweeper you could rewrite your regex to only have 2 match groups:
// Example program
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::string text1 = "(X->Y)";
std::string text2 = "[X=>Y]";
std::regex my_regex("[[(]([A-Z]+)(?:->|=>)([A-Z]+)[)\\]]");
std::smatch reg_match;
if(std::regex_match(text1, reg_match, my_regex)) {
std::cout << reg_match[1].str() << ' ' << reg_match[2].str() << std::endl;
} else {
std::cout << "Nothing" << std::endl;
}
}
This however has the small disadvantage that it will also match a few more variants:
(X=>Y)
[X->Y)
(X=>Y]
etc...

C++ regex library

I have this sample code
// regex_search example
#include <iostream>
#include <string>
#include <regex>
int main ()
{
std::string s ("eritueriotu3498 \"pi656\" sdfs3646df");
std::smatch m;
std::string reg("\\(?<=pi\\)\\(\\d+\\)\\(?=\"\\)");
std::regex e (reg);
std::cout << "Target sequence: " << s << std::endl;
std::cout << "The following matches and submatches were found:" << std::endl;
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
I need to get number between pi and " -> (piMYNUMBER")
In online regex service my regex works fine (?<=pi)(\d+)(?=") but c++ regex don't match anything.
Who knows what is wrong with my expression?
Best regards
That is correct, C++ std::regex flavors do not support lookbehinds. You need to capture the digits between pi and ":
#include <iostream>
#include <vector>
#include <regex>
int main() {
std::string s ("eritueriotu3498 \"pi656\" sdfs3646df");
std::smatch m;
std::string reg("pi(\\d+)\""); // Or, with a raw string literal:
// std::string reg(R"(pi(\d+)\")");
std::regex e (reg);
std::vector<std::string> results(std::sregex_token_iterator(s.begin(), s.end(), e, 1),
std::sregex_token_iterator());
// Demo printing the results:
std::cout << "Number of matches: " << results.size() << std::endl;
for( auto & p : results ) std::cout << p << std::endl;
return 0;
}
See the C++ demo. Output:
Number of matches: 1
656
Here, pi(\d+)" pattern matches
pi - a literal substring
(\d+) - captures 1+ digits into Group 1
" - consumes a double quote.
Note the fourth argument to std::sregex_token_iterator, it is 1 because you need to collect only Group 1 values.

return boost::smatch and get substring "\000"

here is my code, I get messy code if I extract boost::regex_search into function #match
boost::smatch match() {
std::string s = "foobar";
std::string re_s = "f(oo)(b)ar";
boost::regex re(re_s);
boost::smatch what;
if (boost::regex_search(s, what, re)) {
return what;
}
}
int main(int argc, char **argv) {
boost::smatch what = match();
std::cout << what.size() << std::endl;
std::cout << what[0] << std::endl;
std::cout << what[1] << std::endl;
std::cout << what[2] << std::endl;
return (0);
};
the output is:
3
\000\000\000\000\000
\000\000
\000
how to make what[n] return real string
boost::smatch contains string::iterator values for tracking the matches internally. You are matching against a string object that is on the stack. When the match() function returns, that string is destructed and the iterators become invalid. Try moving the string s to the main() function and passing it into match() as a reference.
In Boost, the operator[](int index) of smatch returns a const_reference which is a typedef for sub_match<BidirectionalIterator>. sub_match<BidirectionalIterator> has a cast operator to a string, but you must cast the match to a string, otherwise it calls the operator<<(basic_ostream,sub_match) function which returns the distance from the last match. If you cast the what[0] to a std::string, it will print out. (I tested it on my machine.)
This is the code I used:
#include <iostream>
#include <string>
#include <boost/regex.hpp>
boost::smatch match() {
std::string s = "foobar";
std::string re_s = "f(oo)(b)ar";
boost::regex re(re_s);
boost::smatch what;
if (boost::regex_search(s, what, re)) {
return what;
}
}
int main(int argc, char **argv) {
boost::smatch what = match();
std::cout << what.size() << std::endl;
std::string what0 = what[0];
std::cout << what0 << std::endl;
std::cout << what[1] << std::endl;
std::cout << what[2] << std::endl;
return (0);
};
If only wanna use regex, use std::regex_search instead of boost::regex_search is good, following work well.
#include "boost/regex.hpp"
#include "iostream"
#include "regex"
std::smatch match() {
std::string s = "foobar";
std::string re_s = "f(oo)(b)ar";
std::regex re(re_s);
std::smatch what;
if (std::regex_search(s, what, re)) {
return what;
}
}
int main(int argc, char **argv) {
std::smatch what = match();
std::cout << what.size() << std::endl;
std::cout << what[0].str() << std::endl;
std::cout << what[1].str() << std::endl;
std::cout << what[2].str() << std::endl;
return (0);
};

This regex doesn't work in c++

It is supposed to match "abababab" since "ab" is repeated more than two times consecutively but the code isn't printing any output.
Is there some other trick in using regex in C++.
I tried with other languages and it works just fine.
#include<bits/stdc++.h>
int main(){
std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\1\1+");
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
Your problem is your backslashes are escaping the '1''s in your string. You need to inform std::regex to treat them as '\' 's. You can do this by using a raw string R"((.+)\1\1+)", or by escaping the slashes, as shown here:
#include <regex>
#include <string>
#include <iostream>
int main(){
std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\\1\\1+");
while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}
return 0;
}
Which produces the output
abababab ab