regex_match not returning true [duplicate] - c++

This question already has an answer here:
Regex not working as expected with C++ regex_match
(1 answer)
Closed 4 days ago.
I am very confused why this regex match in C++ not working.
#include <iostream>
#include <regex>
#include <string>
void test_code(){
const std::string test_string("this is a test of test");
const std::regex match_regex("test");
std::cout<<test_string<<std::endl;
std::smatch match;
if (std::regex_match(test_string, match, match_regex)){
std::cout<<match.size()<<std::endl;
}
}
int main() {
test_code();
}
I read the CPP reference documentation and tried to write a simple regex check. I am not sure why this is not working (i.e. it s not returning true for std::regex_match(...) call.

As stated in documentation for std::regex_match() (emphasis is mine):
Determines if the regular expression e matches the entire target character sequence, which may be specified as std::string, a C-string, or an iterator pair.
and your regex pattern does not obviously match the whole string. So you either need to change your regex to something like ".*test.*" or use std::regex_search() If you want to check substring for matching:
Determines if there is a match between the regular expression e and some subsequence in the target character sequence.

Related

Validating an NMEA sentence using C++ [duplicate]

This question already has answers here:
Regex statement in C++ isn't working as expected [duplicate]
(3 answers)
Closed 3 years ago.
I need help with creating regular expressions for NMEA sentence. The reason for this because I want to validate the data whether it is a correct form of NMEA sentence. Using C++. Below is some example of NMEA sentence in the form of GLL. If it's possible I would also like to get a sample of c++ that will validate the code.
$GPGLL,5425.32,N,106.92,W,82808*64
$GPGLL,5425.33,N,106.91,W,82826*6a
$GPGLL,5425.32,N,106.9,W,82901*5e
$GPGLL,5425.32,N,106.89,W,82917*61
I have also included the expression I have tried that I found it online. But when I run it, it says unknown escape sequence.
#include <iostream>
#include <regex>
#include<string.h>
using namespace std;
int main()
{
// Target sequence
string s = "$GPGLL, 54 30.49, N, 1 06.74, W, 16 39 58 *5E";
// An object of regex for pattern to be searched
regex r("[A-Z] \w+,\d,\d,(?:\d{1}|),[A-B],[^,]+,0\*([A-Za-z0-9]{2})");
// flag type for determining the matching behavior
// here it is for matches on 'string' objects
smatch m;
// regex_search() for searching the regex pattern
// 'r' in the string 's'. 'm' is flag for determining
// matching behavior.
regex_search(s, m, r);
// for each loop
for (auto x : m)
cout << "The nmea sentence is correct ";
return 0;
}
The C++ compiler interprets \d and friends as a character escape code.
Either double the backslashes:
regex r("[A-Z] \\w+,\\d,\\d,(?:\\d{1}|),[A-B],[^,]+,0\\*([A-Za-z0-9]{2})");
or use a raw literal:
regex r(R"re([A-Z] \w+,\d,\d,(?:\d{1}|),[A-B],[^,]+,0\*([A-Za-z0-9]{2}))re");

Is regex match guaranteed to always only look out for the last pattern? C++ [duplicate]

This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 4 years ago.
Assume I have a string like this:
"a-b-c-d"
n = 4 sequences seperated by "-".
Now I want to receive the first n - 1 sequences ("a-b-c") and the last sequence - ("d").
I can achieve this with the following code:
std::string string{ "a-b-c-d" };
std::regex reg{ "^(.*)-(.*)$" };
std::smatch match;
std::regex_match(string, match, reg);
std::cout << match.str(1) << '\n';
std::cout << match.str(2) << '\n';
producing the excpected output:
a-b-c
d
However, following the pure logical grammar of this regex ("^(.*)-(.*)$")
a
b-c-d
or
a-b
c-d
could also be valid matches of the string. Afterall (.*) could be interpreted differently here and the first (.*) could decide to stop at the first character sequence or the second etc.
So my question: is std::smatch guaranteed to behave this way? Does std::smatch always explicitly look for the last patterns when giving the option to capture with (.*)? Is there a way to tell std::smatch to look for the first occurrence rather than the last?
* is greedy. So the first (.*) matches as much as it can while the second (.*) still has something left to match. There is only one correct match, and it is the one you want.
If you want the first group to be matched non-greedily, add a ? after the *:
^(.*?)-(.*)$
For your example input a-b-c-d this leaves you with a in the first capture group and b-c-d in the second.

What is returned in std::smatch and how are you supposed to use it?

string "I am 5 years old"
regex "(?!am )\d"
if you go to http://regexr.com/ and apply regex to the string you'll get 5.
I would like to get this result with std::regex, but I do not understand how to use match results and probably regex has to be changed as well.
std::regex expression("(?!am )\\d");
std::smatch match;
std::string what("I am 5 years old.");
if (regex_search(what, match, expression))
{
//???
}
The std::smatch is an instantiation of the match_results class template for matches on string objects (with string::const_iterator as its iterator type). The members of this class are those described for match_results, but using string::const_iterator as its BidirectionalIterator template parameter.
std::match_results supports a operator[]:
If n > 0 and n < size(), returns a reference to the std::sub_match representing the part of the target sequence that was matched by the nth captured marked subexpression).
If n == 0, returns a reference to the std::sub_match representing the part of the target sequence matched by the entire matched regular expression.
if n >= size(), returns a reference to a std::sub_match representing an unmatched sub-expression (an empty subrange of the target sequence).
In your case, regex_search finds the first match only and then match[0] holds the entire match text, match[1] would contain the text captured with the first capturing group (the fist parenthesized pattern part), etc. In this case though, your regex does not contain capturing groups.
Here, you need to use a capturing mechanism here since std::regex does not support a lookbehind. You used a lookahead that checks the text that immediately follows the current location, and the regex you have is not doing what you think it is.
So, use the following code:
#include <regex>
#include <string>
#include <iostream>
using namespace std;
int main() {
std::regex expression(R"(am\s+(\d+))");
std::smatch match;
std::string what("I am 5 years old.");
if (regex_search(what, match, expression))
{
cout << match.str(1) << endl;
}
return 0;
}
Here, the pattern is am\s+(\d+)". It is matching am, 1+ whitespaces, and then captures 1 or more digits with (\d+). Inside the code, match.str(1) allows access to the values that are captured with capturing groups. As there is only one (...) in the pattern, one capturing group, its ID is 1. So, str(1) returns the text captured into this group.
The raw string literal (R"(...)") allows using a single backslash for regex escapes (like \d, \s, etc.).

C++11 regex to match nothing [duplicate]

This question already has answers here:
A Regex that will never be matched by anything
(30 answers)
Closed 9 years ago.
I encountered a task which reuires a regex matching nothing.
C++ reference happily states that it already has such thing:
http://en.cppreference.com/w/cpp/regex/basic_regex/basic_regex
1) Default constructor. Constructs an empty regular expression which will match nothing.
But in reality (clang 3.3) it happens not to be the case:
#include <string>
#include <regex>
int main(int argc, const char *argv[]) {
std::regex re1;
std::regex re2("");
std::smatch rt1, rt2;
bool r1 = std::regex_match(std::string(""), rt1, re1);
bool r2 = std::regex_match(std::string(""), rt2, re2);
std::cerr << "r1:" << r1 << ", r2:" << r2 << std::endl;
}
This program prints: r1:1, r2:1
What should mean that both regexes matched empty string.
Any idea what is wrong here and how to create "match nothing" regex ?
The default constructor for std::basic_regex constructs a regular expression that "does not match any character sequence". [re.regex.construct]/1. If your implementation matches an empty character sequence it's wrong.

Put first boost::regex match into a string [duplicate]

This question already has an answer here:
Get last match with Boost::Regex
(1 answer)
Closed 9 years ago.
Somehow, I've failed to find out, how to put only the first occurrence or regular expression to string. I can create a regex object:
static const boost::regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
Now, I need to match ([A-Za-z0-9_]+) to std::string, say playername.
std::string chat_input("<Darker> Hello");
std::string playername = e.some_match_method(chat_input, 1); //Get contents of the second (...)
What have I missed?
What should be instead of some_match_method and what parameters should it take?
You can do something like this:
static const regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
string chat_input("<Darker> Hello");
smatch mr;
if (regex_search(begin(chat_input), end(chat_input), mr, e)
string playername = mr[2].str(); //Get contents of the second (...)
Please note that regex is part of C++11, so you don't need boost for it, unless your regular expression is complex (as C++11 and newer still has difficulties processing complex regular expressions).
I think what you're missing is that boost::regex is the regular expression, but it doesn't do the parsing against a given input. You need to actually use it as a parameter to boost::regex_search or boost::regex_match, which evaluate a string (or iterator pairs) against the regular expression.
static const boost::regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
std::string chat_input("<Darker> Hello");
boost::match_results<std::string::const_iterator> results;
if (boost::regex_match(chat_input, results, e))
{
std::string playername = results[2]; //Get contents of the second (...)
}