C++ std::regex track 2 - c++

I want to extract track 2 data from a string, using std::regex in C++.
I have a piece of code, but it does not work. This is the code:
std::string buff("this is ateststring;5581123456781323=160710212423468?hjks");
std::regex e (";\d{0,19}=\d{7}\w*\?", std::regex_constants::basic);
if(std::regex_match(buff, e))
{
cout << "Found!";
}

From the regex_match documentation:
The entire target sequence must match the regular expression for this function to return true (i.e., without any additional characters before or after the match). For a function that returns true when the match is only part of the sequence, see regex_search.
So try using regex_search instead:
std::regex e (";\\d{0,19}=\\d{7}\\w*\\?", std::regex_constants::basic);
if(std::regex_search(buff, e))
{
cout << "Found!";
}

Related

C++: Matching regex, what is in smatch? [duplicate]

This question already has an answer here:
What is returned in std::smatch and how are you supposed to use it?
(1 answer)
Closed 2 years ago.
I'm using a modified regex example from Stroustrup C++ 4th Ed. Page 127 & 128. I'm trying to understand what is in the vector smatch matches.
$ ./a.out
AB00000-0000
AB00000-0000.-0000.
$ ./a.out
AB00000
AB00000..
It seems like the matches in parenthesis () appear in match[1], match[2], ... which the total match appears in match[0].
Appreciate any insight into this.
#include <iostream>
#include <regex>
using namespace std;
int main(int argc, char *argv[])
{
// ZIP code pattern: XXddddd-dddd and variants
regex pat (R"(\w{2}\s*\d{5}(-\d{4})?)");
for (string line; getline(cin,line);) {
smatch matches; // matched strings go here
if (regex_search(line, matches, pat)) { // search for pat in line
for (auto p : matches) {
cout << p << ".";
}
}
cout << endl;
}
return 0;
}
The type of matches is a std::match_results, not a vector, but it does have an operator[].
From the reference:
If n == 0, returns a reference to the std::sub_match representing the part of the target sequence matched by the entire matched regular expression.
If n > 0 and n < size(), returns a reference to the std::sub_match representing the part of the target sequence that was matched by the nth captured marked subexpression).
where n is the argument to operator[]. So matches[0] contains the entire matched expression, and matches[1], matches[2], ... contain consecutive capture group expressions.

Regex in C++11 vs PHP

I'm new to regex and C++11. In order to match an expression like this :
TYPE SIZE NUMBER ("regina s x99");
I built a regex which looks like this one :
\b(regina|margarita|americaine|fantasia)\b \b(s|l|m|xl|xxl)\b x([1-9])([0-9])
In my code I did this to try the regex :
std::string s("regina s x99");
std::regex rgx($RGX); //$RGX corresponds to the regex above
if (std::regex_match(s, rgx))
std::cout << "It works !" << std::endl;
This code throw a std::regex_error, but I don't know where it comes from..
Thanks,
This works with g++ (4.9.2) in c++11 mode:
std::regex rgx("\\b(regina|margarita|americaine|fantasia)\\b\\s*(s|l|m|xl|xxl)\\b\\s*x([1-9]*[0-9])");
This will capture three groups: regina s 99 which matches the TYPE SIZE NUMBER pattern, while your original captured four groups regina s 9 9 and had the NUMBER as two values (maybe that was what you wanted though).
Demo on IdeOne
In C++ strings the \ character is special and needs to be escaped so that it gets passed to the regular expression engine, not interpreted by the compiler.
So you either need to use \\b:
std::regex rgx("\\b(regina|margarita|americaine|fantasia)\\b \\b(s|l|m|xl|xxl)\\b x([1-9])([0-9])");
or use a raw string, which means that \ is not special and doesn't need to be escaped:
std::regex rgx(R"(\b(regina|margarita|americaine|fantasia)\b \b(s|l|m|xl|xxl)\b x([1-9])([0-9]))");
There was a typo in this line in original question:
if (std::reegex_match(s, rgx))
More over I am not sure what are you passing with this variable : $RGX
Corrected program as follows:
#include<regex>
#include<iostream>
int main()
{
std::string s("regina s x99");
std::regex rgx("\\b(regina|margarita|americaine|fantasia)\\b \\s*(s|l|m|xl|xxl)\\b \\s*x([1-9])([0-9])"); //$RGX corresponds to the regex above
if (std::regex_match(s, rgx))
std::cout << "It works !" << std::endl;
else
std::cout<<"No Match"<<std::endl;
}

C++ RegExp and placeholders

I'm on C++11 MSVC2013, I need to extract a number from a file name, for example:
string filename = "s 027.wav";
If I were writing code in Perl, Java or Basic, I would use a regular expression and something like this would do the trick in Perl5:
filename ~= /(\d+)/g;
and I would have the number "027" in placeholder variable $1.
Can I do this in C++ as well? Or can you suggest a different method to extract the number 027 from that string? Also, I should convert the resulting numerical string into an integral scalar, I think atoi() is what I need, right?
You can do this in C++, as of C++11 with the collection of classes found in regex. It's pretty similar to other regular expressions you've used in other languages. Here's a no-frills example of how you might search for the number in the filename you posted:
const std::string filename = "s 027.wav";
std::regex re = std::regex("[0-9]+");
std::smatch matches;
if (std::regex_search(filename, matches, re)) {
std::cout << matches.size() << " matches." << std::endl;
for (auto &match : matches) {
std::cout << match << std::endl;
}
}
As far as converting 027 into a number, you could use atoi (from cstdlib) like you mentioned, but this will store the value 27, not 027. If you want to keep the 0 prefix, I believe you will need to keep this as a string. match above is a sub_match so, extract a string and convert to a const char* for atoi:
int value = atoi(match.str().c_str());
Ok, I solved using std::regex which for some reason I couldn't get to work properly when trying to modify the examples I found around the web. It was simpler than I thought. This is the code I wrote:
#include <regex>
#include <string>
string FileName = "s 027.wav";
// The search object
smatch m;
// The regexp /\d+/ works in Perl and Java but for some reason didn't work here.
// With this other variation I look for exactly a string of 1 to 3 characters
// containing only numbers from 0 to 9
regex re("[0-9]{1,3}");
// Do the search
regex_search (FileName, m, re);
// 'm' is actually an array where every index contains a match
// (equally to $1, $2, $2, etc. in Perl)
string sMidiNoteNum = m[0];
// This casts the string to an integer number
int MidiNote = atoi(sMidiNoteNum.c_str());
Here is an example using Boost, substitute the proper namespace and it should work.
typedef std::string::const_iterator SITR;
SITR start = str.begin();
SITR end = str.end();
boost::regex NumRx("\\d+");
boost::smatch m;
while ( boost::regex_search ( start, end, m, NumRx ) )
{
int val = atoi( m[0].str().c_str() )
start = m[0].second;
}

c++ regex substring wrong pattern found

I'm trying to understand the logic on the regex in c++
std::string s ("Ni Ni Ni NI");
std::regex e ("(Ni)");
std::smatch sm;
std::regex_search (s,sm,e);
std::cout << "string object with " << sm.size() << " matches\n";
This form shouldn't give me the number of substrings matching my pattern? Because it always give me 1 match and it says that the match is [Ni , Ni]; but i need it to find every single pattern; they should be 3 and like this [Ni][Ni][Ni]
The function std::regex_search only returns the results for the first match found in your string.
Here is a code, merged from yours and from cplusplus.com. The idea is to search for the first match, analyze it, and then start again using the rest of the string (that is to say, the sub-string that directly follows the match that was found, which can be retrieved thanks to match_results::suffix ).
Note that the regex has two capturing groups (Ni*) and ([^ ]*).
std::string s("the knights who say Niaaa and Niooo");
std::smatch m;
std::regex e("(Ni*)([^ ]*)");
while (std::regex_search(s, m, e))
{
for (auto x : m)
std::cout << x.str() << " ";
std::cout << std::endl;
s = m.suffix().str();
}
This gives the following output:
Niaaa Ni aaa
Niooo Ni ooo
As you can see, for every call to regex_search, we have the following information:
the content of the whole match,
the content of every capturing group.
Since we have two capturing groups, this gives us 3 strings for every regex_search.
EDIT: in your case if you want to retrieve every "Ni", all you need to do is to replace
std::regex e("(Ni*)([^ ]*)");
with
std::regex e("(Ni)");
You still need to iterate over your string, though.

Get String Between 2 Strings

How can I get a string that is between two other declared strings, for example:
String 1 = "[STRING1]"
String 2 = "[STRING2]"
Source:
"832h0ufhu0sdf4[STRING1]I need this text here[STRING2]afyh0fhdfosdfndsf"
How can I get the "I need this text here"?
Since this is homework, only clues:
Find index1 of occurrence of String1
Find index2 of occurrence of String2
Substring from index1+lengthOf(String1) (inclusive) to index2 (exclusive) is what you need
Copy this to a result buffer if necessary (don't forget to null-terminate)
Might be a good case for std::regex, which is part of C++11.
#include <iostream>
#include <string>
#include <regex>
int main()
{
using namespace std::string_literals;
auto start = "\\[STRING1\\]"s;
auto end = "\\[STRING2\\]"s;
std::regex base_regex(start + "(.*)" + end);
auto example = "832h0ufhu0sdf4[STRING1]I need this text here[STRING2]afyh0fhdfosdfndsf"s;
std::smatch base_match;
std::string matched;
if (std::regex_search(example, base_match, base_regex)) {
// The first sub_match is the whole string; the next
// sub_match is the first parenthesized expression.
if (base_match.size() == 2) {
matched = base_match[1].str();
}
}
std::cout << "example: \""<<example << "\"\n";
std::cout << "matched: \""<<matched << "\"\n";
}
Prints:
example: "832h0ufhu0sdf4[STRING1]I need this text here[STRING2]afyh0fhdfosdfndsf"
matched: "I need this text here"
What I did was create a program that creates two strings, start and end that serve as my start and end matches. I then use a regular expression string that will look for those, and match against anything in-between (including nothing). Then I use regex_match to find the matching part of the expression, and set matched as the matched string.
For more info, see http://en.cppreference.com/w/cpp/regex and http://en.cppreference.com/w/cpp/regex/regex_search
Use strstr http://www.cplusplus.com/reference/clibrary/cstring/strstr/ , with that function you will get 2 pointers, now you should compare them (if pointer1 < pointer2) if so, read all chars between them.