c++ [regex] how to extract given char value - c++

how to extract digit number value?
std::regex legit_command("^\\([A-Z]+[0-9]+\\-[A-Z]+[0-9]+\\)$");
std::string input;
let say the user key in
(AA11-BB22)
i want get the
first_character = "aa"
first_number = 11
secondt_character = "bb"
second_number = 22

You could use capture groups. In the example below I replaced (AA11+BB22) with (AA11-BB22) to match the regex you posted. Note that regex_match only succeeds if the entire string matches the pattern so the beginning/end of line assertions (^ and $) are not required.
#include <iostream>
#include <regex>
#include <string>
using namespace std;
int main() {
const string input = "(AA11-BB22)";
const regex legit_command("\\(([A-Z]+)([0-9]+)-([A-Z]+)([0-9]+)\\)");
smatch matches;
if(regex_match(input, matches, legit_command)) {
cout << "first_character " << matches[1] << endl;
cout << "first_number " << matches[2] << endl;
cout << "second_character " << matches[3] << endl;
cout << "second_number " << matches[4] << endl;
}
}
Output:
$ c++ main.cpp && ./a.out
first_character AA
first_number 11
second_character BB
second_number 22

Related

C++ Regex Alpha without Equal sign

im new to Regex and C++.
My problem is, that '=' is matching when I search for [a-zA-Z]. But this is only a-z without '='?
Can anyone help me please?
string string1 = "s=s;";
enum states state = s1;
regex statement("[a-zA-Z]+[=][a-zA-Z0-9]+[;]");
regex rg_left_letter("[a-zA-Z]");
regex rg_equal("[=]");
regex rg_right_letter("[a-zA-Z0-9]");
regex rg_semicolon("[;]");
for (const auto &s : string1) {
cout << "Current Value: " << s << endl;
// step(&state, s);
if (regex_search(&s, rg_left_letter)) {
cout << "matching: " << s << endl;
} else {
cout << "not matching: " << s << endl;
}
// cout << "Step Executed with sate: " << state << endl;
}
This outputs:
Current Value: s
matching: s
Current Value: =
matching: =
Current Value: s
matching: s
Current Value: ;
not matching: ;
When you write
regex_search(&s, rg_left_letter)
you basically search the C-String &s for a match character-wise, beginning at the character s. Therefore, your loop will search for a match in the remaining sub-strings
s=s;
=s;
s;
;
Which will always succeed, except in the last case, as there is always one character in the entire string that fits your regex. Note however that this assumes that std::string has some 0-termination added, which is, as far as I can tell, not guaranteed if you do not explicitely use the c_str() method, making your code UB.
What you really want to use is the function regex_match, together with your original regex just as simple as:
#include <iostream>
#include <regex>
int main()
{
std::regex statement("[a-zA-Z]+[=][a-zA-Z0-9]+[;]");
if(std::regex_match("s=s;", statement)) { std::cout << "Hooray!\n"; }
}
This is working for me:
int main(void) {
string string1 = "s=s;";
enum states state = s1;
regex statement("[a-zA-Z]+[=][a-zA-Z0-9]+[;]");
regex rg_left_letter("[a-zA-Z]");
regex rg_equal("[=]");
regex rg_right_letter("[a-zA-Z0-9]");
regex rg_semicolon("[;]");
//for (const auto &s : string1) {
for (int i = 0; i < string1.size(); i++) {
cout << "Current Value: " << string1[i] << endl;
// step(&state, s);
if (regex_match(string1.substr(i, 1), rg_left_letter)) {
cout << "matching: " << string1[i] << endl;
} else {
cout << "not matching: " << string1[i] << endl;
}
// cout << "Step Executed with sate: " << state << endl;
}
cout << endl;
return 0;
}

Pcre php regex equal in c++

hello this is pcre regex (php regex)
/\h*(.*?)\h*[=]\h*("(.*?(?:[\\\\]".*?)*)")\h*([,|.*?])/
this regex work for this string
data1 = "value 1", data2 = "value 2", data3 = " data4(" hey ") ",
and get
data, data2, data3
val, val2, data4("val3")
what is this regex equal in c++ regex ?
You should replace \h with \s and use \\ inside a raw string literal.
Refer to the following example code:
#include <string>
#include <iostream>
#include <regex>
using namespace std;
int main() {
std::string pat = R"(\s*(.*?)\s*=\s*(\"(.*?(?:[\\]\".*?)*)\")\s*([,|.*?]))";
std::regex r(pat);
std::cout << pat << "\n";
std::string s = R"(data1 = "value 1", data2 = "value 2", data3 = " data4(" hey ") ",)";
std::cout << s << "\n";
for(std::sregex_iterator i = std::sregex_iterator(s.begin(), s.end(), r);
i != std::sregex_iterator();
++i)
{
std::smatch m = *i;
std::cout << "Capture 1: " << m[1].str() << " at Position " << m.position(1) << '\n';
std::cout << "Capture 3: " << m[3].str() << " at Position " << m.position(3) << '\n';
}
return 0;
}
See IDEONE demo and a JS (ECMA5) regex demo

C++ stringstream value extraction

I am trying to extract values from myString1 using std::stringstream like shown below:
// Example program
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
int main()
{
string myString1 = "+50years";
string myString2 = "+50years-4months+3weeks+5minutes";
stringstream ss (myString1);
char mathOperator;
int value;
string timeUnit;
ss >> mathOperator >> value >> timeUnit;
cout << "mathOperator: " << mathOperator << endl;
cout << "value: " << value << endl;
cout << "timeUnit: " << timeUnit << endl;
}
Output:
mathOperator: +
value: 50
timeUnit: years
In the output you can see me successfully extract the values I need, the math operator, the value and the time unit.
Is there a way to do the same with myString2? Perhaps in a loop? I can extract the math operator, the value, but the time unit simply extracts everything else, and I cannot think of a way to get around that. Much appreciated.
The problem is that timeUnit is a string, so >> will extract anything until the first space, which you haven't in your string.
Alternatives:
you could extract parts using getline(), which extracts strings until it finds a separator. Unfortunately, you don't have one potential separator, but 2 (+ and -).
you could opt for using regex directly on the string
you could finally split the strings using find_first_of() and substr().
As an illustration, here the example with regex:
regex rg("([\\+-][0-9]+[A-Za-z]+)", regex::extended);
smatch sm;
while (regex_search(myString2, sm, rg)) {
cout <<"Found:"<<sm[0]<<endl;
myString2 = sm.suffix().str();
//... process sstring sm[0]
}
Here a live demo applying your code to extract ALL the elements.
You could something more robust like <regex> like in the example below:
#include <iostream>
#include <regex>
#include <string>
int main () {
std::regex e ("(\\+|\\-)((\\d)+)(years|months|weeks|minutes|seconds)");
std::string str("+50years-4months+3weeks+5minutes");
std::sregex_iterator next(str.begin(), str.end(), e);
std::sregex_iterator end;
while (next != end) {
std::smatch match = *next;
std::cout << "Expression: " << match.str() << "\n";
std::cout << " mathOperator : " << match[1] << std::endl;
std::cout << " value : " << match[2] << std::endl;
std::cout << " timeUnit : " << match[4] << std::endl;
++next;
}
}
Output:
Expression: +50years
mathOperator : +
value : 50
timeUnit : years
Expression: -4months
mathOperator : -
value : 4
timeUnit : months
Expression: +3weeks
mathOperator : +
value : 3
timeUnit : weeks
Expression: +5minutes
mathOperator : +
value : 5
timeUnit : minutes
LIVE DEMO
I'd use getline for the timeUnit, but since getline can take only one delimiter, I'd search the string separately for mathOperator and use that:
string myString2 = "+50years-4months+3weeks+5minutes";
stringstream ss (myString2);
size_t pos=0;
ss >> mathOperator;
do
{
cout << "mathOperator: " << mathOperator << endl;
ss >> value;
cout << "value: " << value << endl;
pos = myString2.find_first_of("+-", pos+1);
mathOperator = myString2[pos];
getline(ss, timeUnit, mathOperator);
cout << "timeUnit: " << timeUnit << endl;
}
while(pos!=string::npos);

Finding a number between 2 numbers using regex/boost in c++

I feel like this is a pretty basic question but I did not find a post for it. If you know one please link it below.
So what I'm trying to do is look through a string and extract the numbers in groups of 2.
here is my code:
int main() {
string line = "P112233";
boost::regex e ("P([0-9]{2}[0-9]{2}[0-9]{2})");
boost::smatch match;
if (boost::regex_search(line, match, e))
{
boost::regex f("([0-9]{2})"); //finds 11
boost::smatch match2;
line = match[0];
if (boost::regex_search(line, match2, f))
{
float number1 = boost::lexical_cast<float>(match2[0]);
cout << number1 << endl; // this works and prints out 11.
}
boost::regex g(" "); // here I want it to find the 22
boost::smatch match3;
if (boost::regex_search(line, match3, g))
{
float number2 = boost::lexical_cast<float>(match3[0]);
cout << number2 << endl;
}
boost::regex h(" "); // here I want it to find the 33
boost::smatch match4;
if (boost::regex_search(line, match4, h))
{
float number3 = boost::lexical_cast<float>(match4[0]);
cout << number3 << endl;
}
}
else
cout << "found nothing"<< endl;
return 0;
}
I was able to get the first number but I have no idea how to get the second(22) and third(33).
what's the proper expression I need to use?
As #Cornstalks mentioned you need to use 3 capture groups and then you access them like that:
int main()
{
std::string line = "P112233";
boost::regex e("P([0-9]{2})([0-9]{2})([0-9]{2})");
boost::smatch match;
if (boost::regex_search(line, match, e))
{
std::cout << match[0] << std::endl; // prints the whole string
std::cout << match[1] << ", " << match[2] << ", " << match[3] << std::endl;
}
return 0;
}
Output:
P112233
11, 22, 33
I don't favour regular expressions for this kind of parsing. The key point being that the numbers are still strings when you're done with that hairy regex episode.
I'd use Boost Spirit here instead, which parses into the numbers all at once, and you don't even have to link to the Boost Regex library either, because Spirit is header-only.
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <iostream>
namespace qi = boost::spirit::qi;
static qi::int_parser<int, 10, 2, 2> two_digits;
int main() {
std::string const s = "P112233";
std::vector<int> nums;
if (qi::parse(s.begin(), s.end(), "P" >> *two_digits, nums))
{
std::cout << "Parsed " << nums.size() << " pairs of digits:\n";
for(auto i : nums)
std::cout << " * " << i << "\n";
}
}
Parsed 3 pairs of digits:
* 11
* 22
* 33

Deleting a regular expression match

I have a program that I need to be able to search a file with regex epressions and delete what regex has found. Here is the code I have been working on:
#include <boost/regex.hpp>
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include "time.h"
using namespace std;
class application{
private:
//Variables
boost::regex expression;
boost::smatch matches;
string line;
string pat;
int lineNumber;
string replace;
char time[9];
char date[9];
//Functions
void getExpression(){
cout << "Expression: ";
cin >> pat;
try{
expression = pat;
}
catch(boost::bad_expression){
cout << pat << " is not a valid regular expression\n";
exit(1);
}
}
void boostMatch(){
//Files to open
//Input Files
ifstream in("files/trff292010.csv");
if(!in) cerr << "no file\n";
//Output Files
ofstream out("files/ORIGtrff292010.csv");
ofstream newFile("files/NEWtrff292010.csv");
ofstream record("files/record.dat");
//time
_strdate_s(date);
_strtime_s(time);
lineNumber = 0;
while(in.peek() != EOF){
getline(in, line, '\n');
lineNumber++;
out << line << "\n";
if (regex_search(line, matches, expression)){
for (int i = 0; i<matches.size(); ++i){
record << "Date: "<< date << "Time: " << time << "\tmatches[" << i << "]: " << matches[i] << "\n\tLine Number: "<< lineNumber<< '\n\t\t' << line << '\n';
boost::regex_replace(line, expression, "");
newFile << line << "\n";
}
}else{
newFile << line << "\n";
}
}
}
public:
void run(){
replace = "";
getExpression();
boostMatch();
}
};
As you can see I was trying to use boost::regex_replace to just replace what was found with a blank space, but this did not work. The test I have been running is using [*] to find all the asterisks before some names in a list. Example *alice. The program does find the star but does not remove is to just alice
It seems like boost::regex_replace is returning a string instead of modifying the input. See the documentation for this method.
Try this instead:
newFile << boost::regex_replace(line, expression, "") << "\n";
Escape the * with a \ .
This is a fairly common issue,
http://bytes.com/topic/c/answers/166133-problem-boost-regex_replace
Maybe the above link helps