c++ regexp allowing digits separated by dot - c++

i need rexexp allowing up to two digits in a row separated by dots, like 1.2 or 1.2.3 or 1.2.3.45 etc., but not 1234 or 1.234 etc. I'm trying this "^[\d{1,2}.]+", but it allows all numbers. What's wrong?

You may try this:
^\d{1,2}(\.\d{1,2})+$
Regex 101 Demo
Explanation:
^ start of a string
\d{1,2} followed by one or two digits
( start of capture group
\.\d{1,2} followed by a dot and one or two digits
) end of capture group
+ indicates the previous capture group be repeated 1 or more times
$ end of string
Sample C++ Source (run here):
#include <regex>
#include <string>
#include <iostream>
using namespace std;
int main()
{
string regx = R"(^\d{1,2}(\.\d{1,2})+$)";
string input = "1.2.346";
smatch matches;
if (regex_search(input, matches, regex(regx)))
{
cout<<"match found";
}
else
cout<<"No match found";
return 0;
}

I think the last should not have more than 2 digits.
(\d{1,2}\.)+\d{1,2}(?=\b)

Related

Regex to replace single occurrence of character in C++ with another character

I am trying to replace a single occurrence of a character '1' in a String with a different character.
This same character can occur multiple times in the String which I am not interested in.
For example, in the below string I want to replace the single occurrence of 1 with 2.
input:-0001011101
output:-0002011102
I tried the below regex but it is giving be wrong results
regex b1("(1){1}");
S1=regex_replace( S,
b1, "2");
Any help would be greatly appreciated.
If you used boost::regex, Boost regex library, you could simply use a lookaround-based solution like
(?<!1)1(?!1)
And then replace with 2.
With std::regex, you cannot use lookbehinds, but you can use a regex that captures either start of string or any one char other than your char, then matches your char, and then makes sure your char does not occur immediately on the right.
Then, you may replace with $01 backreference to Group 1 (the 0 is necessary since the $12 replacement pattern would be parsed as Group 12, an empty string here since there is no Group 12 in the match structure):
regex reg("([^1]|^)1(?!1)");
S1=std::regex_replace(S, regex, "$012");
See the C++ demo online:
#include <iostream>
#include <regex>
int main() {
std::string S = "-0001011101";
std::regex reg("([^1]|^)1(?!1)");
std::cout << std::regex_replace(S, reg, "$012") << std::endl;
return 0;
}
// => -0002011102
Details:
([^1]|^) - Capturing group 1: any char other than 1 ([^...] is a negated character class) or start of string (^ is a start of string anchor)
1 - a 1 char
(?!1) - a negative lookahead that fails the match if there is a 1 char immediately to the right of the current location.
Use a negative lookahead in the regexp to match a 1 that isn't followed by another 1:
regex b1("1(?!1)");

std::regex: Match string consisting of digits and white space and extract digits. How?

I want to do 2 things at the same time: Match a string against a pattern and extract groups.
The string consists of white spaces and digits. I want to match the string against this pattern. Additionally I want to extract the digits (not numbers, single digits only) using std::smatch.
I tried a lot, but no success.
For the dupe hunters: I checked many many answers on SO, but I could not find a solution.
Then I tried to use the std::sregex_token_iterator. And the result was also baffeling me. In
#include <string>
#include <regex>
#include <vector>
#include <iterator>
const std::regex re1{ R"(((?:\s*)|(\d))+)" };
const std::regex re2{ R"(\s*(\d)\s*)" };
int main() {
std::string test(" 123 45 6 ");
std::smatch sm;
bool valid1 = std::regex_match(test, sm, re1);
std::vector<std::string> v(std::sregex_token_iterator(test.begin(), test.end(), re2), {});
return 0;
}
The vector contains not only the digits, but also spaces. I would like to have digits only.
The smatch does not contain any digits.
I know, that I can first remove all whitespaces from the string, but there should be a better, one step solution.
What is the proper regex to 1. match the string against my described pattern and 2. extract all single digits into the smatch?
The pattern you need to validate is
\s*(?:\d\s*)*
See the regex demo (note I added ^ and $ to make the pattern match the whole string at the regex testing site, since you use equivalent regex_match in the code, it requires a full string match).
Next, once your string is validated with the first regex, you just need to extract any single digit:
const std::regex re2{ R"(\d)" };
// ...
std::vector<std::string> v(std::sregex_token_iterator(test.begin(), test.end(), re2), {});
Full working snippet:
#include <string>
#include <regex>
#include <vector>
#include <iterator>
#include <iostream>
const std::regex re1{ R"(\s*(?:\d\s*)*)" };
const std::regex re2{ R"(\d)" };
int main() {
std::string test(" 123 45 6 ");
std::smatch sm;
bool valid1 = std::regex_match(test, sm, re1);
std::vector<std::string> v(std::sregex_token_iterator(test.begin(), test.end(), re2), {});
for (auto i: v)
std::cout << i << std::endl;
return 0;
}
Output:
1
2
3
4
5
6
Alternative solution using Boost
You may use a regex that will match all digits separately only if the whole string consists of whitespaces and digits using
\G\s*(\d)(?=[\s\d]*$)
See the regex demo.
Details
\G - start of string or end of the preceding successful match
\s* - 0+ whitespaces
(\d) - a digit captured in Group 1 (we'll return only this value when passing 1 as the last argument in boost::sregex_token_iterator iter(test.begin(), test.end(), re2, 1))
(?=[\s\d]*$) - there must be any 0 or more whitespaces or digits and then the end of string immediately to the right of the current location.
See the whole C++ snippet (compiled with the -lboost_regex option):
#include <iostream>
#include <vector>
#include <boost/regex.hpp>
int main()
{
std::string test(" 123 45 6 ");
boost::regex re2(R"(\G\s*(\d)(?=[\s\d]*$))");
boost::sregex_token_iterator iter(test.begin(), test.end(), re2, 1);
boost::sregex_token_iterator end;
std::vector<std::string> v(iter, end);
for (auto i: v)
std::cout << i << std::endl;
return 0;
}

Avoid empty elements in match when optional substrings are not present

I am trying to create a regex that match the strings returned by diff terminal command.
These strings start with a decimal number, might have a substring composed by a comma and a number, then a mandatory character (a, c, d) another mandatory decimal number followed by another optional group as the one before.
Examples:
27a27
27a27,30
28c28
28,30c29,31
1d1
1,10d1
I am trying to extract all the groups separately but the optional ones without ,.
I am doing this in C++:
#include<iostream>
#include<string>
#include<fstream>
#include <regex>
using namespace std;
int main(int argc, char* argv[])
{
string t = "47a46";
std::string result;
std::regex re2("(\\d+)(?:,(\\d+))?([acd])(\\d+)(?:,(\\d+))?");
std::smatch match;
std::regex_search(t, match, re2);
cout<<match.size()<<endl;
cout<<match.str(0)<<endl;
if (std::regex_search(t, match, re2))
{
for (int i=1; i<match.size(); i++)
{
result = match.str(i);
cout<<i<<":"<<result<< " ";
}
cout<<endl;
}
return 0;
}
The string variable t is the string I want to manipulate.
My regular expression
(\\d+)(?:,(\\d+))?([acd])(\\d+)(?:,(\\d+))?
is working but with strings that do not have the optional subgroups (such as 47a46, the match variable will contain empty elements in the corresponding position of the expected substrings.
For example in the program above the elements of match (preceded by their index) are:
1:47 2: 3:a 4:46 5:
Elements in position 2 and 5 correspond to the optional substring that in this case are not present so I would like match to avoid retrieving them so that it would be:
1:47 2:a 3:46
How can I do it?
I think the best RE for you would be like this:
std::regex re2(R"((\d+)(?:,\d+)?([a-z])(\d+)(?:,\d+)?)");
- that way it should match all the required groups (but optional)
output:
4
47a46
1:47 2:a 3:46
Note: the re2's argument string is given in c++11 notation.
EDIT: simplified RE a bit

How to consider taking dot in the number in regular expression

Take a look at the following regular expression
std::regex reg("[A][-+]?([0-9]*\\.[0-9]+|[0-9]+)");
This will find any A letter followed by float number. The problem if the number A30., this regular expression ignores the dot and print the result as A30. I would like to force the regular expression to consider the decimal dot as well. Is this feasible?
#include <iostream>
#include <string>
#include <regex>
using namespace std;
int main()
{
std::string line("A50. hsih Y0 his ");
std::smatch match;
std::regex reg("[A][-+]?([0-9]*\\.[0-9]+|[0-9]+)");
if ( std::regex_search(line,match,reg) ){
cout << match.str(0) << endl;
}else{
cout << "nothing found" << endl;
}
return 0;
}
You request the dot to be followed by one or more (+) digits. Just make the trailing ditigs optional by changing it to:
std::regex reg("[A][-+]?([0-9]*\\.[0-9]*|[0-9]+)");
Demo
The only problem with this expression is that it would also match A followed by a single dot without any digit. I don't know if you'd see this a s a valid match. A more robust alternative would hence be:
std::regex reg("[A][-+]?([0-9]*\\.[0-9]+|[0-9]+\\.?)");
So either trailing digits, or digits followed optionally by a dot.
Second demo
You can change your regex like this
A[-+]?(?:[0-9]*\\.?(?:[0-9]+)?)
A - Matches A.
[-+]? - Matches + or -. ( ? makes it optional)
(?:[0-9]*\\.?(?:[0-9]+)?)
(?:[0-9]*\\. - will match zero or more digits followed by . (? makes it optional)
(?:[0-9]+)? - Matches one or more time. (? makes it optional)
Demo

Using RegEx to filter wrong Input?

Look at this example:
string str = "January 19934";
The Outcome should be
Jan 1993
I think I have created the right RegEx ([A-z]{3}).*([\d]{4}) to use in this case but I do not know what I should do now?
How can I extract what I am looking for, using RegEx? Is there a way like receiving 2 variables, the first one being the result of the first RegEx bracket: ([A-z]{3}) and the second result being 2nd bracket:[[\d]{4}]?
Your regex contains a common typo: [A-z] matches more than just ASCII letters. Also, the .* will grab all the string up to its end, and backtracking will force \d{4} match the last 4 digits. You need to use lazy quantifier with the dot, *?.
Then, use regex_search and concat the 2 group values:
#include <regex>
#include <string>
#include <iostream>
using namespace std;
int main() {
regex r("([A-Za-z]{3}).*?([0-9]{4})");
string s("January 19934");
smatch match;
std::stringstream res("");
if (regex_search(s, match, r)) {
res << match.str(1) << " " << match.str(2);
}
cout << res.str(); // => Jan 1993
return 0;
}
See the C++ demo
Pattern explanation:
([A-Za-z]{3}) - Group 1: three ASCII letters
.*? - any 0+ chars other than line break symbols as few as possible
([0-9]{4}) - Group 2: 4 digits
This could work.
([A-Za-z]{3})([a-z ])+([\d]{4})
Note the space after a-z is important to catch space.