How to check for success in c++11 std::regex_replace?

How to check for success in c++11 std::regex_replace? - c++

I'd like to do the c++11 equivalent of a perl checked-replacement operation:
my $v = "foo.rat"
if ( $v =~ s/\.rat$/.csv/ )
{
...
}
I can do the replacement without trouble:
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::string s{ "foo.rat" } ;
std::regex reg{ R"((.*)\.rat$)" } ;
s = std::regex_replace( s, reg, "$1.csv" ) ;
std::cout << s << std::endl ;
s = std::regex_replace( "foo.noo", reg, "$1.csv" ) ;
std::cout << s << std::endl ;
return 0 ;
}
This gives:
foo.csv
foo.noo
Notice that the replace operation on the non-matching expression doesn't throw an error (which is what I expected).
Looking at the regex_replace documentation, it's not obvious to me how to check for the success of the replace operation. I could do a string compare, but that seems backwards?

Try to find match with std::regex_match or std::regex_search, check if something is matched, then replace found portion of string using std::string::replace. That shouldn't lead to performance loss.

Just to add to the accepted answer that it can also be done with a std::regex_iterator. This may be handy when multiple replacements may took place.
Iterator std::regex_iterator repeatedly calls std::regex_search() until all matches are found. If the position of the iterator at the beginning and the position at the end are the same, no match was found.
Function bool regex_replace(std::string &str, const std::string &re, const std::string& replacement) implements this behaviour:
#include <iostream>
#include <regex>
bool regex_replace(std::string &str, const std::string &re, const std::string& replacement) {
std::regex regexp(re);
//Search regex
std::sregex_iterator begin = std::sregex_iterator(str.begin(), str.end(), regexp);
std::sregex_iterator end = std::sregex_iterator();
//replace using iterator
for (std::sregex_iterator i = begin; i != end; ++i)
str.replace(i->position(), i->length(), replacement);
//returns true if at least one match was found and replaced
return (begin != end);
}
This function operates in place. At the end str have the replacements. Only if any replacement was made, the function returns true.
Following code shows how to use it to make multiple replacements and detect if any was made:
int main(int argc, char** argv) {
std::string rgx("[0-9]");
std::string str("0a1b2c3d4e5");
std::string replacement("?");
bool found = regex_replace(str, rgx, replacement);
std::cout << "Found any: " << (found ? "true" : "false") << std::endl;
std::cout << "string: " << str << std::endl;
return 0;
}
The code substitutes every digit for the quotation mark '?':
Found any: true
string: ?a?b?c?d?e?

use std::regex_constants::format_no_copy flag to change the behavior of regex_replace(). look the code below.
the return string will now be empty if match failed.
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::string s{ "foo.rat" } ;
std::regex reg{ R"((.*)\.rat$)" } ;
auto rxMatchFlag = std::regex_constants::format_no_copy; //<---use this to modify the behavior of regex_replace when matching failed.
s = std::regex_replace( s, reg, "$1.csv", rxMatchFlag) ;
if(!s.empty()) std::cout << s << std::endl ;
else std::cout << "failed match" << std::endl;
s = std::regex_replace( "foo.noo", reg, "$1.csv", rxMatchFlag) ;
if(!s.empty()) std::cout << s << std::endl ;
else std::cout << "failed match" << std::endl;
return 0 ;
}
for the other flags, look them here

I don't believe there's any direct way to find out whether any replacements were made.
(Don't confuse this with "success / not success", which is not quite the same thing.)

Related

Regex search overlapping matches c++11

What regex expression should I use to search all occurrences that match:
Start with 55 or 66
followed by a minimum 8 characters in the range of [0-9a-fA-F] (HEX numbers)
Ends with \r (a carriage return)
Example string: 0205065509085503400066/r09\r
My expected result:
5509085503400066\r
5503400066\r
My current result:
5509085503400066\r
Using
(?:55|66)[0-9a-fA-F]{8,}\r
As you can sie, this finds onlny the first result but not the second one.
Edit clarification
I search the string using Regex. It'll select the message for further parsing. The target string can start anywhere in the string. The target string is only valid if it only contains base-16 (HEX) numbers, and ends with a carriage return.
[start] [information part minimum 8 chars] [end symbol-carigge return]
I'm using the std::regex library in c++11 with the flag ECMAScript
Edit
I have created an alternative solution that gives me the expected result. But this is not pure regex.
#include <iostream>
#include <string>
#include <regex>
int main()
{
// repeated search (see also
std::regex_iterator)
std::string log("0055\r0655036608090705\r");
std::regex r("(?:55|66)[0-9a-fA-F]{8,}\r");
std::smatch sm;
while(regex_search(log, sm, r))
{
std::cout << sm.str() << '\n';
log = sm.str();
log += sm.suffix();
log[0] = 'a' ;
}
}
** Edit: Working regex solution based on comments **
#include <iostream>
#include <string>
#include <regex>
int main()
{
// repeated search (see also
std::regex_iterator)
std::string s("0055\r06550003665508090705\r0970");
std::regex r("(?=((?:55|66)[0-9a-fA-F]{8,}\r))");
auto words_begin =
std::sregex_iterator(s.begin(), s.end(), r);
auto words_end = std::sregex_iterator();
std::cout << "Found "
<< std::distance(words_begin, words_end)
<< " words:\n";
for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
std::smatch match = *i;
std::string match_str = s.substr(match.position(1), match.length(1) - 1); //-1 cr
std::cout << match_str << " =" << match.position(1) << '\n';
}
}

Your are actually looking for overlapping matches. This can be achieved using a regex lookahead like this:
(?=((?:55|66)[0-9a-fA-F]{8,}\/r))
You will find the matches in question in group 1. The full-match, however, is empty.
Regex Demo (using /r instead of a carriage return for demonstration purposes only)
Sample Code:
#include <iostream>
#include <string>
#include <regex>
using namespace std;
int main() {
std::string subject("0055\r06550003665508090705\r0970");
try {
std::regex re("(?=((?:55|66)[0-9a-fA-F]{8,}\r))");
std::sregex_iterator next(subject.begin(), subject.end(), re);
std::sregex_iterator end;
while (next != end) {
std::smatch match = *next;
std::cout << match.str(1) << "\n";
next++;
}
} catch (std::regex_error& e) {
// Syntax error in the regular expression
}
return 0;
}
See also: Regex-Info: C++ Regular Expressions with std::regex

Find an exact substr in a string

I have a text file which contains the following text
License = "123456"
GeneralLicense = "56475655"
I want to search for License as well as for GeneralLicense.
while (getline(FileStream, CurrentReadLine))
{
if (CurrentReadLine.find("License") != std::string::npos)
{
std::cout << "License Line: " << CurrentReadLine;
}
if (CurrentReadLine.find("GeneralLicense") != std::string::npos)
{
std::cout << "General License Line: " << CurrentReadLine;
}
}
Since the word License also present in the word GeneralLicense so if-statement in the line if (CurrentReadLine.find("License") != std::string::npos) becomes true two times.
How can I specify that I want to search for the exact sub-string?
UPDATE: I can reverse the order as mentioned by some Answers OR check if the License is at Index zero. But isn't there anything ROBOUST (flag or something) which we can speficy to look for the exact match (Something like we have in most of the editors e.g. MS Word etc.).

while (getline(FileStream, CurrentReadLine))
{
if (CurrentReadLine.find("GeneralLicense") != std::string::npos)
{
std::cout << "General License Line: " << CurrentReadLine;
}
else if (CurrentReadLine.find("License") != std::string::npos)
{
std::cout << "License Line: " << CurrentReadLine;
}
}

The more ROBUST search is called a regex:
#include <regex>
while (getline(FileStream, CurrentReadLine))
{
if(std::regex_match(CurrentReadLine,
std::regex(".*\\bLicense\\b.*=.*")))
{
std::cout << "License Line: " << CurrentReadLine << std::endl;
}
if(std::regex_match(CurrentReadLine,
std::regex(".*\\bGeneralLicense\\b.*=.*")))
{
std::cout << "General License Line: " << CurrentReadLine << std::endl;
}
}
The \b escape sequences denote word boundaries.
.* means "any sequence of characters, including zero characters"
EDIT: You could also use regex_search instead of regex_match to search for substrings that match instead of using .* to cover the parts that don't match:
#include <regex>
while (getline(FileStream, CurrentReadLine))
{
if(std::regex_search(CurrentReadLine, std::regex("\\bLicense\\b")))
{
std::cout << "License Line: " << CurrentReadLine << std::endl;
}
if(std::regex_search(CurrentReadLine, std::regex("\\bGeneralLicense\\b")))
{
std::cout << "General License Line: " << CurrentReadLine << std::endl;
}
}
This more closely matches your code, but note that it will get tripped up if the keywords are also found after the equals sign. If you want maximum robustness, use regex_match and specify exactly what the whole line should match.

You can check if the position at which the substring appears is at index zero, or that the character preceding the initial position is a space:
bool findAtWordBoundary(const std::string& line, const std::string& search) {
size_t pos = line.find(search);
return (pos != std::string::npos) && (pos== 0 || isspace(line[pos-1]));
}
Isn't there anything ROBUST (flag or something) which we can specify to look for the exact match?
In a way, find already looks for exact match. However, it treats a string as a sequence of meaningless numbers that represent individual characters. That is why std::string class lacks the concept of "full word", which is present in other parts of the library, such as regular expressions.

You could write a function that tests for the largest match first and then returns what ever information you want about the match.
Something a bit like:
// find the largest matching element from the set and return it
std::string find_one_of(std::set<std::string, std::greater<std::string>> const& tests, std::string const& s)
{
for(auto const& test: tests)
if(s.find(test) != std::string::npos)
return test;
return {};
}
int main()
{
std::string text = "abcdef";
auto found = find_one_of({"a", "abc", "ab"}, text);
std::cout << "found: " << found << '\n'; // prints "abc"
}

If all matches start on pos 0 and none is prefix of an other, then the following might work
if (CurrentReadLine.substr( 0, 7 ) == "License")

You can tokenize your string and do a full comparison with your search key and the tokens
Example:
#include <string>
#include <sstream>
#include <vector>
#include <iostream>
auto tokenizer(const std::string& line)
{
std::vector<std::string> results;
std::istringstream ss(line);
std::string s;
while(std::getline(ss, s, ' '))
results.push_back(s);
return results;
}
auto compare(const std::vector<std::string>& tokens, const std::string& key)
{
for (auto&& i : tokens)
if ( i == key )
return true;
return false;
}
int main()
{
std::string x = "License = \"12345\"";
auto token = tokenizer(x);
std::cout << compare(token, "License") << std::endl;
std::cout << compare(token, "GeneralLicense") << std::endl;
}

find alphabetic substring

I have the following strings, from which i want to extract the alphabetic part (alphabetic substring) only which is greater than 1:
% d. i.p.p. attendu --> attendu
aprÃ ¨ s. expertise --> apr, expertise
n.c.p.c. condamner --> condamner
I am trying the following piece code :
#include <regex>
#include <iostream>
void main()
{
const std::string s = "% d. i.p.p. attendu";
std::regex rgx("[a-zA-Z]{2,20}");
std::smatch match;
if (std::regex_search(s.begin(), s.end(), match, rgx))
std::cout << "match: " << match[1] << '\n';
}
But I am having the following error when i run the code :
Terminate called after throwing an instance of 'std::regex_error' what(): regex_error
Can you please help me,
Thank you,
Hani.
Ok I managed to use boost since gcc's regex is an abomination.
#include <boost/regex.hpp>
void main()
{
const std::string s = "% d. i.p.p. tototo attendu";
boost::regex re("[a-zA-Z]{4,7}");
boost::smatch matches;
if( boost::regex_search( s, matches, re ) )
{
std::string value( matches[0].first, matches[0].second );
cout << value << " ";
}
}
Fine i found attendu but the output is only tototo. It's not incrementing
The return value is "tototo attendu" I was wondering if I can return each value at a time instead of 1 string

I was wondering if I can return each value at a time instead of 1 string
The only way of doing this seems to be via regex_iterator. Here’s an example using Boost:
#include <boost/regex.hpp>
#include <iostream>
int main() {
const std::string s = "% d. i.p.p. tototo attendu";
boost::regex rgx("([a-zA-Z]{2,20})");
boost::smatch match;
boost::sregex_iterator begin{s.begin(), s.end(), rgx},
end{};
for (auto&& i = begin; i != end; ++i)
std::cout << "match: " << *i << '\n';
}
This yields:
match: tototo
match: attendu
Two things:
The return type of main is always int. Your code shouldn’t even compile.
I’ve added parentheses around your (first, which was correct!) regular expression so that it creates a capture for each match. The iterators then iterate over each match in turn.

Different behavior in C regex VS C++11 regex

I need a code that splits math-notation permutations into its elements, lets suppose this permutation:
The permutation string will be:
"(1,2,5)(3,4)" or "(3,4)(1,2,5)" or "(3,4)(5,1,2)"
The patterns i've tried are this:
([0-9]+[ ]*,[ ]*)*[0-9]+ for each permutation cycle. This would split the "(1,2,5)(3,4)" string in two strings "1,2,5" and "3,4".
([0-9]+) for each element in cycle. This would split each cycle in individual numbers.
When i've tried this patterns in this page they work well. And also, i've used them with the C++11 regex library with good results:
#include <iostream>
#include <string>
#include <regex>
void elements(const std::string &input)
{
const std::regex ElementRegEx("[0-9]+");
for (std::sregex_iterator Element(input.begin(), input.end(), ElementRegEx); Element != std::sregex_iterator(); ++Element)
{
const std::string CurrentElement(*Element->begin());
std::cout << '\t' << CurrentElement << '\n';
}
}
void cycles(const std::string &input)
{
const std::regex CycleRegEx("([0-9]+[ ]*,[ ]*)*[0-9]+");
for (std::sregex_iterator Cycle(input.begin(), input.end(), CycleRegEx); Cycle != std::sregex_iterator(); ++Cycle)
{
const std::string CurrentCycle(*Cycle->begin());
std::cout << CurrentCycle << '\n';
elements(CurrentCycle);
}
}
int main(int argc, char **argv)
{
std::string input("(1,2,5)(3,4)");
std::cout << "input: " << input << "\n\n";
cycles(input);
return 0;
}
The Output compiling with Visual Studio 2010 (10.0):
input: (1,2,5)(3,4)
1,2,5
1
2
5
3,4
3
4
But unfortunately, i cannot use the C++11 tools on my project, the project will run under a Linux plataform and it must be compiled with gcc 4.2.3; so i'm forced to use the C regex library in the regex.h header. So, using the same patterns but with different library i'm getting different results:
Here is the test code:
void elements(const std::string &input)
{
regex_t ElementRegEx;
regcomp(&ElementRegEx, "([0-9]+)", REG_EXTENDED);
regmatch_t ElementMatches[MAX_MATCHES];
if (!regexec(&ElementRegEx, input.c_str(), MAX_MATCHES, ElementMatches, 0))
{
int Element = 0;
while ((ElementMatches[Element].rm_so != -1) && (ElementMatches[Element].rm_eo != -1))
{
regmatch_t &ElementMatch = ElementMatches[Element];
std::stringstream CurrentElement(input.substr(ElementMatch.rm_so, ElementMatch.rm_eo - ElementMatch.rm_so));
std::cout << '\t' << CurrentElement << '\n';
++Element;
}
}
regfree(&ElementRegEx);
}
void cycles(const std::string &input)
{
regex_t CycleRegEx;
regcomp(&CycleRegEx, "([0-9]+[ ]*,[ ]*)*[0-9]+", REG_EXTENDED);
regmatch_t CycleMatches[MAX_MATCHES];
if (!regexec(&CycleRegEx, input.c_str(), MAX_MATCHES, CycleMatches, 0))
{
int Cycle = 0;
while ((CycleMatches[Cycle].rm_so != -1) && (CycleMatches[Cycle].rm_eo != -1))
{
regmatch_t &CycleMatch = CycleMatches[Cycle];
const std::string CurrentCycle(input.substr(CycleMatch.rm_so, CycleMatch.rm_eo - CycleMatch.rm_so));
std::cout << CurrentCycle << '\n';
elements(CurrentCycle);
++Cycle;
}
}
regfree(&CycleRegEx);
}
int main(int argc, char **argv)
{
cycles("(1,2,5)(3,4)")
return 0;
}
The expected output is the same as using C++11 regex, but the real ouput was:
input: (1,2,5)(3,4)
1,2,5
1
1
2,
2
2
Finally, the questions are:
Could someone give me a hint about where i'm misunderstanding the C regex engine?
Why the behavior is different in the C regex vs the C++ regex?

You're misunderstanding the output of regexec. The pmatch buffer (after pmatch[0]) is filled with sub-matches of the regex, not with consecutive matches in the string.
For example, if your regex is [a-z]([+ ])([0-9]) matched against x+5, then pmatch[0] will reference x+5 (the whole match), and pmatch[1] and pmatch[2] will reference + and 5 respectively.
You need to repeat the regexec in a loop, starting from the end of the previous match:
int start = 0;
while (!regexec(&ElementRegEx, input.c_str() + start, MAX_MATCHES, ElementMatches, 0))
{
regmatch_t &ElementMatch = ElementMatches[0];
std::string CurrentElement(input.substr(start + ElementMatch.rm_so, ElementMatch.rm_eo - ElementMatch.rm_so));
std::cout << '\t' << CurrentElement << '\n';
start += ElementMatch.rm_eo;
}

How do I check if a C++ std::string starts with a certain string, and convert a substring to an int?

How do I implement the following (Python pseudocode) in C++?
if argv[1].startswith('--foo='):
foo_value = int(argv[1][len('--foo='):])
(For example, if argv[1] is --foo=98, then foo_value is 98.)
Update: I'm hesitant to look into Boost, since I'm just looking at making a very small change to a simple little command-line tool (I'd rather not have to learn how to link in and use Boost for a minor change).

Use rfind overload that takes the search position pos parameter, and pass zero for it:
std::string s = "tititoto";
if (s.rfind("titi", 0) == 0) { // pos=0 limits the search to the prefix
// s starts with prefix
}
Who needs anything else? Pure STL!
Many have misread this to mean "search backwards through the whole string looking for the prefix". That would give the wrong result (e.g. string("tititito").rfind("titi") returns 2 so when compared against == 0 would return false) and it would be inefficient (looking through the whole string instead of just the start). But it does not do that because it passes the pos parameter as 0, which limits the search to only match at that position or earlier. For example:
std::string test = "0123123";
size_t match1 = test.rfind("123"); // returns 4 (rightmost match)
size_t match2 = test.rfind("123", 2); // returns 1 (skipped over later match)
size_t match3 = test.rfind("123", 0); // returns std::string::npos (i.e. not found)

You would do it like this:
std::string prefix("--foo=");
if (!arg.compare(0, prefix.size(), prefix))
foo_value = std::stoi(arg.substr(prefix.size()));
Looking for a lib such as Boost.ProgramOptions that does this for you is also a good idea.

Just for completeness, I will mention the C way to do it:
If str is your original string, substr is the substring you want to
check, then
strncmp(str, substr, strlen(substr))
will return 0 if str
starts with substr. The functions strncmp and strlen are in the C
header file <string.h>
(originally posted by Yaseen Rauf here, markup added)
For a case-insensitive comparison, use strnicmp instead of strncmp.
This is the C way to do it, for C++ strings you can use the same function like this:
strncmp(str.c_str(), substr.c_str(), substr.size())

If you're already using Boost, you can do it with boost string algorithms + boost lexical cast:
#include <boost/algorithm/string/predicate.hpp>
#include <boost/lexical_cast.hpp>
try {
if (boost::starts_with(argv[1], "--foo="))
foo_value = boost::lexical_cast<int>(argv[1]+6);
} catch (boost::bad_lexical_cast) {
// bad parameter
}
This kind of approach, like many of the other answers provided here is ok for very simple tasks, but in the long run you are usually better off using a command line parsing library. Boost has one (Boost.Program_options), which may make sense if you happen to be using Boost already.
Otherwise a search for "c++ command line parser" will yield a number of options.

Code I use myself:
std::string prefix = "-param=";
std::string argument = argv[1];
if(argument.substr(0, prefix.size()) == prefix) {
std::string argumentValue = argument.substr(prefix.size());
}

Nobody used the STL algorithm/mismatch function yet. If this returns true, prefix is a prefix of 'toCheck':
std::mismatch(prefix.begin(), prefix.end(), toCheck.begin()).first == prefix.end()
Full example prog:
#include <algorithm>
#include <string>
#include <iostream>
int main(int argc, char** argv) {
if (argc != 3) {
std::cerr << "Usage: " << argv[0] << " prefix string" << std::endl
<< "Will print true if 'prefix' is a prefix of string" << std::endl;
return -1;
}
std::string prefix(argv[1]);
std::string toCheck(argv[2]);
if (prefix.length() > toCheck.length()) {
std::cerr << "Usage: " << argv[0] << " prefix string" << std::endl
<< "'prefix' is longer than 'string'" << std::endl;
return 2;
}
if (std::mismatch(prefix.begin(), prefix.end(), toCheck.begin()).first == prefix.end()) {
std::cout << '"' << prefix << '"' << " is a prefix of " << '"' << toCheck << '"' << std::endl;
return 0;
} else {
std::cout << '"' << prefix << '"' << " is NOT a prefix of " << '"' << toCheck << '"' << std::endl;
return 1;
}
}
Edit:
As #James T. Huggett suggests, std::equal is a better fit for the question: Is A a prefix of B? and is slight shorter code:
std::equal(prefix.begin(), prefix.end(), toCheck.begin())
Full example prog:
#include <algorithm>
#include <string>
#include <iostream>
int main(int argc, char **argv) {
if (argc != 3) {
std::cerr << "Usage: " << argv[0] << " prefix string" << std::endl
<< "Will print true if 'prefix' is a prefix of string"
<< std::endl;
return -1;
}
std::string prefix(argv[1]);
std::string toCheck(argv[2]);
if (prefix.length() > toCheck.length()) {
std::cerr << "Usage: " << argv[0] << " prefix string" << std::endl
<< "'prefix' is longer than 'string'" << std::endl;
return 2;
}
if (std::equal(prefix.begin(), prefix.end(), toCheck.begin())) {
std::cout << '"' << prefix << '"' << " is a prefix of " << '"' << toCheck
<< '"' << std::endl;
return 0;
} else {
std::cout << '"' << prefix << '"' << " is NOT a prefix of " << '"'
<< toCheck << '"' << std::endl;
return 1;
}
}

With C++17 you can use std::basic_string_view & with C++20 std::basic_string::starts_with or std::basic_string_view::starts_with.
The benefit of std::string_view in comparison to std::string - regarding memory management - is that it only holds a pointer to a "string" (contiguous sequence of char-like objects) and knows its size. Example without moving/copying the source strings just to get the integer value:
#include <exception>
#include <iostream>
#include <string>
#include <string_view>
int main()
{
constexpr auto argument = "--foo=42"; // Emulating command argument.
constexpr auto prefix = "--foo=";
auto inputValue = 0;
constexpr auto argumentView = std::string_view(argument);
if (argumentView.starts_with(prefix))
{
constexpr auto prefixSize = std::string_view(prefix).size();
try
{
// The underlying data of argumentView is nul-terminated, therefore we can use data().
inputValue = std::stoi(argumentView.substr(prefixSize).data());
}
catch (std::exception & e)
{
std::cerr << e.what();
}
}
std::cout << inputValue; // 42
}

Given that both strings — argv[1] and "--foo" — are C strings, #FelixDombek's answer is hands-down the best solution.
Seeing the other answers, however, I thought it worth noting that, if your text is already available as a std::string, then a simple, zero-copy, maximally efficient solution exists that hasn't been mentioned so far:
const char * foo = "--foo";
if (text.rfind(foo, 0) == 0)
foo_value = text.substr(strlen(foo));
And if foo is already a string:
std::string foo("--foo");
if (text.rfind(foo, 0) == 0)
foo_value = text.substr(foo.length());

Starting with C++20, you can use the starts_with method.
std::string s = "abcd";
if (s.starts_with("abc")) {
...
}

text.substr(0, start.length()) == start

Using STL this could look like:
std::string prefix = "--foo=";
std::string arg = argv[1];
if (prefix.size()<=arg.size() && std::equal(prefix.begin(), prefix.end(), arg.begin())) {
std::istringstream iss(arg.substr(prefix.size()));
iss >> foo_value;
}

At the risk of being flamed for using C constructs, I do think this sscanf example is more elegant than most Boost solutions. And you don't have to worry about linkage if you're running anywhere that has a Python interpreter!
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
for (int i = 1; i != argc; ++i) {
int number = 0;
int size = 0;
sscanf(argv[i], "--foo=%d%n", &number, &size);
if (size == strlen(argv[i])) {
printf("number: %d\n", number);
}
else {
printf("not-a-number\n");
}
}
return 0;
}
Here's some example output that demonstrates the solution handles leading/trailing garbage as correctly as the equivalent Python code, and more correctly than anything using atoi (which will erroneously ignore a non-numeric suffix).
$ ./scan --foo=2 --foo=2d --foo='2 ' ' --foo=2'
number: 2
not-a-number
not-a-number
not-a-number

I use std::string::compare wrapped in utility method like below:
static bool startsWith(const string& s, const string& prefix) {
return s.size() >= prefix.size() && s.compare(0, prefix.size(), prefix) == 0;
}

C++20 update :
Use std::string::starts_with
https://en.cppreference.com/w/cpp/string/basic_string/starts_with
std::string str_value = /* smthg */;
const auto starts_with_foo = str_value.starts_with(std::string_view{"foo"});

In C++20 now there is starts_with available as a member function of std::string defined as:
constexpr bool starts_with(string_view sv) const noexcept;
constexpr bool starts_with(CharT c) const noexcept;
constexpr bool starts_with(const CharT* s) const;
So your code could be something like this:
std::string s{argv[1]};
if (s.starts_with("--foo="))

In case you need C++11 compatibility and cannot use boost, here is a boost-compatible drop-in with an example of usage:
#include <iostream>
#include <string>
static bool starts_with(const std::string str, const std::string prefix)
{
return ((prefix.size() <= str.size()) && std::equal(prefix.begin(), prefix.end(), str.begin()));
}
int main(int argc, char* argv[])
{
bool usage = false;
unsigned int foos = 0; // default number of foos if no parameter was supplied
if (argc > 1)
{
const std::string fParamPrefix = "-f="; // shorthand for foo
const std::string fooParamPrefix = "--foo=";
for (unsigned int i = 1; i < argc; ++i)
{
const std::string arg = argv[i];
try
{
if ((arg == "-h") || (arg == "--help"))
{
usage = true;
} else if (starts_with(arg, fParamPrefix)) {
foos = std::stoul(arg.substr(fParamPrefix.size()));
} else if (starts_with(arg, fooParamPrefix)) {
foos = std::stoul(arg.substr(fooParamPrefix.size()));
}
} catch (std::exception& e) {
std::cerr << "Invalid parameter: " << argv[i] << std::endl << std::endl;
usage = true;
}
}
}
if (usage)
{
std::cerr << "Usage: " << argv[0] << " [OPTION]..." << std::endl;
std::cerr << "Example program for parameter parsing." << std::endl << std::endl;
std::cerr << " -f, --foo=N use N foos (optional)" << std::endl;
return 1;
}
std::cerr << "number of foos given: " << foos << std::endl;
}

Why not use gnu getopts? Here's a basic example (without safety checks):
#include <getopt.h>
#include <stdio.h>
int main(int argc, char** argv)
{
option long_options[] = {
{"foo", required_argument, 0, 0},
{0,0,0,0}
};
getopt_long(argc, argv, "f:", long_options, 0);
printf("%s\n", optarg);
}
For the following command:
$ ./a.out --foo=33
You will get
33

Ok why the complicated use of libraries and stuff? C++ String objects overload the [] operator, so you can just compare chars.. Like what I just did, because I want to list all files in a directory and ignore invisible files and the .. and . pseudofiles.
while ((ep = readdir(dp)))
{
string s(ep->d_name);
if (!(s[0] == '.')) // Omit invisible files and .. or .
files.push_back(s);
}
It's that simple..

You can also use strstr:
if (strstr(str, substr) == substr) {
// 'str' starts with 'substr'
}
but I think it's good only for short strings because it has to loop through the whole string when the string doesn't actually start with 'substr'.

With C++11 or higher you can use find() and find_first_of()
Example using find to find a single char:
#include <string>
std::string name = "Aaah";
size_t found_index = name.find('a');
if (found_index != std::string::npos) {
// Found string containing 'a'
}
Example using find to find a full string & starting from position 5:
std::string name = "Aaah";
size_t found_index = name.find('h', 3);
if (found_index != std::string::npos) {
// Found string containing 'h'
}
Example using the find_first_of() and only the first char, to search at the start only:
std::string name = ".hidden._di.r";
size_t found_index = name.find_first_of('.');
if (found_index == 0) {
// Found '.' at first position in string
}
More about find
More about find_first_of
Good luck!

std::string text = "--foo=98";
std::string start = "--foo=";
if (text.find(start) == 0)
{
int n = stoi(text.substr(start.length()));
std::cout << n << std::endl;
}

Since C++11 std::regex_search can also be used to provide even more complex expressions matching. The following example handles also floating numbers thorugh std::stof and a subsequent cast to int.
However the parseInt method shown below could throw a std::invalid_argument exception if the prefix is not matched; this can be easily adapted depending on the given application:
#include <iostream>
#include <regex>
int parseInt(const std::string &str, const std::string &prefix) {
std::smatch match;
std::regex_search(str, match, std::regex("^" + prefix + "([+-]?(?=\\.?\\d)\\d*(?:\\.\\d*)?(?:[Ee][+-]?\\d+)?)$"));
return std::stof(match[1]);
}
int main() {
std::cout << parseInt("foo=13.3", "foo=") << std::endl;
std::cout << parseInt("foo=-.9", "foo=") << std::endl;
std::cout << parseInt("foo=+13.3", "foo=") << std::endl;
std::cout << parseInt("foo=-0.133", "foo=") << std::endl;
std::cout << parseInt("foo=+00123456", "foo=") << std::endl;
std::cout << parseInt("foo=-06.12e+3", "foo=") << std::endl;
// throw std::invalid_argument
// std::cout << parseInt("foo=1", "bar=") << std::endl;
return 0;
}
The kind of magic of the regex pattern is well detailed in the following answer.
EDIT: the previous answer did not performed the conversion to integer.

if(boost::starts_with(string_to_search, string_to_look_for))
intval = boost::lexical_cast<int>(string_to_search.substr(string_to_look_for.length()));
This is completely untested. The principle is the same as the Python one. Requires Boost.StringAlgo and Boost.LexicalCast.
Check if the string starts with the other string, and then get the substring ('slice') of the first string and convert it using lexical cast.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to check for success in c++11 std::regex_replace? - c++

Try to find match with std::regex_match or std::regex_search, check if something is matched, then replace found portion of string using std::string::replace. That shouldn't lead to performance loss.

I don't believe there's any direct way to find out whether any replacements were made. (Don't confuse this with "success / not success", which is not quite the same thing.)

Related

Regex search overlapping matches c++11

Find an exact substr in a string

find alphabetic substring

Different behavior in C regex VS C++11 regex

How do I check if a C++ std::string starts with a certain string, and convert a substring to an int?

Categories

Resources