Boost::xpressive regex_search concatenate matches in a single string - c++

I want to concatenate all matches found by regex_search into a single string, and then return it. I tried doing it with std::accumulate, but failed.
Is there a way to return something like std::accumulate(what.begin()+1, what.end(), someFunc)?
I'm not very familiar with functional programming. I know that I can make a for loop that adds strings together, but I want to try doing it otherwise. Thanks in advance!
Here is a pseudo-code snippet that might help you better understand what I want to do.
std::string foo(const std::string& text)
{
using namespace boost::xpressive;
sregex srx = +_d >> as_xpr("_") >> +_d; // some random regex
smatch what;
if (regex_search(filename, what, srx))
{
// Here I want to return a string,
// concatenated from what[1].str() + what[2].str() + ... + what[n].str();
// How do I do this?
// What about what[1].str() + "-" + what[2].str()...?
}
return std::string();
}

Related

How can I use the C++ regex library to find a match and *then* replace it?

I am writing what amounts to a tiny DSL in which each script is read from a single string, like this:
"func1;func2;func1;4*func3;func1"
I need to expand the loops, so that the expanded script is:
"func1;func2;func1;func3;func3;func3;func3;func1"
I have used the C++ standard regex library with the following regex to find those loops:
regex REGEX_SIMPLE_LOOP(":?[0-9]+)\\*([_a-zA-Z][_a-zA-Z0-9]*;");
smatch match;
bool found = std::regex_search(*this, match, std::regex(REGEX_SIMPLE_LOOP));
Now, it's not too difficult to read out the loop multiplier and print the function N times, but how do I then replace the original match with this string? I want to do this:
if (found) match[0].replace(new_string);
But I don't see that the library can do this.
My backup place is to regex_search, then construct the new string, and then use regex_replace, but it seems clunky and inefficient and not nice to essentially do two full searches like that. Is there a cleaner way?
You can also NOT use regex, the parsing isn't too difficult.
So regex might be overkill. Demo here : https://onlinegdb.com/RXLqLtrUQ-
(and yes my output gives an extra ; at the end)
#include <string>
#include <sstream>
#include <iostream>
int main()
{
std::istringstream is{ "func1;func2;func1;4*func3;func1" };
std::string split;
// use getline to split
while (std::getline(is, split, ';'))
{
// assume 1 repeat
std::size_t count = 1;
// if split part starts with a digit
if (std::isdigit(split.front()))
{
// look for a *
auto pos = split.find('*');
// the first part of the string contains the repeat count
auto count_str = split.substr(0, pos);
// convert that to a value
count = std::stoi(count_str);
// and keep the rest ("funcn")
split = split.substr(pos + 1, split.size() - pos - 1);
}
// now use the repeat count to build the output string
for (std::size_t n = 0; n < count; ++n)
{
std::cout << split << ";";
}
}
// TODO invalid input string handling.
return 0;
}

C++ regex replace whole word

I have a small game to do in which I need to sometimes replace some group of characters with the name of the player in the sentences.
For example, I could have a sentence like :
"[Player]! Are you okay? A plane crash happened, it's on fire!"
And I need to replace the "[Player]" with some name contained in a std::string.
I have been looking for about 20 minutes in other SO questions and in the CPP reference and I really can't understand how to use the regex.
I would like to know how I can replace all instances of the "[Player]" string in a std::string.
Personally I would not use regex for this. A simple search and replace should be enough.
These are (roughly) the functions I use:
// change the string in-place
std::string& replace_all_mute(std::string& s,
const std::string& from, const std::string& to)
{
if(!from.empty())
for(std::size_t pos = 0; (pos = s.find(from, pos) + 1); pos += to.size())
s.replace(--pos, from.size(), to);
return s;
}
// return a copy of the string
std::string replace_all_copy(std::string s,
const std::string& from, const std::string& to)
{
return replace_all_mute(s, from, to);
}
int main()
{
std::string s = "[Player]! Are you okay? A plane crash happened, it's on fire!";
replace_all_mute(s, "[Player]", "Uncle Bob");
std::cout << s << '\n';
}
Output:
Uncle Bob! Are you okay? A plane crash happened, it's on fire!
Regex is meant for more complex patterns. Consider, for example, that instead of simply matching [Player], you wanted to match anything between brackets. That would be a good use for regex.
Following is an example that does just that. Unfortunately, the interface of <regex> is not flexible enough to enable dynamic replacements, so we have to implement the actual replacing ourselves.
#include <iostream>
#include <regex>
int main() {
// Anything stored here can be replaced in the string.
std::map<std::string, std::string> vars {
{"Player1", "Bill"},
{"Player2", "Ted"}
};
// Matches anything between brackets.
std::regex r(R"(\[([^\]]+?)\])");
std::string str = "[Player1], [Player1]! Are you okay? [Player2] said that a plane crash happened!";
// We need to keep track of where we are, or else we would need to search from the start of
// the string everytime, which is very wasteful.
// std::regex_iterator won't help, because the replacement may be smaller
// than the match, and it would cause strings like "[Player1][Player1]" to not match properly.
auto pos=str.cbegin();
do {
// First, we try to get a match. If there's no more matches, exit.
std::smatch m;
regex_search(pos, str.cend(), m, r);
if (m.empty()) break;
// The interface of std::match_results is terrible. Let's get what we need and
// place it in apropriately named variables.
auto var_name = m[1].str();
auto start = m[0].first;
auto end = m[0].second;
auto value = vars[var_name];
// This does the actual replacement
str.replace(start, end, value);
// We update our position. The new search will start right at the end of the replacement.
pos = m[0].first + value.size();
} while(true);
std::cout << str;
}
Output:
Bill, Bill! Are you okay? Ted said that a plane crash happened!
See it live on Coliru
Simply find and replace, e.g. boost::replace_all()
#include <boost/algorithm/string.hpp>
std::string target(""[Player]! Are you okay? A plane crash happened, it's on fire!"");
boost::replace_all(target, "[Player]", "NiNite");
As some people have mentioned, find and replace might be more useful for this scenario, you could do something like this.
std::string name = "Bill";
std::string strToFind = "[Player]";
std::string str = "[Player]! Are you okay? A plane crash happened, it's on fire!";
str.replace(str.find(strToFind), strToFind.length(), name);

Conditional replace string using boost::regex_replace

I want to simplify the signs in a mathematical expression using regex_replace, here is a sample code:
string entry="6+-3++5";
boost::regex signs("[\-\+]+");
cout<<boost::regex_replace(entry,signs,"?")<<endl;
The output is then 6?3?5. My question is: How can I get the proper result of 6-3+5 with some neat regular expression tools? Thanks a lot.
Tried something else with sregex_iterator and smatch, but still has some problem:
string s="63--17--42+5555";
collect_sign(s);
Output is
63+17--42+5555+42+5555+5555
i.e.
63+(17--42+5555)+(42+5555)+5555
It seems to me that the problem is related to the match.suffix(), Could anybody help please? The collect_sign function basically just iterate through every sign strings, convert it to "-"/"+" if the number of "-" is odd/even, and then stitch together the suffix expression of the signs.
void collect_sign(string& entry)
{
boost::regex signs("[\-\+]+");
string output="";
auto signs_begin = boost::sregex_iterator(entry.begin(), entry.end(), signs);
auto signs_end = boost::sregex_iterator();
for (boost::sregex_iterator it = signs_begin; it != signs_end; ++it)
{
boost::smatch match = *it;
if (it ==signs_begin)
output+=match.prefix().str();
string match_signs = match.str();
int n_minus=count(match_signs.begin(),match_signs.end(),'-');
if (n_minus%2==0)
output+="+";
else
output+="-";
output+=match.suffix();
}
cout<<"simplify to: "<<output<<endl;
}
Use:
[+\-*\/]*([+\-*\/])
Replace:
$1
You can test here
If you just want a mathematical simplification, you can use:
s = boost::regex_replace(s, boost::regex("(?:++|--"), "+", boost::format_all);
s = boost::regex_replace(s, boost::regex("(?:+-|-+"), "-", boost::format_all);

C++ Get String between two delimiter String

Is there any inbuilt function available two get string between two delimiter string in C/C++?
My input look like
_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_
And my output should be
_0_192.168.1.18_
Thanks in advance...
You can do as:
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
unsigned first = str.find(STARTDELIMITER);
unsigned last = str.find(STOPDELIMITER);
string strNew = str.substr (first,last-first);
Considering your STOPDELIMITER delimiter will occur only once at the end.
EDIT:
As delimiter can occur multiple times, change your statement for finding STOPDELIMITER to:
unsigned last = str.find_last_of(STOPDELIMITER);
This will get you text between the first STARTDELIMITER and LAST STOPDELIMITER despite of them being repeated multiple times.
I have no idea how the top answer received so many votes that it did when the question clearly asks how to get a string between two delimiter strings, and not a pair of characters.
If you would like to do so you need to account for the length of the string delimiter, since it will not be just a single character.
Case 1: Both delimiters are unique:
Given a string _STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_ that you want to extract _0_192.168.1.18_ from, you could modify the top answer like so to get the desired effect. This is the simplest solution without introducing extra dependencies (e.g Boost):
#include <iostream>
#include <string>
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find(stop_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
int main() {
// Want to extract _0_192.168.1.18_
std::string s = "_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_";
std::string s2 = "ABC123_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_XYZ345";
std::string start_delim = "_STARTDELIMITER";
std::string stop_delim = "STOPDELIMITER_";
std::cout << get_str_between_two_str(s, start_delim, stop_delim) << std::endl;
std::cout << get_str_between_two_str(s2, start_delim, stop_delim) << std::endl;
return 0;
}
Will print _0_192.168.1.18_ twice.
It is necessary to add the position of the first delimiter in the second argument to std::string::substr as last - (first + start_delim.length()) to ensure that the it would still extract the desired inner string correctly in the event that the start delimiter is not located at the very beginning of the string, as demonstrated in the second case above.
See the demo.
Case 2: Unique first delimiter, non-unique second delimiter:
Say you want to get a string between a unique delimiter and the first non unique delimiter encountered after the first delimiter. You could modify the above function get_str_between_two_str to use find_first_of instead to get the desired effect:
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find_first_of(stop_delim, end_pos_of_first_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
If instead you want to capture any characters in between the first unique delimiter and the last encountered second delimiter, like what the asker commented above, use find_last_of instead.
Case 3: Non-unique first delimiter, unique second delimiter:
Very similar to case 2, just reverse the logic between the first delimiter and second delimiter.
Case 4: Both delimiters are not unique:
Again, very similar to case 2, make a container to capture all strings between any of the two delimiters. Loop through the string and update the first delimiter's position to be equal to the second delimiter's position when it is encountered and add the string in between to the container. Repeat until std::string:npos is reached.
To get a string between 2 delimiter strings without white spaces.
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string startDEL = "STARTDELIMITER";
// this is really only needed for the first delimiter
string stopDEL = "STOPDELIMITER";
unsigned firstLim = str.find(startDEL);
unsigned lastLim = str.find(stopDEL);
string strNew = str.substr (firstLim,lastLim);
//This won't exclude the first delimiter because there is no whitespace
strNew = strNew.substr(firstLim + startDEL.size())
// this will start your substring after the delimiter
I tried combining the two substring functions but it started printing the STOPDELIMITER
Hope that helps
Hope you won't mind I'm answering by another question :)
I would use boost::split or boost::split_iter.
http://www.boost.org/doc/libs/1_54_0/doc/html/string_algo/usage.html#idp166856528
For example code see this SO question:
How to avoid empty tokens when splitting with boost::iter_split?
Let's say you need to get 5th argument (brand) from output below:
zoneid:zonename:state:zonepath:uuid:brand:ip-type:r/w:file-mac-profile
You cannot use any "str.find" function, because it is in the middle, but you can use 'strtok'. e.g.
char *brand;
brand = strtok( line, ":" );
for (int i=0;i<4;i++) {
brand = strtok( NULL, ":" );
}
This is a late answer, but this might work too:
string strgOrg= "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string strg= strgOrg;
strg.replace(strg.find("STARTDELIMITER"), 14, "");
strg.replace(strg.find("STOPDELIMITER"), 13, "");
Hope it works for others.
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
int start = oStr.find(sStr1);
if (start >= 0)
{
string tstr = oStr.substr(start + sStr1.length());
int stop = tstr.find(sStr2);
if (stop >1)
rStr = oStr.substr(start + sStr1.length(), stop);
else
rStr ="error";
}
else
rStr = "error"; }
or if you are using Windows and have access to c++14, the following,
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
using namespace std::literals::string_literals;
auto start = sStr1;
auto end = sStr2;
std::regex base_regex(start + "(.*)" + end);
auto example = oStr;
std::smatch base_match;
std::string matched;
if (std::regex_search(example, base_match, base_regex)) {
if (base_match.size() == 2) {
matched = base_match[1].str();
}
rStr = matched;
}
}
Example:
string strout;
getBtwString("it's_12345bb2","it's","bb2",strout);
getBtwString("it's_12345bb2"s,"it's"s,"bb2"s,strout); // second solution
Headers:
#include <regex> // second solution
#include <string.h>

C++ How to get string/char in between 2 words

i got a word that is
AD#Andorra
Got a few questions:
How do i check
AD?Andorra exist
? is a wildcard, it could be comma or hex or dollar sign or other value
then after confirm AD?Andorra exist, how do i get the value of ?
Thanks,
Chen
The problem can be solved generally with a regular expression match. However, for the specific problem you presented, this would work:
std::string input = getinput();
char at2 = input[2];
input[2] = '#';
if (input == "AD#Andorra") {
// match, and char of interest is in at2;
} else {
// doesn't match
}
If the ? is supposed to represent a string also, then you can do something like this:
bool find_inbetween (std::string input,
std::string &output,
const std::string front = "AD",
const std::string back = "Andorra") {
if ((input.size() < front.size() + back.size())
|| (input.compare(0, front.size(), front) != 0)
|| (input.compare(input.size()-back.size(), back.size(), back) != 0)) {
return false;
}
output = input.substr(front.size(), input.size()-front.size()-back.size());
return true;
}
If you are on C++11/use Boost (which I strongly recommend!) use regular expressions. Once you gain some level of understanding all text processing becomes easy-peasy!
#include <regex> // or #include <boost/regex>
//! \return A separating character or 0, if str does not match the pattern
char getSeparator(const char* str)
{
using namespace std; // change to "boost" if not on C++11
static const regex re("^AD(.)Andorra$");
cmatch match;
if (regex_match(str, match, re))
{
return *(match[1].first);
}
return 0;
}
assuming your character always starts at position 3!
use the string functions substr:
your_string.substr(your_string,2,1)
If you are using C++11, i recommend you to use regex instead of direct searching in your string.