This question already has an answer here:
Get last match with Boost::Regex
(1 answer)
Closed 9 years ago.
Somehow, I've failed to find out, how to put only the first occurrence or regular expression to string. I can create a regex object:
static const boost::regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
Now, I need to match ([A-Za-z0-9_]+) to std::string, say playername.
std::string chat_input("<Darker> Hello");
std::string playername = e.some_match_method(chat_input, 1); //Get contents of the second (...)
What have I missed?
What should be instead of some_match_method and what parameters should it take?
You can do something like this:
static const regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
string chat_input("<Darker> Hello");
smatch mr;
if (regex_search(begin(chat_input), end(chat_input), mr, e)
string playername = mr[2].str(); //Get contents of the second (...)
Please note that regex is part of C++11, so you don't need boost for it, unless your regular expression is complex (as C++11 and newer still has difficulties processing complex regular expressions).
I think what you're missing is that boost::regex is the regular expression, but it doesn't do the parsing against a given input. You need to actually use it as a parameter to boost::regex_search or boost::regex_match, which evaluate a string (or iterator pairs) against the regular expression.
static const boost::regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
std::string chat_input("<Darker> Hello");
boost::match_results<std::string::const_iterator> results;
if (boost::regex_match(chat_input, results, e))
{
std::string playername = results[2]; //Get contents of the second (...)
}
Related
This question already has an answer here:
Regex not working as expected with C++ regex_match
(1 answer)
Closed 4 days ago.
I am very confused why this regex match in C++ not working.
#include <iostream>
#include <regex>
#include <string>
void test_code(){
const std::string test_string("this is a test of test");
const std::regex match_regex("test");
std::cout<<test_string<<std::endl;
std::smatch match;
if (std::regex_match(test_string, match, match_regex)){
std::cout<<match.size()<<std::endl;
}
}
int main() {
test_code();
}
I read the CPP reference documentation and tried to write a simple regex check. I am not sure why this is not working (i.e. it s not returning true for std::regex_match(...) call.
As stated in documentation for std::regex_match() (emphasis is mine):
Determines if the regular expression e matches the entire target character sequence, which may be specified as std::string, a C-string, or an iterator pair.
and your regex pattern does not obviously match the whole string. So you either need to change your regex to something like ".*test.*" or use std::regex_search() If you want to check substring for matching:
Determines if there is a match between the regular expression e and some subsequence in the target character sequence.
I want to split the following mathematical expression -1+33+4.4+sin(3)-2-x^2 into tokens using regex. I use the following site to test my regex expression link, this says that nothing wrong. When I implement the regex into my C++, throwing the following error Invalid special open parenthesis I looked for the solution and I find the following stackoverflow site link but it do not helped me solve my problem.
My regex code is (?<=[-+*\/^()])|(?=[-+*\/^()]). In the C++ code I do not use \.
The other problem is that I do not know how to determine the minus sign is an unary operator or a binary operator, if the minus is an unary operator I want to look like this {-1}
I want the tokens looks like this : {-1,+,33,+4.4,+,sin,(,3,),-,2,-,x,^,2}
The unary minus can be anywhere in the string.
If I do not use ^ it still wrong.
code:
std::vector<std::string> split(const std::string& s, std::string rgx_str) {
std::vector<std::string> elems;
std::regex rgx (rgx_str);
std::sregex_token_iterator iter(s.begin(), s.end(), rgx);
std::sregex_token_iterator end;
while (iter != end) {
elems.push_back(*iter);
++iter;
}
return elems;
}
int main() {
std::string str = "-1+33+4.4+sin(3)-2-x^2";
std::string reg = "(?<=[-+*/()^])|(?=[-+*/()^])";
std::vector<std::string> s = split(str,reg);
for(auto& a : s)
cout << a << endl;
return 0;
}
C++ uses a modified ECMAScript regular expression grammar for its std::regex by default. It does support lookaheads (?=) and (?!), but not lookbehinds. So, the (?<=) is not a valid std::regex syntax.
There is a proposal to add this in C++23, but it is not currently implemented.
May we have similar question here stackoverflow:
But my question is:
First I tried to match all x in the string so I write the following code, and it's working well:
string str = line;
regex rx("x");
vector<int> index_matches; // results saved here
for (auto it = std::sregex_iterator(str.begin(), str.end(), rx);
it != std::sregex_iterator();
++it)
{
index_matches.push_back(it->position());
}
Now if I tried to match all { I tried to replace
regex rx("x"); with regex rx("{"); andregex rx("\{");.
So I got an exception and I think it should throw an exception because we use {
sometimes to express the regular expression, and it expect to have } in the regex at the end that's why it throw an exception.
So first is my explanation correct?
Second question I need to match all { using the same code above, is that possible to change the regex rx("{"); to something else?
You need to escape characters with special meaning in regular expressions, i.e. use \{ regular expression. But, \ has special meaning in C++ string literals. So, next you need to escape characters with special meaning in C++ string literals, i.e. write:
regex rx("\\{");
I am trying to store a regular expression within a variable, i.e if we had a regular expression, \\d and a string, std::string str; then I would store the regular expression \\d within std::string str. From that I could then use str whenever I wanted to use that regular expression.
I tried something like this:
Boost::regex const string_matcher("\\d");
std::string str = string_matcher;
However I realized that it would not work. Does anyone have any ides of how I can store a regular expression?
std::string regex = "\\d";
boost::regex expression(regex);
bool ok = boost::regex_match(testStr, expression);
You already have your regular expression stored in a variable. You called it string_matcher.
What is the regular expression for removing the suffix of file names? For example, if I have a file name in a string such as "vnb.txt", what is the regular expression to remove ".txt"?
Thanks.
Do you really need a regular expression to do this? Why not just look for the last period in the string, and trim the string up to that point? Frankly, there's a lot of overhead for a regular expression, and I don't think you need it in this case.
As suggested by tstenner, you can try one of the following, depending on what kinds of strings you're using:
std::strrchr
std::string::find_last_of
First example:
char* str = "Directory/file.txt";
size_t index;
char* pStr = strrchr(str,'.');
if(nullptr != pStr)
{
index = pStr - str;
}
Second example:
int index = string("Directory/file.txt").find_last_of('.');
If you are using Qt already, you could use QFileInfo, and use the baseName() function to get just the name (if one exists), or the suffix() function to get the extension (if one exists).
If you're looking for a solution that will give you anything except for the suffix, you should use string::find_last_of.
Your code could look like this:
const std::string removesuffix(const std::string& s) {
size_t suffixbegin = s.find_last_of('.');
//This will handle cases like "directory.foo/bar"
size_t dir = s.find_last_of('/');
if(dir != std::string::npos && dir > suffixbegin) return s;
if(suffixbegin == std::string::npos) return s;
else return s.substr(0,suffixbegin);
}
If you're looking for a regular expression, use \.[^.]+$.
You have to escape the first ., otherwise it will match any character, and put a $ at the end, so it will only match at the end of a string.
Different operating systems may allow different characters in filenams, the simplest regex might be (.+)\.txt$. Get the first capture group to get the filename sans extension.