Storing regular expressions within a variable (std::string) - c++

I am trying to store a regular expression within a variable, i.e if we had a regular expression, \\d and a string, std::string str; then I would store the regular expression \\d within std::string str. From that I could then use str whenever I wanted to use that regular expression.
I tried something like this:
Boost::regex const string_matcher("\\d");
std::string str = string_matcher;
However I realized that it would not work. Does anyone have any ides of how I can store a regular expression?

std::string regex = "\\d";
boost::regex expression(regex);
bool ok = boost::regex_match(testStr, expression);

You already have your regular expression stored in a variable. You called it string_matcher.

Related

Split a mathematical expression using regex

I want to split the following mathematical expression -1+33+4.4+sin(3)-2-x^2 into tokens using regex. I use the following site to test my regex expression link, this says that nothing wrong. When I implement the regex into my C++, throwing the following error Invalid special open parenthesis I looked for the solution and I find the following stackoverflow site link but it do not helped me solve my problem.
My regex code is (?<=[-+*\/^()])|(?=[-+*\/^()]). In the C++ code I do not use \.
The other problem is that I do not know how to determine the minus sign is an unary operator or a binary operator, if the minus is an unary operator I want to look like this {-1}
I want the tokens looks like this : {-1,+,33,+4.4,+,sin,(,3,),-,2,-,x,^,2}
The unary minus can be anywhere in the string.
If I do not use ^ it still wrong.
code:
std::vector<std::string> split(const std::string& s, std::string rgx_str) {
std::vector<std::string> elems;
std::regex rgx (rgx_str);
std::sregex_token_iterator iter(s.begin(), s.end(), rgx);
std::sregex_token_iterator end;
while (iter != end) {
elems.push_back(*iter);
++iter;
}
return elems;
}
int main() {
std::string str = "-1+33+4.4+sin(3)-2-x^2";
std::string reg = "(?<=[-+*/()^])|(?=[-+*/()^])";
std::vector<std::string> s = split(str,reg);
for(auto& a : s)
cout << a << endl;
return 0;
}
C++ uses a modified ECMAScript regular expression grammar for its std::regex by default. It does support lookaheads (?=) and (?!), but not lookbehinds. So, the (?<=) is not a valid std::regex syntax.
There is a proposal to add this in C++23, but it is not currently implemented.

How to match "{" using regex in c++

May we have similar question here stackoverflow:
But my question is:
First I tried to match all x in the string so I write the following code, and it's working well:
string str = line;
regex rx("x");
vector<int> index_matches; // results saved here
for (auto it = std::sregex_iterator(str.begin(), str.end(), rx);
it != std::sregex_iterator();
++it)
{
index_matches.push_back(it->position());
}
Now if I tried to match all { I tried to replace
regex rx("x"); with regex rx("{"); andregex rx("\{");.
So I got an exception and I think it should throw an exception because we use {
sometimes to express the regular expression, and it expect to have } in the regex at the end that's why it throw an exception.
So first is my explanation correct?
Second question I need to match all { using the same code above, is that possible to change the regex rx("{"); to something else?
You need to escape characters with special meaning in regular expressions, i.e. use \{ regular expression. But, \ has special meaning in C++ string literals. So, next you need to escape characters with special meaning in C++ string literals, i.e. write:
regex rx("\\{");

Put first boost::regex match into a string [duplicate]

This question already has an answer here:
Get last match with Boost::Regex
(1 answer)
Closed 9 years ago.
Somehow, I've failed to find out, how to put only the first occurrence or regular expression to string. I can create a regex object:
static const boost::regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
Now, I need to match ([A-Za-z0-9_]+) to std::string, say playername.
std::string chat_input("<Darker> Hello");
std::string playername = e.some_match_method(chat_input, 1); //Get contents of the second (...)
What have I missed?
What should be instead of some_match_method and what parameters should it take?
You can do something like this:
static const regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
string chat_input("<Darker> Hello");
smatch mr;
if (regex_search(begin(chat_input), end(chat_input), mr, e)
string playername = mr[2].str(); //Get contents of the second (...)
Please note that regex is part of C++11, so you don't need boost for it, unless your regular expression is complex (as C++11 and newer still has difficulties processing complex regular expressions).
I think what you're missing is that boost::regex is the regular expression, but it doesn't do the parsing against a given input. You need to actually use it as a parameter to boost::regex_search or boost::regex_match, which evaluate a string (or iterator pairs) against the regular expression.
static const boost::regex e("<(From )?([A-Za-z0-9_]+)>(.*?)");
std::string chat_input("<Darker> Hello");
boost::match_results<std::string::const_iterator> results;
if (boost::regex_match(chat_input, results, e))
{
std::string playername = results[2]; //Get contents of the second (...)
}

Capitalizing variable name in C program using regular expressions

I need to find a variable in a C program and need to convert its 1st letter to upper case. For example:
int sum;
sum = 50;
I need to find sum and I should convert it to Sum. How can I achieve this using regular expressions (find and replace)?
This can't be done with a regex. You need a C language parser for that, otherwise how would you know what is a variable, what is a keyword, what is a function name, what is a word inside a string or a comment...
.Net's Regex replace support what you want to do (if you can come up with the regular expression you need). The ReplaceCC function at the bottom is invoked to provide the replacement value.
static void Main(string[] args)
{
string sInput, sRegex;
// The string to search.
sInput = #"int sum;
sum = 1;";
// A very simple regular expression.
sRegex = "sum";
Regex r = new Regex(sRegex);
MyClass c = new MyClass();
// Assign the replace method to the MatchEvaluator delegate.
MatchEvaluator myEvaluator = new MatchEvaluator(c.ReplaceCC);
// Write out the original string.
Console.WriteLine(sInput);
// Replace matched characters using the delegate method.
sInput = r.Replace(sInput, myEvaluator);
// Write out the modified string.
Console.WriteLine(sInput);
}
public string ReplaceCC(Match m)
{
return m.Value[0].ToUpper () + m.Value.Substring (1);
}

Regular Expression for removing suffix

What is the regular expression for removing the suffix of file names? For example, if I have a file name in a string such as "vnb.txt", what is the regular expression to remove ".txt"?
Thanks.
Do you really need a regular expression to do this? Why not just look for the last period in the string, and trim the string up to that point? Frankly, there's a lot of overhead for a regular expression, and I don't think you need it in this case.
As suggested by tstenner, you can try one of the following, depending on what kinds of strings you're using:
std::strrchr
std::string::find_last_of
First example:
char* str = "Directory/file.txt";
size_t index;
char* pStr = strrchr(str,'.');
if(nullptr != pStr)
{
index = pStr - str;
}
Second example:
int index = string("Directory/file.txt").find_last_of('.');
If you are using Qt already, you could use QFileInfo, and use the baseName() function to get just the name (if one exists), or the suffix() function to get the extension (if one exists).
If you're looking for a solution that will give you anything except for the suffix, you should use string::find_last_of.
Your code could look like this:
const std::string removesuffix(const std::string& s) {
size_t suffixbegin = s.find_last_of('.');
//This will handle cases like "directory.foo/bar"
size_t dir = s.find_last_of('/');
if(dir != std::string::npos && dir > suffixbegin) return s;
if(suffixbegin == std::string::npos) return s;
else return s.substr(0,suffixbegin);
}
If you're looking for a regular expression, use \.[^.]+$.
You have to escape the first ., otherwise it will match any character, and put a $ at the end, so it will only match at the end of a string.
Different operating systems may allow different characters in filenams, the simplest regex might be (.+)\.txt$. Get the first capture group to get the filename sans extension.