its possible c++ regex evaluator with lambdas like ruby? [duplicate] - c++

This question already has answers here:
regex replace with callback in c++11?
(4 answers)
Closed 5 years ago.
Im trying to learn how I could write a regex evaluator in c++ with lambda expression
5;4
11;2
7;3
inputx.gsub(/(.*?);(.*?)\n/) { ($1.to_i - $2.to_i ).to_s + "\n" }
1
9
4
How I could do this if is possible using lambda expression
Please help me

The little dirty secret is that all regex replace functions of all
languages maintain an internal string by which the output is constructed
from scratch.
The output is a catenation of between-match substrings plus the matched
string in formatted form.
The new string is then returned to the caller.
But, what if you want to supply a callback function to do your own
formatting that requires language constructs ?
In all of regex land, it's easy to simulate this by just sitting in a
regex_search loop and constructing your new output inside there,
based on each match.
Well, as far as C++(>=11) is concerned, _you can't provide a callback to
do this automatically !!
Pretty sad huh..
(Boost::Regex has this built into their regex replace function as
and option (callback functor).)
So, what do you do?
You have to roll your own general regex_replace() class that takes a
callback function, which does nothing more that what they all do as described.
Lucky for you someone has already done this using all the bells and whistles
of C++.
regex replace with callback in c++11?
Enjoy !!

Related

How can I support regex character class subtraction with C++ regular expressions? [duplicate]

This question already has answers here:
Regex character class subtraction in C++
(4 answers)
Closed 1 year ago.
I don't know who closed this question but please actually read the question... This is a legitimate problem and I have done a good amount of research online and cannot find any way to implement this in C++. I can only assume whoever closed the question did not read it. (They didn't provide any reason for the question being closed so if you are going to close it again please explain why.)
I'm writing a C++ program that will need to take regular expressions that are defined in a XML Schema file and use them to validate XML data. The problem is, the flavor of regular expressions used by XML Schemas does not seem to be directly supported in C++.
For example, there are a couple special character classes \i and \c that are not defined by default and also the XML Schema regex language supports something called "character class subtraction" that does not seem to be supported in C++.
Allowing the use of the \i and \c special character classes is pretty simple, I can just look for "\i" or "\c" in the regular expression and replace them with their expanded versions, but getting character class subtraction to work is a much more daunting problem...
For example, this regular expression that is valid in an XML Schema definition throws an exception in C++ saying it has unbalanced square brackets.
#include <iostream>
#include <regex>
int main()
{
try
{
// Match any lowercase letter that is not a vowel
std::regex rx("[a-z-[aeiuo]]");
}
catch (const std::regex_error& ex)
{
std::cout << ex.what() << std::endl;
}
}
How can I get C++ to recognize character class subtraction within a regex? Or even better, is there a way to just use the XML Schema flavor of regular expressions directly within C++?
I have never heard of character class subtraction, but if you want any non vowel lowercase letter you can easily enough express that with a regular character class:
std::regex rx("[a-df-hj-np-t-v-z]");

How do I regexp-match (remove) an arbitrary series of two-letter language codes separated by commas, to the right of a title? [duplicate]

This question already has answers here:
Apply Perl RegExp to Remove Parenthesis and Text at End of String
(1 answer)
Regex for Comma delimited list
(12 answers)
Closed 2 years ago.
I have a bunch of strings such as:
Super Mario Bros. 8 (En,Fr,De,Es,It)
Donald Duck in Whacky Land (En,Fr,De,Es,Sv)
Toadstool Adventures 3D (En)
Chinaland (En,De)
A title which doesn't have any such thing
...
That is, a title of a product followed by (sometimes) a list of one or more language codes in parentheses.
I really struggle to come up with a (PCRE) regexp to safely remove these from the strings in a safe manner. That is, not likely to touch the titles.
I know that ([A-Z]{1}[a-z]{1}) must be involved somewhere, to match a single language code such as "It" or "De", but how I should handle the possibility of any number of such in a row, with commas between or no comma (if it's just one), is beyond my regular expression skills.
I really wish that they had used some kind of unambiguous separator between the title part and the "metadata" part of the filenames... Then I wouldn't need to do all this manual trial-and-error removal. But they didn't.
Something like this would do it:
\([A-Z][a-z](?:,[A-Z][a-z])*\)$
https://regex101.com/r/xxNQ8h/1
Try it like this:
\(([A-Z][a-z],?)+\).*$
Online Demo

Regex to extract function from c++ function calls [duplicate]

This question already has an answer here:
Regex for extracting functions from C++ code
(1 answer)
Closed 7 years ago.
I have to extract a function name from various c++ function calls. Following are some of the function calls examples and extracted function names highlighted.
std::basic_fstream<char,std::char_traits<char> >::~basic_fstream<char,std::char_traits<char> >
~basic_fstream
CSocket::Send send
CMap<unsigned int,unsigned int &,tagLAUNCHOBJECT,tagLAUNCHOBJECT &>::RemoveAll
Cerner::Foundations::String::Rep::~Rep~Rep
CCMessage::~CCMessage ~CCMessage
std::_Tree<std::_Tmap_traits<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,u_Tree
Lib::DispatcherCache::~DispatcherCache~DispatcherCache
CPrefDataObjectLoader<CPrefManagerKey,CPrefManagerValue,CGetPrefManager,PrefManagerKeyFunctor>::Get Get
The following Regex works for most of the functions
/((?:[^:]*))$';/ This regex get the string from the last :
/+?(?=<)';/ This one removes string that starts with <
But for std::basic_fstream<char,std::char_traits<char> >::~basic_fstream<char,std::char_traits<char> > the output I get is char_traits because this string is after last ':' but the result should be ~basic_fstream. Is there a way I can combine both regex and ignore everything that is within <>?
The grammar of C++ is not only not regular, it's actually highly context-sensitive (especially near templates). Even a proper CFG parser won't help you, let alone a plain old regex… Rather than trying to approximate the impossible using ugly and fragile hacks, why not use an actual tool for the job? If you want to parse C++, then use a C++ parser, such as libclang.

Read inner Parentheses of a String (VB.NET) [duplicate]

This question already has answers here:
Evaluate mathematical expression from a string using VB
(3 answers)
Closed 4 years ago.
I am developping a calculator function in Visual Basic. I think there is no basic one in the default .NET libraries.
I use System.Data.DataTable.Compute() to calculate the normal Math expression. But I want my function to solve functions like Sinus() or Round() as well.
Currently I am using Regex for this. For example for functions with one argument I use
Dim expr As String = Regex.Replace(Term, "(?<func>[A-Za-z]*?)\((?<arg1>[0-9,.]*?)\)", AddressOf FunctionLibrary.Funcs1)
the Sinus() function would look like this: sin(Number).
But this only allows to write Doubles or Integers into the argument. I can not write inner functions or even another math expression between the parentheses.
If I would write more functions as an argument, Regex would detect the first ")" inside the function, which is the closing parenthese of the inner function, as the end of the outer function.
Is there any way to make Regex recognize that theres an inner function as well?
If anyone knows an Evaluate() function for Visual Basic which is in the default .NET libraries, this might help me as well
Is there any way to make Regex recognize that theres an inner function as well?
No, there is not. Regular expressions, as the name implies, solve Regular grammars and simple mathematical expressions are Context-Free. Regular expressions do not have a stack to match arbitrary expressions. For example, distinguishing between (()) and ()() require at least one character of lookahead (or backtracking). Yes, PCRE-style regular expressions can let you create a fixed number of lookahead characters, but so far as I know you have to specify the number of characters, and anyway this is not going to solve your problem.
Evaluating arithmetic expressions require handling precedence, subexpressions, possibly variables and types. Regular expressions cannot do this, attempting to do it with regular expressions will lead you into a pit of failure.
Nor are they even necessary. Evaluating mathematical expressions is a solved problem and there are dozens of parsers and evaluators written, tested, and ready for you to drop into your application. You have not given us enough information to decide which one would be best for you, and anyway Stack Overflow is not a tool advocacy site. You could start by going through the list at Gary Beene's Equation Parser review.

Lua string.match uses irregular regular expressions?

I'm curious why this doesn't work, and need to know why/how to work around it; I'm trying to detect whether some input is a question, I'm pretty sure string.match is what I need, but:
print(string.match("how much wood?", "(how|who|what|where|why|when).*\\?"))
returns nil. I'm pretty sure Lua's string.match uses regular expressions to find matches in a string, as I've used wildcards (.) before with success, but maybe I don't understand all the mechanics? Does Lua require special delimiters in its string functions? I've tested my regular expression here, so if Lua used regular regular expressions, it seems like the above code would return "how much wood?".
Can any of you tell me what I'm doing wrong, what I mean to do, or point me to a good reference where I can get comprehensive information about how Lua's string manipulation functions utilize regular expressions?
Lua doesn't use regex. Lua uses Patterns, which look similar but match different input.
.* will also consume the last ? of the input, so it fails on \\?. The question mark should be excluded. Special characters are escaped with %.
"how[^?]*%?"
As Omri Barel said, there's no alternation operator. You probably need to use multiple patterns, one for each alternative word at the beginning of the sentence. Or you could use a library that supports regex like expressions.
According to the manual, patterns don't support alternation.
So while "how.*" works, "(how|what).*" doesnt.
And kapep is right about the question mark being swallowed by the .*.
There's a related question: Lua pattern matching vs. regular expressions.
As they have already answered before, it is because the patterns in lua are different from the Regex in other languages, but if you have not yet managed to get a good pattern that does all the work, you can try this simple function:
local function capture_answer(text)
local text = text:lower()
local pattern = '([how]?[who]?[what]?[where]?[why]?[when]?[would]?.+%?)'
for capture in string.gmatch(text, pattern) do
return capture
end
end
print(capture_answer("how much wood?"))
Output: how much wood?
That function will also help you if you want to find a question in a larger text string
Ex.
print(capture_answer("Who is the best football player in the world?\nWho are your best friends?\nWho is that strange guy over there?\nWhy do we need a nanny?\nWhy are they always late?\nWhy does he complain all the time?\nHow do you cook lasagna?\nHow does he know the answer?\nHow can I learn English quickly?"))
Output:
who is the best football player in the world?
who are your best friends?
who is that strange guy over there?
why do we need a nanny?
why are they always late?
why does he complain all the time?
how do you cook lasagna?
how does he know the answer?
how can i learn english quickly?