Regex to extract function from c++ function calls [duplicate] - c++

This question already has an answer here:
Regex for extracting functions from C++ code
(1 answer)
Closed 7 years ago.
I have to extract a function name from various c++ function calls. Following are some of the function calls examples and extracted function names highlighted.
std::basic_fstream<char,std::char_traits<char> >::~basic_fstream<char,std::char_traits<char> >
~basic_fstream
CSocket::Send send
CMap<unsigned int,unsigned int &,tagLAUNCHOBJECT,tagLAUNCHOBJECT &>::RemoveAll
Cerner::Foundations::String::Rep::~Rep~Rep
CCMessage::~CCMessage ~CCMessage
std::_Tree<std::_Tmap_traits<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,u_Tree
Lib::DispatcherCache::~DispatcherCache~DispatcherCache
CPrefDataObjectLoader<CPrefManagerKey,CPrefManagerValue,CGetPrefManager,PrefManagerKeyFunctor>::Get Get
The following Regex works for most of the functions
/((?:[^:]*))$';/ This regex get the string from the last :
/+?(?=<)';/ This one removes string that starts with <
But for std::basic_fstream<char,std::char_traits<char> >::~basic_fstream<char,std::char_traits<char> > the output I get is char_traits because this string is after last ':' but the result should be ~basic_fstream. Is there a way I can combine both regex and ignore everything that is within <>?

The grammar of C++ is not only not regular, it's actually highly context-sensitive (especially near templates). Even a proper CFG parser won't help you, let alone a plain old regex… Rather than trying to approximate the impossible using ugly and fragile hacks, why not use an actual tool for the job? If you want to parse C++, then use a C++ parser, such as libclang.

Related

How can I support regex character class subtraction with C++ regular expressions? [duplicate]

This question already has answers here:
Regex character class subtraction in C++
(4 answers)
Closed 1 year ago.
I don't know who closed this question but please actually read the question... This is a legitimate problem and I have done a good amount of research online and cannot find any way to implement this in C++. I can only assume whoever closed the question did not read it. (They didn't provide any reason for the question being closed so if you are going to close it again please explain why.)
I'm writing a C++ program that will need to take regular expressions that are defined in a XML Schema file and use them to validate XML data. The problem is, the flavor of regular expressions used by XML Schemas does not seem to be directly supported in C++.
For example, there are a couple special character classes \i and \c that are not defined by default and also the XML Schema regex language supports something called "character class subtraction" that does not seem to be supported in C++.
Allowing the use of the \i and \c special character classes is pretty simple, I can just look for "\i" or "\c" in the regular expression and replace them with their expanded versions, but getting character class subtraction to work is a much more daunting problem...
For example, this regular expression that is valid in an XML Schema definition throws an exception in C++ saying it has unbalanced square brackets.
#include <iostream>
#include <regex>
int main()
{
try
{
// Match any lowercase letter that is not a vowel
std::regex rx("[a-z-[aeiuo]]");
}
catch (const std::regex_error& ex)
{
std::cout << ex.what() << std::endl;
}
}
How can I get C++ to recognize character class subtraction within a regex? Or even better, is there a way to just use the XML Schema flavor of regular expressions directly within C++?
I have never heard of character class subtraction, but if you want any non vowel lowercase letter you can easily enough express that with a regular character class:
std::regex rx("[a-df-hj-np-t-v-z]");

How do I regexp-match (remove) an arbitrary series of two-letter language codes separated by commas, to the right of a title? [duplicate]

This question already has answers here:
Apply Perl RegExp to Remove Parenthesis and Text at End of String
(1 answer)
Regex for Comma delimited list
(12 answers)
Closed 2 years ago.
I have a bunch of strings such as:
Super Mario Bros. 8 (En,Fr,De,Es,It)
Donald Duck in Whacky Land (En,Fr,De,Es,Sv)
Toadstool Adventures 3D (En)
Chinaland (En,De)
A title which doesn't have any such thing
...
That is, a title of a product followed by (sometimes) a list of one or more language codes in parentheses.
I really struggle to come up with a (PCRE) regexp to safely remove these from the strings in a safe manner. That is, not likely to touch the titles.
I know that ([A-Z]{1}[a-z]{1}) must be involved somewhere, to match a single language code such as "It" or "De", but how I should handle the possibility of any number of such in a row, with commas between or no comma (if it's just one), is beyond my regular expression skills.
I really wish that they had used some kind of unambiguous separator between the title part and the "metadata" part of the filenames... Then I wouldn't need to do all this manual trial-and-error removal. But they didn't.
Something like this would do it:
\([A-Z][a-z](?:,[A-Z][a-z])*\)$
https://regex101.com/r/xxNQ8h/1
Try it like this:
\(([A-Z][a-z],?)+\).*$
Online Demo

its possible c++ regex evaluator with lambdas like ruby? [duplicate]

This question already has answers here:
regex replace with callback in c++11?
(4 answers)
Closed 5 years ago.
Im trying to learn how I could write a regex evaluator in c++ with lambda expression
5;4
11;2
7;3
inputx.gsub(/(.*?);(.*?)\n/) { ($1.to_i - $2.to_i ).to_s + "\n" }
1
9
4
How I could do this if is possible using lambda expression
Please help me
The little dirty secret is that all regex replace functions of all
languages maintain an internal string by which the output is constructed
from scratch.
The output is a catenation of between-match substrings plus the matched
string in formatted form.
The new string is then returned to the caller.
But, what if you want to supply a callback function to do your own
formatting that requires language constructs ?
In all of regex land, it's easy to simulate this by just sitting in a
regex_search loop and constructing your new output inside there,
based on each match.
Well, as far as C++(>=11) is concerned, _you can't provide a callback to
do this automatically !!
Pretty sad huh..
(Boost::Regex has this built into their regex replace function as
and option (callback functor).)
So, what do you do?
You have to roll your own general regex_replace() class that takes a
callback function, which does nothing more that what they all do as described.
Lucky for you someone has already done this using all the bells and whistles
of C++.
regex replace with callback in c++11?
Enjoy !!

Regex for extracting functions from C++ code

I have sample C++ code (http://pastebin.com/6q7zs7tc) from which I have to extract functions names as well as the number of parameters that a function requires. So far I have written this regex, but it's not working perfectly for me.
(?![a-z])[^\:,>,\.]([a-z,A-Z]+[_]*[a-z,A-Z]*)+[(]
You can't parse C++ reliably with regex.
In fact, you can't parse it with weak parsing technology (See Why can't C++ be parsed with a LR(1) parser?). If you expect to get extract this information reliably from source files, you will need a time-tested C++ parser; see https://stackoverflow.com/a/28825789/120163
If you don't care that your extraction process is flaky, then you can use a regex and maybe some additional hackery. Your key problem for heuristic extraction is matching various kinds of brackets, e.g., [...], < ... > (which won't quite work for shift operators) and { ... }. Bracket matching requires you to keep a stack of seen brackets. And bracket matching may fail in the presence of macros and preprocessor conditionals.

Read inner Parentheses of a String (VB.NET) [duplicate]

This question already has answers here:
Evaluate mathematical expression from a string using VB
(3 answers)
Closed 4 years ago.
I am developping a calculator function in Visual Basic. I think there is no basic one in the default .NET libraries.
I use System.Data.DataTable.Compute() to calculate the normal Math expression. But I want my function to solve functions like Sinus() or Round() as well.
Currently I am using Regex for this. For example for functions with one argument I use
Dim expr As String = Regex.Replace(Term, "(?<func>[A-Za-z]*?)\((?<arg1>[0-9,.]*?)\)", AddressOf FunctionLibrary.Funcs1)
the Sinus() function would look like this: sin(Number).
But this only allows to write Doubles or Integers into the argument. I can not write inner functions or even another math expression between the parentheses.
If I would write more functions as an argument, Regex would detect the first ")" inside the function, which is the closing parenthese of the inner function, as the end of the outer function.
Is there any way to make Regex recognize that theres an inner function as well?
If anyone knows an Evaluate() function for Visual Basic which is in the default .NET libraries, this might help me as well
Is there any way to make Regex recognize that theres an inner function as well?
No, there is not. Regular expressions, as the name implies, solve Regular grammars and simple mathematical expressions are Context-Free. Regular expressions do not have a stack to match arbitrary expressions. For example, distinguishing between (()) and ()() require at least one character of lookahead (or backtracking). Yes, PCRE-style regular expressions can let you create a fixed number of lookahead characters, but so far as I know you have to specify the number of characters, and anyway this is not going to solve your problem.
Evaluating arithmetic expressions require handling precedence, subexpressions, possibly variables and types. Regular expressions cannot do this, attempting to do it with regular expressions will lead you into a pit of failure.
Nor are they even necessary. Evaluating mathematical expressions is a solved problem and there are dozens of parsers and evaluators written, tested, and ready for you to drop into your application. You have not given us enough information to decide which one would be best for you, and anyway Stack Overflow is not a tool advocacy site. You could start by going through the list at Gary Beene's Equation Parser review.