Regular expression to match (C) function calls - regex

Does anyone have a regular expression for matching function calls in C programs ?

Since C isn't a regular language and C function calls can contain arbitrary argument expressions, I fear the answer to your question is “no.”

After a bit more searching I decided to let the compiler do the hard work.
Get the compiler to produce a Register Transfer Language (RTL) file using the -dr options of gcc.
The produced RTL file has the suffix .rtl or .expand.
This file is far easier to parse as the functions calls are already identified.

I doubt you can find a regex that matches all (and only) the function calls in some source code. But maybe you could use a tool like Understand, or your IDE, to browse your code.

Related

Is there any way to search the macro name by value?

Thanks to #selbie, a more clear question is
I've got some magic number or string I need to reference in code. There's a good chance one of the platform header files has already defined this value as an existing macro. And if so, how would I discover the macro with another name, so I don't end up duplicating it was another name?
We know that macro can be computed (or replaced in fact?) in compile time. So I want to know if there any way to search macro name by its value?
Here is a example. When I parse the USN record, I find that FileReferenceNumber of the root of driver is always 1407374883553285, so I would like to check whether it is defined in XXX.h previously, then I don't need to define another one.
By the way, if we can search macro, how about constexpr?
Gcc and clang will print out a list of #defines if you invoke them with the options -E -dM. (If you don't use -E, -dM does something else.)
Unfortunately, macros and arithmetic expressions in the macro replacement texts are not expanded / evaluated, so you'll only be able to find the value if you know it's textual representation. Still, it's a first step.
That won't work for enum member values and constexprs. I don't think there is any way to search for those which doesn't involve using some C parsing library to build a symbol table. Such libraries exist, but they're not necessarily well-documented, stable, or easy to use.

Why is LLDB's print (or p) command so limited?

When I first heard about it, it sounded like a great feature—a c++ REPL. However, it cannot call STL functions or methods, and has a whole lot of other issues. This question also applies to conditional breakpoints.
Is it still an experimental feature, or have the developers just dropped it?
Example:
(lldb) p iterator->aField
error: call to a function 'std::__1::__wrap_iter<aClass const*>::operator->() const' ('_ZNKSt3__111__wrap_iterIPK8aClassEptEv') that is not present in the target
error: 0 errors parsing expression
error: The expression could not be prepared to run in the target
At present, there's no good way for the debugger to generate methods of template specializations for which the compiler only emitted inlined versions. And the debugger can't call inlined methods.
Here's one limited trick (though it requires C++11) that you can use to force the compiler to generate full copies of the relevant template class so that there are functions the debugger can call. For instance, if I put:
template class std::vector<int>;
in my source code somewhere, the compiler will generate real copies of all the functions in the int specialization of std::vector. This obviously isn't a full solution, and you should only do this in debug builds or it will bloat your code. But when there are a couple of types that you really call methods on, its a useful trick to know.
You mention a "whole lot of other issues". Please file bugs on any expression parser issues you find in lldb, either with the lldb bugzilla: https://llvm.org/bugs, or Apple's bug reporter: http://bugreporter.apple.com. The expression parser is under active development

Why this definition of vector of shared_ptr can pass the compiler checking?

I have seen the following definition throughout legacy code:
std::vector<boost::shared_ptr<ClassNameAAA>> vecClass;
I am able to compile it with VS2008 w/o problems.
Question> My understanding is that the following line should be used instead:
std::vector<boost::shared_ptr<ClassNameAAA> > vecClass;
^ Add a space here
Am i correct on this? If any, why VS2008 allows this?
Thank you
This is one of those several MS extensions.
Am i correct on this?
Yes your understanding is correct. >> would be parsed as right shift operator.
However C++11 supports right-angle brackets.
MSVC++ 2008 is able to parse this because of a language extension.
Many compilers have extensions for features that eventually become part of the language. Being able to parse nested template declarations without the space is now required in the new C++11 standard.

Build parser from grammar at runtime

Many (most) regular expression libraries for C++ allow for creating the expression from a string during runtime. Is anyone aware of any C++ parser generators that allow for feeding a grammar (preferably BNF) represented as a string into a generator at runtime? All the implementations I've found either require an explicit code generator to be run or require the grammar to be expressed via clever template meta-programming.
It should be pretty easy to build a recursive descent, backtracking parser that accepts a grammar as input. You can reduce all your rules to the following form (or act as if you have):
A = B C D ;
Parsing such a rule by recursive descent is easy: call a routine that corresponds to finding a B, then one that finds a C, then one that finds a D. Given you are doing a general parser, you can always call a "parse_next_sentential_form(x)" function, and pass the name of the desired form (terminal or nonterminal token) as x (e.g., "B", "C", "D").
In processing such a rule, the parser wants to produce an A, by finding a B, then C, then D. To find B (or C or D), you'd like to have an indexed set of rules in which all the left-hand sides are the same, so one can easily enumerate the B-producing rules, and recurse to process their content. If your parser gets a failure, it simply backtracks.
This won't be a lightning fast parser, but shouldn't be terrible if well implemented.
One could also use an Earley parser, which parses by creating states of partially-processed rules.
If you wanted it to be fast, I suppose you could simply take the guts of Bison and make it into a library. Then if you have grammar text or grammar rules (different entry points into Bison), you could start it and have it produce its tables in memory (which it must do in some form). Don't spit them out; simply build an LR parsing engine that uses them. Voila, on-the-fly efficient parser generation.
You have to worry about ambiguities and the LALR(1)ness of your grammar if you do this; the previous two solutions work with any context free grammar.
I am not aware of an existing library for this. However if performance and robustness are not critical, then you can spin off bison or any other tool that generates C code (via popen(3) or similar), spin off gcc on the generated code, link it into shared library and load the library via dlopen(3)/dlsym(3). On Windows -- DLL and LoadLibrary() instead.
The easiest option is to embed some scripting language or even a full-blown VM (e.g., Mono), and run your generated parsers on top of it. Lua has quite a powerful JIT compiler, decent metaprogramming capabilities and several Packrat implementations ready to use, so probably it would be the least effort way.
I just came across this http://cocom.sourceforge.net/ammunition++-13.html
The last one is an Earley Parser and it appears to take the grammar as a string.
One of the functions is:
Public function `parse_grammar'
`int parse_grammar (int strict_p, const char *description)'
is another function which tunes the parser to given grammar.
The grammar is given by string `description'.
The description is similiar YACC one.
The actual code is at http://sourceforge.net/projects/cocom/
EDIT
A newer version is at https://github.com/vnmakarov/yaep
boost::spirit is a C++ parsing framework that can be used to construct parsers dynamically at runtime.

Is it possible to create wrong Regular expression in ActionScript/Flex which will cause runtime error?

Is it possible to create wrong Regular expression in ActionScript/Flex which will cause runtime error? I've tried so many weird regexpes in Flex and Flex never complained! How do I know If my regexp valid?
In theory, according to the ActionScript 3.0 SyntaxError documentation, when a regular expression cannot be parsed a SyntaxError is generated at runtime that you can detect in a try/catch block.
In practice, I've never actually seen the RegExp class exhibit this behavior.
I don't have ActionScript/Flex, so I can't test this. Since you haven't given any examples, I don't know what you think is a "weird" regex. What happens if you try one of these:
/(?<=x*)foo/
(ECMAScript regexes don't support lookbehind)
/foo([/
(missing closing parentheses/brackets)
/foo)]/
(missing opening parentheses/brackets)
/foo(?)/
(Syntax error)
/foo\1/
(invalid backreference)
If your end goal is to determine whether a particular regular expression is valid or not then I'm not sure trying to intentionally generate runtime errors is the best way to accomplish that.
Instead I would recommend testing your patterns against known inputs and make sure they behave as intended. You can use a tool like this to test:
RegExr