I have a program which I'd like to profile. However, it has some functions declared with attribute gnu_inline. If I try to build the program with -finstrument-functions flag, I get linker errors, for example:
#define always_inline __attribute__((always_inline, gnu_inline))
static int inline always_inline f()
{
return 0;
}
int main(int argc, char *argv[])
{
int i = f();
return i;
}
gives me "undefined reference to f()" error.
The problem is that the compiler tries to pass the address of f() to profiling functions, but the function is inlined, so there is no body of function, and no address of f().
I tried to build my program with -fkeep-inline-functions flag, but it apparently has no effect on functions declared with gnu_inline.
Is it possible somehow to force the compiler making a separate copy of f() for linker? Or are such functions unprofileable?
My program uses Qt 5, and these functions are located in Qt headers, so I'd prefer not to alter function declaration, if it is possible.
At least, the following two options could make your scenario working even though there might be a prettier solution:
-finstrument-functions-exclude-file-list=file,file,...
Set the list of functions that are excluded from instrumentation (see the description of "-finstrument-functions"). If the file that contains a function definition matches with one
of file, then that function is not instrumented. The match is done on substrings: if the file parameter is a substring of the file name, it is considered to be a match.
For example:
-finstrument-functions-exclude-file-list=/bits/stl,include/sys
excludes any inline function defined in files whose pathnames contain "/bits/stl" or "include/sys".
If, for some reason, you want to include letter ',' in one of sym, write ','. For example, "-finstrument-functions-exclude-file-list=',,tmp'" (note the single quote surrounding the
option).
-finstrument-functions-exclude-function-list=sym,sym,...
This is similar to "-finstrument-functions-exclude-file-list", but this option sets the list of function names to be excluded from instrumentation. The function name to be matched is its user-visible name, such as "vector blah(const vector &)", not the internal mangled name (e.g., "_Z4blahRSt6vectorIiSaIiEE"). The match is done on substrings: if the sym parameter is a substring of the function name, it is considered to be a match. For C99 and C++ extended identifiers, the function name must be given in UTF-8, not using universal character names
Related
I am trying to find all places in a large and old code base where certain constructors or functions are called. Specifically, these are certain constructors and member functions in the std::string class (that is, basic_string<char>). For example, suppose there is a line of code:
std::string foo(fiddle->faddle(k, 9).snark);
In this example, it is not obvious looking at this that snark may be a char *, which is what I'm interested in.
Attempts To Solve This So Far
I've looked into some of the dump features of gcc, and generated some of them, but I haven't been able to find any that tell me that the given line of code will generate a call to the string constructor taking a const char *. I've also compiled some code with -s to save the generated equivalent assembly code. But this suffers from two things: the function names are "mangled," so it's impossible to know what is being called in C++ terms; and there are no line numbers of any sort, so even finding the equivalent place in the source file would be tough.
Motivation and Background
In my project, we're porting a large, old code base from HP-UX (and their aCC C++ compiler) to RedHat Linux and gcc/g++ v.4.8.5. The HP tool chain allowed one to initialize a string with a NULL pointer, treating it as an empty string. The Gnu tools' generated code fails with some flavor of a null dereference error. So we need to find all of the potential cases of this, and remedy them. (For example, by adding code to check for NULL and using a pointer to a "" string instead.)
So if anyone out there has had to deal with the base problem and can offer other suggestions, those, too, would be welcomed.
Have you considered using static analysis?
Clang has one called clang analyzer that is extensible.
You can write a custom plugin that checks for this particular behavior by implementing a clang ast visitor that looks for string variable declarations and checks for setting it to null.
There is a manual for that here.
See also: https://github.com/facebook/facebook-clang-plugins/blob/master/analyzer/DanglingDelegateFactFinder.cpp
First I'd create a header like this:
#include <string>
class dbg_string : public std::string {
public:
using std::string::string;
dbg_string(const char*) = delete;
};
#define string dbg_string
Then modify your makefile and add "-include dbg_string.h" to cflags to force include on each source file without modification.
You could also check how is NULL defined on your platform and add specific overload for it (eg. dbg_string(int)).
You can try CppDepend and its CQLinq a powerful code query language to detect where some contructors/methods/fields/types are used.
from m in Methods where m.IsUsing ("CClassView.CClassView()") select new { m, m.NbLinesOfCode }
This question already has answers here:
Listing Unused Symbols
(2 answers)
Closed 7 years ago.
How do I detect function definitions which are never getting called and delete them from the file and then save it?
Suppose I have only 1 CPP file as of now, which has a main() function and many other function definitions (function definition can also be inside main() ). If I were to write a program to parse this CPP file and check whether a function is getting called or not and delete if it is not getting called then what is(are) the way(s) to do it?
There are few ways that come to mind:
I would find out line numbers of beginning and end of main(). I can do it by maintaining a stack of opening and closing braces { and }.
Anything after main would be function definition. Then I can parse for function definitions. To do this I can parse it the following way:
< string >< open paren >< comma separated string(s) for arguments >< closing paren >
Once I have all the names of such functions as described in (2), I can make a map with its names as key and value as a bool, indicating whether a function is getting called once or not.
Finally parse the file once again to check for any calls for functions with their name as in this map. The function call can be from within main or from some other function. The value for the key (i.e. the function name) could be flagged according to whether a function is getting called or not.
I feel I have complicated my logic and it could be done in a smarter way. With the above logic it would be hard to find all the corner cases (there would be many). Also, there could be function pointers to make parsing logic difficult. If that's not enough, the function pointers could be typedefed too.
How do I go about designing my program? Are a map (to maintain filenames) and stack (to maintain braces) the right data structures or is there anything else more suitable to deal with it?
Note: I am not looking for any tool to do this. Nor do I want to use any library (if it exists to make things easy).
I think you should not try to build a C++ parser from scratch, becuse of other said in comments that is really hard. IMHO, you'd better start from CLang libraries, than can do the low-level parsing for you and work directly with the abstract syntax tree.
You could even use crange as an example of how to use them to produce a cross reference table.
Alternatively, you could directly use GNU global, because its gtags command directly generates definition and reference databases that you have to analyse.
IMHO those two ways would be simpler than creating a C++ parser from scratch.
The simplest approach for doing it yourself I can think of is:
Write a minimal parser that can identify functions. It just needs to detect the start and ending line of a function.
Programmatically comment out the first function, save to a temp file.
Try to compile the file by invoking the complier.
Check if there are compile errors, if yes, the function is called, if not, it is unused.
Continue with the next function.
This is a comment, rather than an answer, but I post it here because it's too long for a comment space.
There are lots of issues you should consider. First of all, you should not assume that main() is a first function in a source file.
Even if it is, there should be some functions header declarations before the main() so that the compiler can recognize their invocation in main.
Next, function's opening and closing brace needn't be in separate lines, they also needn't be the only characters in their lines. Generally, almost whole C++ code can be put in a single line!
Furthermore, functions can differ with parameters' types while having the same name (overloading), so you can't recognize which function is called if you don't parse the whole code down to the parameters' types. And even more: you will have to perform type lists matching with standard convertions/casts, possibly considering inline constructors calls. Of course you should not forget default parameters. Google for resolving overloaded function call, for example see an outline here
Additionally, there may be chains of unused functions. For example if a() calls b() and b() calls c() and d(), but a() itself is not called, then the whole four is unused, even though there exist 'calls' to b(), c() and d().
There is also a possibility that functions are called through a pointer, in which case you may be unable to find a call. Example:
int (*testfun)(int) = whattotest ? TestFun1 : TestFun2; // no call
int testResult = testfun(paramToTest); // unknown function called
Finally the code can be pretty obfuscated with #defineās.
Conclusion: you'll probably have to write your own C++ compiler (except the machine code generator) to achieve your goal.
This is a very rough idea and I doubt it's very efficient but maybe it can help you get started. First traverse the file once, picking out any function names (I'm not entirely sure how you would do this). But once you have those names, traverse the file again, looking for the function name anywhere in the file, inside main and other functions too. If you find more than 1 instance it means that the function is being called and should be kept.
I am unable to use lldb to invoke simple, non-templated functions that take string arguments. Is there any way to get lldb to understand the C++ datatype "string", which is a commonly used datatype in C++ programs?
The sample source code here just creates a simple class with a few constructors, and then calls them (includes of "iostream" and "string" omitted):
using namespace std;
struct lldbtest{
int bar=5;
lldbtest(){bar=6;}
lldbtest(int foo){bar=foo;}
lldbtest(string fum){bar=7;}
};
int main(){
string name="fum";
lldbtest x,y(3);
cout<<x.bar<<y.bar<<endl;
return 0;
}
When compiled on Mac Maverick with
g++ -g -std=c++11 -o testconstructor testconstructor.cpp
the program runs and prints the expected output of "63".
However, when a breakpoint is set in main just before the return statement, and attempt to invoke the constructor fails with a cryptic error message:
p lldbtest(string("hey there"))
error: call to a function 'lldbtest::lldbtest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >)' ('_ZN8lldbtestC1ENSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEE') that is not present in the target
error: The expression could not be prepared to run in the target
Possibly relevant as well, the command:
p lldbtest(name)
prints nothing at all.
Also, calling the constructor with a string literal also failed, the standard way:
p lldbtest("foo")
gives a similar long error:
error: call to a function
'lldbtest::lldbtest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >)' ('_ZN8lldbtestC1ENSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEE') that is not present in the targeterror: The expression could not be prepared to run in the target
Is there any way to get lldb to understand and use the C++ "string" datatype? I have a number of functions taking string arguments and need a way to invoke these functions from the debugger. On a Mac.
THE PROBLEM
This is due to a subtle problem with your code, that boils down to the following wording from the C++ Standard:
7.1.2p3-4 Function specifiers [dcl.fct.spec]
A function defined within a class definition is an inline function.
...
An inline function shall be defined in every translation unit in which it is odr-used, and shall have exactly the same definition in every case (3.2).
Your constructor, lldbtest(std::string) is defined within the body of lldbtest which means that it will implicitly be inline, which further means that the compiler will not generate any code for it, unless it is used in the translation unit.
Since the definition must be present in every translation unit that potentially calls it we can imagine the compiler saying; "heck, I don't need to do this.. if someone else uses it, they will generate the code".
lldb will look for a function definition which doesn't exist, since gcc didn't generate one; because you didn't use it.
THE SOLUTION
If we change the definition of lldbtest to the following I bet it will work as you intended:
struct lldbtest{
int bar=5;
lldbtest();
lldbtest(int foo);
lldbtest(string fum);
};
lldbtest::lldbtest() { bar=6; }
lldbtest::lldbtest(int) { bar=7; }
lldbtest::lldbtest(string) { bar=8; }
But.. what about p lldbtest(name)?
The command p in lldb is used to print* information, but it can also be used to evaluate expressions.
lldbtest(name) will not call the constructor of lldbtest with a variable called name, it's equivalent of declaring a variable called name of type lldbtest; ie. lldbtest name is sementically equivalent.
Going to answer the asked question here instead of addressing the problem with the op's code. Especially since this took me a while to figure out.
Use a string in a function invocation in lldb in C++
(This post helped greatly, and is a good read: Dancing in The Debugger)
From the example of hooking C++ methods with MobileSubstrate I found this:
void (*X_ZN20WebFrameLoaderClient23dispatchWillSendRequestEPN7WebCore14DocumentLoaderEmRNS0_15ResourceRequestERKNS0_16ResourceResponseE) (void* something, void* loader, unsigned long identifier, void* request, const void** response);
Why do we need this x_zn20...23....7...14 etc. between the names? What does this mean? I don't think that this is the real name.
C++ mangles names of symbols emitted to the binary, to distinguish void foo(int) and void foo(double). Also, on many platforms, it needs to encode X::Y somehow to make it an alphanumeric string. This adds the extra characters and is platform dependent.
The notation you see is called name mangling.
It's a way of encoding method signatures (in the binary) so that they are unique across the binary, even if two methods have the same name and they belong to classes of the same name, but differ only by the scope (namespace) or parameters
It looks like some platform specific hack to bind to a symbol in a compiled object.
You should look for the header file which contains that function name and call it properly.
This is bad because as compiler evolve and the code base changes specifics of the symbol name will change.
I just want to ask your ideas regarding this matter. For a certain important reason, I must extract/acquire all function names of functions that were called inside a "main()" function of a C source file (ex: main.c).
Example source code:
int main()
{
int a = functionA(); // functionA must be extracted
int b = functionB(); // functionB must be extracted
}
As you know, the only thing that I can use as a marker/sign to identify these function calls are it's parenthesis "()". I've already considered several factors in implementing this function name extraction. These are:
1. functions may have parameters. Ex: functionA(100)
2. Loop operators. Ex: while()
3. Other operators. Ex: if(), else if()
4. Other operator between function calls with no spaces. Ex: functionA()+functionB()
As of this moment I know what you're saying, this is a pain in the $$$... So please share your thoughts and ideas... and bear with me on this one...
Note: this is in C++ language...
You can write a Small C++ parser by combining FLEX (or LEX) and BISON (or YACC).
Take C++'s grammar
Generate a C++ program parser with the mentioned tools
Make that program count the funcion calls you are mentioning
Maybe a little bit too complicated for what you need to do, but it should certainly work. And LEX/YACC are amazing tools!
One option is to write your own C tokenizer (simple: just be careful enough to skip over strings, character constants and comments), and to write a simple parser, which counts the number of {s open, and finds instances of identifier + ( within. However, this won't be 100% correct. The disadvantage of this option is that it's cumbersome to implement preprocessor directives (e.g. #include and #define): there can be a function called from a macro (e.g. getchar) defined in an #include file.
An option that works for 100% is compiling your .c file to an assembly file, e.g. gcc -S file.c, and finding the call instructions in the file.S. A similar option is compiling your .c file to an object file, e.g, gcc -c file.c, generating a disassembly dump with objdump -d file.o, and searching for call instructions.
Another option is finding a parser using Clang / LLVM.
gnu cflow might be helpful