Automatically adding Enter/Exit Function Logs to a Project - c++

I have a 3rd party source code that I have to investigate. I want to see in what order the functions are called but I don't want to waste my time typing:
printf("Entered into %s", __FUNCTION__)
and
printf("Exited from %s", __FUNCTION__)
for each function, nor do I want to touch any source file.
Do you have any suggestions? Is there a compiler flag that automagically does this for me?
Clarifications to the comments:
I will cross-compile the source to run it on ARM.
I will compile it with gcc.
I don't want to analyze the static code. I want to trace the runtime. So doxygen will not make my life easier.
I have the source and I can compile it.
I don't want to use Aspect Oriented Programming.
EDIT:
I found that 'frame' command in the gdb prompt prints the current frame (or, function name, you could say) at that point in time. Perhaps, it is possible (using gdb scripts) to call 'frame' command everytime a function is called. What do you think?

Besides the usual debugger and aspect-oriented programming techniques, you can also inject your own instrumentation functions using gcc's -finstrument-functions command line options. You'll have to implement your own __cyg_profile_func_enter() and __cyg_profile_func_exit() functions (declare these as extern "C" in C++).
They provide a means to track what function was called from where. However, the interface is a bit difficult to use since the address of the function being called and its call site are passed instead of a function name, for example. You could log the addresses, and then pull the corresponding names from the symbol table using something like objdump --syms or nm, assuming of course the symbols haven't been stripped from the binaries in question.
It may just be easier to use gdb. YMMV. :)

You said "nor do I want to touch any source file"... fair game if you let a script do it for you?
Run this on all your .cpp files
sed 's/^{/{ENTRY/'
So that it transforms them into this:
void foo()
{ENTRY
// code here
}
Put this in a header that can be #included by every unit:
#define ENTRY EntryRaiiObject obj ## __LINE__ (__FUNCTION__);
struct EntryRaiiObject {
EntryRaiiObject(const char *f) : f_(f) { printf("Entered into %s", f_); }
~EntryRaiiObject() { printf("Exited from %s", f_); }
const char *f_;
};
You may have to get fancier with the sed script. You can also put the ENTRY macro anywhere else you want to probe, like some deeply nested inner scope of a function.

Use /Gh (Enable _penter Hook Function) and /GH (Enable _pexit Hook Function) compiler switches (if you can compile the sources ofcourse)
NOTE: you won't be able to use those macro's. See here ("you will need to get the function address (in EIP register) and compare it against addresses in the map file that can be generated by the linker (assuming no rebasing has occurred). It'll be very slow though.")

If you're using gcc, the magic compiler flag is -g. Compile with debugging symbols, run the program under gdb, and generate stack traces. You could also use ptrace, but it's probably a lot easier to just use gdb.

Agree with William, use gdb to see the run time flow.
There are some static code analyzer which can tell which functions call which and can give you some call flow graph. One tool is "Understand C++" (support C/C++) but thats not free i guess. But you can find similar tools.

Related

Logging code execution in C++

Having used gprof and callgrind many times, I have reached the (obvious) conclusion that I cannot use them efficiently when dealing with large (as in a CAD program that loads a whole car) programs. I was thinking that maybe, I could use some C/C++ MACRO magic and somehow build a simple (but nice) logging mechanism. For example, one can call a function using the following macro:
#define CALL_FUN(fun_name, ...) \
fun_name (__VA_ARGS__);
We could add some clocking/timing stuff before and after the function call, so that every function called with CALL_FUN gets timed, e.g
#define CALL_FUN(fun_name, ...) \
time_t(&t0); \
fun_name (__VA_ARGS__); \
time_t(&t1);
The variables t0, t1 could be found in a global logging object. That logging object can also hold the calling graph for each function called through CALL_FUN. Afterwards, that object can be written in a (specifically formatted) file, and be parsed from some other program.
So here comes my (first) question: Do you find this approach tractable ? If yes, how can it be enhanced, and if not, can you propose a better way to measure time and log callgraphs ?
A collegue proposed another approach to deal with this problem, which is annotating with a specific comment each function (that we care to log). Then, during the make process, a special preprocessor must be run, parse each source file, add logging logic for each function we care to log, create a new source file with the newly added (parsing) code, and build that code instead. I guess that reading CALL_FUN... macros (my proposal) all over the place is not the best approach, and his approach would solve this problem. So what is your opinion about this approach?
PS: I am not well versed in the pitfalls of C/C++ MACROs, so if this can be developed using another approach, please say it so.
Thank you.
Well you could do some C++ magic to embed a logging object. something like
class CDebug
{
CDebug() { ... log somehow ... }
~CDebug() { ... log somehow ... }
};
in your functions then you simply write
void foo()
{
CDebug dbg;
...
you could add some debug info
dbg.heythishappened()
...
} // not dtor is called or if function is interrupted called from elsewhere.
I am bit late, but here is what I am doing for this:
On Windows there is a /Gh compiler switch which makes the compiler to insert a hidden _penter function at the start of each function. There is also a switch for getting a _pexit call at the end of each function.
You can utilizes this to get callbacks on each function call. Here is an article with more details and sample source code:
http://www.johnpanzer.com/aci_cuj/index.html
I am using this approach in my custom logging system for storing the last few thousand function calls in a ring buffer. This turned out to be useful for crash debugging (in combination with MiniDumps).
Some notes on this:
The performance impact very much depends on your callback code. You need to keep it as simple as possible.
You just need to store the function address and module base address in the log file. You can then later use the Debug Interface Access SDK to get the function name from the address (via the PDB file).
All this works suprisingly well for me.
Many nice industrial libraries have functions' declarations and definitions wrapped into void macros, just in case. If your code is already like that -- go ahead and debug your performance problems with some simple asynchronous trace logger. If no -- post-insertion of such macros can be an unacceptably time-consuming.
I can understand the pain of running an 1Mx1M matrix solver under valgrind, so I would suggest starting with so called "Monte Carlo profiling method" -- start the process and in parallel run pstack repeatedly, say each second. As a result you will have N stack dumps (N can be quite significant). Then, the mathematical approach would be to count relative frequencies of each stack and make a conclusion about the ones most frequent. In practice you either immediately see the bottleneck or, if no, you switch to bisection, gprof, and finally to valgrind's toolset.
Let me assume the reason you are doing this is you want to locate any performance problems (bottlenecks) so you can fix them to get higher performance.
As opposed to measuring speed or getting coverage info.
It seems you're thinking the way to do this is to log the history of function calls and measure how long each call takes.
There's a different approach.
It's based on the idea that mainly the program walks a big call tree.
If time is being wasted it is because the call tree is more bushy than necessary,
and during the time that's being wasted, the code that's doing the wasting is visible on the stack.
It can be terminal instructions, but more likely function calls, at almost any level of the stack.
Simply pausing the program under a debugger a few times will eventually display it.
Anything you see it doing, on more than one stack sample, if you can improve it, will speed up the program.
It works whether or not the time is being spent in CPU, I/O or anything else that consumes wall clock time.
What it doesn't show you is tons of stuff you don't need to know.
The only way it can not show you bottlenecks is if they are very small,
in which case the code is pretty near optimal.
Here's more of an explanation.
Although I think it will be hard to do anything better than gprof, you can create a special class LOG for instance and instantiate it in the beginning of each function you want to log.
class LOG {
LOG(const char* ...) {
// log time_t of the beginning of the call
}
~LOG(const char* ...) {
// calculate the total time spent,
//by difference between current time and that saved in the constructor
}
};
void somefunction() {
LOG log(__FUNCTION__, __FILE__, ...);
.. do other things
}
Now you can integrate this approach with the preprocessing one you mentioned. Just add something like this in the beginning of each function you want to log:
// ### LOG
and then you replace the string automatically in debug builds (shoudn't be hard).
May be you should use a profiler. AQTime is a relatively good one for Visual Studio. (If you have VS2010 Ultimate, you already have a profiler.)

How do I get a print of the functions called in a C++ program under Linux?

What I want is a mix of what can be obtained by a static code analysis like Doxygen and the stackframe you can see when using GDB. I know which problematic function I'm debugging and I want to see the neighbourhood of the function calls that guided the execution to this function call. For instance, running a simple HelloWorld! would output something like:
main:
Greeter::Greeter()
Greeter::printHello()
Greeter::printWorld()
denoting that from the main function, the constructor was called and then the printHello and printWorld functions where called. Notice that in GDB if I break at printWorld I won't be able to see in the stackframe that printHello was called.
Any ideas about how to trace function calls without going through the pain of inserting log messages in a myriad of source files?
Thanks!!
The -finstrument-functions option to gcc instructs the compiler to call a user-provided profiling function at every function entry and exit.
You could use this to write a function that just logs every function entry and exit.
From reading the question I understand that you want a list of all relevant functions executed in order as they're executed.
Unfortunately there is no application to generate this list automatically, but there are helper macros to save you a lot of time. Define a single macro called LOGFUNCTION or whatever you want and define it as:
#define LOGFUNCTION printf("In %s (%s:%d)\n", __PRETTY_FUNCTION__, __FILE__, __LINE__);
Now you do have to paste the line LOGFUNCTION wherever you want a trace to be added.
wherever you see fit.
see http://gcc.gnu.org/onlinedocs/gcc/Function-Names.html and http://gcc.gnu.org/onlinedocs/cpp/Standard-Predefined-Macros.html
GDB features a stack trace, it does what you ask for.
What he wants is to obtain tha info (for example, backtrace from gdb) but printed in a 'nicer' format than gdb do.
I think you can't. I mean, maybe there is some type of app that trace your application and do something like that, but I never hear about something like that.
The best thing you can do is use GDB, maybe create some type of bash script that use gdb to obtain the info and print it out in the way you like.
Of course, your application MUST be compiled with debug symbols (-g param to gcc).
I'm not entirely sure what the problem is with gdb's backtrace, but maybe a profiler is closer to what you want? For example, using valgrind:
valgrind --tool cachegrind ./myprogram
kcachegrind callgrind.out.NNNN
Have you tried to use gprof to generate a call graph? You can also convert gprof output to something easier on the eye with gprof2dot for example.

Preventing GDB from stepping into a function (or file)

I have some C++ code like this that I'm stepping through with GDB:
void foo(int num) { ... }
void main() {
Baz baz;
foo (baz.get());
}
When I'm in main(), I want to step into foo(), but I want to step over baz.get().
The GDB docs say that "the step command only enters a function if there is line number information for the function", so I'd be happy if I could remove the line number information for baz.get() from my executable. But ideally, I'd be able to tell GDB "never step into any function in the Baz class".
Does anyone know how to do this?
Starting with GDB 7.4, skip can be used.
Run info skip, or check out the manual for details: https://sourceware.org/gdb/onlinedocs/gdb/Skipping-Over-Functions-and-Files.html
Instead of choosing to "step", you can use the "until" command to usually behave in the way that you desire:
(gdb) until foo
I don't know of any way to permanently configure gdb to skip certain symbols (aside from eliding their debugging information).
Edit: actually, the GDB documentation states that you can't use until to jump to locations that aren't in the same frame. I don't think this is true, but in the event that it is, you can use advance for the same purpose:
(gdb) advance foo
Page 85 of the GDB manual defines what can be used as "location" arguments for commands that take them. Just putting "foo" will make it look for a function named foo, so as long as it can find it, you should be fine. Alternatively you're stuck typing things like the filename:linenum for foo, in which case you might just be better off setting a breakpoint on foo and using continue to advance to it.
(I think this might be better suited as a comment rather than an answer, but I don't have enough reputation to add a comment yet.)
So I've also been wanting to ignore STL, Boost, et al (collectively '3rd Party') files when debugging for a while. Yesterday I finally decided to look for a solution and it seems the nearest capability is the 'skip' command in GDB.
I found the 'skip' ability in GDB to be helpful, but it's still a nuisance for me because my program uses a lot of STL and other "3rd Party" template code. In this case I have to mark a bunch of files as skip. After the 2nd time doing so I realized it would be more helpful to be able to skip an entire directory--and most helpful to skip a directory and all subdirectories. That way I can skip, for example, /usr since none of my code lives there and I typically have no interest in debugging through 3rd party code. So I extended the 'skip' command in gdb to support a new type 'dir'. I can now do this in gdb:
skip dir /usr
and then I'm never stopped in any of my 3rd party headers.
Here's a webpage w/ this info + the patch if it helps anyone: info & patch to skip directories in GDB
It appears that this isn't possible in GDB. I've filed a bug.
Meanwhile, gdb has the skip function command. Just execute it when you are inside the uninteresting function and it will not bother you again.
skip file is also very useful to get rid of the STL internals.
As Justin has said, it has been added in gdb 7.4. For more details, take a look at the documentation.

Does an arbitrary instruction pointer reside in a specific function?

I have a very difficult problem I'm trying to solve: Let's say I have an arbitrary instruction pointer. I need to find out if that instruction pointer resides in a specific function (let's call it "Foo").
One approach to this would be to try to find the start and ending bounds of the function and see if the IP resides in it. The starting bound is easy to find:
void *start = &Foo;
The problem is, I don't know how to get the ending address of the function (or how "long" the function is, in bytes of assembly).
Does anyone have any ideas how you would get the "length" of a function, or a completely different way of doing this?
Let's assume that there is no SEH or C++ exception handling in the function. Also note that I am on a win32 platform, and have full access to the win32 api.
This won't work. You're presuming functions are contigous in memory and that one address will map to one function. The optimizer has a lot of leeway here and can move code from functions around the image.
If you have PDB files, you can use something like the dbghelp or DIA API's to figure this out. For instance, SymFromAddr. There may be some ambiguity here as a single address can map to multiple functions.
I've seen code that tries to do this before with something like:
#pragma optimize("", off)
void Foo()
{
}
void FooEnd()
{
}
#pragma optimize("", on)
And then FooEnd-Foo was used to compute the length of function Foo. This approach is incredibly error prone and still makes a lot of assumptions about exactly how the code is generated.
Look at the *.map file which can optionally be generated by the linker when it links the program, or at the program's debug (*.pdb) file.
OK, I haven't done assembly in about 15 years. Back then, I didn't do very much. Also, it was 680x0 asm. BUT...
Don't you just need to put a label before and after the function, take their addresses, subtract them for the function length, and then just compare the IP? I've seen the former done. The latter seems obvious.
If you're doing this in C, look first for debugging support --- ChrisW is spot on with map files, but also see if your C compiler's standard library provides anything for this low-level stuff -- most compilers provide tools for analysing the stack etc., for instance, even though it's not standard. Otherwise, try just using inline assembly, or wrapping the C function with an assembly file and a empty wrapper function with those labels.
The most simple solution is maintaining a state variable:
volatile int FOO_is_running = 0;
int Foo( int par ){
FOO_is_running = 1;
/* do the work */
FOO_is_running = 0;
return 0;
}
Here's how I do it, but it's using gcc/gdb.
$ gdb ImageWithSymbols
gdb> info line * 0xYourEIPhere
Edit: Formatting is giving me fits. Time for another beer.

Tools for finding unused function declarations?

Whilst refactoring some old code I realised that a particular header file was full of function declarations for functions long since removed from the .cpp file. Does anyone know of a tool that could find (and strip) these automatically?
You could if possible make a test.cpp file to call them all, the linker will flag the ones that have no code as unresolved, this way your test code only need compile and not worry about actually running.
PC-lint can be tunned for dedicated purpose:
I tested the following code against for your question:
void foo(int );
int main()
{
return 0;
}
lint.bat test_unused.cpp
and got the following result:
============================================================
--- Module: test_unused.cpp (C++)
--- Wrap-up for Module: test_unused.cpp
Info 752: local declarator 'foo(int)' (line 2, file test_unused.cpp) not referenced
test_unused.cpp(2) : Info 830: Location cited in prior message
============================================================
So you can pass the warning number 752 for your puropse:
lint.bat -"e*" +e752 test_unused.cpp
-e"*" will remove all the warnings and +e752 will turn on this specific one
If you index to code with Doxygen you can see from where is each function referenced. However, you would have to browse through each class (1 HTML page per class) and scan for those that don't have anything pointing to them.
Alternatively, you could use ctags to generate list of all functions in the code, and then use objdump or some similar tool to get list of all function in .o files - and then compare those lists. However, this can be problematic due to name mangling.
I don't think there is such thing because some functions not having a body in the actual source tree might be defined in some external library. This can only be done by creating a script which makes a list of declared functions in a header and verifies if they are sometimes called.
I have a C++ ftplugin for vim that is able is check and report unmatched functions -- vimmers, the ftplugin suite is not yet straightforward to install. The ftplugin is based on ctags results (hence its heuristic could be easily adapted to other environments), sometimes there are false positives in the case of inline functions.
HTH,
In addition Doxygen (#Milan Babuskov), you can see if there are warnings for this in your compiler. E.g. gcc has -Wunused-function for static functions; -fdump-ipa-cgraph.
I've heard good things about PC-Lint, but I imagine it's probably overkill for your needs.