Custom stacktrace implementation for ARM - c++

I need to have a stacktrace in my program that is written in C++ and runs on ARM-device. I can't find any reliable way to get starcktrace so I decided to write my own that will be as simple as possible, just to get something like stacktrace in gdb.
Here's an idea: write a macro that will push FUNCTION and __PRETTY_FUNCTION__. There are several questions:
Consider I have such a macro:
#define STACKTRACE_ENTER_FUNC \
... lock mutex
... push info into the global list
... set scope-exit handler to delete info at function exit
... unlock mutex
Now I need to place this macro in every function in my code. But there are too many of them. Is there any better way to achieve the goal or should I really change every function to include this macro:
void foo()
{
STACKTRACE_ENTER_FUNC;
...
}
void bar()
{
STACKTRACE_ENTER_FUNC;
...
}
The next question is: I can use __PRETTY_FUNCTION__ (because we use only gcc of fixed version and the stacktrace implementation is only for debug builds on the fixed platform, no cross-platform or compiler issues). I can even parse it a bit to split the string to function name and function arguments names. But how can I print all function arguments without knowing too much about them: like types or number of arguments? Like:
int foo(char x, float y)
{
PRINT_ARGS("arg1", "arg2"); // Gives me the string: "arg1 = 'A', arg2 = 13.37"
...
}
int main()
{
foo('A', 13.37);
...
}
P.S. If you know a better approach to get stack-trace in running program on ARMv6, please let me know (compiler: arm-openwrt-linux-uclibcgnueabi-gcc 4.7.3, libc: uClibc-0.9.33.2)
Thanks in advance.

The easier solution is to drop down to assembly - stack traces don't exist on C++ level anyway.
From an assembly perspective, you use a map of function addresses (which any linker can generate). The current Instruction Pointer identifies the top frame, the return addresses identify the call stack. The tricky part is tail-call optimization, which is a bit philosophical (do you want the logical or the actual call stack?)

Related

C++ function instrumentation via clang++'s -finstrument-functions : how to ignore internal std library calls?

Let's say I have a function like:
template<typename It, typename Cmp>
void mysort( It begin, It end, Cmp cmp )
{
std::sort( begin, end, cmp );
}
When I compile this using -finstrument-functions-after-inlining with clang++ --version:
clang version 11.0.0 (...)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: ...
The instrument code explodes the execution time, because my entry and exit functions are called for every call of
void std::__introsort_loop<...>(...)
void std::__move_median_to_first<...>(...)
I'm sorting a really big array, so my program doesn't finish: without instrumentation it takes around 10 seconds, with instrumentation I've cancelled it at 10 minutes.
I've tried adding __attribute__((no_instrument_function)) to mysort (and the function that calls mysort), but this doesn't seem to have an effect as far as these standard library calls are concerned.
Does anyone know if it is possible to ignore function instrumentation for the internals of a standard library function like std::sort? Ideally, I would only have mysort instrumented, so a single entry and a single exit!
I see that clang++ sadly does not yet support anything like finstrument-functions-exclude-function-list or finstrument-functions-exclude-file-list, but g++ does not yet support -finstrument-functions-after-inlining which I would ideally have, so I'm stuck!
EDIT: After playing more, it would appear the effect on execution-time is actually less than that described, so this isn't the end of the world. The problem still remains however, because most people who are doing function instrumentation in clang will only care about the application code, and not those functions linked from (for example) the standard library.
EDIT2: To further highlight the problem now that I've got it running in a reasonable time frame: the resulting trace that I produce from the instrumented code with those two standard library functions is 15GB. When I hard code my tracing to ignore the two function addresses, the resulting trace is 3.7MB!
I've run into the same problem. It looks like support for these flags was once proposed, but never merged into the main branch.
https://reviews.llvm.org/D37622
This is not a direct answer, since the tool doesn't support what you want to do, but I think I have a decent work-around. What I wound up doing was creating a "skip list" of sorts. In the instrumented functions (__cyg_profile_func_enter and __cyg_profile_func_exit), I would guess the part that is contributing most to your execution time is the printing. If you can come up with a way of short-circuiting the profile functions, that should help, even if it's not the most ideal. At the very least it will limit the size of the output file.
Something like
#include <stdint.h>
uintptr_t skipAddrs[] = {
// assuming 64-bit addresses
0x123456789abcdef, 0x2468ace2468ace24
};
size_t arrSize = 0;
int main(void)
{
...
arrSize = sizeof(skipAddrs)/sizeof(skipAddrs[0]);
// https://stackoverflow.com/a/37539/12940429
...
}
void __cyg_profile_func_enter (void *this_fn, void *call_site) {
for (size_t idx = 0; idx < arrSize; idx++) {
if ((uintptr_t) this_fn == skipAddrs[idx]) {
return;
}
}
}
I use something like objdump -t binaryFile to examine the symbol table and find what the addresses are for each function.
If you specifically want to ignore library calls, something that might work is examining the symbol table of your object file(s) before linking against libraries, then ignoring all the ones that appear new in the final binary.
All this should be possible with things like grep, awk, or python.
You have to add attribute __attribute__((no_instrument_function)) to the functions that should not be instrumented. Unfortunately it is not easy to make it work with C/C++ standard library functions because this feature requires editing all the C++ library functions.
There are some hacks you can do like #define existing macros from include/__config to add this attribute as well. e.g.,
-D_LIBCPP_INLINE_VISIBILITY=__attribute__((no_instrument_function,internal_linkage))
Make sure to append existing macro definition with no_instrument_function to avoid unexpected errors.

Variable name to string in function call, C++

I'm expanding our internal debugging library, and I've run into an odd wall. I'd like to output a variable name as a string. From elsewhere on this site, I found that a macro can be used to do this within a file:
#define VarToStr(v) #v
...
printf("%s\n", VarToStr(MatName));
This outputs MatName. But now let's try this through a function across files (Matrix is a defined type):
// DebugHelpers.h
#define VarToStr(v) #v
...
void PrintMatrix(const Matrix &InputMat)
{
printf("%s\n", VarToStr(InputMat));
... // output InputMat contents
}
// DataAnalysis.cc
#include DebugHelpers.h
...
void AnalysisSubProgram342()
{
Matrix MatName;
...
PrintMatrix(MatName);
}
This outputs InputMat, instead of MatName. How can a function in another file get the variable name from the calling file?
While more complex solutions (wrapper classes, etc.) would be useful for the larger community, my implementation needs to minimize impact to preexisting code/classes.
Update:
Inspired by zenith's comments, I implemented both of his proposed solutions for comparison's sake and got both working quickly. The macro works well for simple outputs, while the function allows for more complex work (and type checking/overloading). I hadn't known that preprocessor macros could be so complex. I'll remember both for future use. Thanks !
You can't. Neither C nor C++ retain variable names at runtime.
All your macros are doing is substituting text which happens at compile time.
As mentioned by others, C++ doesn't support runtime reflection, so if you want to have a string whose contents will only be known at runtime (which is when the call to PrintMatrix will happen), you need to pass it as an argument.
And because you always know what your variables' names are you don't need the VarToStr macro:
// DebugHelpers.h
void PrintMatrix(const Matrix &InputMat, const char* MatName)
{
printf("%s\n", MatName);
... // output InputMat contents
}
// DataAnalysis.cc
#include DebugHelpers.h
...
void AnalysisSubProgram342()
{
Matrix MatName;
...
PrintMatrix(MatName, "MatName");
}
But there's another choice: make PrintMatrix a macro itself, since it's only a debug thing anyway:
// DebugHelpers.h
#define PRINT_MATRIX(InputMat)\
printf(#InputMat "\n");\
... // output InputMat contents
// DataAnalysis.cc
#include DebugHelpers.h
...
void AnalysisSubProgram342()
{
Matrix MatName;
...
PRINT_MATRIX(MatName);
}
Now after preprocessing, AnalysisSubProgram342 will look like this:
void AnalysisSubProgram342()
{
Matrix MatName;
...
printf("MatName\n");
... // output InputMat contents
}
In general you cannot do that (getting the name of a variable at runtime, from e.g. its address or in C++ its reference).
I am focusing on Linux:
However, on Linux (and GNU glibc based systems), for global variables (and functions), you might use the GNU specific dladdr(3) function.
If all the relevant code was compiled with -g (to get debug info), you might parse the debug information in DWARF format (perhaps also using __builtin_frame_address, etc.). With some pain, you might be able to get the name of some local variables from its address on the call stack. This would be a significant effort (probably months of work). Ian Taylor's libbacktrace (inside GCC) might be useful as a starting point.
You could also start (assuming everything is compiled with -g), with e.g. popen(3), a gdb -p debugging process.
Notice that recent GDB debugger is scriptable in Python or Guile, so practically speaking developing Python or Guile functions for GDB would be quicker.
You could also simply add debug output like here.

Running Function Inside Stub. Passing Function Pointer

I'm working on creating a user-level thread library and what I want to do is run a function inside a stub and so I would like to pass the function pointer to the stub function.
Here is my stub function:
void _ut_function_stub(void (*f)(void), int id)
{
(*f)();
DeleteThread(id);
}
This is what the user calls. What I want to do is get pointer of _ut_function_stub to assign to pc and I've tried various different options including casting but the compiler keeps saying "invalid use of void expression".
int CreateThread (void (*f) (void), int weight)
{
... more code
pc = (address_t)(_ut_function_stub(f, tcb->id));
... more code
}
Any help is appreciated. Thanks!
If you're interested in implementing your own user-level-threads library, I'd suggest looking into the (now deprecated) ucontext implementation. Specifically, looking at the definitions for the structs used in ucontext.h will help you see all the stuff you actually need to capture to get a valid snapshot of the thread state.
What you're really trying to capture with the erroneous (address_t) cast in your example is the current continuation. Unfortunately, C doesn't support first-class continuations, so you're going to be stuck doing something much more low-level, like swapping stacks and dumping registers (hence why I pointed you to ucontext as a reference—it's going to be kind of complicated if you really want to get this right).

Trace all user defined function calls using gdb

I want to trace all the user defined functions that have been called (in order, and preferably with the input params). Is there a way to do this using gdb? OR is there a better free/opensource application out there to do this job?
Please note that I want to print only the user defined function calls.
for example:
int abc(int a, char b) {
return xyz(a+b);
}
int xyz(int theta) {
return theta * theta;
}
I need the following output:
abc(a, b);
xyz(theta);
My codebase is pretty huge and is compiled in various pieces and hence I want to avoid using a tool which needs me to compile my source code again with some options enabled.
PS: I found that there are ways where you can define functions in gdb and pass in function names as params to find out there call trace. But in my case the code base is pretty huge and I'm starting off with it, so I'm not even sure what all functions are called etc. It wouldn't be practical to list all the functions in here.
TIA,
You need to run some form of third-party tool against your binary, something like Quantify (IBM) or Callgrind, (or as #Paul R mentioned above gprof). They will generate a call tree, which will give you the information you need, google: "call tree C functions" for example, will reveal lots of goodies you can link against your code...
If you want to roll your own, you'd need to add one line to to the top of each of your functions which creates a stack allocated object and you can catch the ctor/dtor sequence to know when you've entered and exited the function and then maintain a "stack" of these to generate your own call tree... pretty easy to do (in single threaded, complex in multi-threaded)...

Does an arbitrary instruction pointer reside in a specific function?

I have a very difficult problem I'm trying to solve: Let's say I have an arbitrary instruction pointer. I need to find out if that instruction pointer resides in a specific function (let's call it "Foo").
One approach to this would be to try to find the start and ending bounds of the function and see if the IP resides in it. The starting bound is easy to find:
void *start = &Foo;
The problem is, I don't know how to get the ending address of the function (or how "long" the function is, in bytes of assembly).
Does anyone have any ideas how you would get the "length" of a function, or a completely different way of doing this?
Let's assume that there is no SEH or C++ exception handling in the function. Also note that I am on a win32 platform, and have full access to the win32 api.
This won't work. You're presuming functions are contigous in memory and that one address will map to one function. The optimizer has a lot of leeway here and can move code from functions around the image.
If you have PDB files, you can use something like the dbghelp or DIA API's to figure this out. For instance, SymFromAddr. There may be some ambiguity here as a single address can map to multiple functions.
I've seen code that tries to do this before with something like:
#pragma optimize("", off)
void Foo()
{
}
void FooEnd()
{
}
#pragma optimize("", on)
And then FooEnd-Foo was used to compute the length of function Foo. This approach is incredibly error prone and still makes a lot of assumptions about exactly how the code is generated.
Look at the *.map file which can optionally be generated by the linker when it links the program, or at the program's debug (*.pdb) file.
OK, I haven't done assembly in about 15 years. Back then, I didn't do very much. Also, it was 680x0 asm. BUT...
Don't you just need to put a label before and after the function, take their addresses, subtract them for the function length, and then just compare the IP? I've seen the former done. The latter seems obvious.
If you're doing this in C, look first for debugging support --- ChrisW is spot on with map files, but also see if your C compiler's standard library provides anything for this low-level stuff -- most compilers provide tools for analysing the stack etc., for instance, even though it's not standard. Otherwise, try just using inline assembly, or wrapping the C function with an assembly file and a empty wrapper function with those labels.
The most simple solution is maintaining a state variable:
volatile int FOO_is_running = 0;
int Foo( int par ){
FOO_is_running = 1;
/* do the work */
FOO_is_running = 0;
return 0;
}
Here's how I do it, but it's using gcc/gdb.
$ gdb ImageWithSymbols
gdb> info line * 0xYourEIPhere
Edit: Formatting is giving me fits. Time for another beer.