I have FreeRTOS running on ARM processor and I don't have dump_stack() available to me... I am trying to check the call-chain and badly missing dump_stack()... I was googling a bit, and found something close to what i was looking for, using GCC(/GDB) _Unwind_Backtrace() utility but it only prints the address of stack_frame. It doesn't provide mapping to meaningful symbol (like function names). Any help is really appreciated.
#include <stdio.h>
#include <unwind.h>
#include <stdint.h>
static _Unwind_Reason_Code unwind_backtrace_callback(struct _Unwind_Context* context, void* arg)
{
uintptr_t pc = _Unwind_GetIP(context);
if (pc) {
printf("unwind got pc ...0x%x\n", pc);
}
return _URC_NO_REASON;
}
ssize_t unwind_backtrace()
{
_Unwind_Reason_Code rc = _Unwind_Backtrace(unwind_backtrace_callback, 0);
return rc == _URC_END_OF_STACK ? 0 : -1;
}
void func_1()
{
int ret = unwind_backtrace();
printf("unwind_backtrace return ...%d\n", ret);
}
void func_2()
{
func_1();
}
int main()
{
func_2();
return 0;
}
Result:
unwind got pc ...0x40076b
unwind got pc ...0x400796
unwind got pc ...0x4007bd
unwind got pc ...0x400819
unwind got pc ...0x67314b15
unwind got pc ...0x400649
unwind_backtrace return ...0
All the IDEs I use (and I use a lot) show me the stack trace in a window - but only for the currently executing task. If I want to see the trace for all the tasks I need a fully thread aware FreeRTOS plug-in of the type provided by Segger, IAR and Code Confidence.
It doesn't provide mapping to meaningful symbol
The "standard" way to perform this mapping is by using addr2line. Something like:
addr2line -fe a.out 0x40076b 0x400796 0x4007bd ...
Update:
I want on the fly convert ...
Well, you should have asked for that then.
It's a simple matter of writing code. You need to write code that will map address ranges to symbol names (just like addr2line does).
On an ELF platform, this is actually quite simple: read Elf32_Syms from .symtab section to build address to symbol map, and look up your addresses in that map. You'll also need to read corresponding symbol names from .strtab section (Elf32_Sym.st_name is the offset into .strtab).
Related
I have been working on this simply hobbyist OS, and I have decided to add some C++ support. Here is the simple script I wrote. When I compile it, I get this message:
cp.o: In function `caller':
test.cpp:(.text+0x3a): undefined reference to `__stack_chk_fail'
Here is the script:
class CPP {
public:
int a;
void test(void);
};
void CPP::test(void) {
// Code here
}
int caller() {
CPP caller;
caller.test();
return CPP.a;
}
Try it like this.
class CPP {
public:
int a;
void test(void);
};
void CPP::test(void) {
CPP::a = 4;
}
int caller() {
CPP caller;
caller.test();
return caller.a;
}
int main(){
int called = caller();
std::cout << called << std::endl;
return 0;
}
It seems to me that the linker you are using can't find the library containing a security function crashing the program upon detecting stack smashing. (It may be that the compiler doesn't include the function declaration for some reason? I am not familiar who actually defies this specific function.) Try compiling with -fno-stack-protector or equivalent.
What is the compiler used? A workaround might be defining the function as something like exit(1); or similar. That would produce the intended effect yet fix the problem for now.
I created a test program to show how this actually plays out. Test program:
int main(){
int a[50]; // To have the compiler manage the stack
return 0;
}
With only -O0 as the flag ghidra decompiles this to:
undefined8 main(void){
long in_FS_OFFSET;
if (*(long *)(in_FS_OFFSET + 0x28) != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return 0;
}
With -fno-stack-protector:
undefined8 main(void){
return 0;
}
The array was thrown out by ghidra in decompilation, but we see that the stack protection is missing if you use the flag. There are also some messed up parts of this in ghidra (e.g. int->undefined8), but this is standard in decompilation.
Consequences of using the flag
Compiling without stack protection is not good per se, but it shouldn't affect you in much. If you write some code (that the compiler shouts you about) you can create a buffer overflowable program, which should not be that big of an issue in my optinion.
Alternative
Alternatively have a look at this. They are talking about embedded systems, but the topic seems appropriate.
Why is the code there
Look up stack smashing, but to my knowledge I will try to explain. When the program enters a function (main in this case) it stores the location of the next instruction in the stack.
If you write an OS you probably know what the stack is, but for completeness: The stack is just some memory onto which you can push and off which you can pop data. You always pop the last pushed thing off the stack. C++ and other languages also use the stack as a way to store local variables. The stack is at the end of memory and when you push something, the new thing will be further forward rather than back, it fills up 'backwards'.
You can initialise buffers as a local variable e.g. char[20]. If you filled the buffer without checking the length you might overfill this, and overwrite things in the stack other than the buffer. The return address of the next instruction is in the stack as well. So if we have a program like this:
int test(){
int a;
char buffer[20];
int c;
// someCode;
}
Then the stack will look something like this at someCode:
[ Unused space, c, buffer[0], buffer[1] ..., buffer[19], a, Return Address, variables of calling function ]
Now if I filled the buffer without checking the length I can overwrite a (which is a problem as I can modify how the program runs) or even the return address (which is a major flaw as I might be able to execute malicious shellcode, by injecting it into the buffer). To avoid this compilers insert a 'stack cookie' between a and the return address. If that variable is changed then the program should terminate before calling return, and that is what __stack_chk_fail() is for. It seems that it is defined in some library as well so you might not be able use this, despite technically the compiler being the one that uses this.
I was trying to debug a program that has a corrupted stack and seems too big (it has multiple threads) to manually debug. So I was wondering if there was a way to print out the symbols that correspond to the addresses on the stack after the corruption to try and get a better idea of how it got there.
I noticed the "info symbol" command (which normally prints out the symbol at a given address) only accepts one address at a time. So, I tried to write a script to do what I wanted, but when I tried to store the addresses in convenience variables so I could iterate through the stack manually, the info symbol command wouldn't work.
I know on WinDBG there is the dds command which does what I'm looking for, but I have not been able to find an equivalent in GDB. Does anyone know an equivalent?
x command with a flag will decode memory as address and will try to lookup for the symbols
given code:
int func3(int a)
{
return a+a;
}
int func2(int b)
{
return func3(b+b);
}
int func1(int c)
{
return func2(c+c);
}
int main(int argc, char** argv)
{
return func1(argc);
}
and breakpoint at func3 output will be:
(gdb) x /16ga $rsp
0x7fffffffe150: 0x7fffffffe168 0x5555555545fa <func2+23>
0x7fffffffe160: 0x2000000c2 0x7fffffffe180
0x7fffffffe170: 0x555555554613 <func1+23> 0x100000000
0x7fffffffe180: 0x7fffffffe1a0 0x55555555462e <main+25>
0x7fffffffe190: 0x7fffffffe288 0x100000000
0x7fffffffe1a0: 0x555555554630 <__libc_csu_init> 0x7ffff7a05b97 <__libc_start_main+231>
0x7fffffffe1b0: 0x1 0x7fffffffe288
0x7fffffffe1c0: 0x100008000 0x555555554615
This might not answer your question but could help you with identifying the place where you have the stack corruption. Have you tried compiling with -fstack-protectorxxx flags on ?
https://en.wikibooks.org/wiki/Linux_Applications_Debugging_Techniques/Stack_corruption
Is it possible to write some f() template function that takes a type T and a pointer to member function of signature void(T::*pmf)() as (template and/or function) arguments and returns a const char* that points to the member function's __func__ variable (or to the mangled function name)?
EDIT: I am asked to explain my use-case. I am trying to write a unit-test library (I know there is a Boost Test library for this purpose). And my aim is not to use any macros at all:
struct my_test_case : public unit_test::test {
void some_test()
{
assert_test(false, "test failed.");
}
};
My test suite runner will call my_test_case::some_test() and if its assertion fails, I want it log:
ASSERTION FAILED (&my_test_case::some_test()): test failed.
I can use <typeinfo> to get the name of the class but the pointer-to-member-function is just an offset, which gives no clue to the user about the test function being called.
It seems like what you are trying to achieve, is to get the name of the calling function in assert_test(). With gcc you can use
backtace to do that. Here is a naive example:
#include <iostream>
#include <execinfo.h>
#include <cxxabi.h>
namespace unit_test
{
struct test {};
}
std::string get_my_caller()
{
std::string caller("???");
void *bt[3]; // backtrace
char **bts; // backtrace symbols
size_t size = sizeof(bt)/sizeof(*bt);
int ret = -4;
/* get backtrace symbols */
size = backtrace(bt, size);
bts = backtrace_symbols(bt, size);
if (size >= 3) {
caller = bts[2];
/* demangle function name*/
char *name;
size_t pos = caller.find('(') + 1;
size_t len = caller.find('+') - pos;
name = abi::__cxa_demangle(caller.substr(pos, len).c_str(), NULL, NULL, &ret);
if (ret == 0)
caller = name;
free(name);
}
free(bts);
return caller;
}
void assert_test(bool expression, const std::string& message)
{
if (!expression)
std::cout << "ASSERTION FAILED " << get_my_caller() << ": " << message << std::endl;
}
struct my_test_case : public unit_test::test
{
void some_test()
{
assert_test(false, "test failed.");
}
};
int main()
{
my_test_case tc;
tc.some_test();
return 0;
}
Compiled with:
g++ -std=c++11 -rdynamic main.cpp -o main
Output:
ASSERTION FAILED my_test_case::some_test(): test failed.
Note: This is a gcc (linux, ...) solution, which might be difficult to port to other platforms!
TL;DR: It is not possible to do this in a reasonably portable way, other than using macros. Using debug symbols is really a hard solution, which will introduce a maintenance and architecture problem in the future, and a bad solution.
The names of functions, in any form, is not guaranteed to be stored in the binary [or anywhere else for that matter]. Static free functions certainly won't have to expose their name to the rest of the world, and there is no real need for virtual member functions to have their names exposed either (except when the vtable is formed in A.c and the member function is in B.c).
It is also entirely permissible for the linker to remove ALL names of functions and variables. Names MAY be used by shared libraries to find functions not present in the binary, but the "ordinal" way can avoid that too, if the system is using that method.
I can't see any other solution than making assert_test a macro - and this is actually a GOOD use-case of macros. [Well, you could of course pass __func__ as a an argument, but that's certainly NOT better than using macros in this limited case].
Something like:
#define assert_test(x, y) do_assert_test(x, y, __func__)
and then implment do_assert_test to do what your original assert_test would do [less the impossible bit of figuring out the name of the function].
If it's unit tests, and you can be sure that you will always do this with debug symbols, you could solve it in a very non-portable way by building with debug symbols and then using the debug interface to find the name of the function you are currently in. The reason I say it's non-portable is that the debug API for a given OS is not standard - Windows does it one way, Linux another, and I'm not sure how it works in MacOS - and to make matters worse, my quick search on the subject seems to indicate that reading debug symbols doesn't have an API as such - there is a debug API that allows you to inspect the current process and figure out where you are, what the registers contain, etc, but not to find out what the name of the function is. So that's definitely a harder solution than "convince whoever needs to be convinced that this is a valid use of a macro".
I am catching an exception using Win32 SEH:
try
{
// illegal operation that causes access violation
}
__except( seh_filter_func(GetExceptionInformation()) )
{
// abort
}
where the filter function looks like:
int seh_filter_func(EXCEPTION_POINTERS *xp)
{
// log EIP, other registers etc. to file
return 1;
}
This works so far and the value in xp->ContextRecord->Eip tells me which function caused the access violation (actually - ntdll.dll!RtlEnterCriticalSection , and the value for EDX tells me that this function was called with a bogus argument).
However, this function is called in many places, including from other WinAPI functions, so I still don't know which code is responsible for calling this function with the bogus argument.
Is there any code I can use to generate a trace of the chain of function calls leading up to where EIP is now, based on the info in EXCEPTION_POINTERS or otherwise? (Running the program under an external debugger isn't an option).
Just EIP values would be OK as I can look them up in the linker map and symbol tables, although if there is a way to automatically map them to symbol names that'd be even better.
I am using C++Builder 2006 for this project, although an MSVC++ solution might work anyway.
I think you can use Boost.Stacktrace for this:
#include <boost/stacktrace.hpp>
int seh_filter_func(EXCEPTION_POINTERS *xp)
{
const auto stack = to_string( boost::stacktrace::stacktrace() );
LOG( "%s", stack.c_str() );
return 1;
}
I want to intercept application's calls to dlsym. I have tried declaring inside the .so that I am preloading dlsym , and using dlsym itself to get it's real address, but that for quite obvious reasons didn't work.
Is there a way easier than taking process' memory maps, and using libelf to find the real location of dlsym inside loaded libdl.so?
WARNING:
I have to explicitely warn everyone who tries to do this. The general premise of having a shared library hooking dlsym has several significant drawbacks. The biggest issue issue is that the original dlsym implementation if glibc will internally use stack unwinding techniques to find out from which loaded module the function was called. If the intercepting shared library then calls the original dlsym on behalf of the original application, this will break lookups using stuff like RTLD_NEXT, as now the current module isn't the originally calling one, but your hook library.
It might be possible to implement this the correct way, but it requires a lot more work. Without having tried it, I think that using dlinfo to get to the chained list of linket maps, you could individually walk through all modules, and do a separate dlsym for each one, to get the RTLD_NEXT behavior right. You still need to get the address of your caller for that, which you might get via the old backtrace(3) family of functions.
MY OLD ANSWER FROM 2013
I stumbled across the same problem with hdante's answer as the commenter: calling __libc_dlsym() directly crashes with a segfault. After reading some glibc sources, I came up with the following hack as a workaround:
extern void *_dl_sym(void *, const char *, void *);
extern void *dlsym(void *handle, const char *name)
{
/* my target binary is even asking for dlsym() via dlsym()... */
if (!strcmp(name,"dlsym"))
return (void*)dlsym;
return _dl_sym(handle, name, dlsym);
}
NOTE two things with this "solution":
This code bypasses the locking which is done internally by (__libc_)dlsym(), so to make this threadsafe, you should add some locking.
The thrid argument of _dl_sym() is the address of the caller, glibc seems to reconstruct this value by stack unwinding, but I just use the address of the function itself. The caller address is used internally to find the link map the caller is in to get things like RTLD_NEXT right (and, using NULL as thrid argument will make the call fail with an error when using RTLD_NEXT). However, I have not looked at glibc's unwindind functionality, so I'm not 100% sure that the above code will do the right thing, and it may happen to work just by chance alone...
The solution presented so far has some significant drawbacks: _dl_sym() acts quite differently than the intended dlsym() in some situations. For example, trying to resolve a symbol which does not exist does exit the program instead of just returning NULL. To work around that, one can use _dl_sym() to just get the pointer to the original dlsym() and use that for everything else (like in the "standard" LD_PRELOAD hook approch without hooking dlsym at all):
extern void *_dl_sym(void *, const char *, void *);
extern void *dlsym(void *handle, const char *name)
{
static void * (*real_dlsym)(void *, const char *)=NULL;
if (real_dlsym == NULL)
real_dlsym=_dl_sym(RTLD_NEXT, "dlsym", dlsym);
/* my target binary is even asking for dlsym() via dlsym()... */
if (!strcmp(name,"dlsym"))
return (void*)dlsym;
return real_dlsym(handle,name);
}
UPDATE FOR 2021 / glibc-2.34
Beginning with glibc 2.34, the function _dl_sym() is no longer publicly exported. Another approach I can suggest is to use dlvsym() instead, which is offically part of the glibc API and ABI. The only downside is that you now need the exact version to ask for the dlsym symbol. Fortunately, that is also part of the glibc ABI, unfortunately, it varies per architecture. However, a grep 'GLIBC_.*\bdlsym\b' -r sysdeps in the root folder of the glibc sources will tell you what you need:
[...]
sysdeps/unix/sysv/linux/i386/libc.abilist:GLIBC_2.0 dlsym F
sysdeps/unix/sysv/linux/i386/libc.abilist:GLIBC_2.34 dlsym F
[...]
sysdeps/unix/sysv/linux/x86_64/64/libc.abilist:GLIBC_2.2.5 dlsym F
sysdeps/unix/sysv/linux/x86_64/64/libc.abilist:GLIBC_2.34 dlsym F
Glibc-2.34 actually introduced new versions of this function, but the old versions are still be kept around for backwards compatibilty.
For x86_64, you could use:
real_dlsym=dlvsym(RTLD_NEXT, "dlsym", "GLIBC_2.2.5");
And, if you both like to get the newest version, as well as a potentially one of another interceptor in the same process, you can use that version to do an unversioned query again:
real_dlsym=real_dlsym(RTLD_NEXT, "dlsym");
If you actually need to hook both dlsym and dlvsym in your shared object, this approach of course won't work either.
UPDATE: hooking both dlsym() and dlvsym() at the same time
Out of curiosity, I thought about some approach to hook both of the glibc symbol query methods, and I came up with a solution using an additional wrapper library which links to libdl. The idea is that the interceptor library can dynamically load this library at runtime using dlopen() with the RTLD_LOCAL | RTLD_DEEPBIND flags, which will create a separate linker scope for this object, also containing the libdl, so that the dlsym and dlvsym will be resolved to the original methods, and not the one in the interceptor library. The problem now is that our interceptor library can not directly call any function inside the wrapper library, because we can not use dlsym, which is our original problem.
However, the shared library can have an initialization function, which the linker will call before the dlopen() returns. We just need to pass some information from the initialization function of the wrapper library to the interceptor library. Since both are in the same process, we can use the environment block for that.
This is the code I came up with:
dlsym_wrapper.h:
#ifndef DLSYM_WRAPPER_H
#define DLSYM_WRAPPER_H
#define DLSYM_WRAPPER_ENVNAME "DLSYM_WRAPPER_ORIG_FPTR"
#define DLSYM_WRAPPER_NAME "dlsym_wrapper.so"
typedef void* (*DLSYM_PROC_T)(void*, const char*);
#endif
dlsym_wrapper.c, compiled to dlsym_wrapper.so:
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include "dlsym_wrapper.h"
__attribute__((constructor))
static void dlsym_wrapper_init()
{
if (getenv(DLSYM_WRAPPER_ENVNAME) == NULL) {
/* big enough to hold our pointer as hex string, plus a NUL-terminator */
char buf[sizeof(DLSYM_PROC_T)*2 + 3];
DLSYM_PROC_T dlsym_ptr=dlsym;
if (snprintf(buf, sizeof(buf), "%p", dlsym_ptr) < (int)sizeof(buf)) {
buf[sizeof(buf)-1] = 0;
if (setenv(DLSYM_WRAPPER_ENVNAME, buf, 1)) {
// error, setenv failed ...
}
} else {
// error, writing pointer hex string failed ...
}
} else {
// error: environment variable already set ...
}
}
And one function in the interceptor library to get the pointer to the
original dlsym() (should be called only once, guared by a mutex):
static void *dlsym_wrapper_get_dlsym
{
char dlsym_wrapper_name = DLSYM_WRAPPER_NAME;
void *wrapper;
const char * ptr_str;
void *res = NULL;
void *ptr = NULL;
if (getenv(DLSYM_WRAPPER_ENVNAME)) {
// error: already defined, shoudn't be...
}
wrapper = dlopen(dlsym_wrapper_name, RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND | RTLD_NOLOAD);
if (wrapper) {
// error: dlsym_wrapper.so already loaded ...
// it is important that we load it by ourselves to a sepearte linker scope
}
wrapper = dlopen(dlsym_wrapper_name, RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND);
if (!wrapper) {
// error: dlsym_wrapper.so can't be loaded
}
ptr_str = getenv(DLSYM_WRAPPER_ENVNAME);
if (!ptr_str) {
// error: dlsym_wrapper.so failed...
}
if (sscanf(ptr_str, "%p", &ptr) == 1) {
if (ptr) {
// success!
res = ptr;
} else {
// error: got invalid pointer ...
}
} else {
// error: failed to parse pointer...
}
// this is a bit evil: close the wrapper. we can be sure
// that libdl still is used, as this mosule uses it (dlopen)
dlclose(wrapper);
return res;
}
This of course assumes that dlsym_wrapper.so is in the library search path. However, you may prefer to just inject the interceptor library via LD_PRELOAD using a full path, and not modifying LD_LIBRARY_PATH at all. To do so, you can add dladdr(dlsym_wrapper_get_dlsym,...) to find the path of the injector library itself, and use that for searching the wrapper library, too.
http://www.linuxforu.com/2011/08/lets-hook-a-library-function/
From the text:
Do beware of functions that themselves call dlsym(), when you need to call __libc_dlsym (handle, symbol) in the hook.
extern void *__libc_dlsym (void *, const char *);
void *dlsym(void *handle, const char *symbol)
{
printf("Ha Ha...dlsym() Hooked\n");
void* result = __libc_dlsym(handle, symbol); /* now, this will call dlsym() library function */
return result;
}