Meaning of a gdb backtrace when there is not source code - c++

I have a gdb backtrace of a crashed process, but I can't see the specific line in which the crash occurred because the source code was not in that moment. I don't understand some of the information given by the mentioned backtrace.
The backtrace is made of lines like the following one:
<path_to_binary_file>(_Z12someFunction+0x18)[0x804a378]
Notice that _Z12someFunction is the mangled name of int someFunction(double ).
My questions are:
Does the +0x18 indicate the offset, starting at _Z12someFunction address, of the assembly instruction that produced the crash?
If the previous question is affirmative, and taking into account that I am working with a 32-bit architecture, does the +0x18 indicates 0x18 * 4 bytes?
If the above is affirmative, I assume that the address 0x804a378 is the _Z12someFunction plus 0x18, am I right?
EDIT:
The error has ocurred in a production machine (no cores enabled), and it seems to be a timing-dependant bug, so it is not easy to reproduce it. That is because the information I am asking for is important to me in this occasion.

Most of your assumptions are correct. The +0x18 indeed means offset (in bytes, regardless of architecture) into the executable.
0x804a378 is the actual address in which the error occurred.
With that said, it is important to understand what you can do about it.
First of all, compiling with -g will produce debug symbols. You, rightfully, strip those for your production build, but all is not lost. If you take your original executable (i.e. - before you striped it), you can run:
addr2line -e executable
You can then feed into stdin the addresses gdb is giving you (0x804a378), and addr2line will give you the precise file and line to which this address refers.
If you have a core file, you can also load this core file with the unstriped executable, and get full debug info. It would still be somewhat mangled, as you're probably building with optimizations, but some variables should, still, be accessible.
Building with debug symbols and stripping before shipping is the best option. Even if you did not, however, if you build the same sources again with the same build tools on the same environment and using the same build options, you should get the same binary with the same symbols locations. If the bug is really difficult to reproduce, it might be worthwhile to try.
EDITED to add
Two more important tools are c++filt. You feed it a mangled symbol, and produces the C++ path to the actual source symbol. It works as a filter, so you can just copy the backtrace and paste it into c++filt, and it will give you the same backtrace, only more readable.
The second tool is gdb remote debugging. This allows you to run gdb on a machine that has the executable with debug symbols, but run the actual code on the production machine. This allows live debugging in production (including attaching to already running processes).

You are confused. What you are seeing is backtrace output from glibc's backtrace function, not gdb's backtrace.
but I can't see the specific line in which the crash occurred because
the source code was not in that moment
Now you can load executable in gdb and examine the address 0x804a378 to get line numbers. You can use list *0x804a378 or info symbol 0x804a378. See Convert a libc backtrace to a source line number and How to use addr2line command in linux.

Run man gcc, there you should see -g option that gives you possibility to add debug information to the binary object file, so when crash happens and the core is dumped gdb can detect exact lines where and why the crash happened, or you can run the process using gdb or attach to it and see the trace directly without searching for the core file.

Related

loading libc's symbols into gdb

I'm debugging a binary with an older libc version than my system's one (I have libc-2.31, I'm running 2.24). I execute gdb with the LD_LIBRARY_PATH and it works like a charm, but I cannot load any symbols.
I downloaded the closest symbols file from http://archive.ubuntu.com/ubuntu/pool/main/g/glibc/libc6-dbg_2.23-0ubuntu11.2_amd64.deb, extracted it and after loading the binary into gdb, I execute:
add-symbol-file <path_to_libc-2.27.so from the deb package>
the file was loaded successfuly, but the addresses are incorrect. For example, trying to stop on a symbol such as 'main_arena' (x/40gx &main_arena) produces the following error:
0x3ebc40 <main_arena>: Cannot access memory at address 0x3ebc40
obviously this address is too low, thus I guess it's only the offset. What is my problem? maybe I need to find the exact debug file that suits my version (2.24)? because I there is no one.
Thanks!
I execute gdb with the LD_LIBRARY_PATH and it works like a charm,
It is not supposed to work, and if it happens to work today, it will likely break tomorrow.
The easiest solution is to debug inside a VM or a docker container with the desired version of GLIBC installed.
If you don't want to do that, see this answer on how to properly set things up for multiple GLIBCs on a single host.

gdb segmentation fault line number missing with c++11 option [duplicate]

Is there any gcc option I can set that will give me the line number of the segmentation fault?
I know I can:
Debug line by line
Put printfs in the code to narrow down.
Edits:
bt / where on gdb give No stack.
Helpful suggestion
I don't know of a gcc option, but you should be able to run the application with gdb and then when it crashes, type where to take a look at the stack when it exited, which should get you close.
$ gdb blah
(gdb) run
(gdb) where
Edit for completeness:
You should also make sure to build the application with debug flags on using the -g gcc option to include line numbers in the executable.
Another option is to use the bt (backtrace) command.
Here's a complete shell/gdb session
$ gcc -ggdb myproj.c
$ gdb a.out
gdb> run --some-option=foo --other-option=bar
(gdb will say your program hit a segfault)
gdb> bt
(gdb prints a stack trace)
gdb> q
[are you sure, your program is still running]? y
$ emacs myproj.c # heh, I know what the error is now...
Happy hacking :-)
You can get gcc to print you a stacktrace when your program gets a SEGV signal, similar to how Java and other friendlier languages handle null pointer exceptions. See my answer here for more details:
how to generate a stacktace when my C++ app crashes ( using gcc compiler )
The nice thing about this is you can just leave it in your code; you don't need to run things through gdb to get the nice debug output.
If you compile with -g and follow the instructions there, you can use a command-line tool like addr2line to get file/line information from the output.
Run it under valgrind.
you also need to build with debug flags on -g
You can also open the core dump with gdb (you need -g though).
If all the preceding suggestions to compile with debugging (-g) and run under a debugger (gdb, run, bt) are not working for you, then:
Elementary: Maybe you're not running under the debugger, you're just trying to analyze the postmortem core dump. (If you start a debug session, but don't run the program, or if it exits, then when you ask for a backtrace, gdb will say "No stack" -- because there's no running program at all. Don't forget to type "run".) If it segfaulted, don't forget to add the third argument (core) when you run gdb, otherwise you start in the same state, not attached to any particular process or memory image.
Difficult: If your program is/was really running but your gdb is saying "No stack" perhaps your stack pointer is badly smashed. In which case, you may be a buffer overflow problem somewhere, severe enough to mash your runtime state entirely. GCC 4.1 supports the ProPolice "Stack Smashing Protector" that is enabled with -fstack-protector-all. It can be added to GCC 3.x with a patch.
There is no method for GCC to provide this information, you'll have to rely on an external program like GDB.
GDB can give you the line where a crash occurred with the "bt" (short for "backtrace") command after the program has seg faulted. This will give you not only the line of the crash, but the whole stack of the program (so you can see what called the function where the crash happened).
The No stack problem seems to happen when the program exit successfully.
For the record, I had this problem because I had forgotten a return in my code, which made my program exit with failure code.

Program crash - how to read appcompat.txt?

After the program I am debugging crashes, I am left with heap dump *.mdmp file & appcompat.txt in my Temp directory. I understand that appcompat.txt is an error report. Is there a description of its format?
My appcompat.txt lists a number of DLLs. Am I correct assuming that the reason for a crash could have only come from one of the listed DLLs? Can I limit my debugging effort to the DLLs listed in appcompat.txt?
Thanks in advance!
The minidump file is far more informative for diagnosing crashes:
Install Debugging Tools for Windows, if you don't already have it.
Set up the symbol path variable _NT_SYMBOL_PATH to point to the Microsoft symbol server
Run Windbg and do File -> Open Crash Dump and locate your .dmp or .mdmp file
Type !analyze -v.
This will try to isolate the location of the crash. Note that just because a crash occurs in a particular dll it doesn't mean that is where the bug resides - it could be because an invalid parameter has been passed in from your application code. The analysis should hopefully show you a meaningful stack and an error code which should help in working out the actual cause of the crash.

How to map PC (ARMv5) address to source code?

I'm developing on an ARM9E processor running Linux. Sometimes my application crashes with the following message :
[ 142.410000] Alignment trap: rtspserverd (996) PC=0x4034f61c
Instr=0xe591300c Address=0x0000000d FSR 0x001
How can I translate the PC address to actual source code? In other words, how can I make sense out of this message?
With objdump. Dump your executable, then search for 4034f61c:.
The -x, --disassemble, and -l options are particularly useful.
You can turn on listings in the compiler and tell the linker to produce a map file. The map file will give you the meaning of the absolute addresses up to the function where the problem occurs, while the listing will help you pinpoint the exact location of the exception within the function.
For example in gcc you can do
gcc -Wa,-a,-ad -c foo.c > foo.lst
to produce a listing in the file foo.lst.
-Wa, sends the following options to the assembler (gas).
-a tells gas to produce a listing on standard output.
-ad tells gas to omit debug directives, which would otherwise add a lot of clutter.
The option for the GNU linker to produce a map file is -M or --print-map. If you link with gcc you need to pass the option to the linker with an option starting with -Wl,, for example -Wl,-M.
Alternatively you could also run your application in the debugger (e.g. gdb) and look at the stack dump after the crash with the bt command.

analysis of core file

I'm using Linux redhat 3, can someone explain how is that possible that i am able to analyze
with gdb , a core dump generated in Linux redhat 5 ?
not that i complaint :) but i need to be sure this will always work... ?
EDIT: the shared libraries are the same version, so no worries about that, they are placed in a shaerd storage so it can be accessed from both linux 5 and linux 3.
thanks.
You can try following commands of GDB to open a core file
gdb
(gdb) exec-file <executable address>
(gdb) set solib-absolute-prefix <path to shared library>
(gdb) core-file <path to core file>
The reason why you can't rely on it is because every process used libc or system shared library,which will definitely has changes from Red hat 3 to red hat 5.So all the instruction address and number of instruction in native function will be diff,and there where debugger gets goofed up,and possibly can show you wrong data to analyze. So its always good to analyze the core on the same platform or if you can copy all the required shared library to other machine and set the path through set solib-absolute-prefix.
In my experience analysing core file, generated on other system, do not work, because standard library (and other libraries your program probably use) typically will be different, so addresses of the functions are different, so you cannot even get a sensible backtrace.
Don't do it, because even if it works sometimes, you cannot rely on it.
You can always run gdb -c /path/to/corefile /path/to/program_that_crashed. However, if program_that_crashed has no debug infos (i.e. was not compiled and linked with the -g gcc/ld flag) the coredump is not that useful unless you're a hard-core debugging expert ;-)
Note that the generation of corefiles can be disabled (and it's very likely that it is disabled by default on most distros). See man ulimit. Call ulimit -c to see the limit of core files, "0" means disabled. Try ulimit -c unlimited in this case. If a size limit is imposed the coredump will not exceed the limit size, thus maybe cutting off valuable information.
Also, the path where a coredump is generated depends on /proc/sys/kernel/core_pattern. Use cat /proc/sys/kernel/core_pattern to query the current pattern. It's actually a path, and if it doesn't start with / then the file will be generated in the current working directory of the process. And if cat /proc/sys/kernel/core_uses_pid returns "1" then the coredump will have the file PID of the crashed process as file extension. You can also set both value, e.g. echo -n /tmp/core > /proc/sys/kernel/core_pattern will force all coredumps to be generated in /tmp.
I understand the question as:
how is it possible that I am able to
analyse a core that was produced under
one version of an OS under another
version of that OS?
Just because you are lucky (even that is questionable). There are a lot of things that can go wrong by trying to do so:
the tool chains gcc, gdb etc will
be of different versions
the shared libraries will be of
different versions
so no, you shouldn't rely on that.
You have asked similar question and accepted an answer, ofcourse by yourself here : Analyzing core file of shared object
Once you load the core file you can get the stack trace and get the last function call and check the code for the reason of crash.
There is a small tutorial here to get started with.
EDIT:
Assuming you want to know how to analyse core file using gdb on linux as your question is little unclear.