GDB CallStack Address virtual or physical? - gdb

My gdb bt callstack gives function name with function address. Then I did nm binary
and generated the function name and address mapping. When I tried to match gdb address with nm output it did not match. The function address in (gdb) bt where too high (looks like physical address).
gdb function address (e.g 0x00007fffe6fc150f):
#9 0x00007fffe6fc150f in read_alias_file (fname=<value optimized out>, fname_len=<value optimized out>) at localealias.c:224
#10 0x00007fffe6fc1a4e in _nl_expand_alias (name=0x7fffffffed04 "en_IN") at localealias.c:189
#11 0x00007fffe6fbb62f in _nl_find_locale (locale_path=0x7fffe70df580 "/usr/lib/locale", locale_path_len=16, category=12, name=0x7fffffffdb90) at findlocale.c:119
#12 0x00007fffe6fbacf6 in *__GI_setlocale (catesagory=12, locale=<value optimized out>) at setlocale.c:303
#13 0x00007ffff17b8686 in
but when I did nm the address from the binary I got is like this
0000000005ddda04 t StubGLBindFragDataLocationIndexed
0000000005ddda3f t StubGLBindFramebuffer
0000000005ddda65 t StubGLBindRenderbuffer
0000000005ddda8b t StubGLBindTexture
0000000005dddab1 t StubGLBlendColor
0000000005dddaef t StubGLBlendFunc
0000000005dddb15 t StubGLBlitFramebuffer
0000000005dddb7e t StubGLBufferData
0000000005dddbbd t StubGLBufferSubData
0000000005dddc00 t StubGLCheckFramebufferStatus
0000000005dddc1e t StubGLClear
0000000005dddc3c t StubGLClearColor
0000000005dddc7a t StubGLClearStencil
0000000005dddc98 t StubGLColorMask
0000000005dddcda t StubGLCompileShader
The Machine is 64 bit.
As i know the gdb shows only virtual address. But I dont know why it is coming so high and
doesnot match with address present nm output
Is the gdb address are virtual address??. nm o/p looks like actual virtual address since it starts from 000000000. But then why base address are added automatically?
Note: I tried with sample test.out. that works fine. The bt callstack address are virtual address and perfectly matches with nm a.out symbols output.

The 0x00007fffe6fc150f address is coming from a shared library (libc.so.6 in this case). It is the virtual address, but it will not match output from nm /lib/libc.so.6 because the library is loaded at certain load address, which varies from execution to execution.
You can find out when libc.so.6 is loaded by executing info proc map GDB command. Once you know the load address for libc.so.6, add it to every address in nm output, and the result will match GDB output.
The reason this worked for a simple a.out is that (unlike shared libraries) a.out is linked to be loaded at a fixed load address (usually 0x400000 on Linux x86_64), and is not relocated by the dynamic loader.

Related

No symbols loaded in gdb with breakpad core file

I am already using google-crashdumper but I want to try breakpad now. I have integrated google-breakpad in my project and I'm deliberately crashing the application to test the breakpad.
I am converting the minidump to core file and loading in the gdb as follows
gdb application --core=corefile.core
And the problem is there are no symbols from the shared library. It looks something like the following:
Thread 2 (LWP 16357):
#0 0xf7789bd9 in ?? ()
#1 0x00000a48 in CountAUXV (pvdso_ehdr=<optimized out>, pnum_auxv=<optimized out>)
#2 CreateElfCore (handle=<error reading variable: Cannot access memory at address 0xf70befac>,
writer=<error reading variable: Cannot access memory at address 0xf70befa8>,
is_done=<error reading variable: Cannot access memory at address 0xf70bef74>, prpsinfo=0x80, user=0xf769b9eb, prstatus=0x0,
num_threads=1314, pids=0x0, i386_regs=0x0, fpregs=0x0, fpxregs=0x8e763f8 <_GLOBAL_OFFSET_TABLE_>, pagesize=175652892,
prioritize_max_length=175652896, main_pid=-150208408,
extra_notes=0x8494476 <boost::asio::detail::posix_event::wait<boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex> >(boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex>&)+134>, extra_notes_count=175652440) at src/elfcore.c:770
#3 0x00000a48 in CountAUXV (pvdso_ehdr=<optimized out>, pnum_auxv=<optimized out>)
#4 CreateElfCore (handle=<error reading variable: Cannot access memory at address 0xf70befb0>,
writer=<error reading variable: Cannot access memory at address 0xf70befac>,
is_done=<error reading variable: Cannot access memory at address 0xf70bef78>, prpsinfo=0xf769b9eb, user=0x0,
prstatus=0x522 <CryptoPP::PSSR_MEM_Base::RecoverMessageFromRepresentative(CryptoPP::HashTransformation&, std::pair<unsigned char const*, unsigned int>, bool, unsigned char*, unsigned int, unsigned char*) const+600>, num_threads=0, pids=0x0, i386_regs=0x0,
fpregs=0x8e763f8 <_GLOBAL_OFFSET_TABLE_>, fpxregs=0xa78401c, pagesize=175652896, prioritize_max_length=4144758888,
main_pid=139019382, extra_notes=0xa783e58, extra_notes_count=175652416) at src/elfcore.c:770
#5 0x00000080 in ?? ()
#6 0xf769b9eb in ?? ()
#7 0x00000000 in ?? ()
Thread 1 (LWP 16350):
#0 0xf7789bd9 in ?? ()
#1 0xff8d29b8 in ?? ()
#2 0xf74f0527 in ?? ()
Just posting 2 threads. It is similar with every thread which is quite weird as I have provided my executable also to the gdb.
Then I compared the breakpad's core-file with crashdumper's core-file. In crashdumper core-file everything is being loaded perfectly. All the sysmbols from all the library. It is showing the thread program where the crash took place. But nothing as such in breakpad version.
What am I missing with breakpad?? I googled a lot but in vain. Didn't find anything and anyone facing such problem.
UPDATE
I might be knowing why it is behaving like that. I checked info sharedlibrary in gdb and found out following:
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
No /var/lib/breakpad/D05FAC9D-0A87-6A47-5B5F-4ACE88DA8B2B-linux-gate.solinux-gate.so
No /var/lib/breakpad/07158AB3-A302-F4D9-E226-2E743AAD5F62-libarmmem.solibarmmem.so
No /var/lib/breakpad/0CF3E746-A497-4FC2-344C-5150C99DA98F-libdbus-1.so.3.8.13libdbus-1.so.3.8.13
No /var/lib/breakpad/86022950-B6CD-75CC-5231-9E660744CC01-librt-2.19.solibrt-2.19.so
No /var/lib/breakpad/D43EAF3E-9294-46AB-EBEC-7D2843FAD327-libdl-2.19.solibdl-2.19.so
No /var/lib/breakpad/083C9754-79F6-5740-5007-420864280D28-libm-2.19.solibm-2.19.so
No /var/lib/breakpad/73F07B39-C2C2-F2E1-976B-28C79E9C7380-libpthread-2.19.solibpthread-2.19.so
No /var/lib/breakpad/8E621420-AFA9-0E78-0FC6-66408F455863-libc-2.19.solibc-2.19.so
No /var/lib/breakpad/2848F9C5-0705-5011-7118-B3528CB1B127-ld-2.19.sold-2.19.so
No /var/lib/breakpad/98309410-5F29-2228-E94C-CE5597E94B8E-libnss_compat-2.19.solibnss_compat-2.19.so
No /var/lib/breakpad/ADB0DF4C-35D2-97E7-D08B-08CCC5D05BAE-libnsl-2.19.solibnsl-2.19.so
No /var/lib/breakpad/7A15AA2B-CFE8-EAE9-ED53-5AE09F11D847-libnss_nis-2.19.solibnss_nis-2.19.so
No /var/lib/breakpad/0B47D611-FAE4-DF70-897D-B17FC2403E6B-libnss_files-2.19.solibnss_files-2.19.so
No /var/lib/breakpad/44B0344D-3E34-451F-180E-80F7260552C9-libX11.so.6.3.0libX11.so.6.3.0
No /var/lib/breakpad/6980DABF-E4A3-BA5A-77BD-A926F982F7DA-libxcb.so.1.1.0libxcb.so.1.1.0
No /var/lib/breakpad/761E80BE-9902-2C81-CE65-EB25C918F928-libXau.so.6.0.0libXau.so.6.0.0
No /var/lib/breakpad/E82DCDA7-DBC9-E32F-4910-42EB91EE45E1-libXdmcp.so.6.0.0libXdmcp.so.6.0.0
No /var/lib/breakpad/61020107-52E1-1B5E-F21D-C4B038AB639A-libXext.so.6.4.0libXext.so.6.4.0
No /var/lib/breakpad/129CD9AD-EAC2-ACF7-CB4A-1676EAE9A2C5-libXrandr.so.2.2.0libXrandr.so.2.2.0
No /var/lib/breakpad/A9E8A41A-1DA0-1FDD-A54D-0B1C5D35E90F-libXrender.so.1.3.0libXrender.so.1.3.0
No /var/lib/breakpad/DC369B36-7E04-CEC6-4D5B-3FDF02CB5A94-libXtst.so.6.1.0libXtst.so.6.1.0
No /var/lib/breakpad/F0A290AE-076C-3270-25B8-52C134D70034-libXi.so.6.1.0libXi.so.6.1.0
No /var/lib/breakpad/A77F22F7-692A-A25D-BA51-9F725850878B-libXdamage.so.1.1.0libXdamage.so.1.1.0
No /var/lib/breakpad/4C202434-CFCB-ABB5-A350-73E99C5D9E2F-libXfixes.so.3.1.0libXfixes.so.3.1.0
No /var/lib/breakpad/E35954A9-31A1-A86D-6CEE-9A4532E31D10-libSM.so.6.0.1libSM.so.6.0.1
No /var/lib/breakpad/2254A820-8A49-A402-DC7B-7BCC21EF2BC3-libICE.so.6.3.0libICE.so.6.3.0
No /var/lib/breakpad/129A60DD-4279-492F-67BB-BD62B86BE6B3-libuuid.so.1.3.0libuuid.so.1.3.0
So it is looking for the shared library where it does not exists if I am not wrong. Even after I installed breakpad there was no such folder /varlib/breakpad.
Found the answer.
https://breakpad.appspot.com/1214002
This patch was already applied but did not mentioned anywhere. For anyone who face such problem.
But still there is one problem with this. User can only provide one path and the libraries has been loaded from multiple paths. I don't know if this is already been implemented!!!

How to get line number from libunwind and AddressSanitizer listed as <shared object>+offset?

I often get stack traces from libunwind or AddressSanitizer like this:
#12 0x7ffff4b47063 (/home/janw/src/pl-devel/lib/x86_64-linux/libswipl.so.7.1.13+0x1f5063)
#13 0x7ffff4b2c783 (/home/janw/src/pl-devel/lib/x86_64-linux/libswipl.so.7.1.13+0x1da783)
#14 0x7ffff4b2cca4 (/home/janw/src/pl-devel/lib/x86_64-linux/libswipl.so.7.1.13+0x1daca4)
#15 0x7ffff4b2cf42 (/home/janw/src/pl-devel/lib/x86_64-linux/libswipl.so.7.1.13+0x1daf42)
I know that if I have gdb attached to the still living process, I can use this to get details
on the location:
(gdb) list *0x7ffff4b47063
But if the process has died, I can not just restart it under gdb and use the above because
address randomization makes that I get the wrong result (at least, that is my assumption;
I clearly do not get meaningful locations). So, I tried
% gdb program
% run
<get to the place everything is loaded and type Control-C>
(gdb) info shared
<Dumps mapping location of shared objects>
(gdb) list *(<base of libswipl.so.7.1.13>+0x1f5063)
But, this either lists nothing or clearly the wrong location. This sounds simple, but
I failed to find the answer :-( Platform is 64-bit Linux, but I guess this applies to
any platform.
(gdb) info shared
<Dumps mapping location of shared objects>
Unfortunately, above does not dump actual mapping location that is usable with this:
libswipl.so.7.1.13+0x1f5063
(as you've discovered). Rather, GDB output lists where the .text section was mapped, not where the ELF binary itself was mapped.
You can adjust for .text offset by finding it in
readelf -WS libswipl.so.7.1.13 | grep '\.text'
It might be easier to use addr2line instead. Something like
addr2line -fe libswipl.so.7.1.13 0x1f5063 0x1da783
should work.
Please see http://clang.llvm.org/docs/AddressSanitizer.html for the instructions on using the asan_symbolize.py script and/or the symbolize=true option.

Get line number for lambda using GDB

We've a backtrace for a segfault that quotes a compiler-generated name for a lambda:
(gdb) bt
#0 std::_Function_handler<std::function<bool()>(), bold::AdHocOptionTreeBuilder::buildTree(bold::Agent*)::__lambda59>::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/4.8/functional:2057
#1 0x08146d2c in operator() (this=<optimized out>) at /usr/include/c++/4.8/functional:2464
...
The assigned name is bold::AdHocOptionTreeBuilder::buildTree(bold::Agent*)::__lambda59. However as you can tell that file has a lot of lambda in it! Is there a way to map that generated function's name to a line number in the source code? We have line numbers for other functions, however here it's only quoted as a type param for std::_Function_handler<>.
The linker option -Map mapfile should give you the information showing where each function originated, including lambda's. nm --line-numbers might work too, if the program was compiled with debug info -g.
Also, I think you can use set print symbol-filename on in GDB, and then evaluate &bold::AdHocOptionTreeBuilder::buildTree(bold::Agent*)::__lambda59

Analyze backtrace of a crash occurring due to a faulty library

In my application I have setup signal handler to catch Segfaults, and print bactraces.
My application loads some plugins libraries, when process starts.
If my application crashes with a segfault, due to an error in the main executable binary, I can analyze the backtrace with:
addr2line -Cif -e ./myapplication 0x4...
It accurately displays the function and the source_file:line_no
However how do analyze if the crash occurs due to an error in the plugin as in the backtrace below?
/opt/myapplication(_Z7sigsegvv+0x15)[0x504245]
/lib64/libpthread.so.0[0x3f1c40f500]
/opt/myapplication/modules/myplugin.so(_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi+0x6af)[0x7f5588fe4bbf]
/opt/myapplication/modules/myplugin.so(_Z11myplugin_reqmodP12CONNECTION_TP7Filebuf+0x68)[0x7f5588fe51e8]
/opt/myapplication(_ZN10Processors7ExecuteEiP12CONNECTION_TP7Filebuf+0x5b)[0x4e584b]
/opt/myapplication(_Z15process_requestP12CONNECTION_TP7Filebuf+0x462)[0x4efa92]
/opt/myapplication(_Z14handle_requestP12CONNECTION_T+0x1c6d)[0x4d4ded]
/opt/myapplication(_Z13process_entryP12CONNECTION_T+0x240)[0x4d79c0]
/lib64/libpthread.so.0[0x3f1c407851]
/lib64/libc.so.6(clone+0x6d)[0x3f1bce890d]
Both my application and plugin libraries have been compiled with gcc and are unstripped.
My application when executed, loads the plugin.so with dlopen
Unfortunately, the crash is occurring at a site where I cannot run the application under gdb.
Googled around frantically for an answer but all sites discussing backtrace and addr2line exclude scenarios where analysis of faulty plugins may be required.
I hope some kind-hearted hack knows solution to this dilemma, and can share some insights. It would be so invaluable for fellow programmers.
Tons of thanks in advance.
Here are some hints that may help you debug this:
The address in your backtrace is an address in the address space of the process at the time it crashed. That means that, if you want to translate it into a 'physical' address relative to the start of the .text section of your library, you have to subtract the start address of the relevant section of pmap from the address in your backtrace.
Unfortunately, this means that you need a pmap of the process before it crashed. I admittedly have no idea whether loading addresses for libraries on a single system are constant if you close and rerun it (imaginably there are security features which randomize this), but it certainly isn't portable across systems, as you have noticed.
In your position, I would try:
demangling the symbol names with c++filt -n or manually. I don't have a shell right now, so here is my manual attempt: _ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi is ICAPSection::process(CONNECTION_T *, Filebuf *, int). This may already be helpful. If not:
use objdump or nm (I'm pretty sure they can do that) to find the address corresponding to the mangled name, then add the offset (+0x6af as per your stacktrace) to this, then look up the resulting address with addr2line.
us2012's answer was quite the trick required to solve the problem. I am just trying to restate it here just to help any other newbie struggling with the same problem, or if somebody wishes to offer improvements.
In the backtrace it is clearly visible that the flaw exists in the code for myplugin.so. And the backtrace indicates that it exists at:
/opt/myapplication/modules/myplugin.so(_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi+0x6af)[0x7f5588fe4bbf]
The problem of locating the line corresponding to this fault cannot be determined as simplistically as:
addr2line -Cif -e /opt/myapplication/modules/myplugin.so 0x7f5588fe4bbf
The correct procedure here would be to use nm or objdump to determine the address pointing to the mangled name. (Demangling as done by us2012 is not really necessary at this point). So using:
nm -Dlan /opt/myapplication/modules/myplugin.so | grep "_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi"
I get:
0000000000008510 T _ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi /usr/local/src/unstable/myapplication/sources/modules/myplugin/myplugin.cpp:518
Interesting to note here is that myplugin.cpp:518 actually points to the line where the opening "{" of the function ICAPSection::process(CONNECTION_T *, Filebuf *, int)
Next we add 0x6af to the address (revealed by the nm output above) 0000000000008510 using linux shell command
printf '0x%x\n' $(( 0x0000000000008510 + 0x6af ))
And that results in 0x8bbf
And this is the actual source_file:line_no of the faulty code, and can be precisely determined with addr2line as:
addr2line -Cif -e /opt/myapplication/modules/myplugin.so 0x8bbf
Which displays:
std::char_traits<char>::length(char const*)
/usr/include/c++/4.4/bits/char_traits.h:263
std::string::assign(char const*)
/usr/include/c++/4.4/bits/basic_string.h:970
std::string::operator=(char const*)
/usr/include/c++/4.4/bits/basic_string.h:514
??
/usr/local/src/unstable/myapplication/sources/modules/myplugin/myplugin.cpp:622
I am not too sure why the function name was not displayed here, but myplugin.cpp:622 was quite precisely where the fault was.

Linux Kernel Text Symbols

When I look through a linux kernel OOPS output, the EIP and other code address have values in the range of 0xC01-----. In my System.map and objdump -S vmlinux output, all the code addresses are at least above 0xC1------. My vmlinux has debug symbols included (CONFIG_DEBUG_INFO).
When I debug over a serial connection (kgdb), and I load gdb with gdb ./vmlinux, again I have the same issue that I cannot reconcile $eip with what I have in System.map and objdump output. When I run where in gdb, I get a jumbled mess on the stack:
#0 0xC01----- in ?? ()
#1 0xC01----- in ?? ()
#2 0xC01----- in ?? ()
...
Can anyone make any suggestions on how to resolve this/these issues? My main concern is how I actually map an eip value from an OOPS to System.map or objdump -S vmlinux. I know that the OOPS will give me the function name and offset into the object code, but I am more concerned about the previously mentioned issue and why gdb can't correctly display a stack backtrace.
Looks like the OOPS is because you jumped into a place that's not a function.
This would easily cause a crash, and would also prevent the debugger from resolving the address as a symbol.
You can check this by disassembling the area around this EIP. If I'm correct, it won't make sense as machine code.
There are generally two causes for such things:
1. Function call using a corrupt function pointer. In this case, the stack frame before the last should show the caller. But you don't have this frame, so it may be the other reason.
2. Stack overrun - your return address is corrupt, so you've returned to a bad location. If it's so, the data ESP points to should contain the address in EIP. Debugging stack overruns is hard, because the most important source of information is missing. You can try to print the stack in "raw" format (x/xa addr), and try to make sense of it.