gdb coredump analysis failure - gdb

(gdb) shared info -> this shows that all the shared lib syms have been read
But still back trace bt shows that 'No symbol table info available' for any of the functions.
Can someone throw some light on what the issue could be ? How can I resolve this and get a meaningful trace back?

That column in the info shared output is perennially confusing. It doesn't mean that there actually is debug info -- it just means that gdb tried to read it. This information isn't actually all that useful to ordinary users.
It's surprisingly hard to find out if you actually have debug info. One way to do it is to use readelf -WS on your various files and look for the relevant debug sections. This will tell you if it exists.
However! Even this isn't enough. Maybe a section is missing (it is unlikely unless you have been mucking around with the files). Or maybe gdb rejected some part of the debug info (also reasonably unlikely).
Another possibility is that you have separated the debug info from the libraries. This is typical in distros. In this case you have to make sure to install the debug info packages -- in Fedora you can do this with debuginfo-install; presumably there are similar methods on other distros.
In the separate debug info case you have to be sure to install exactly the same versions of the files that were used by the process that made the core. This can be difficult. Sometimes it can be done by inspecting the build ids, but this isn't always possible, as distros frequently purge out-of-date versions of the files.
If the libraries in question are your libraries, then you have to recompile them with -g and then try to recreate the core. There's really no reliable way to generate the necessary debug info after the fact.
If the core was created on some other machine, you can try to find those files and install them locally. You can install them pretty much anywhere and use the set sysroot feature to tell gdb how to find them.

Related

GDB: How can I find the in-memory total size of loaded symbols (msymbols, psymbols, symbols)?

Our dev environment is configured in such a way that when we run the debug version of our code, it breaks into gdb on a crash or ^C. With some recent changes this is not happening anymore (exiting the program instead of breaking into gdb), and I'm suspecting the increase in symbols size is causing this issue.
Is there a way to find the sizes of msymbols, psymbols & symbols (memory consumption of symbols when they are loaded into the gdb session)?
Also, is there a way to limit the memory used for symbols in gdb? Google mentions HP's version supports such feature - and with other versions the only way out is to disable shared lib symbols auto-load and load them on demand. What would it take to have HP like support on, say FreeBSD?
Thankyou.
There is no way to get that information directly. You could add it pretty easily, but I personally wouldn't bother.
Your report isn't really detailed enough to understand what is going on. However I tend to doubt the behavior you are seeing is caused by gdb's size.
You can disable automatic loading of shared library info using set auto-solib-add.

Executable runtime crash caused by linking with dynamic library

Stack: MIPS, Linux, C, C++ using GNU Tools to compile and link (building on x86 for MIPS)
Fair warning: I'm a C, C++ novice, feel free to suggest anything which might be obvious as it's possible I have not tried it yet.
I am able to build an executable which dynamically links to a library (live555), if I statically link to this everything works fine, however when I attempt to dynamically link the executable crashes during runtime. To confirm I am building the .so files correctly, I've also tried building other executables (the test tools included with live555) to dynamically link against these .so libs and these tools work fine.
The linking/build seems to work fine, no errors or warnings are thrown during the build. I can inspect the crashing executable with readelf -d and clearly see the .so references. I can also run ldd on the MIPS system on the executable and the libraries seem to be loaded fine, strace output also shows these libraries as being loaded. Unfortunately the strace output doesn't really provide me with any insite, I've talked with others familiar with this system and they are not sure what the problem is.
Just looking for ideas and tools to try, if anyone has any thoughts I'd appropriate them!
Thanks for reading
There is not enough information here to start troubleshooting in depth. Some ideas to start debugging, from least to most time-consuming:
After you run ldd on your executable, check the path(s) where that library is being loaded from, make sure the library is the version you compiled / linked against. Easy way is to get it's MD5 hash on your target and host, make sure they are the same.
Also check to make sure you don't have multiple instances of the library installed
Double check the aliases for your library, make sure they point to the same place
Try enabling crash dump generation $> ulimit -c unlimited, run gdb or DDD, load the crash dump and inspect your environment.
Check your CFLAGS, it could be as #YannRamin said, you need -fPIC for MIPS. You can run make -n to see how your binary is being generated.
Check your LDPATH env on target and make sure it is sensible; empty is perfectly fine btw.
Check your LDFLAGS during compile / linking. You might have to run make -n, look for gcc command or collect command, then copy-paste the entire line and add --verbose to the end so you can see exactly what the linker is doing. You might have to fix paths for sources / object files, depending on how your build system is setup.
The idea is to try and eliminate potential issues, such as:
wrong library version: installed vs compiled against
multiple locations / bad aliasing
symbol pollution when compiling / linking
many others
You're lucky that you have Linux installed, so should be fairly easy, just might be time consuming.

What is "system-supplied DSO" that gdb references?

I'm running gdb with set verbose on and I'm trying to understand one of the messages I am getting:
Reading symbols from system-supplied DSO at 0x7ffff7ffb000...(no debugging symbols found)...done.
What is thesystem-supplied DSO? After some search I think that DSO might stand for "dynamic shared object". But I still don't understand exactly what gdb is doing here and how I might solve the problem with the debugging symbols not being found (or if it even matters).
Also the program that I am debugging is being compiled with llvm-gcc and has an LLVM pass applied to it. I think that is effecting the behavior of gdb, but I"m not exactly sure how.
So essentially my question is what does the message that gdb prints mean, is it likely to cause a problem, and if so any suggestions on how I could help gdb find the debugging symbols.
According to this document a DSO is:
A dynamic shared object (DSO) is an object file that’s meant to be
used simultaneously— or shared—by multiple applications (a.out files)
while they’re executing.
I believe that a system supplied DSO is just a DLL provided by the OS and loaded by the main executable. Since this is an external library you don't have the debugging symbols of such object unless you download them separately. Typically the release binaries are stripped of debugging symbols but they can have a link to a separate file. A typical Linux distribution provides a package containing the debugging symbols of such binaries ( like the xxx-debuginfo-xxx.rpm for RedHat based distributions).
In this context, system-supplied-DSO means a shared library provided directly by the linux kernel such as VDSO. Debuginfo is indeed available for them, but is packaged along with the kernel rather than userspace. Use debuginfod to automatically fetch them if your distro supports that.

Read debugging information at runtime from an application

I have some questions regarding debugging symbols and what can be done with them, besides, well, debugging. I'm mostly interested in answers regarding GCC, but I'd also be happy to know how it looks like under other compilers, including MSVC.
First of all:
What are the common formats/types of debugging symbols?
How do they relate to compilers and platforms? Is it always the same format on GCC and MinGW among platforms?
Can I check in runtime whether the build has them and what format are they in?
And some more practical questions... How can I:
Check the current file and line number?
Obtain the (qualified) function name being executed?
Obtain a full current stack trace?
Let me emphasize that I'm talking about run-time checks. All of those can be read and pretty-printed by GDB, but I don't know how much info comes from the debugging symbols themselves and how much from the source code which GDB also has access to.
Maybe there's a library which is able to parse the debugging symbols and yield such information?
Are the debugging symbols standardised well enough that I can expect some degree of portability for such solutions?
What are the common formats/types of debugging symbols?
DWARF and STABS (those are embedded inside executable, in special sections), Program Database (PDB; external file, used by MSVC).
How do they relate to compilers and platforms? Is it always the same format on GCC and MinGW among platforms?
GCC uses DWARF/STABS (I think it's a GCC compile-time option) both on Linux (ELF) and Windows (PE), don't know about others. MSVC always uses PDB.
Can I check in runtime whether the build has them and what format are they in?
You can parse the executable image and see if there are sections with debugging info (see STABS documentation and DWARF specs). PDB files are distributed either with executables or via symbol servers (so if you don't want to go online, check if there is X.pdb for X.exe/X.dll).
About how to read and use those symbols — I don't know about DWARF/STABS (there's probably something around GNU binutils that can locate and extract those), but for PDB your best bet is to use dbghelp — its usage is pretty well documented and there are a lot of examples available on the net. There's also DIA SDK that can be used to query PDB files.
Are the debugging symbols standardised well enough that I can expect some degree of portability for such solutions?
DWARF has a formal specification, and it's complicated as hell. PDB AFAIK is not documented, but dbghelp/DIA are, and are the recommended way.

How to analyze and debug gdb core with no symbols, using registers and raw stack

At a customer place a third party software has crashed. The process and the libraries are stripped (no symbols), the call stack does not give any useful information. All that I have is registers which may not be corrupted. This third party code has been written is C.
Now, I have used gdb till now to debug simpler issues. But this one is a bit complicated. I think register and raw stack information may be used to corelate where the crash occurred and I require help on this aspect.
It may not be possible to deploy a non-stripped binary at customer site, nor would it be possible to do inhouse crash reproduction. Also, I am not familiar with this third party code.
Also I require pointers/sites/documents for the following:
1) ELF and various section headers.
2) How to create a symbol file (during compilation) for a library and a process.
3) How to tell gdb to read symbols from a symbol file.
One thing we should be able to do is to open you core file against a non-stripped/with-symbols version of your process. As long as the compilation process (compiler, optimization flags, etc.) is the same and you just keep all these debugging information, GDB should be able to provide you with all the information you can expect from a core.
gdb [options] executable-file core-file
To compile your process with the debugging information (symbols and dwarfs for lines, types, ...), you need to add -g in your compiler flags. The same applies for your custom libraries.
For the system libraries, it might be conviant sometime (not always), modern Linux distributions (at least Fedora) directly provide them to gdb.