In one of my C++-projects I found an issue related to a linked library which results in a segfault directly after starting the compiled executable. I tried to dive into the issue using gdb, but it fails with the output:
../../gdb/dwarf2/read.c:1857: internal-error: bool dwarf2_per_objfile::symtab_set_p(const dwarf2_per_cu_data*) const: Assertion `per_cu->index < this->m_symtabs.size ()' failed.
After it is an internal error I am not able to do much (except reporting it), but I still would like to be able to debug the program itself. Which options do I have for that?
Use of a different debugger than gdb?
Manual update to a newer version of gdb (currently on 10.1)?
Somehow catch the segfault before it is damaging the debugger?
?
Most likely this is already fixed gdb bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28160. It has Target Milestone 11.1, so update to a latest version of gdb which is 11.1 now. It should be fixed in that version.
Related
Whenever I try to debug TensorFlow's C++ code with Eclipse + GDB, I get GDB crashing, or actually exiting with: error code = -1.
As long as I don't set a breakpoint in TensorFlow's C++ code, the program runs just fine. But when I do, and when the debugger gets to the breakpoint, it crashes after a few seconds with error code -1. There is nothing meaningful in the GDB traces which can explain this behavior.
The GDB version I am using is 7.7.1, running with Eclipse Neon under Ubuntu 14.04.
TensorFlow is compiled in debug mode. I don't think that Eclipse is missing the debug symbols for it, as it is not complaining that those are missing (and also, occasionally, the debugger is able to step through a few steps in the code before it crashes).
An easy way to reproduce is to try and debug the label_image example:
https://www.tensorflow.org/versions/r0.11/tutorials/image_recognition/index.html
Compile it and then create a 'C/C++ Application' debug configuration in Eclipse, directing it to the compiled binary of the label_image app.
I've encountered the same problem with GDB on macOS. But, I finally succeeded to debug tf with lldb. And I also found that using VisualStudio Code + lldb makes it easy for debugging.
Here is my way of debugging. Maybe you can give it a try.
I have a program that I have written in C++ under linux (Ubuntu 10.10).
The programming and debugging worked perfectly until the moment I added the following lines to the code:
mapfile = fopen(map_filename,"wb");
fwrite(map_header,1,20,mapfile); // <-- this is the problem line
fclose(mapfile);
After I added those, the program compiles ok, but the debugger now won't start. It immediately fails with this message:
Program completed, Exit code 0x177
error while loading shared libraries: unexpected PLT reloc type 0xcc
And if I remove the line with the "fwrite", the debugger will start normally.
This problem only happends inside Netbeans.
When I debug it using the command-line "gdb" it also works ok without any problems.
Anyone have idea why its happening and how to fix it?
P.S: Those problems started recently so I assume maybe it has to do something with system updates, I'm not sure.
Found the problem:
Not long ago, I removed some old C++ projects from netbeans. It figures out that netbeans (at least v7.0) remembers all the breakpoints that I put on old projects that don't even exist in the IDE anymore.
I found this by looking at the Debugger Console (Window->Debugging->Debugging Console) and seeing that when "gdb" starts, it tries to setup all these breakpoints from other projects or from projects that do not exists (this is a bug in netbeans, btw)
The solution: I simply cleaned all the breakpoints (inside Window->Debugging->Breakpoints) and now the program can be debugged properly.
Hope this will help to anyone out there who has the similar problem.
A shared object was built on RedHat Linux and while all the code was compiled with debug, the debugger (gdb) refused to load the symbols and issued an error as in:
...
GNU gdb Fedora (6.8-37.el5)
...
This GDB was configured as "x86_64-redhat-linux-gnu"...
Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in module libgrokf.so]
With this error, I could not get break points to trigger in any function nor see proper stack trace. I recompiled the entire project but nothing helped. I do know that some time in the past there was no problem in debugging that module.
What is causing this problem?
The problem is that your version of gdb doesn't support the DWARF version used in one of your binaries.
The solution: Update gdb or compile your files using another debug format (DWARF2 works on gdb 6).
I have recently had this problem with freeBSD and nasm, nasm compiling binaries with DWARF3 and the gdb that ships with freeBSD 9.1 doesn't accept it.
I hope this answer helps anyone having a similar problem :P
Debug options for GCC
As it happens, the module that could not debug was mostly built from sources except for one little 'external' object file someextcode.o that was provided by a 3rd party.
In investigating the problem it was found that the someextcode.c was compiled with the -g3 flag which, apparently, places DWARF version of 4 in the compilation unit header. Changing that to -g resolved the problem.
Unfortunately, it appears a problem with a single module can break the debug-ability of an entire shared object (.so) without giving a clear indication of root of the problem.
My issue got resolved by choosing the right gdb version for debugging. Earlier I was using the gdb 7.0... and when I started using the gdb version 7.10, i was able to debug my application.
I have a problem with my C++ application. It was developed on a 32bit pc, on Microsoft Visual Studio 2008, and now I am trying to run it on a 64bit pc.
On my 32bit pc it works fine; on the 64bit pc, Visual Studio does not give any compilation problem, but then on execution gives wrong results.
And I have undestood why.
In the code, I define a variable, of tipe "dag", that is a structure for a direct acyclic graph. By debugging the software, I noticed that, although I declared it, later the software is not able to insert data in it, and the debugger says:
CXX0017: Error: symbol "dags" not found
Here's my code:
Dag<int64_t>* dags = new Dag<int64_t>();
dags = getDagsFromRequest2(request, dags);
The very strange thing is that, if I follow the flow inside getDagsFromRequest2() function, I can clearly see that dags variable is full of data: on "quickwatch", it shows 2342 nodes inside it. But when I come back from getDagsFromRequest2() function to this part of the code, debugger says "CXX0017: Error: symbol "dags" not found". How is it possible?
You can also see this screenshot from my Visual Studio debug set.
What could be the problem?
Thanks a lot
There are a few possibilities to consider:
Running in Release builds. Switch to a Debug build.
Using a Debug build that has optimizations enabled and/or debug information disabled. Disable the optimizations and enable the debug information (look in another project for the relevant settings).
A corrupt build of some sort. Clean and rebuild the entire solution.
Memory corruption which is preventing the debugger from displaying the variable. Ensure that no memory issues exist with a tool like Valgrind.
A VS bug. This report for VS2010 seems to suggest a known bug with similar characteristics for example. Ensure all patches and hotfixes for VS2008 are installed.
The variable dags is defined as your code compiles. The error you see is simply related to the debugger. I am guessing it is caused by running the application in Release mode which sometimes causes confusing and wrong watches values. Try changing the mode to debug(there is a drop down from which you can choose the build mode).
EDIT: as you say you are running in Debug mode, my next guess is that this behavior could be caused by stack corruption. Try using valgrind to detect if that is the case. It may take a while to start with it,but it is worth it and will detect if you have some memory corruption.
Does anyone know a good debugger for C++ segmentation errors on the Linux environment? It would be good enough if the debugger could trace which function is causing the error.
Also consider some techniques that do require from you code changes:
Run your app via valgrind memcheck tool. It's possible to catch error when you access wrong address (e.g. freed pointer, not initialized) - see here.
If you use extensievly stl/boost, consider compiling with -D_GLIBCXX_DEBUG and -D_GLIBCXX_DEBUG_PEDANTIC (see here). This can catch such errors as using invalidated iterator, accessing incorrect index in vector etc.
tcmalloc (from google per tool). When linking with it's debug enabled version, it may find memory related problems
Even more ...
GDB! what else is available on Linux?
Check this out for starting up with GDB, its a nice, concise and easy to understand tutorial.
GDB is indeed about the only choice. There are some GUI's but they are allmost all wrappers for gdb. Finding a segfault is easy. Make sure you compile with -g -O0 then start gdb with your program as argument.
In gdb type run
To start your program running, gdb will stop it is soon as it hits a segfault and report on which line that was. If you need a full backtrace then just type bt. To get out of gdb enter quit.
BTW gdb has a build in help, just type help.