freeglut segfaults, no useful info from gdb - opengl

In this program I'm writing, I've been using freeglut and, generally, it has been working. However, sometimes when there is some issue in the program that often has nothing to do with rendering at all, I get a segfault at glutInit() and no explanation from GDB.
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7942409 in glutInit () from /usr/lib/x86_64-linux-gnu/libglut.so.3
Backtrace:
#0 0x00007ffff7942409 in glutInit () from /usr/lib/x86_64-linux-gnu/libglut.so.3
#1 0x0000000000415d4c in initGL () at ../gfx/render.cpp:62
#2 0x00000000004035f3 in main () at battle.cpp:49
Running with rendering disabled produces no errors.
So, I am wondering what I need to do to get more information on these failures. Can I get the backtrace to look inside liblut.so.3?
(As an aside, any recommendations for a more reliable toolkit than freeglut are appreciated.)

Can I get the backtrace to look inside liblut.so.3?
You already have a backtrace that is looking inside libglut.so.3.
You need to either
install debug symbols for it (sudo apt-get install freeglut3-dbg or some such), or
compile libglut.so from source, or
debug at assembly level: x/i $pc, disas, info registers, etc.

Related

gdb only finds some debug symbols

So I am experiencing this really weird behavior of gdb on Linux (KDE Neon 5.20.2):
I start gdb and load my executable using the file command:
(gdb) file mumble
Reading symbols from mumble...
As you can see it did find debug symbols. Then I start my program (using start) which causes gdb to pause at the entry to the main function. At this point I can also print out the back trace using bt and it works as expected.
If I now continue my program and interrupt it at any point during startup, I can still display the backtrace without issues. However if I do something in my application that happens in another thread than the startup (which all happens in thread 1) and interrupt my program there, gdb will no longer be able to display the stacktrace properly. Instead it gives
(gdb) bt
#0 0x00007ffff5bedaff in ?? ()
#1 0x0000555556a863f0 in ?? ()
#2 0x0000555556a863f0 in ?? ()
#3 0x0000000000000004 in ?? ()
#4 0x0000000100000001 in ?? ()
#5 0x00007fffec005000 in ?? ()
#6 0x00007ffff58a81ae in ?? ()
#7 0x0000000000000000 in ?? ()
which shows that it can't find the respective debug symbols.
I compiled my application with cmake (gcc) using -DCMAKE_BUILD_TYPE=Debug. I also ensured that a bunch of debug symbols are present in the binary using objdump --debug mumble (Which also printed a few lines of objdump: Error: LEB value too large, but I'm not sure if this is related to the problem I am seeing).
While playing around with gdb, I also encountered the error
Cannot find user-level thread for LWP <SomeNumber>: generic error
a few times, which lets me suspect that maybe there is indeed some issue invloving threads here...
Finally I tried starting gdb and before loading my binary using set verbose on which yields
(gdb) set verbose on
(gdb) file mumble
Reading symbols from mumble...
Reading in symbols for /home/user/Documents/Git/mumble/src/mumble/main.cpp...done.
This does also look suspicious to me as only main.cpp is explicitly listed here (even though the project has much, much more source files). I should also note that all successful backtraces that I am able to produce (as described above) all originate from main.cpp.
I am honestly a bit clueless as to what might be the root issue here. Does someone have an idea what could be going on? Or how I could investigate further?
Note: I also tried using clang as a compiler but the result was the same.
Used program versions:
cmake: 3.18.4
gcc: 9.3.0
clang: 10.0.0
make: 4.2.1

How to debug DPDK libraries to diagnose segmentation fault?

I am working with DPDK version 18.11.8 stable on Linux, using a gcc x64 build.
At runtime I get a segmentation fault. Running gdb on the core dump gives this backtrace:
#0 0x0000000000f65680 in rte_eth_devices ()
#1 0x000000000048a03a in rte_eth_rx_burst (nb_pkts=7,
rx_pkts=0x7fab40620480, queue_id=0, port_id=<optimized out>)
at
/opt/dpdk/dpdk-18.08/x86_64-native-linuxapp-gcc/include/rte_ethdev.h:3825
#2 Socket_poll (ucRxPortId=<optimized out>, ucRxQueId=ucRxQueId at entry=0
'\000', uiMaxNumOfRxFrm=uiMaxNumOfRxFrm at entry=7,
pISocketListener=pISocketListener at entry=0xf635d0 <FH_gtFrontHaulObj+16>)
at /data/<snip>/SocketClass.c:2188
#3 0x000000000048b941 in FH_perform (args_ptr=<optimized out>) at
/data/<snip>/FrontHaul.c:281
#4 0x00000000005788e4 in eal_thread_loop ()
#5 0x00007fab419fae65 in start_thread () from /lib64/libpthread.so.0
#6 0x00007fab4172388d in clone () from /lib64/libc.so.6
So it seems that rte_eth_rx_burst() calls rte_eth_devices () and that function crashes, presumably because of an illegal memory access. Possibly a hugepages problem?
I want to enable more debug info in DPDK. I am building DPDK using:
usertools/dpdk-setup.sh
Am I correct in thinking that the build commands in that script use make and I should modify the appropriate:
config/defconfig_*
file (defconfig_x86_64-native-linuxapp-gcc in my case) ?
If so, would these values be appropriate?
CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=y
RTE_LOG_LEVEL=RTE_LOG_DEBUG
RTE_LIBRTE_ETHDEV_DEBUG=y
(not sure whether all values should be prefixed by 'CONFIG_'?)
I tried building DPDK using:
$ export EXTRA_CFLAGS='-O0 -g'
$ make install T=x86_64-native-linuxapp-gcc
but that gave no extra info in the backtrace.
EDIT: error is identified update is Fixed and running without crashing now
using chat room dpdk-debug, we were able to rebuild the libraries and application with proper CFLAGS. Using gdb have identified the probable cause is in rte_eth_rx_burst not being passed with pointer array for mbuf.
Based on the GDB details for frame 1, it looks the application is not build with the EXTRA_CFLAGS (assuming you are using DPDK example Makefile). The right way to build an DPDK application for debugging is to follow the steps as
cd [dpdk target folder]
make clean
make EXTRA_CFLAGS='-O0 -ggdb'
cd [application folder]
make EXTRA_CFLAGS='-O0 -ggdb'
then use GDB in TUI or non-TUI mode to analyze the error.
note:
one of the most common mistakes I commit in rx_burst, is passing *mbuf_array instead of **mbuf_array as the argument.
if custom Makefile is used for the application, pass the EXTRA_CFLAGS as CFLAGS+="-O0 -ggdb"

How do I debug an executable with gdb when it crashes during startup?

I have a C-and-C++-based project I just got to build and link for the first time, and it segfaults on execution. I tried running it in gdb to get a backtrace, and saw this:
gdb) run
Starting program: /home/jon/controlix-code/bin/controlix
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
No stack.
(gdb)
I assume it is crashing before main() is called, but beyond that I don't have a clue. I haven't been able to find much about this type of situation on Google, so I thought I'd ask here.
One approach is to catch all exceptions before running:
catch throw
run
And if that does not help, you may have to single-step through the assembly from the very beginning. But before you do that,
break main
run
and single-step through the code using step and next should lead you to the culprit.

Gdb cannot find assertion failure positions after recompiling

It seems that gdb fails finding the code position of an assertion failure, after I recompile my code. More precisely, I expect the position of a signal raise, relative to an assertion failure, to be
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.`6
while instead I obtain
0x00007ffff7a5ff00 in ?? ()
For instance, consider the following code
#include <assert.h>
int main()
{
assert(0);
return 0;
}
compiled with debug symbols and debugged with gdb.
> gcc -g main.c
> gdb a.out
On the first run of gdb, the position is found, and the backtrace is reported correctly:
GNU gdb (Gentoo 8.0.1 p1) 8.0.1
...
(gdb) r
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
#1 0x00007ffff7a61baa in abort () from /lib64/libc.so.6
#2 0x00007ffff7a57cb7 in ?? () from /lib64/libc.so.6
#3 0x00007ffff7a57d72 in __assert_fail () from /lib64/libc.so.6
#4 0x00005555555546b3 in main () at main.c:5
(gdb)
The problem comes when I recompile the code. After recompiling, I issue the run command in the same gdb instance. Gdb re-reads the symbols, starts the program from the beginning, but does not find the right position:
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
`/home/myself/a.out' has changed; re-reading symbols.
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in ?? ()
(gdb) bt
#0 0x00007ffff7a5ff00 in ?? ()
#1 0x0000000000000000 in ?? ()
(gdb) up
Initial frame selected; you cannot go up.
(gdb) n
Cannot find bounds of current function
At this point the debugger is unusable. One cannot go up, step forward.
As a workaround, I can manually reload the file, and positions are found again.
(gdb) file a.out
Load new symbol table from "a.out"? (y or n) y
Reading symbols from a.out...done.
(gdb) r
Starting program: /home/myself/a.out
a.out: main.c:5: main: Assertion `0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a5ff00 in raise () from /lib64/libc.so.6
(gdb)
Unfortunately, after reloading the file this way, gdb fails resetting the breakpoints.
ERRATA CORRIGE: I was experiencing failure in resetting the breakpoints using gdb 7.12.1. After upgrading to 8.0.1 the problem vanished. Supposedly, this was related to the bugfix https://sourceware.org/bugzilla/show_bug.cgi?id=21555. However, code positions where assertions fail still cannot be found correctly.
Does anybody have any idea about what is going on here?
This has started happening after a system update. The system update recompiled all system libraries, including the glibc, as position independent code, i.e., compiled with -fPIC.
Also, the version of the gcc I am using is 6.4.0
Here is a workaround. Since file re-reads the symbols correctly, while run does not, we can define a hook for the command run so to execute file before:
define hook-run
pi gdb.execute("file %s" % gdb.current_progspace().filename)
end
after you change the source file and recompile u are generating a different file from the one loaded to GDB.
you need to stop the running debug cession and reload the file.
you cant save the previously defined breakpoints and watch points in the file to a changed source, since gdb is actually inserting additional code to your source to support breakpoints and registrar handlers.
if you change the source the the behavior is undefined and you need to reset those breakpoints.
you can refer to gdb manual regarding saving breakpoints in a file as
Mark Plotnick suggested, but it wont work if you change the file(from my experience)
https://sourceware.org/gdb/onlinedocs/gdb/Save-Breakpoints.html

Debugging a haxe based ndk app on android with gdb

I am trying to debug a openfl/haxe app on android with gdb. Since haxe/openfl compiles to c++ which is then compiled using the ndk, I am basically trying to debug an ndk app.
I got gdbserver attached to the apps process and can remote debug it using arm-linux-androideabi-gdb. But as soon as I try to get a backtrace (the app is running fine at this moment), I get this:
#0 0xb6e70b10 in ?? ()
#1 0xb6e4833c in ?? ()
#2 0xb6e4833c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I understand that this can happen when I corrupted the stack by using bad pointers, but since the app is running fine (am actually looking for a runtime error, not a crash), that is not the case.