GDB doesn't show function names - c++

I am debugging from an embedded device using gdbserver:
./gdbserver HOST:5000 /home/test_app
In my PC, I execute gdb in this way:
arm-none-linux-gnueabi-gdb test_app
Once the application is executing, I receive the Segfault I want to debug, but it's impossible to know what line produced it:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 715]
0x31303030 in ?? ()
(gdb) bt
#0 0x31303030 in ?? ()
#1 0x0000dff8 in ?? ()
#2 0x0000dff8 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(I must say I'm totally new to GDB)

Ok this usually happens if debug symbols are missing... just to make sure run following commands
file <your_executable>
you will get info on your binary like format, arch etc.. last part of the info describes if the binary is stripped or not. For debugging in GDB the binary should not have been stripped.
nm --debug-sym <your_executable> | grep debug
If you have some valid prints as below it means debug symbols are present.
00000000 N .debug_abbrev
00000000 N .debug_aranges
00000000 N .debug_frame
00000000 N .debug_info
00000000 N .debug_line
00000000 N .debug_loc
00000000 N .debug_pubnames
00000000 N .debug_str
Further when you invoke GDB you should have follwing line
Reading symbols from <your_executable>...done.
At this point you should be able to list sources with list command.
Make sure both gdb and gdbserver have same versioninig.
arm-none-linux-gnueabi-gdb --version
./gdbserver --version
If all the above are true, and you still don't get backtrace, there is something bad going on with your stack. Try running some static analysis, valgrind on your code / newly added code.

You need to build your application with debug symbols enabled. The switch for gcc is -g

For others, if nm --debug-sym <your_executable> | grep debug prints the debug symbols but you do not get them in gdb, this might be because you are opening a core in gdb using a executable that is different from the one that generated the core.

You will need to include -g for every translation unit, for example, if you have a bunch of object files that are linked to build your final executable you will need to include -g for each compilation command.
g++ -g file1.cpp -c -o file1.o
g++ -g file2.cpp -c -o file2.o
...
g++ -g file1.o file2.o -o main

Related

How do you find out the cause of rare crashes that are caused by things that are not caught by try catch (access violation, divide by zero, etc.)?

I am a .NET programmer who is starting to dabble into C++. In C# I would put the root function in a try catch, this way I would catch all exceptions, save the stack trace, and this way I would know what caused the exception, significantly reducing the time spent debugging.
But in C++ some stuff(access violation, divide by zero, etc.) are not caught by try catch. How do you deal with them, how do you know which line of code caused the error?
For example let's assume we have a program that has 1 million lines of code. It's running 24/7, has no user-interaction. Once in a month it crashes because of something that is not caught by try catch. How do you find out which line of code caused the crash?
Environment: Windows 10, MSVC.
C++ is meant to be a high performance language and checks are expensive. You can't run at C++ speeds and at the same time have all sorts of checks. It is by design.
Running .Net this way is akin to running C++ in debug mode with sanitizers on. So if you want to run your application with all the information you can, turn on debug mode in your cmake build and add sanitizers, at least undefined and address sanitizers.
For Windows/MSVC it seems that address sanitizers were just added in 2021. You can check the announcement here: https://devblogs.microsoft.com/cppblog/addresssanitizer-asan-for-windows-with-msvc/
For Windows/mingw or Linux/* you can use Gcc and Clang's builtin sanitizers that have largely the same usage/syntax.
To set your build to debug mode:
cd <builddir>
cmake -DCMAKE_BUILD_TYPE=debug <sourcedir>
To enable sanitizers, add this to your compiler command line: -fsanitize=address,undefined
One way to do that is to add it to your cmake build so altogether it becomes:
cmake -DCMAKE_BUILD_TYPE=debug \
-DCMAKE_CXX_FLAGS_DEBUG_INIT="-fsanitize=address,undefined" \
<sourcedir>
Then run your application binary normally as you do. When an issue is found a meaningful message will be printed along with a very informative stack trace.
Alternatively you can set so the sanitizer breaks inside the debugger (gdb) so you can inspect it live but that only works with the undefined sanitizer. To do so, replace
-fsanitize=address,undefined
with
-fsanitize-undefined-trap-on-error -fsanitize-trap=undefined -fsanitize=address
For example, this code has a clear problem:
void doit( int* p ) {
*p = 10;
}
int main() {
int* ptr = nullptr;
doit(ptr);
}
Compile it in the optimized way and you get:
$ g++ -O3 test.cpp -o test
$ ./test
Segmentation fault (core dumped)
Not very informative. You can try to run it inside the debugger but no symbols are there to see.
$ g++ -O3 test.cpp -o test
$ gdb ./test
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
...
Reading symbols from ./test...
(No debugging symbols found in ./test)
(gdb) r
Starting program: /tmp/test
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555044 in main ()
(gdb)
That's useless so we can turn on debug symbols with
$ g++ -g3 test.cpp -o test
$ gdb ./test
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
...
Reading symbols from ./test...
(gdb) r
Starting program: /tmp/test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
test.cpp:4:5: runtime error: store to null pointer of type 'int'
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555259 in doit (p=0x0) at test.cpp:4
4 *p = 10;
Then you can inspect inside:
(gdb) p p
$1 = (int *) 0x0
Now, turn on sanitizers to get even more messages without the debugger:
$ g++ -O0 -g3 test.cpp -fsanitize=address,undefined -o test
$ ./test
test.cpp:4:5: runtime error: store to null pointer of type 'int'
AddressSanitizer:DEADLYSIGNAL
=================================================================
==931717==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x563b7b66c259 bp 0x7fffd167c240 sp 0x7fffd167c230 T0)
==931717==The signal is caused by a WRITE memory access.
==931717==Hint: address points to the zero page.
#0 0x563b7b66c258 in doit(int*) /tmp/test.cpp:4
#1 0x563b7b66c281 in main /tmp/test.cpp:9
#2 0x7f36164a9082 in __libc_start_main ../csu/libc-start.c:308
#3 0x563b7b66c12d in _start (/tmp/test+0x112d)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /tmp/test.cpp:4 in doit(int*)
==931717==ABORTING
That is much better!

gdb fails to insert a breakpoint even if compiled with -no-pie

I'm trying to get gdb working with C++ programs on Ubuntu 20.04. What I need is to be able to set a breakpoint (for example, break main.cpp:3 gdb command) and then run until the breakpoint, but at the moment both start and run fail because they "Cannot insert breakpoint" and "Cannot access memory". For me gdb fails even with very simple examples. This is main.cpp content:
#include <iostream>
int main() {
std::cout << "Hello World!";
return 0;
}
I found somewhere that using -no-pie might help to get gdb working (with breakpoints), so I compile the program by running g++ -ggdb3 -no-pie -o main main.cpp (I also tried -g instead of -ggdb3, and -fno-PIE in addition to -no-pie). When I try to use gdb, it complains "Cannot insert breakpoint 1":
gdb -q main
Reading symbols from main...
(gdb) start
Temporary breakpoint 1 at 0x1189: file main.cpp, line 3.
Starting program: /tmp/main
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x1189
Without -no-pie result is the same. Only thing that changes with or without -no-pie is the hexadecimal address, without -no-pie it is low like 0x1189 (as shown above), with -no-pie it can be 0x401176, but everything else exactly the same, I keep getting the "Cannot access memory at address" warning in both cases.
If I use starti instead of start, it works at first, but after a few nexti iterations it prints usual message "Cannot insteart breakpoint":
gdb -q main
Reading symbols from main...
(gdb) starti
Starting program: /tmp/main
Program stopped.
0x00007ffff7fd0100 in ?? () from /lib64/ld-linux-x86-64.so.2
(gdb) nexti
0x00007ffff7fd0103 in ?? () from /lib64/ld-linux-x86-64.so.2
...
(gdb) nexti
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x4
0x00007ffff7fd0119 in ?? () from /lib64/ld-linux-x86-64.so.2
(gdb) nexti
0x00007ffff7fd011c in ?? () from /lib64/ld-linux-x86-64.so.2
...
(gdb) nexti
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x1c
0x000055555556ca22 in ?? ()
(gdb) nexti
[Detaching after fork from child process 3829827]
...
[Detaching after fork from child process 3829840]
Hello World![Inferior 1 (process 3819010) exited normally]
So I can use nexti, but cannot use next and obviously cannot insert breakpoints.
I tried -Wl,-no-pie (by running g++ -Wl,-no-pie -ggdb3 -o main main.cpp; adding -no-pie does not change anything) but this option causes a strange linker error:
/usr/bin/ld: cannot find -lgcc_s
/usr/bin/ld: cannot find -lgcc_s
collect2: error: ld returned 1 exit status
When I google the error, I only found advice to try -no-pie instead of -Wl,-no-pie, and no other solutions. Since debugging C++ programs is very common activity, I feel like I'm missing something obvious but I found no solution so far.
To make it easier to understand what exact commands I use and to make it clear I'm not mixing up directories and to show what versions of g++ and gdb I'm using, here is full terminal log:
$ ls
main.cpp
$ g++ --version | grep Ubuntu
g++ (Ubuntu 9.3.0-10ubuntu2) 9.3.0
$ g++ -ggdb3 -no-pie -o main main.cpp
$ ls
main main.cpp
$ gdb --version | grep Ubuntu
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
$ readelf -h main | grep 'Type: .*EXEC'
Type: EXEC (Executable file)
$ gdb -q main
Reading symbols from main...
(gdb) start
Temporary breakpoint 1 at 0x401176: file main.cpp, line 3.
Starting program: /tmp/main/main
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x401176
For completeness, I tried the same without -no-pie:
$ rm main
$ g++ -ggdb3 -o main main.cpp
$ readelf -h main | grep 'Type: .*'
Type: DYN (Shared object file)
$ gdb -q main
Reading symbols from main...
(gdb) start
Temporary breakpoint 1 at 0x1189: file main.cpp, line 3.
Starting program: /tmp/main/main
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x1189
As you can see the only difference with or without -no-pie is the memory address, but the issue and warnings are the same. Without -no-pie this may be expected, but I do not understand why this is happening if I compiled with -no-pie and what else I can try to solve the issue.
This:
g++ -ggdb3 -no-pie -o main main.cpp
should produce a non-PIE executable. You should be able to verify that it non-PIE by looking at readelf -h main | grep 'Type: .*EXEC' (PIE binaries have ET_DYN type).
This:
Temporary breakpoint 1 at 0x1189: file main.cpp, line 3.
is unambiguously a PIE binary (a non-PIE binary will not have any code below 0x40000 on x86_64 Linux).
Conclusion: you are either debugging the wrong binary (e.g. you are compiling main in a different directory from the one in which you are debugging), or you are not telling is the whole story.

Why is "gdb" listing multiple functions after executing the "start' command even when the C++ source file doesn't contain any function?

The context
Consider the following file
$ cat main.cpp
int main() {return 0;}
I can list all the available functions by executing
$ g++ -g main.cpp && gdb -q -batch -ex 'info functions -n' a.out
All defined functions:
File main.cpp:
1: int main();
When executing start before executing info functions more than 1000 functions are listed (see below)
g++ -g main.cpp && \
gdb -q -batch -ex 'start' -ex 'info functions -n' a.out | \
head -n 10
Temporary breakpoint 1 at 0x111d: file main.cpp, line 1.
Temporary breakpoint 1, main () at main.cpp:1
1 int main() {return 0;}
All defined functions:
File /build/gcc/src/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/allocated_ptr.h:
70: void std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<std::filesystem::__cxx11::filesystem_error::_Impl, std::allocator<std::filesystem::__cxx11::filesystem_error::_Impl>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr();
70: void std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<std::filesystem::filesystem_error::_Impl, std::allocator<std::filesystem::filesystem_error::_Impl>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr();
As seen below, the total number of lines printed is so, apparently, more than 1000 functions are being listed
g++ -g main.cpp && gdb -q -batch -ex 'start' -ex 'info functions -n' a.out | wc -l
4436
The question
As we can see above, the main.cpp file does not contain any function, so why is gdb listing those functions when the start command has been executed before but not when start hasn't been executed?
Additional context
As suggested in one of the comments of this question, here's the output of executing info shared after start has been executed
g++ -g main.cpp && gdb -q -batch -ex 'start' -ex 'info shared' a.out
Temporary breakpoint 1 at 0x111d: file main.cpp, line 1.
Temporary breakpoint 1, main () at main.cpp:1
1 int main() {return 0;}
From To Syms Read Shared Object Library
0x00007ffff7fd2090 0x00007ffff7ff2746 Yes (*) /lib64/ld-linux-x86-64.so.2
0x00007ffff7e4c040 0x00007ffff7f37b52 Yes /usr/lib/libstdc++.so.6
0x00007ffff7c7f3b0 0x00007ffff7d1a658 Yes (*) /usr/lib/libm.so.6
0x00007ffff7c59020 0x00007ffff7c69ca5 Yes /usr/lib/libgcc_s.so.1
0x00007ffff7ab3650 0x00007ffff7bfe6bd Yes (*) /usr/lib/libc.so.6
(*): Shared library is missing debugging information.
main.cpp file does not contain any function, so why is gdb listing those functions when the start command has been executed before but not when start hasn't been executed?
Before start, GDB reads symbols (and debug info) only for the main executable.
After start, a dynamically linked executable loads shared libraries (seen in info shared), and GDB (by default) reads symbol tables and debug info for each of them. And since these libraries contain hundreds of functions, GDB knows about all of them.
You can prevent this with set auto-solib-add off, but usually you don't want to do that. If you do, and your program crashes in e.g. abort, GDB will not know where you crashed unless you manually add the symbols back using sharedlibrary or add-symbol-file command.

Global constructor call not in .init_array section

I'm trying to add global constructor support on an embedded target (ARM Cortex-M3).
Lets say I've the following code:
class foobar
{
int i;
public:
foobar()
{
i = 100;
}
void inc()
{
i++;
}
};
foobar foo;
int main()
{
foo.inc();
for (;;);
}
I compile it like this:
arm-none-eabi-g++ -O0 -gdwarf-2 -mcpu=cortex-m3 -mthumb -c foo.cpp -o foo.o
When I look at the .init_array section with objdump it shows the .init_section has a zero size.
I do get an symbol named _Z41__static_initialization_and_destruction_0ii.
When I disassemble the object file I see that the global construction is done in the static_initialization_and_destruction symbol.
Why isn't a pointer added to this symbol in the .init_section?
I know it has been almost two years since this question was asked, but I just had to figure out the mechanics of bare-metal C++ initialization with GCC myself, so I thought I'd share the details here. There turns out to be a lot of out-of-date or confusing information on the web. For example, the oft-mentioned collect2 wrapper does not appear to be used for ARM ELF targets, since its arbitrary section support enables the approach described below.
First, when I compile the code above with the given command line using Sourcery CodeBench Lite 2012.09-63, I do see the correct .init_array section size of 4:
$ arm-none-eabi-objdump -h foo.o
foo.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
...
13 .init_array 00000004 00000000 00000000 0000010c 2**2
CONTENTS, ALLOC, LOAD, RELOC, DATA
...
When I look at the section contents, it just contains 0:
$ arm-none-eabi-objdump -j .init_array -s foo.o
Contents of section .init_array:
0000 00000000 ....
However, there is also a relocation section that sets it correctly to _GLOBAL__sub_I_foo:
$ arm-none-eabi-objdump -x foo.o
...
RELOCATION RECORDS FOR [.init_array]:
OFFSET TYPE VALUE
00000000 R_ARM_TARGET1 _GLOBAL__sub_I_foo
In general, .init_array points to all of your _GLOBAL__sub_I_XXX initializer stubs, each of which calls its own copy of _Z41__static_initialization_and_destruction_0ii (yes, it is multiply-defined), which calls the constructor with the appropriate arguments.
Because I'm using -nostdlib in my build, I can't use CodeSourcery's __libc_init_array to execute the .init_array for me, so I need to call the static initializers myself:
extern "C"
{
extern void (**__init_array_start)();
extern void (**__init_array_end)();
inline void static_init()
{
for (void (**p)() = __init_array_start; p < __init_array_end; ++p)
(*p)();
}
}
__init_array_start and __init_array_end are defined by the linker script:
. = ALIGN(4);
.init_array :
{
__init_array_start = .;
KEEP (*(.init_array*))
__init_array_end = .;
}
This approach seems to work with both the CodeSourcery cross-compiler and native ARM GCC, e.g. in Ubuntu 12.10 for ARM. Supporting both compilers is one reason for using -nostdlib and not relying on the CodeSourcery CS3 bare-metal support.
Timmmm,
I just had the same issue on the nRF51822 and solved it by adding KEEP() around a couple lines in the stock Nordic .ld file:
KEEP(*(SORT(.init_array.*)))
KEEP(*(.init_array))
While at it, I did the same to the fini_array area too. Solved my problem and the linker can still remove other unused sections...
You have only produced an object file, due to the -c argument to gcc. To create the .init section, I believe that you need to link that .o into an actual executable or shared library. Try removing the -c argument and renaming the output file to "foo", and then check the resulting executable with the disassembler.
If you look carefully _Z41__static_initialization_and_destruction_0ii would be called inside global constructor. Which inturn would be linked in .init_array section (in arm-none-eabi- from CodeSourcery.) or some other function (__main() if you are using Linux g++). () This should be called at startup or at main().
See also this link.
I had a similar issue where my constructors were not being called (nRF51822 Cortex-M0 with GCC). The problem turned out to be due to this linker flag:
-Wl,--gc-sections
Don't ask me why! I thought it only removed dead code.

no debugging symbols found when using gdb

GNU gdb Fedora (6.8-37.el5)
Kernal 2.6.18-164.el5
I am trying to debug my application. However, everytime I pass the binary to the gdb it says:
(no debugging symbols found)
Here is the file output of the binary, and as you can see it is not stripped:
vid: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped
I am compiling with the following CFLAGS:
CFLAGS = -Wall -Wextra -ggdb -O0 -Wunreachable-code
Can anyone tell me if I am missing some simple here?
The most frequent cause of "no debugging symbols found" when -g is present is that there is some "stray" -s or -S argument somewhere on the link line.
From man ld:
-s
--strip-all
Omit all symbol information from the output file.
-S
--strip-debug
Omit debugger symbol information (but not all symbols) from the output file.
The application has to be both compiled and linked with -g option. I.e. you need to put -g in both CPPFLAGS and LDFLAGS.
Some Linux distributions don't use the gdb style debugging symbols. (IIRC they prefer dwarf2.)
In general, gcc and gdb will be in sync as to what kind of debugging symbols they use, and forcing a particular style will just cause problems; unless you know that you need something else, use just -g.
You should also try -ggdb instead of -g if you're compiling for Android!
Replace -ggdb with -g and make sure you aren't stripping the binary with the strip command.
I know this was answered a long time ago, but I've recently spent hours trying to solve a similar problem. The setup is local PC running Debian 8 using Eclipse CDT Neon.2, remote ARM7 board (Olimex) running Debian 7. Tool chain is Linaro 4.9 using gdbserver on the remote board and the Linaro GDB on the local PC. My issue was that the debug session would start and the program would execute, but breakpoints did not work and when manually paused "no source could be found" would result. My compile line options (Linaro gcc) included -ggdb -O0 as many have suggested but still the same problem. Ultimately I tried gdb proper on the remote board and it complained of no symbols. The curious thing was that 'file' reported debug not stripped on the target executable.
I ultimately solved the problem by adding -g to the linker options. I won't claim to fully understand why this helped, but I wanted to pass this on for others just in case it helps. In this case Linux did indeed need -g on the linker options.
Hope the sytem you compiled on and the system you are debugging on have the same architecture. I ran into an issue where debugging symbols of 32 bit binary refused to load up on my 64 bit machine. Switching to a 32 bit system worked for me.
Bazel can strip binaries by default without warning, if that's your build manager. I had to add --strip=never to my bazel build command to get gdb to work, --compilation_mode=dbg may also work.
$ bazel build -s :mithral_wrapped
...
#even with -s option, no '-s' was printed in gcc command
...
$ file bazel-bin/mithral_wrapped.so
../cpp/bazel-bin/mithral_wrapped.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=4528622fb089b579627507876ff14991179a1138, not stripped
$ objdump -h bazel-bin/mithral_wrapped.so | grep debug
$ bazel build -s :mithral_wrapped --strip=never
...
$ file bazel-bin/mithral_wrapped.so
bazel-bin/mithral_wrapped.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=28bd192b145477c2a7d9b058f1e722a29e92a545, not stripped
$ objdump -h bazel-bin/mithral_wrapped.so | grep debug
30 .debug_info 002c8e0e 0000000000000000 0000000000000000 0006b11e 2**0
31 .debug_abbrev 000030f6 0000000000000000 0000000000000000 00333f2c 2**0
32 .debug_loc 0013cfc3 0000000000000000 0000000000000000 00337022 2**0
33 .debug_aranges 00002950 0000000000000000 0000000000000000 00473fe5 2**0
34 .debug_ranges 00011c80 0000000000000000 0000000000000000 00476935 2**0
35 .debug_line 0001e523 0000000000000000 0000000000000000 004885b5 2**0
36 .debug_str 0033dd10 0000000000000000 0000000000000000 004a6ad8 2**0
For those that came here with this question and who are using Qt: in the release config there is a step where the binary is stripped as part of doing the make install. You can pass the configuration option CONFIG+=nostrip to tell it not to:
Instead of:
qmake <your options here, e.g. CONFIG=whatever>
you add CONFIG+=nostrip, so:
qmake <your options here, e.g. CONFIG=whatever> CONFIG+=nostrip
The solutions I've seen so far are good:
must compile with the -g debugging flag to tell the compiler to generate debugging symbols
make sure there is no stray -s in the compiler flags, which strips the output of all symbols.
Just adding on here, since the solution that worked for me wasn't listed anywhere. The order of the compiler flags matters. I was including multiple header files from many locations (-I/usr/local/include -Iutil -I. And I was compiling with all warnings on (-Wall).
The correct recipe for me was:
gcc -I/usr/local/include -Iutil -I -Wall -g -c main.c -o main.o
Notice:
include flags are at the beginning
-Wall is after include flags and before -g
-g is at the end
Any other ordering of the flags would cause no debug symbols to be generated.
I'm using gcc version 11.3.0 on Ubuntu 22.04 on WSL2.