gdb fails to insert a breakpoint even if compiled with -no-pie - c++

I'm trying to get gdb working with C++ programs on Ubuntu 20.04. What I need is to be able to set a breakpoint (for example, break main.cpp:3 gdb command) and then run until the breakpoint, but at the moment both start and run fail because they "Cannot insert breakpoint" and "Cannot access memory". For me gdb fails even with very simple examples. This is main.cpp content:
#include <iostream>
int main() {
std::cout << "Hello World!";
return 0;
}
I found somewhere that using -no-pie might help to get gdb working (with breakpoints), so I compile the program by running g++ -ggdb3 -no-pie -o main main.cpp (I also tried -g instead of -ggdb3, and -fno-PIE in addition to -no-pie). When I try to use gdb, it complains "Cannot insert breakpoint 1":
gdb -q main
Reading symbols from main...
(gdb) start
Temporary breakpoint 1 at 0x1189: file main.cpp, line 3.
Starting program: /tmp/main
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x1189
Without -no-pie result is the same. Only thing that changes with or without -no-pie is the hexadecimal address, without -no-pie it is low like 0x1189 (as shown above), with -no-pie it can be 0x401176, but everything else exactly the same, I keep getting the "Cannot access memory at address" warning in both cases.
If I use starti instead of start, it works at first, but after a few nexti iterations it prints usual message "Cannot insteart breakpoint":
gdb -q main
Reading symbols from main...
(gdb) starti
Starting program: /tmp/main
Program stopped.
0x00007ffff7fd0100 in ?? () from /lib64/ld-linux-x86-64.so.2
(gdb) nexti
0x00007ffff7fd0103 in ?? () from /lib64/ld-linux-x86-64.so.2
...
(gdb) nexti
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x4
0x00007ffff7fd0119 in ?? () from /lib64/ld-linux-x86-64.so.2
(gdb) nexti
0x00007ffff7fd011c in ?? () from /lib64/ld-linux-x86-64.so.2
...
(gdb) nexti
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x1c
0x000055555556ca22 in ?? ()
(gdb) nexti
[Detaching after fork from child process 3829827]
...
[Detaching after fork from child process 3829840]
Hello World![Inferior 1 (process 3819010) exited normally]
So I can use nexti, but cannot use next and obviously cannot insert breakpoints.
I tried -Wl,-no-pie (by running g++ -Wl,-no-pie -ggdb3 -o main main.cpp; adding -no-pie does not change anything) but this option causes a strange linker error:
/usr/bin/ld: cannot find -lgcc_s
/usr/bin/ld: cannot find -lgcc_s
collect2: error: ld returned 1 exit status
When I google the error, I only found advice to try -no-pie instead of -Wl,-no-pie, and no other solutions. Since debugging C++ programs is very common activity, I feel like I'm missing something obvious but I found no solution so far.
To make it easier to understand what exact commands I use and to make it clear I'm not mixing up directories and to show what versions of g++ and gdb I'm using, here is full terminal log:
$ ls
main.cpp
$ g++ --version | grep Ubuntu
g++ (Ubuntu 9.3.0-10ubuntu2) 9.3.0
$ g++ -ggdb3 -no-pie -o main main.cpp
$ ls
main main.cpp
$ gdb --version | grep Ubuntu
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
$ readelf -h main | grep 'Type: .*EXEC'
Type: EXEC (Executable file)
$ gdb -q main
Reading symbols from main...
(gdb) start
Temporary breakpoint 1 at 0x401176: file main.cpp, line 3.
Starting program: /tmp/main/main
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x401176
For completeness, I tried the same without -no-pie:
$ rm main
$ g++ -ggdb3 -o main main.cpp
$ readelf -h main | grep 'Type: .*'
Type: DYN (Shared object file)
$ gdb -q main
Reading symbols from main...
(gdb) start
Temporary breakpoint 1 at 0x1189: file main.cpp, line 3.
Starting program: /tmp/main/main
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x1189
As you can see the only difference with or without -no-pie is the memory address, but the issue and warnings are the same. Without -no-pie this may be expected, but I do not understand why this is happening if I compiled with -no-pie and what else I can try to solve the issue.

This:
g++ -ggdb3 -no-pie -o main main.cpp
should produce a non-PIE executable. You should be able to verify that it non-PIE by looking at readelf -h main | grep 'Type: .*EXEC' (PIE binaries have ET_DYN type).
This:
Temporary breakpoint 1 at 0x1189: file main.cpp, line 3.
is unambiguously a PIE binary (a non-PIE binary will not have any code below 0x40000 on x86_64 Linux).
Conclusion: you are either debugging the wrong binary (e.g. you are compiling main in a different directory from the one in which you are debugging), or you are not telling is the whole story.

Related

How do you find out the cause of rare crashes that are caused by things that are not caught by try catch (access violation, divide by zero, etc.)?

I am a .NET programmer who is starting to dabble into C++. In C# I would put the root function in a try catch, this way I would catch all exceptions, save the stack trace, and this way I would know what caused the exception, significantly reducing the time spent debugging.
But in C++ some stuff(access violation, divide by zero, etc.) are not caught by try catch. How do you deal with them, how do you know which line of code caused the error?
For example let's assume we have a program that has 1 million lines of code. It's running 24/7, has no user-interaction. Once in a month it crashes because of something that is not caught by try catch. How do you find out which line of code caused the crash?
Environment: Windows 10, MSVC.
C++ is meant to be a high performance language and checks are expensive. You can't run at C++ speeds and at the same time have all sorts of checks. It is by design.
Running .Net this way is akin to running C++ in debug mode with sanitizers on. So if you want to run your application with all the information you can, turn on debug mode in your cmake build and add sanitizers, at least undefined and address sanitizers.
For Windows/MSVC it seems that address sanitizers were just added in 2021. You can check the announcement here: https://devblogs.microsoft.com/cppblog/addresssanitizer-asan-for-windows-with-msvc/
For Windows/mingw or Linux/* you can use Gcc and Clang's builtin sanitizers that have largely the same usage/syntax.
To set your build to debug mode:
cd <builddir>
cmake -DCMAKE_BUILD_TYPE=debug <sourcedir>
To enable sanitizers, add this to your compiler command line: -fsanitize=address,undefined
One way to do that is to add it to your cmake build so altogether it becomes:
cmake -DCMAKE_BUILD_TYPE=debug \
-DCMAKE_CXX_FLAGS_DEBUG_INIT="-fsanitize=address,undefined" \
<sourcedir>
Then run your application binary normally as you do. When an issue is found a meaningful message will be printed along with a very informative stack trace.
Alternatively you can set so the sanitizer breaks inside the debugger (gdb) so you can inspect it live but that only works with the undefined sanitizer. To do so, replace
-fsanitize=address,undefined
with
-fsanitize-undefined-trap-on-error -fsanitize-trap=undefined -fsanitize=address
For example, this code has a clear problem:
void doit( int* p ) {
*p = 10;
}
int main() {
int* ptr = nullptr;
doit(ptr);
}
Compile it in the optimized way and you get:
$ g++ -O3 test.cpp -o test
$ ./test
Segmentation fault (core dumped)
Not very informative. You can try to run it inside the debugger but no symbols are there to see.
$ g++ -O3 test.cpp -o test
$ gdb ./test
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
...
Reading symbols from ./test...
(No debugging symbols found in ./test)
(gdb) r
Starting program: /tmp/test
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555044 in main ()
(gdb)
That's useless so we can turn on debug symbols with
$ g++ -g3 test.cpp -o test
$ gdb ./test
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
...
Reading symbols from ./test...
(gdb) r
Starting program: /tmp/test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
test.cpp:4:5: runtime error: store to null pointer of type 'int'
Program received signal SIGSEGV, Segmentation fault.
0x0000555555555259 in doit (p=0x0) at test.cpp:4
4 *p = 10;
Then you can inspect inside:
(gdb) p p
$1 = (int *) 0x0
Now, turn on sanitizers to get even more messages without the debugger:
$ g++ -O0 -g3 test.cpp -fsanitize=address,undefined -o test
$ ./test
test.cpp:4:5: runtime error: store to null pointer of type 'int'
AddressSanitizer:DEADLYSIGNAL
=================================================================
==931717==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x563b7b66c259 bp 0x7fffd167c240 sp 0x7fffd167c230 T0)
==931717==The signal is caused by a WRITE memory access.
==931717==Hint: address points to the zero page.
#0 0x563b7b66c258 in doit(int*) /tmp/test.cpp:4
#1 0x563b7b66c281 in main /tmp/test.cpp:9
#2 0x7f36164a9082 in __libc_start_main ../csu/libc-start.c:308
#3 0x563b7b66c12d in _start (/tmp/test+0x112d)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /tmp/test.cpp:4 in doit(int*)
==931717==ABORTING
That is much better!

Why is "gdb" listing multiple functions after executing the "start' command even when the C++ source file doesn't contain any function?

The context
Consider the following file
$ cat main.cpp
int main() {return 0;}
I can list all the available functions by executing
$ g++ -g main.cpp && gdb -q -batch -ex 'info functions -n' a.out
All defined functions:
File main.cpp:
1: int main();
When executing start before executing info functions more than 1000 functions are listed (see below)
g++ -g main.cpp && \
gdb -q -batch -ex 'start' -ex 'info functions -n' a.out | \
head -n 10
Temporary breakpoint 1 at 0x111d: file main.cpp, line 1.
Temporary breakpoint 1, main () at main.cpp:1
1 int main() {return 0;}
All defined functions:
File /build/gcc/src/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/allocated_ptr.h:
70: void std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<std::filesystem::__cxx11::filesystem_error::_Impl, std::allocator<std::filesystem::__cxx11::filesystem_error::_Impl>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr();
70: void std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<std::filesystem::filesystem_error::_Impl, std::allocator<std::filesystem::filesystem_error::_Impl>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr();
As seen below, the total number of lines printed is so, apparently, more than 1000 functions are being listed
g++ -g main.cpp && gdb -q -batch -ex 'start' -ex 'info functions -n' a.out | wc -l
4436
The question
As we can see above, the main.cpp file does not contain any function, so why is gdb listing those functions when the start command has been executed before but not when start hasn't been executed?
Additional context
As suggested in one of the comments of this question, here's the output of executing info shared after start has been executed
g++ -g main.cpp && gdb -q -batch -ex 'start' -ex 'info shared' a.out
Temporary breakpoint 1 at 0x111d: file main.cpp, line 1.
Temporary breakpoint 1, main () at main.cpp:1
1 int main() {return 0;}
From To Syms Read Shared Object Library
0x00007ffff7fd2090 0x00007ffff7ff2746 Yes (*) /lib64/ld-linux-x86-64.so.2
0x00007ffff7e4c040 0x00007ffff7f37b52 Yes /usr/lib/libstdc++.so.6
0x00007ffff7c7f3b0 0x00007ffff7d1a658 Yes (*) /usr/lib/libm.so.6
0x00007ffff7c59020 0x00007ffff7c69ca5 Yes /usr/lib/libgcc_s.so.1
0x00007ffff7ab3650 0x00007ffff7bfe6bd Yes (*) /usr/lib/libc.so.6
(*): Shared library is missing debugging information.
main.cpp file does not contain any function, so why is gdb listing those functions when the start command has been executed before but not when start hasn't been executed?
Before start, GDB reads symbols (and debug info) only for the main executable.
After start, a dynamically linked executable loads shared libraries (seen in info shared), and GDB (by default) reads symbol tables and debug info for each of them. And since these libraries contain hundreds of functions, GDB knows about all of them.
You can prevent this with set auto-solib-add off, but usually you don't want to do that. If you do, and your program crashes in e.g. abort, GDB will not know where you crashed unless you manually add the symbols back using sharedlibrary or add-symbol-file command.

gdb cannot find source file of libarary

I want to debug with stander library in gdb.
I ran gdb with argument -d to specify the source file of standard libary.
$ gdb ./test -d /usr/src/glibc/glibc-2.27/
test.cpp:
#include<iostream>
#include<string>
using namespace std;
int main(){
string hex("0x0010");
long hex_number = strtol(hex.c_str(), NULL, 16);
cout << hex_number << endl;
return 0;
}
However, gdb told me it can not find source file strtol.c.
(gdb) b 8
Breakpoint 1 at 0xc8c: file /home/purin/Desktop/CodeWork/adv_unix_programming/hw1/test.cpp, line 8.
(gdb) run
Starting program: /home/purin/Desktop/CodeWork/adv_unix_programming/hw1/test
Breakpoint 1, main () at /home/purin/Desktop/CodeWork/adv_unix_programming/hw1/test.cpp:8
8 long hex_number = strtol(hex.c_str(), NULL, 16);
(gdb) s
__strtol (nptr=0x7fffffffd820 "0x0010", endptr=0x0, base=16) at ../stdlib/strtol.c:106
106 ../stdlib/strtol.c: file or directory does not exits
(gdb) info source
Current source file is ../stdlib/strtol.c
Compilation directory is /build/glibc-OTsEL5/glibc-2.27/stdlib
Source language is c.
Producer is GNU C11 7.3.0 -mtune=generic -march=x86-64 -g -O2 -O3 -std=gnu11 -fgnu89-inline -fmerge-all-constants -frounding-math -fstack-protector-strong -fPIC -ftls-model=initial-exec -fstack-protector-strong.
Compiled with DWARF 2 debugging format.
Does not include preprocessor macro info.
Since gdb can find source code if I execute this command:
(gdb) dir /usr/src/glibc/glibc-2.27/stdlib/
I am sure that I have strtol.c in path /usr/src/glibc/glibc-2.27/stdlib/strtol.c and the library has debug information.
Why gdb cannot search the directories under /usr/src/glibc/glibc-2.27/ to find out stdlib/strtol.c?
Maybe the reason is that if you join two paths /usr/src/glibc/glibc-2.27/ and ../stdlib/strtol.c you get the path /usr/src/glibc/stdlib/strtol.c, but not the expected path /usr/src/glibc/glibc-2.27/stdlib/strtol.c. The reason why ../stdlib/strtol.c is stored in the debug info, very likely is a build directory /build/glibc-OTsEL5/glibc-2.27/build used, and ../stdlib/strtol.c is the relative path to the build directory.
One of solutions is run
$ gdb ./test -d /usr/src/glibc/glibc-2.27/stdlib/

GDB fails to run program due to KERNELBASE.dll error

I've only started getting this problem recently, and I have no idea when it started occuring/what causes it.
I have this simple test program here:
#include <iostream>
int main()
{
return 0;
}
but when I try to run it normally, it creates a stackdump.
Stack trace:
Frame Function Args
00CBC498 6101D93A (00000198, 0000EA60, 000000A4, 00CBC508)
00CBC5C8 610E2F3F (00000000, 60FC04E8, 00CBC658, 7794ABEE)
When I try to run it in GDB, however, it just plain fails to do so.
gdb: unknown target exception 0x406d1388 at 0x778edae8
Program received signal ?, Unknown signal.
0x778edae8 in RaiseException ()
from /cygdrive/c/WINDOWS/SYSTEM32/KERNELBASE.dll
(gdb) n
Single stepping until exit from function RaiseException,
which has no line number information.
[Thread 14880.0x11ac exited with code 1080890248]
[Thread 14880.0x3fd8 exited with code 1080890248]
[Thread 14880.0x3b24 exited with code 1080890248]
[Inferior 1 (process 14880) exited with code 010033211610]
This is how my compiler's set up:
g++ -g -std=c++1y -Wall -c main.cpp -o main.o main.cpp compiled...
g++ -g -std=c++1y -Wall -o a main.o Successfully compiled!
Any idea as to what I'm doing wrong here?
The platform I'm using is Windows, and I'm using Cygwin as my development environment.
OF NOTE: I still get the aforementioned GDB error no matter what I have in main.cpp.
ALSO OF NOTE: Here are the additional files in the a.exe executable:
I formerly needed cygboost_filesystem.dll to get boost_filesystem to work, and I need libgcc_s_sjlj-1.dll and libstdc++6 because it won't work without them.
Update gdb to the 7.11.1-1 experimental release version.
Details of GDB bug

C++0x thread static linking problem

I am having some issues trying to statically link programs using c++0x thread features. Code looks this: (Compiler is gcc 4.6.1 on Debian x86_64 testing)
#include <iostream>
#include <thread>
static void foo() {
std::cout << "FOO BAR\n";
}
int main() {
std::thread t(foo);
t.join();
return 0;
}
I link it with:
g++ -static -pthread -o t-static t.cpp -std=c++0x
When I execute the program, I have the following error:
terminate called after throwing an instance of 'std::system_error'
what(): Operation not permitted
Aborted
GDB Debug output looks like this:
Debugger finished
Current directory is ~/testspace/thread/
GNU gdb (GDB) 7.2-debian
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/will/testspace/thread/t-static...done.
(gdb) list -
1 #include <iostream>
(gdb) b 1
Breakpoint 1 at 0x4007c8: file t.cpp, line 1.
(gdb) r
Starting program: /home/will/testspace/thread/t-static
terminate called after throwing an instance of 'std::system_error'
what(): Operation not permitted
Program received signal SIGABRT, Aborted.
0x00000000004a8e65 in raise ()
(gdb) bt
#0 0x00000000004a8e65 in raise ()
#1 0x000000000045df90 in abort ()
#2 0x000000000044570d in __gnu_cxx::__verbose_terminate_handler() ()
#3 0x0000000000442fb6 in __cxxabiv1::__terminate(void (*)()) ()
#4 0x0000000000442fe3 in std::terminate() ()
#5 0x0000000000443cbe in __cxa_throw ()
#6 0x0000000000401fe4 in std::__throw_system_error(int) ()
#7 0x00000000004057e7 in std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) ()
#8 0x0000000000400b18 in std::thread::thread<void (&)()> (this=0x7fffffffe540, __f=#0x4007c4) at /usr/include/c++/4.6/thread:135
#9 0x00000000004007f3 in main () at t.cpp:11
(gdb)
Update:
Linking with static libstdc++ could (possibly) make this error disappear, and the compiled C++0x programs can run on systems without gcc 4.6 libs:
g++ -static-libgcc -pthread -L.-o t thread.cpp -std=c++0x
But first, we should make a symbolic link to 'libstdc++.a' at current directory:
ln -s `g++ -print-file-name=libstdc++.a`
(Reference: http://www.trilithium.com/johan/2005/06/static-libstdc/)
you can use -u to resolve the problem (test in gcc version 4.6.3/(Ubuntu EGLIBC 2.15-0ubuntu10.4) 2.15 , gcc version 4.8.1/(Ubuntu EGLIBC 2.15-0ubuntu10.5~ppa1) 2.15)
-Wl,-u,pthread_cancel,-u,pthread_cond_broadcast,-u,pthread_cond_destroy,-u,pthread_cond_signal,-u,pthread_cond_wait,-u,pthread_create,-u,pthread_detach,-u,pthread_cond_signal,-u,pthread_equal,-u,pthread_join,-u,pthread_mutex_lock,-u,pthread_mutex_unlock,-u,pthread_once,-u,pthread_setcancelstate
1. reproduce the bug
g++ -g -O0 -static -std=c++11 t.cpp -lpthread
./a.out
terminate called after throwing an instance of 'std::system_error'
what(): Enable multithreading to use std::thread: Operation not permitted
Aborted (core dumped)
nm a.out | egrep "\bpthread_.*"
w pthread_cond_broadcast
w pthread_cond_destroy
w pthread_cond_signal
w pthread_cond_wait
w pthread_create
w pthread_detach
w pthread_equal
w pthread_join
w pthread_mutex_lock
w pthread_mutex_unlock
w pthread_once
w pthread_setcancelstate
2. resolve the bug
g++ -g -O0 -static -std=c++11 t.cpp -lpthread -Wl,-u,pthread_join,-u,pthread_equal
./a.out
FOO BAR
nm a.out | egrep "\bpthread_.*"
0000000000406320 T pthread_cancel
w pthread_cond_broadcast
w pthread_cond_destroy
w pthread_cond_signal
w pthread_cond_wait
0000000000404970 W pthread_create
w pthread_detach
00000000004033e0 T pthread_equal
00000000004061a0 T pthread_getspecific
0000000000403270 T pthread_join
0000000000406100 T pthread_key_create
0000000000406160 T pthread_key_delete
00000000004057b0 T pthread_mutex_lock
00000000004059c0 T pthread_mutex_trylock
0000000000406020 T pthread_mutex_unlock
00000000004063b0 T pthread_once
w pthread_setcancelstate
0000000000406220 T pthread_setspecific
For reasons exactly unknown to me (I consider this a bug) you can not use std::thread on gcc 4.6 when linking statically, since the function __ghtread_active_p() will be inlined as returning false (look at the assembly of _M_start_thread), causing this exception to be thrown. It might be that they require weak symbols for the pthread_create function there and when statically linking they are not there, but why they don't do it otherwise is beyond me (Note that the assembly later contains things like callq 0x0, there seems to be going something very wrong).
For now I personally use boost::threads since I am using boost anyways...
You should make sure you're linking against pthread library, otherwise you'll get the "Operation not permitted" message.
For instance, to compile your source code, I'd use this:
g++ -Wall -fexceptions -std=c++0x -g -c file.cpp -o file.o
Then linking with this:
g++ -o file file.o -lpthread
When doing it without object files, you could try something like this:
g++ -Wall -fexceptions -std=c++0x -g main.cpp -o file -lpthread
Remember to leave libraries at the end, since they'll be used on the linking process only.
My previous answer was deleted and I wrote detailed answer.
In common cases this issue happens due incomplete linking of the libpthread. I found information about this here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52590
You can try link you app with the following flags:
-Wl,--whole-archive -lpthread -Wl,--no-whole-archive
Also you could look at this questions with the similar problem:
What is the correct link options to use std::thread in GCC under linux?
Starting a std::thread with static linking causes segmentation fault