I'm debugging a C++ application which creates trees of forks. Using GDB defaults, the child processes will be detached on the fork and as a result I see only one inferior shown afterwards.
I tried to attach to one of the child processes and despite it not being listed as an inferior for the other GDB process, in the new GDB session I get an error that the process is already being traced (by the first GDB session).
Is this expected behavior? What steps can I take to debug the forked process in a separate GDB session? What steps can I take to debug the problem further?
Related
got some big real time project to deal with (multiple processes (IPCs), multi Everything in short).
My working on process is started as service on Linux. I have the root access.
Here is the problem:
I'm trying to attach to a running proc, tried starting it through/with gdb but the result is the same: it stops the executable once I "touched" it with gdb or sometimes it throws:
Program received signal SIGUSR1, User defined signal 1. [Switching to Thread 0x7f9fe869f700 (LWP 2638)]
of course from there nothing can be done.
Tried:
handle all nostop
attach to launched as service (daemon) or launched as regular proc
started from gdb
thought maybe forking/multi-threaded problem - implemented in the very beginning sleep for 10 seconds - attached to it with "continue"
Guys, all I want it is to debug, hit the breakpoints, etc.
Please help! Share ideas.
Editing actual commands:
1) gdb attach myProcId. Then after reading symbols, I hit "c" which results:
Program received signal SIGUSR1, User defined signal 1.
[Switching to Thread 0x7f9fe869f700 (LWP 2638)]
0x00007f9fec09bf73 in select () from /lib64/libc.so.6
2) If I make the first line 10 seconds sleep in the code, attaching to the process, hit "c", result: it runs, shows info threads, backtrace of main, but never hits the breakpoint (for sure the code runs there - I get logs and different behaviour if I change code there), meaning the process is stuck.
3) All other combinations like gdb path/to/my/proc args list, then start. Where arg list played with different related options gdb gives us.
Maybe worth to mention: process network packets related, timers driven also.
But for me the important thing is a current snapshot on break, i don't care what will happen to the system after timers expired.
Since you mentioned that you are debugging a multiprocessing program, I think the underlying program you have is to set the breakpoint in the correct subprocess.
Try break fork and set follow-fork-mode child/parent. What you want to achieve is have gdb attached to the process that is running the code you want to debug.
Refer to this link.
Another thought is to generate a crash, since you can compile the programe. For example add a int i = *(int*)NULL and that will generate a core dump. You can then debug the core dump with gdb <program> <core dump>. You can refer to this page for how to configure core dump.
I'm having a tough time figuring this one out; I have a program, iverilog that executes a system() call to invoke another program, ivl. I want to debug the second program, ivl in gdb, but I can't get gdb to set any breakpoints in the child process when I invoke gdb with the parent process. Here's what the programs look like:
//iverilog-main.cpp (Parent process)
int main(){
//...
system("ivl arg1 arg2");
//...
return 0;
}
.
//ivl-main.cpp (child process)
int main(){
//...
//stuff I want to debug
//...
return 0;
}
.
The gdb commands I'm running are: gdb iverilog -x cmds.gdb
# cmds.gdb
set args hello.v
set follow-fork-mode child
set breakpoint pending on
break ivl-main.cpp:main
run
Unfortunately, gdb doesn't break at ivl-main.cpp:main,it just completes without ever breaking; the output I get is:
Starting program: /usr/local/bin/iverilog hello.v
[New process 18117]
process 18117 is executing new program: /bin/dash
[Inferior 2 (process 18117) exited normally]
I'm certain ivl-main.cpp:main is being called because when I run the ivl program in gdb it successfully breaks there.
My thinking is that gdb doesn't recognize ivl-main.cpp as a source file when its running gdb iverilog, and it's not setting that breakpoint when it enters the child process which does contain ivl-main.cpp as a source file. So I think if I set the breakpoint for ivl-main.cpp when gdb enters the child process, it should work. The only way I can think of doing this is to manually break at the system() call and step into the child process, then set the breakpoint. Is there a more elegant approach that would force gdb to break whenever entering a child process?
Normally GDB only debugs one process at a time- if your program forks then you will debug the parent or the child, but not both simultaneously. By default, GDB continues debugging the parent after a fork, but you can change this behavior if you so desire with the following command:
set follow-fork-mode child
Alternately, you can tell GDB to keep both the parent and the child under its control. By default GDB only follows one process, but you can tell it to follow all child processes with this command:
set detach-on-fork off
GDB refers to each debugged process as an "inferior". When debugging multiple processes you can examine and interact each process with the "inferiors" command similar to how you would use "threads" to examine or interact with multiple threads.
See more documentation here:
https://sourceware.org/gdb/onlinedocs/gdb/Forks.html
This answer provides one way to achieve what you want.
In theory, set follow-fork-mode child should work.
In practice, the iverilog is likely itself a shell script that runs (forks) multiple commands, so at every fork you will need to decide whether you want to continue debugging the parent or the child. One wrong decision and you've lost control of the process that will eventually execute your program. This very likely explains why it didn't work for you.
If I have some multithreaded process and want to trace it with gdb using attach command, to which thread it will connect (e.g. current running or main)? I know that I can discover it with info threads but I want to know which thread it will choose by default.
For Linux, all of the threads are stopped by the ptrace command when gdb attaches.
It has been my experience that gdb defaults to the main thread for C/C++ applications. If you attach to a process and do a 'bt' it will list the stack for 'main'.
Information is available for all threads however. gdb can look at the thread(s) information in the /proc filesystem. The proc contains information about each thread in the tasks area. Details about the stack address is located in the stat file as well as the maps file. Details are also available regarding the register values for each thread.
Along the lines of your question, I've often wondered why stepping through a multithreaded application will cause gdb to jump from thread to thread. I think that gdb is still at the mercy of the kernel scheduler so that a step on a thread may lead to a different thread getting the CPU resource and a breakpoint being triggered.
On Linux, where thread ids exist in the same space as process ids, it appears you can run gdb -p tid to attach to the thread with given tid and its owning process, without knowing the pid. Because the main thread of a process has tid == pid, it makes sense that running gdb -p pid connects to the main thread.
Example code that connects gdb to the currently executing thread, e.g. for generating a pretty stack trace: https://github.com/facebook/rocksdb/pull/11150
I will try to be as specific as I can, but so far I have worded this problem so poorly that Google failed to return any useful results (hence my question here).
I am attaching gdb to a multi-threaded c++ server process. All I can say is that strange things have been happening while trying to do the usual set-breakpoint-break-investigate.
First, while waiting for the breakpoint to be hit (in 'Continuing' mode), I suddenly got back the (gdb) prompt with the message:
Continuing.
[Thread 0x54d5b940 (LWP 28503) exited]
[New Thread 0x54d5b940 (LWP 28726)]
Cannot get thread event message: debugger service failed
Second, also while waiting for the breakpoint to be hit, I'm suddenly told the program has received SIGSEGV and - back to the (gdb) prompt - backtrace tells me the segfault happened in pthread_cancel(). Note the process under investigation does not normally segfault.
I clearly lack enough information about how gdb works to even begin guessing what is happening. Am I doing anything wrong? The steps I take are the same each time:
gdb attach
break 'MyFunction()'
continue
Thoughts? Thanks.
I fought with similar gdb issues for a while. My case was having lots of threads spawned that executed few functions and then exited.
It appears if a thread exits too fast and there's lots of these happening sometimes gdb cannot keep up and when it fails, it fails with style as in crashes :) I think it tries to attach to a thread that is already done as per the error message.
I see this as an issue in gdb 6.5 to 7.6 and still happening. Did not try with older versions.
My advice is look for this use case or similar. Once I changed my design to have a thread serving a queue of requests gdb works flawlessly.
Design wise is healthier to have already created threads that digest actions than always spawning new threads.
Still same code debugs without a problem on Visual Studio so I do have to say that is a small disappointment to me with regards to gdb.
I use Eclipse and looking at the GDB traces (usually enabled by default) will give you a better hint of where GDB fails. One of the buttons on the console shows you the GDB trace.
Is there a way to stop the inferior without using Ctrl+C (or an equivalent signal sent from another process?) I'm using a windows platform and am managing GDB from another process, so with no notion of signals, it seems that there isn't a good way to break execution of my program when it's free running without any breakpoints.
EDIT FOR CLARITY:
There are 2 processes involved here. There's process A, which is the parent of GDB. GDB is managing a process, but it's on a remote host, and we'll call that process C.
When I tell GDB to "run" it kicks off process C on the remote host and blocks either until a breakpoint is hit, process C encounters an error or a fatal signal, or GDB itself receives an interrupt signal. If working interactively, you would simply press CTRL+C at the GDB command console, which GDB interprets as a SIGINT (somehow), triggering GDB to halt process C. Since I'm actually managing GDB with process A (and not dealing with it interactively at the shell) I can't very well press Ctrl+C, and since windows has no native notion of "Signals" like you have in UNIX, I can't figure out how to interrupt GDB when it's blocking waiting for process C to interrupt or hit a breakpoint.
Did you try to take a look at the remote control protocols? for instance, EMACS uses MI to control GDB, you should check how/if they offer such a ctrl-C mechanism, and how they implement it.
EDIT: it seems to be -exec-interrupt which interrupts the execution.