Is GDB caching debuggee's signal by SIGCHLD? - gdb

Came across this doc: https://idea.popcount.org/2012-12-11-linux-process-states/ (a bit old). It says ptrace is handling debugee's signals by receiving SIGCHLD. Is GDB relying on this?
Related, does GDB get notification when signal handler is set to "noprint nostop pass"?
Further, the doc above says, in the case of ptrace, system blocks debuggee when some signal happens, until debugger finishes handling and continues debugee by waitpid(). Is this still the case nowadays?
Thanks in advance!

The answer is "yes" to every single question you posed, except:
and continues debugee by waitpid()
The waitpid doesn't continue debuggee, merely waits for it. The "continue" is done with (surprise!) ptrace(PTRACE_CONT, ...).

Related

Prevent breaking/stopping program on signals within GDB

I've run into a bit of an issue. I'm debugging a BOCHS OS emulator in GDB, and it sends Signal 0 fairly often (every time there is a page fault). I was wondering if there was a way to explicitly tell gdb to not break/stop execution on signals?
I've tried "handle all nostop" and specifically "handle 0 nostop", but it doesn't work.
Let me know if there's any additional information I can provide. I'd consider myself only an intermediate gdb user, so any help is great!
I've read this SO question and this man page but neither worked.
I believe you want to set
handle 0 noprint
From the man page
GDB should not mention the occurrence of the signal at all. This implies the nostop keyword as well.
If you run info signals in gdb, it gives you a list of signals by name, which works fine with handle.
For example:
(gdb) handle SIG34 noprint
Signal Stop Print Pass to program Description
SIG34 No No Yes Real-time event 34

Who sends and who reveive SIGTRAP in case of interupt 3?

In a debugging session, when the deugger wants to set a breakpoint, it replaces an instruction by int3. When the Target process reach this instruction, the process stops. I have read that a signal is send at this time. But i did not manage to capture this signal (i wrote my own mini debugger for testing). Who send this signal ? The kernel? And who is the receiver?
I had to put a wait() fonction juste after the ptrace_cont. Do you think this is this wait function that catch the signal in order to notify the debugger that the process reach a break point ?
When the Target process reach this instruction, the process stops.
That's not quite accurate. When the trap instruction (0xCC on x86) is executed, the processor notifies the OS. On UNIX, the OS checks to see whether the process is being ptraced by somebody.
If no, the SIGTRAP signal is delivered to the application, which usually results in process being killed (but you can catch and handle the signal in the application).
If there is a ptraceer (usually a debugger), then the signal is not delivered to the application. Instead, debugger's wait is unblocked to notify the debugger that the inferior has changed state. The debugger then looks at where the inferior process stopped, discovers that it did so because of a breakpoint, and handles the situation as appropriate (let's you examine the inferior, or resumes it if the breakpoint is conditional and current conditions don't match, etc.)

What is the correct way to force an app to core dump and quit?

I just came across some code which used the kill system call to send a SIGSEGV signal to an app. The rationale behind this was that this would force the app to core dump and quit. This seems so wrong to me, is this normal practice?
SIGQUIT is the correct signal to send to a program if you wish to produce a core dump. kill is the correct command line program to send signals (it is of course poorly named, since not all signals will kill the program).
Note, you should not send random signals to the program, not all of them will produce a core dump. Many of them will be handled by the program itself, either consumed, ignored, or induce other processing. Thus sending a SIGSEGV is wrong.
GCC Says:
http://www.gnu.org/s/libc/manual/html_node/Termination-Signals.html
POSIX/Unix Says:
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/signal.h.html
Yes. kill is somewhat misnamed -- it can send any signal. There are many uses for kill which don't result in the process being killed at all!
If you want to make an application dump it's core from another program, pretty much the only way to do it is via a signal. SEGV would be fine for this. Alternatively you can hook a debugger up to the program and freeze it and view it's registers and such without killing it.
If you want to dump a core from within an application there are nicer ways to do it, like via an assert().
So, no, it's not particularly wrong to send a SEGV to a program. You could also send things like SIGILL for illegal instruction, or a divide by zero signal. It's all fine.
The way to do it in Unix/Linux is to call abort() which will send SIGABORT to current process. The other option is raise() where you can specify what signal you want to send to current process.
Richard Stevens (_Advanced Programming in the UNIX Environment) wrote:
The generation of core is an implementation features of most Unix. It is not part of POSIX.1.
He lists 12 signals whose default action is to terminate with a core (ANSI: SIGABRT, SIGFPE, SIGILL, SIGSEGV, POSIX: SIGQUIT, Other: SIGBUS, SIGEMT, SIGIOT, SIGSYS, SIGTRAP, SIGXCPU, SIGXFSZ), all of them are overwritable (the two signals which aren't overwritable are SIGKILL and SIGSTOP).
I've never seen a way to generate a core which isn't the use of a default signal handler.
So if your goal is to generate a core and stop, the best is to choose a signal whose default handler does the job (SIGSEGV does the job), reset the default handler for the signal if you are using it and then use kill.

What is SIG44 in gdb?

Sometimes when I am debugging I get message like this.
Program received signal SIG44, Real-time event 44.
What does it means?
Thank you.
EDIT :
Platform is linux
A signal is a message sent by the kernel to a process in order to notify the process that event of some kind has occurred in the system.
Usual signals on linux are for example SIGINT (value 2, interrupt from keyboard) or SIGKILL ( value 9, kill a program).
Signals are received either when the kernel detects a system event (like division by zero is SIGFPE, value 8) or when a process invokes the kill() function to explicitly tell the kernel to send a signal to a process (or to the process itself that called the kill() ).
A signal can often be caught by the process in order to do something.
So to answer to your question, the code is most likely calling the kill() function and sending it a signal with value 44 when something happens. Since you are getting that message, it means that the process has received the signal and is going to exit or do what is written in the code in case that signal comes.
Unlike standard signals, real-time
signals have no predefined meanings:
the entire set of real-time signals
can be used for application-defined
purposes. (Note, however, that the
LinuxThreads implementation uses the
first three real-time signals.)
Source for the quote here
The GNU C++ library uses SIG44 to awaken sleeping threads when signalling condition variables.

How to check if a process is running or got segfaulted or terminated in linux from its pid in my main() in c++

I am invoking several processes in my main and I can get the pid of that processes. Now I want to wait until all this processes have been finished and then clear the shared memory block from my parent process. Also if any of the process not finished and segfaulted I want to kill that process. So how to check from the pid of processes in my parent process code that a process is finished without any error or it gave broke down becoz of runtime error or any other cause, so that I can kill that process.
Also what if I want to see the status of some other process which is not a child process but its pid is known.
Code is appreciated( I am not looking for script but code ).
Look into waitpid(2) with WNOHANG option. Check the "fate" of the process with macros in the manual page, especially WIFSIGNALED().
Also, segfaulted process is already dead (unless SIGSEGV is specifically handled by the process, which is usually not a good idea.)
From your updates, it looks like you also want to check on other processes, which are not children of your current process.
You can look at /proc/{pid}/status to get an overview of what a process is currently doing, its either going to be:
Running
Stopped
Sleeping
Disk (D) sleep (i/o bound, uninterruptable)
Zombie
However, once a process dies (fully, unless zombied) so does its entry in /proc. There's no way to tell if it exited successfully, segfaulted, caught a signal that could not be handled, or failed to handle a signal that could be handled. Not unless its parent logs that information somewhere.
It sounds like your writing a watchdog for other processes that you did not start, rather than keeping track of child processes.
If a program segfaults, you won't need to kill it. It's dead already.
Use the wait and waitpid calls to wait for children to finish and check the status for some idea of how they exiting. See here for details on how to use these functions. Note especially the WIFSIGNALED and WTERMSIG macros.
waitpid() from SIGCHLD handler to catch the moment when application terminates itself. Note that if you start multiple processes you have to loop on waitpid() with WNOHANG until it returns 0.
kill() with signal 0 to check whether the process is still running. IIRC zombies still qualify as processes thus you have to have proper SIGCHLD handler for that to work.