How to use SIGFPE with signal? - c++

I just informed myselve about "signals" in C/C++ and played around. But i have a problem to understand the logic of SIGFPE.
I wrote a little program which will run into a division by zero, if this happens then the signal should be triggered and the signal handler should be executed. But instead my program just crashes. So what is the purpose of the SIGFPE if it does not even work on division by zero?
#include <stdio.h>
#include <signal.h>
#include <iostream>
int signal_status = 0;
void my_handler (int param)
{
signal_status = 1;
printf ("DIVISION BY ZERO!");
}
int main ()
{
signal (SIGFPE, my_handler);
int result = 0;
while(1)
{
system("cls");
printf ("signaled is %d.\n", signal_status);
for(int i=10000; i>-1; i--)
{
result = 5000 / i;
}
}
getchar();
return 0;
}

As I commented, most signals are OS specific. For Linux, read carefully signal(7). You forgot a \n inside your printf (usually, you'll be lucky enough to see something work in your code, but read all my answer). And in principle you should not call printf (which is not an async-signal-safe function, you should use directly and only write(2) inside) from your signal handler.
What probably is happening is (ignoring the undefined behavior posed by wrongly using printf inside the signal handler) is that:
your stdout buffer is never flushed since you forgot a \n (you might add a fflush(NULL);...) in the printf inside my_handler in your code
probably, the SIGFPE handler restarts again the machine code instruction triggering it. (More exactly, after returning from sigreturn(2) your machine is in the same state as it was before SIGFPE was delivered, so the same divide-by-zero condition happens, etc...)
It is difficult (but painfully possible, if you accept coding hardware-specific and operating-system specific code) to handle SIGFPE; you would use sigaction(2) with SA_SIGINFO and handle the third argument to the signal handler (which is a ucontext_t pointer indirectly giving the machine state, including processor registers, which you might change inside your handler; in particular you could change your return program counter there). You might also consider using sigsetjmp(3) inside your signal handler (but it is in theory forbidden, since not async-signal-safe).
(You certainly need to understand the details of your processor's instruction set architecture and your operating system's ABI; and you probably would need a week of coding work after having mastered these)
In a portable POSIX way, SIGFPE cannot really be handled, as explained in Blue Moon's answer
Probably, the runtime of JVM or of SBCL is handling SIGFPE in a machine & operating system specific way to report zero-divides as divide-by-zero exceptions .... (to Java programs for JVM, to Common Lisp programs for SBCL). Alternatively their JIT or compiler machinery could generate a test before every division.
BTW, a flag set inside a signal handler should be declared volatile sig_atomic_t. See POSIX specification about <signal.h>
As a pragmatical rule of thumb, a POSIX portable and robust signal handler should only set some volatile sig_atomic_t and/or perhaps write(2) a few bytes to some pipe(7) (your process could set up a pipe to itself -as recommended by Qt-, with another thread and/or some event loop reading it), but this does not work for asynchronous process-generated signals like SIGFPE, SIGBUS, SIGILL, and SIGSEGV, etc... (which could only be handled by painful computer-specific code).
See also this answer to a very related question.
At last, on Linux, signal processing is believed to be not very quick. Even with a lot of machine-specific coding, emulating GNU Hurd external pagers by tricky SIGSEGV handling (which would mmap lazily ....) is believed to be quite slow.

Divide by zero is undefined behaviour. So whether you have installed a handler for SIGFPE or not is of little significance when your program invokes undefined behaviour.
POSIX says:
Delivery of the signal shall have no effect on the process. The
behavior of a process is undefined after it ignores a SIGFPE, SIGILL,
SIGSEGV, or SIGBUS signal that was not generated by kill(),
sigqueue(), or raise().
A signal is raised as a result of an event (e.g. sending SIGINT by pressing CTRL+C) which can be handled by the process if said event non-fatal. SIGFPE is an erroneous condition in the program and you can't handle that. A similar case would be attempting to handle SIGSEGV, which is equivalent to this (undefined behaviour). When your process attempts to access some memory for which it doesn't have access. It would be silly if you could just ignore it and carry on as if nothing happened.

Related

C++ printing in signal handlers?

I searched a lot but none answered my question, I read that it's not safe to use cout in signal handlers like this:
void ctrlZHandler(int sig_num) {
//SIGTSTP-18
std::cout << "smash: got ctrl-Z" << std::endl;
SmallShell::route_signal(sig_num);
}
will it solve the problem if I move the printing inside route_signal?
Is there a lists of safe-to-call functions in C++11?
What if the only solution to use write, can you show me short example, and let's say route_signal have 100 printings should I replace all with write()? that sounds exhausting with the need to allocate memory and free...
The reason why using std::cout inside signal handlers isn't recommented is because signals might interrupt your running code whenever and std::cout::operator << is not reentrant.
This means if you are executing std::cout::operator << when a signal is raised that also uses it within it's execution, the result is undefined.
So, no. Moving it into route_signal would not solve this and you should replace every call of std::cout within!
One workaround would be to set a flag that this signal was received and create a output outside the signal handler after it returned.
Signal handlers need to run quickly and be reentrant, which is why they shouldn’t call output stream functions like cout <<, either directly or indirectly.
If you are doing this temporarily under controlled conditions for testing, it might be okay, but make sure the signal you are handling is not triggered again until the handler has finished and be aware that stream functions can be slow, which might mess up your tests as well.
will it solve the problem if I move the printing inside route_signal?
No.
Is there a lists of safe-to-call functions in C++11?
For practical purposes, the only safe thing you can do is set a volatile sig_atomic_t or lock-free atomic flag inside a signal handler. (N3690 intro.execution §1.9 ¶6)
I'm no C nor C++ language lawyer, but I believe anything permitted in a conforming C application is allowed in a C++11 signal handler. However, that set is very, very limited: abort, quick_exit, _Exit, and signal. (ISO/IEC 9899:2011 §7.14.1.1 ¶5).
What if the only solution to use write, can you show me short example, and let's say route_signal have 100 printings should I replace all with write()? that sounds exhausting with the need to allocate memory and free...
A better solution is to redesign your program to use sigwait or to check that a flag safely set inside the signal handler.
If you insist on using write, and if you trust that it is safe to call inside a signal handler in your C++ implementation — which it probably is but, again, is not guaranteed by C++ itself — then you simply have a coding problem. You'll need to figure out formatting yourself, bearing in mind that even on POSIX-conforming systems malloc and free are not async-signal-safe. It can certainly be done.

Is signal.h a reliable way to catch null pointers?

i am currently writing a small VM in C/C++. Obviously i can't let the whole VM crash if the user dereferences a null pointer so i have to check every access which is becoming cumbersome as the VM grows and more systems are implemented.
So i had an idea: write a signal handler for sigsegv and let the OS do its thing but instead of closing the program call the VM exception handler.
It seems to work (with my very simple test cases), but i didn't find anything guaranteeing a Sigsegv being thrown on null-derefs nor the handler being called for OS generated signals.
So my question is:
Can i count on signal.h on modern destkop OSes (i don't really care if it's not standard on doesn't work on something other than linux/win: it's a pet project). Are there any non trivial stuff i should be aware of (obscure limitations of signal(...) or longjmp(...) ?)
Thank you !
Here is the pseudo implementation:
/* ... */
jmp_buf env;
/* ... */
void handler(int) {
longjmp(env, VM_NULLPTR);
}
/* ... */
if(setjmp(env)) {
return vm_throw("NullPtrException");
}
switch(opcode) {
/* instructions */
case INVOKE:
*stack_top = vm_call(stack_top->obj); // don't check anything in the case where stack_top or stack_top->obj is null handler() will be called an a "NullPtrException" will be thrown
break;
/* more instructions */
}
/* ... */
Note : i only need to check nulls, garbage (dangling) pointers are handled by the GC and should not happen.
Calling longjmp() from a signal handler is only safe if the signal handler is never invoked from async signal unsafe code. So for example if you might receive SIGSEGV by passing a bad pointer to any of the printf() family of functions, you cannot longjmp() from your signal handler.
Can i count on signal.h on modern destkop OSes
You can count on it in the sense that the header, and the functions within will be available on all standard compliant systems. However, exactly what what signals are thrown and when is not consistent across OSes.
On windows, you may need to compile your program with cygwin or similar environment to get the system to raise a segmentation fault. Programs compiled with visual studio use "structured exceptions" to handle invalid storage access.
Is signal.h a reliable way to catch null pointers?
There are situations where null pointer dereference does not result in the raising a segmentation fault signal, even on a POSIX system.
One case might be that the compiler has optimized the operation away, which is typical for example in the case of dereferencing a null pointer to call a member function that does not access any data members. When there is no invalid memory access, there is no signal either. Of course, in that case there won't be a crash either.
Another case might be that address 0 is in fact valid. That's the case on AIX, which you don't care about. It is also the case on Linux, which you do care about, but not by default and you might choose to not care about the case where it is. See this answer for more details.
Then there is your implementation of the signal handler. longjmp is not async signal safe, so if the signal was raised while another non-safe operation was being performed, the interrupted operation may have left your program in an inconsistent state. See the answer by John Zwinck and the libc documentation for details.

Boost Semaphores under linux and EINTR return code

In boost (I use 1.54.0) I see implementation for posix semaphore wait:
inline void semaphore_wait(sem_t *handle)
{
int ret = sem_wait(handle);
if(ret != 0){
throw interprocess_exception(system_error_code());
}
}
Manual on posix semaphore says:
ERRORS
EINTR The call was interrupted by a signal handler; see signal(7).
Am I right that boost semaphore throw exception if I send kill to the waiting thread? If so how do you handle this situation?
In my opinion, this is probably a bug in Boost.Interprocess. Please, report it to developers, at the very least they will be able to provide a rationale if this is intentional.
Commenting on signal management suggestion in the comments above. It is true that a typical multi-threaded application should mask out signals that are not intended to be processed by threads, leaving only one thread to handle signals. However, this is not a mandatory rule.
First, auxiliary threads can be spawned by libraries which do not internally handle signals, leaving that to the application. Signal handlers can potentially be called in these threads.
Second, some signals may be intentionally left unmasked to catch events related to that particular thread. For example, one can register a handler for SIGSEGV to detect segmentation errors. This handler will be invoked in the offending thread, and the application can theoretically deal with the error. Similarly, SIGUSR1 or SIGUSR2 can be used to signal application-defined events to particular threads.
The bottom line is that even though a well designed application should extract signal handling to a separate thread, libraries should not assume that and be prepared that it doesn't. In any case, throwing in case of EINTR doesn't look like a correct behavior.
The implementation looks OK. SA_RESTART flag can be used so the call is restarted automatically. http://man7.org/linux/man-pages/man7/signal.7.html

signal vs thread

I am looking some info about reentrancy, then I encountered about signal and thread. What is the difference between the two?
Please advice.
Many thanks.
You are comparing apples and oranges. Signal Programming is event driven programming and can be used to influence threads. However the signal programming paradigm can be used in a single threaded application.
To understand signals it is best to start by thinking about a single threaded program. This program is doing whatever it does with its one thread and then a signal is delivered to it. If the program has registered a signal handler (a function to call) for that signal then the normal execution of that program will be put on hold for a little bit while the signal handler function is called (very much like an hardware interrupt interrupts the operating system to run interrupt service routines) and run the function that the program has registered to handle that signal. So with the code:
#include <stdio.h>
#include <signal.h>
#include <unistd.h> // for alarm
volatile int x = 0;
void foo(int sig_num) {
x = sig_num;
}
int main(void) {
unsigned long long count = 0;
signal(SIGALRM, foo);
alarm(1); // This is a posix function and may not be in all hosted
// C implementations.
// SIGALRM will be sent to this process in 1 second.
while (!x) {
printf("not x\n");
count++;
}
printf("x is %i and count = %llu\n", x, count);
}
The program will loop until someone sends it a signal (how this happens may differ by platform). If the signal SIGALARM is sent then foo will set x and the loop will exit. Exactly where in the loop foo is called is not clear. It could happen just between the print and incrementing the count, just after the while conditional is tested, during the print, ... lots of places, really. This is why signals may pose a concurrency or reentrantcy problem -- they can change things without the other code knowing that it happened.
The reason that x was declared as volatile was that without that many compilers might think "hey, no one in main changes x and main doesn't call any other functions, so x never changes" and optimize out the loop test. Specifying volatile tells the C compiler that this variable can be changed by unseen forces (such as signal handlers, other threads, or sometimes even hardware in the case of memory mapped device control registers).
It was pretty easy to make sure that x was looked out for properly between both the signal handler and the main execution code because x is just an integer (load and store for it were probably single instructions assembly each), it was only altered by the one thing (the signal handler, rather than the main code) in this case, and it was only used as a simple boolean value. If x were some other type, such as a string, then since signals can interrupt you at any time, the signal handler may overwrite part of the string while the main code was in the middle of reading the string. This could have results as bad as someone freezing time while you were in the middle of brushing your teeth, replacing your toothbrush with a cobra, and then unfreezing time.
A little bit more on signals -- they are part of the C language, but most of their use is not really covered by C. Many of the Linux, Unix, and POSIX functions that have to do with signals are not part of the C language, but it is difficult to come up with reasonable (and small) examples of signal use that doesn't rely on something not in the C standard, which is why I used the alarm function. The raise function, which is part of C, can be used to send a signal to yourself, but it is more difficult to make examples for.
As scary as signals may seem now, most systems have more functions that make them much more easy to use.
threads, finally
Threads execute concurrently, while signals interrupt. While there are some threading libraries that actually implement threading in such a way that this is not really the case, it is best to think of threads this way. Since computer programs are actually very limited in their ability to see what is going on threads can get in each others' way just like signal handlers can get in the way of the main execution code (probably more often than signal handlers, though).
Imagine that you are about to brush your teeth again, but this time you are def and blind. Now your roommate, who is also def and blind, comes in to fix the sink with some silicone sealer. Just as you reach for the toothpaste he lays down the tube of silicone right on top of the tube of toothpaste and you grab the tube of silicone instead of the toothpaste. Remember, since you are both blind and def (and somehow not bumping into each other) you both assume that no one else is using the sink, so you never realize that you have just put the silicone on your toothbrush, and your roommate doesn't realize that he is trying to fill the cracks between the tile and the back of the sink with toothpaste.
Luckily there are ways that threads can communicate to each other that something is currently in use so other threads should stay away (like locking the door while you brush your teeth).
Thread lives inside a process whereas signals are part of a universe, and signals have permission to communicate with processes or with specific thread inside a process.

Portable way to catch signals and report problem to the user

If by some miracle a segfault occurs in our program, I want to catch the SIGSEGV and let the user (possibly a GUI client) know with a single return code that a serious problem has occurred. At the same time I would like to display information on the command line to show which signal was caught.
Today our signal handler looks as follows:
void catchSignal (int reason) {
std :: cerr << "Caught a signal: " << reason << std::endl;
exit (1);
}
I can hear the screams of horror with the above, as I have read from this thread that it is evil to call a non-reentrant function from a signal handler.
Is there a portable way to handle the signal and provide information to users?
EDIT: Or at least portable within the POSIX framework?
This table lists all of the functions that POSIX guarantees to be async-signal-safe and so can be called from a signal handler.
By using the 'write' command from this table, the following relatively "ugly" solution hopefully will do the trick:
#include <csignal>
#ifdef _WINDOWS_
#define _exit _Exit
#else
#include <unistd.h>
#endif
#define PRINT_SIGNAL(X) case X: \
write (STDERR_FILENO, #X ")\n" , sizeof(#X ")\n")-1); \
break;
void catchSignal (int reason) {
char s[] = "Caught signal: (";
write (STDERR_FILENO, s, sizeof(s) - 1);
switch (reason)
{
// These are the handlers that we catch
PRINT_SIGNAL(SIGUSR1);
PRINT_SIGNAL(SIGHUP);
PRINT_SIGNAL(SIGINT);
PRINT_SIGNAL(SIGQUIT);
PRINT_SIGNAL(SIGABRT);
PRINT_SIGNAL(SIGILL);
PRINT_SIGNAL(SIGFPE);
PRINT_SIGNAL(SIGBUS);
PRINT_SIGNAL(SIGSEGV);
PRINT_SIGNAL(SIGTERM);
}
_Exit (1); // 'exit' is not async-signal-safe
}
EDIT: Building on windows.
After trying to build this one windows, it appears that 'STDERR_FILENO' is not defined. From the documentation however its value appears to be '2'.
#include <io.h>
#define STDIO_FILENO 2
EDIT: 'exit' should not be called from the signal handler either!
As pointed out by fizzer, calling _Exit in the above is a sledge hammer approach for signals such as HUP and TERM. Ideally, when these signals are caught a flag with "volatile sig_atomic_t" type can be used to notify the main program that it should exit.
The following I found useful in my searches.
Introduction To Unix Signals Programming
Extending Traditional Signals
FWIW, 2 is standard error on Windows also, but you're going to need some conditional compilation because their write() is called _write(). You'll also want
#ifdef SIGUSR1 /* or whatever */
etc around all references to signals not guaranteed to be defined by the C standard.
Also, as noted above, you don't want to handle SIGUSR1, SIGHUP, SIGINT, SIGQUIT and SIGTERM like this.
Richard, still not enough karma to comment, so a new answer I'm afraid. These are asynchronous signals; you have no idea when they are delivered, so possibly you will be in library code which needs to complete to stay consistent. Signal handlers for these signals are therefore required to return. If you call exit(), the library will do some work post-main(), including calling functions registered with atexit() and cleaning up the standard streams. This processing may fail if, say, your signal arrived in a standard library I/O function. Therefore in C90 you are not allowed to call exit(). I see now C99 relaxes the requirement by providing a new function _Exit() in stdlib.h. _Exit() may safely be called from a handler for an asynchronous signal. _Exit() will not call atexit() functions and may omit cleaning up the standard streams at the implementation's discretion.
To bk1e (commenter a few posts up)
The fact that SIGSEGV is synchronous is why you can't use functions that are not designed to be reentrant. What if the function that crashed was holding a lock, and the function called by the signal handler tries to acquire the same lock?
This is a possibility, but it's not 'the fact that SIGSEGV is synchronous' which is the problem. Calling non-reentrant functions from the handler is much worse with asynchronous signals for two reasons:
asynchronous signal handlers are
(generally) hoping to return and
resume normal program execution. A
handler for a synchronous signal is
(generally) going to terminate
anyway, so you've not lost much if
you crash.
in a perverse sense, you have absolute control over when a synchronous signal is delivered - it happens as you execute your defective code, and at no other time. You have no control at all over when an async signal is delivered. Unless the OP's own I/O code is ifself the cause of the defect - e.g. outputting a bad char* - his error message has a reasonable chance of succeeding.
Write a launcher program to run your program and report abnormal exit code to the user.