i am currently writing a small VM in C/C++. Obviously i can't let the whole VM crash if the user dereferences a null pointer so i have to check every access which is becoming cumbersome as the VM grows and more systems are implemented.
So i had an idea: write a signal handler for sigsegv and let the OS do its thing but instead of closing the program call the VM exception handler.
It seems to work (with my very simple test cases), but i didn't find anything guaranteeing a Sigsegv being thrown on null-derefs nor the handler being called for OS generated signals.
So my question is:
Can i count on signal.h on modern destkop OSes (i don't really care if it's not standard on doesn't work on something other than linux/win: it's a pet project). Are there any non trivial stuff i should be aware of (obscure limitations of signal(...) or longjmp(...) ?)
Thank you !
Here is the pseudo implementation:
/* ... */
jmp_buf env;
/* ... */
void handler(int) {
longjmp(env, VM_NULLPTR);
}
/* ... */
if(setjmp(env)) {
return vm_throw("NullPtrException");
}
switch(opcode) {
/* instructions */
case INVOKE:
*stack_top = vm_call(stack_top->obj); // don't check anything in the case where stack_top or stack_top->obj is null handler() will be called an a "NullPtrException" will be thrown
break;
/* more instructions */
}
/* ... */
Note : i only need to check nulls, garbage (dangling) pointers are handled by the GC and should not happen.
Calling longjmp() from a signal handler is only safe if the signal handler is never invoked from async signal unsafe code. So for example if you might receive SIGSEGV by passing a bad pointer to any of the printf() family of functions, you cannot longjmp() from your signal handler.
Can i count on signal.h on modern destkop OSes
You can count on it in the sense that the header, and the functions within will be available on all standard compliant systems. However, exactly what what signals are thrown and when is not consistent across OSes.
On windows, you may need to compile your program with cygwin or similar environment to get the system to raise a segmentation fault. Programs compiled with visual studio use "structured exceptions" to handle invalid storage access.
Is signal.h a reliable way to catch null pointers?
There are situations where null pointer dereference does not result in the raising a segmentation fault signal, even on a POSIX system.
One case might be that the compiler has optimized the operation away, which is typical for example in the case of dereferencing a null pointer to call a member function that does not access any data members. When there is no invalid memory access, there is no signal either. Of course, in that case there won't be a crash either.
Another case might be that address 0 is in fact valid. That's the case on AIX, which you don't care about. It is also the case on Linux, which you do care about, but not by default and you might choose to not care about the case where it is. See this answer for more details.
Then there is your implementation of the signal handler. longjmp is not async signal safe, so if the signal was raised while another non-safe operation was being performed, the interrupted operation may have left your program in an inconsistent state. See the answer by John Zwinck and the libc documentation for details.
Related
Is there any way to convert a NULL pointer access into a C++ exception under Linux ? Something similar to the NullPointerException in Java. I hope the following program would return successfully, instead of crash (assume the compiler cannot figure out this NULL pointer access during compile time):
class NullPointerException {};
void accessNullPointer(char* ptr) {
*ptr = 0;
}
int main() {
try {
accessNullPointer(0);
} catch (NullPointerException&) {
return 1;
}
return 0;
}
I'm not expecting any standard way of doing it, since NULL pointer access under C++ is undefined-behavior, just want to know how to get it done under x86_64 Linux/GCC.
I did some very primitive research in this, it might be possible:
When a NULL pointer is access under Linux, a SIGSEGV will be generated.
Inside the SIGSEGV handler, the program's memory and register information will be available (if sigaction() is used to register the signal handler). The instruction which caused the SIGSEGV is also available if the program is disassembled.
Modify the program's memory and/or register, and create/fake an exception instance (maybe by invoking the low level unwind library functions, like _Unwind_RaiseException, etc.)
Finally return from the signal handler, hope the program would start a C++ stack unwinding process like a normal exception was thrown.
Here's a quote from GCC's man page (-fnon-call-exceptions):
Generate code that allows trapping instructions to throw exceptions. Note that this requires platform-specific runtime support that does not exist everywhere. Moreover, it only allows trapping instructions to throw exceptions, i.e. memory references or floating point instructions. It does not allow exceptions to be
thrown from arbitrary signal handlers such as "SIGALRM".
It seems this "platform-specific runtime" is exactly what I want. Anyone knows such a runtime for Linux/x86_64 ? Or give me some information on how to implement such a runtime if no such runtime already exists ?
I want the solution to work in multi-threaded program as well.
No, there's no good way to do that, and there shouldn't be. Exceptions are thrown by a throw statement in source code. That's important for reasoning about exception safety: you can look at the code and see the places where exceptions can be thrown and, perhaps more important, you can look a the code and see the places where exceptions will not be thrown. If pretty much anything you do can throw an exception it becomes very difficult to write exception-safe code without cluttering it with catch clauses. Microsoft tried this in their early C++ compilers: they piggybacked C++ exception handling on top of their OS's structured exceptions, and the result was a disaster.
Register an alternative signal stack with signalaltstack().
The 3rd argument to a signal handler handler registered with the SA_SIGINFO is a pointer to a ucontext_t which contains the saved register. The signal handler should adjust this to simulate a call to a function. That function can then throw the exception.
Potential complications include the need to preserve the value of callee saved registers, the red-zone on x86-64 (which can be disabled) and the return address register on some ISAs, such as ARM.
I just informed myselve about "signals" in C/C++ and played around. But i have a problem to understand the logic of SIGFPE.
I wrote a little program which will run into a division by zero, if this happens then the signal should be triggered and the signal handler should be executed. But instead my program just crashes. So what is the purpose of the SIGFPE if it does not even work on division by zero?
#include <stdio.h>
#include <signal.h>
#include <iostream>
int signal_status = 0;
void my_handler (int param)
{
signal_status = 1;
printf ("DIVISION BY ZERO!");
}
int main ()
{
signal (SIGFPE, my_handler);
int result = 0;
while(1)
{
system("cls");
printf ("signaled is %d.\n", signal_status);
for(int i=10000; i>-1; i--)
{
result = 5000 / i;
}
}
getchar();
return 0;
}
As I commented, most signals are OS specific. For Linux, read carefully signal(7). You forgot a \n inside your printf (usually, you'll be lucky enough to see something work in your code, but read all my answer). And in principle you should not call printf (which is not an async-signal-safe function, you should use directly and only write(2) inside) from your signal handler.
What probably is happening is (ignoring the undefined behavior posed by wrongly using printf inside the signal handler) is that:
your stdout buffer is never flushed since you forgot a \n (you might add a fflush(NULL);...) in the printf inside my_handler in your code
probably, the SIGFPE handler restarts again the machine code instruction triggering it. (More exactly, after returning from sigreturn(2) your machine is in the same state as it was before SIGFPE was delivered, so the same divide-by-zero condition happens, etc...)
It is difficult (but painfully possible, if you accept coding hardware-specific and operating-system specific code) to handle SIGFPE; you would use sigaction(2) with SA_SIGINFO and handle the third argument to the signal handler (which is a ucontext_t pointer indirectly giving the machine state, including processor registers, which you might change inside your handler; in particular you could change your return program counter there). You might also consider using sigsetjmp(3) inside your signal handler (but it is in theory forbidden, since not async-signal-safe).
(You certainly need to understand the details of your processor's instruction set architecture and your operating system's ABI; and you probably would need a week of coding work after having mastered these)
In a portable POSIX way, SIGFPE cannot really be handled, as explained in Blue Moon's answer
Probably, the runtime of JVM or of SBCL is handling SIGFPE in a machine & operating system specific way to report zero-divides as divide-by-zero exceptions .... (to Java programs for JVM, to Common Lisp programs for SBCL). Alternatively their JIT or compiler machinery could generate a test before every division.
BTW, a flag set inside a signal handler should be declared volatile sig_atomic_t. See POSIX specification about <signal.h>
As a pragmatical rule of thumb, a POSIX portable and robust signal handler should only set some volatile sig_atomic_t and/or perhaps write(2) a few bytes to some pipe(7) (your process could set up a pipe to itself -as recommended by Qt-, with another thread and/or some event loop reading it), but this does not work for asynchronous process-generated signals like SIGFPE, SIGBUS, SIGILL, and SIGSEGV, etc... (which could only be handled by painful computer-specific code).
See also this answer to a very related question.
At last, on Linux, signal processing is believed to be not very quick. Even with a lot of machine-specific coding, emulating GNU Hurd external pagers by tricky SIGSEGV handling (which would mmap lazily ....) is believed to be quite slow.
Divide by zero is undefined behaviour. So whether you have installed a handler for SIGFPE or not is of little significance when your program invokes undefined behaviour.
POSIX says:
Delivery of the signal shall have no effect on the process. The
behavior of a process is undefined after it ignores a SIGFPE, SIGILL,
SIGSEGV, or SIGBUS signal that was not generated by kill(),
sigqueue(), or raise().
A signal is raised as a result of an event (e.g. sending SIGINT by pressing CTRL+C) which can be handled by the process if said event non-fatal. SIGFPE is an erroneous condition in the program and you can't handle that. A similar case would be attempting to handle SIGSEGV, which is equivalent to this (undefined behaviour). When your process attempts to access some memory for which it doesn't have access. It would be silly if you could just ignore it and carry on as if nothing happened.
Some C++ libraries call abort() function in the case of error (for example, SDL). No helpful debug information is provided in this case. It is not possible to catch abort call and to write some diagnostics log output. I would like to override this behaviour globally without rewriting/rebuilding these libraries. I would like to throw exception and handle it. Is it possible?
Note that abort raises the SIGABRT signal, as if it called raise(SIGABRT). You can install a signal handler that gets called in this situation, like so:
#include <signal.h>
extern "C" void my_function_to_handle_aborts(int signal_number)
{
/*Your code goes here. You can output debugging info.
If you return from this function, and it was called
because abort() was called, your program will exit or crash anyway
(with a dialog box on Windows).
*/
}
/*Do this early in your program's initialization */
signal(SIGABRT, &my_function_to_handle_aborts);
If you can't prevent the abort calls (say, they're due to bugs that creep in despite your best intentions), this might allow you to collect some more debugging information. This is portable ANSI C, so it works on Unix and Windows, and other platforms too, though what you do in the abort handler will often not be portable. Note that this handler is also called when an assert fails, or even by other runtime functions - say, if malloc detects heap corruption. So your program might be in a crazy state during that handler. You shouldn't allocate memory - use static buffers if possible. Just do the bare minimum to collect the information you need, get an error message to the user, and quit.
Certain platforms may allow their abort functions to be customized further. For example, on Windows, Visual C++ has a function _set_abort_behavior that lets you choose whether or not a message is displayed to the user, and whether crash dumps are collected.
According to the man page on Linux, abort() generates a SIGABRT to the process that can be caught by a signal handler. EDIT: Ben's confirmed this is possible on Windows too - see his comment below.
You could try writing your own and get the linker to call yours in place of std::abort. I'm not sure if it is possible however.
In certain cases - especially when an exception escapes a destructor during stack unwinding - C++ runtime calls terminate() which must do something reasonable post-mortem and then exit the program. When a question "why so harsh" arises the answer is usually "there's nothing more reasonable to do in such error situations". That sounds reasonable if the whole program is in C++.
Now what if the C++ code is in a library and the program that uses the library is not in C++? This happens quite often - for example I might have a native C++ COM component consumed by a .NET program. Once terminate() is called inside the component code the .NET program suddenly ends abnormally. The program author will first of all think "I don't care of C++, why the hell is this library make my program exit?"
How do I handle the latter scenario when developing libraries in C++? Is it reasonable that terminate() unexpectedly ends the program? Is there a better way to handle such situations?
Why is the C++ runtime calling terminate()? It doesn't do it at random, or due to circumstances which cannot be defined and/or avoided when the code is written. It does it because your code does something that is defined to result in a call to terminate(), such as throwing an exception from a destructor during stack unwinding.
There is a list in the C++ standard of all the situations which are defined to result in call to terminate(). If you don't want terminate() to be called, don't do any of those things in your code. The same applies to unexpected(), abort(), and so on.
I don't think this is any different really from the fact that you have to avoid undefined behavior, or in general avoid writing code which is wrong. You also have to avoid behavior which is defined but undesirable.
Perhaps you have a particular example where it is difficult to avoid a call to terminate(), but throwing an exception from a destructor during stack unwinding isn't it. Just don't throw exceptions out of destructors, ever. This means designing your destructors such that if they do something which might fail, the destructor catches the exception and your code continues in a defined state.
There are some situations where your system will impolitely hack your process away at the knees because of something that your C++ code does (although not by calling terminate()). For example, if the system is overcommitting memory and the VMM can't honour the promises malloc/new have made, then your process might be killed. But that's a system feature, and probably applies equally to the other language that's calling your C++. I don't think there's anything you can (or need to) do about it, as long as your caller is aware that your library might allocate memory. In that circumstance it's not your code's fault the process died, it's the defined response of the operating system to low-memory conditions.
I think the more fundamental issue isn't specifically what terminate does, but the library's design. The library may be designed to be used only with C++ in which case exceptions can be caught and handled as appropriate within the app.
If the library is intended to be used in conjunction with non-C++ apps, it needs to provide an interface that ensures no exceptions leave the library, for example an interface that does a catch(...).
Suppose you have a function in C++ called cppfunc and you are invoking it from another langugage (such as C or .NET). I suggest you create a wrapper function, let's say exportedfunc, like so:
int exportedfunc(resultype* outresult, paramtype1 param1, /* ... */)
{
try {
*outresult = cppfunc(param1,param2,/* ... */);
return 0; // indicate success
}catch( ... ) { // may want to have other handlers
/* possibly set other error status info */
return -1; // indicate failure
}
}
Basically, you need to ensure that exceptions do not cross language boundaries... so you need to wrap your C++ functions with a function that catches all exceptions and reports a status code or does something acceptable other than invoking std::terminate.
The default terminate handler will call abort. If you don't want this behavior, define your own terminate handler and set it using set_terminate.
So, I have a big piece of legacy software, coded in C. It's for an embedded system, so if something goes wrong, like divide by zero, null pointer dereference, etc, there's not much to do except reboot.
I was wondering if I could implement just main() as c++ and wrap it's contents in try/catch. That way, depending on the type of exception thrown, I could log some debug info just before reboot.
Hmm, since there are multiple processes I might have to wrap each one, not just main(), but I hope that you see what I mean...
Is it worthwhile to leave the existing C code (several 100 Klocs) untouched, except for wrapping it with try/catch?
Division by zero or null pointer dereferencing don't produce exceptions (using the C++ terminology). C doesn't even have a concept of exceptions. If you are on an UNIX-like system, you might want to install signal handlers (SIGFPE, SIGSEGV, etc.).
Since it's an embedded application it probably runs only on one platform.
If so its probably much simpler and less intrusive to install a proper interrupt handler / kernel trap handler for the division by zero and other hard exceptions.
In most cases this can be done with just a few lines of code. Check if the OS supports such installable exception handlers. If you're working on an OS-less system you can directly hook the CPU interrupts that are called during exceptions.
First of all, divide by zero or null pointer dereference won't throw exceptions. Fpe exceptions are implemented in the GNU C library with signals (SIGFPE). And are part of the C99 standard, not the C++ standard.
A few hints based in my experience in embedded C and C++ development. Depending on your embedded OS:
Unless you are absolutely sure of what you are doing, don't try/catch between threads . Only inside the same thread. Each thread usually has its own stack.
Throwing in the middle of legacy C code and catching, is going to lead to problems, since for example when you have jumped to the catch block out of the normal program flow, allocated memory may not be reclaimed, thus you have leaks which is far more problematic than divide by zero or null pointer dereference in an embedded system.
If you are going to use exceptions, throw them inside your C++ code with allocation of other "C" resources in constructors and deallocation in destructors.
Catch them just above the C++ layer.
I don't think machine code compiled from C does the same thing with regard to exceptions, as machine code compiled for C++ which "knows" about exceptions. That is, I don't think it's correct to expect to be able to catch "exceptions" from C code using C++ constructs. C doesn't throw exceptions, and there's nothing that guarantees that errors that happen when C code does something bad are caught by C++'s exception mechanism.
Divide by zero, null pointer dereference are not C or C++ exceptions they are hardware exceptions.
On Windows you can get hardware exceptions like these wrapped in a normal C++ exception if _set_se_translator() is used.
__try/__catch may also help. Look up __try/__catch, GetExceptionCode(), _set_se_translator() on MSDN, you may find what you need.
I think I remember researching on how to handle SIGSEGV ( usually generated when derefing invalid pointer, or dividing by 0 ) and the general advise I found on the net was: do nothing, let the program die. The explanation, I think a correct one, is that by the time you get to a SIGSEGV trap, the stack of the offending thread is trashed beyond repair.
Can someone comment if this is true?
Only work on msvc
__try
{
int p = 0;
p=p / 0;
}
__except (1)
{
printf("Division by 0 \n");
}