I'd like to write a test in Rust where the expected behaviour of the #[test] function is to segfault. Is this possible?
First, I'd like to point out that the only sure way to segfault that I am aware of is to send the SIGSEGV signal to your own process, possibly using the "raise" function or a rust equivalent.
Dereferencing a pointer to unallocated memory or a null pointer doesn't actually guarantee segfault, though it will on most modern platforms.
The simplest way to check for a segfault is to fork your program (possibly using the nix crate). Once done, execute the function that should make you segfault on the child process, while the parent process waits.
After waiting a sufficient amount of time (any more than a few hundred milliseconds is overkill), check that the child thread is dead. To do that, simply kill it, and an error should be raised if it's already dead.
The only correct way to test this is, in my opinion, pretty heavy. In the test, I would run a static analyser that can detect this possible undefined behavior, and verify that this very issue is still there.
I am not aware of a Rust crate that does a static analysis, though, so I guess that you would depend on an extern tool using the C ABI.
Related
I use a third library in my c++ program which under certain circumstances emits SIGABRT signal. I know that trying to free non-initialized pointer or something like this can be the cause of this signal. Nevertheless I want to keep running my program after this signal is emitted, to show a message and allow the user to change the settings, in order to cope with this signal.
(I use QT for developing.)
How can I do that?
I use a third library in my c++ program which under certain circumstances emits SIGABRT signal
If you have the source code of that library, you need to correct the bug (and the bug could be in your code).
BTW, probably SIGABRT happens because abort(3) gets indirectly called (perhaps because you violated some conventions or invariants of that library, which might use assert(3) - and indirectly call abort). I guess that in caffe the various CHECK* macros could indirectly call abort. I leave you to investigate that.
If you don't have the source code or don't have the capacity or time to fix that bug in that third party library, you should give up using that library and use something else.
In many cases, you should trust external libraries more than your own code. Probably, you are abusing or misusing that library. Read carefully its documentation and be sure that your own code calling it is using that library correctly and respects its invariants and conventions. Probably the bug is in your own code, at some other place.
I want to keep running my program
This is impossible (or very unreliable, so unreasonable). I guess that your program has some undefined behavior. Be very scared, and work hard to avoid UB.
You need to improve your debugging skills. Learn better how to use the gdb debugger, valgrind, GCC sanitizers (e.g. instrumentation options like -fsanitize=address, -fsanitize=undefined and others), etc...
You reasonably should not try to handle SIGABRT even if in principle you might (but then read carefully signal(7), signal-safety(7) and hints about handling Unix signals in Qt). I strongly recommend to avoid even trying catching SIGABRT.
Unfortunately, you can't.
SIGABRT signal is itself sent right after abort()
Ref:
https://stackoverflow.com/a/3413215/9332965
You can handle SIGABRT, but you probably shouldn't.
The "can" is straightforward - just trap it in the usual way, using signal(). You don't want to return from this signal handler - you probably got here from abort() - possibly originally from assert() - and that function will exit after raising the signal. You could however longjmp() back to a state you set up earlier.
The "shouldn't" is because once SIGABRT has been raised, your data structures (including those of Qt and any other libraries) are likely in an inconsistent state and actually using any of your program's state is likely to be unpredictable at best. Apart from exiting immediately, there's not much you can do other than exec() a replacement program to take over in a sane initial state.
If you just want to show a friendly message, then you perhaps could exec() a small program to do that (or just use xmessage), but beware of exiting this with a success status where you would have had an indication of the SIGABRT otherwise.
Unfortunately there isn't much you can do to prevent SIGABRT from terminating your program. Not without modifying some code that was hopefully written by you.
You would either need to change code to not throw an abort, or you would have to spawn a new process that runs the code instead of the current process. I do not suggest you use a child process to solve this problem. It's most likely caused by misuse of an api or computer resources, such as low memory.
I am using google's v8 javascript engine to have an embedded js interpreter in my project, which must be able to execute user-provided code, but I am wondering if it is possible to set something up in advance of calling any user code which ensures that if the code tries to recurse indefinitely (or even if it just executes for too long), that it can somehow be made to abort, throw an otherwise uncaught exception, and report the issue back to the caller.
Thank you all for responses so far... yes, I realized not long after I posted this that I was basically asking for some kind of solution to the halting problem, which I know is unsolvable, and is actually far more than what I really need.
What I'd need is either some mechanism for detecting when something running in the v8 environment is returning quickly enough, or else simply a mechanism to detect if recursion is happening at all... my use cases are such that the end user should not be utilizing any recursion anyways, and if I can possibly even detect that, then I could reject it at that point instead of blindly executing it. It would be allowed, however, for different threads, with different isolates to invoke the same functions at the same time, so I can't just use a static local variable to lock out another call to the same function.
A compiler [V8 is definitely a compiler in this context, even if it isn't "always" a compiler] can detect recursion, but if the code is clever enough (for example depending on variables that aren't known at compile time), it's not possible to detect whether it has infinite or finite recursion.
I would simply state that "execution over X seconds is disallowed", and if the execution takes more than that long, abort it. You can do this by having a "watchdog thread", that gets triggered when the code completes - and if the watchdog thread gets to run X seconds, kill the main thread and report back to user-code. No, I don't know EXACTLY how to write this code in conjunction with V8.
I am refactoring an old code, and one of the things I'd like to address is the way that errors are handled. I'm well aware of exceptions and how they work, but I'm not entirely sure they're the best solution for the situations I'm trying to handle.
In this code, if things don't validate, there's really no reason or advantage to unwind the stack. We're done. There's no point in trying to save the ship, because it's a non-interactive code that runs in parallel through the Sun Grid Engine. The user can't intervene. What's more, these validation failures don't really represent exceptional circumstances. They're expected.
So how do I best deal with this? One thing I'm not sure I want is an exit point in every class method that can fail. That seems unmaintainable. Am I wrong? Is it acceptable practice to just call exit() or abort() at the failure point in codes like this? Or should I throw an exception all the way back to some generic catch statement in main? What's the advantage?
Throwing an exception to be caught in main and then exiting means your RAII resource objects get cleaned up. On most systems this isn't needed for a lot of resource types. The OS will clean up memory, file handles, etc. (though I've used a system where failing to free memory meant it remained allocated until system restart, so leaking on program exit wasn't a good idea.)
But there are other resource types that you may want to release cleanly such as network or database connections, or a mechanical device you're driving and need to shut down safely. If an application uses a lot of such things then you may prefer to throw an exception to unwind the stack back to main, and then exit.
So the appropriate method of exiting depends on the application. If an application knows it's safe then calling _Exit(), abort(), exit(), or quickexit() may be perfectly reasonable. (Library code shouldn't call these, since obviously the library has no idea whether its safe for every application that will ever use the library.) If there is some critical clean up that must be performed before an application exits but you know it's limited, then the application can register that clean up code via atexit() or at_quick_exit().
So basically decide what you need cleaned up, document it, implement it, and try to make sure it's tested.
It is acceptable to terminate the program if it cannot handle the error gracefully. There are few things you can do:
Call abort() if you need a core dump.
Call exit() if you want to give a chance to run to those routines registered with atexit() (that is most likely to call destructors for global C++ objects).
Call _exit() to terminate a process immediately.
There is nothing wrong with using those functions as long as you understand what you are doing, know your other choices, and choose that path willingly. After all, that's why those functions exist. So if you don't think it makes any sense to try to handle the error or do anything else when it happens - go ahead. What I would probably do is try to log some informative message (say, to syslog), and call _exit. If logging fails - call abort to get a core along the termination.
I'd suggest to call global function
void stopProgram() {
exit(1);
}
Later you can change it's behavior, so it is maintainable.
As you pointed out, having an exit or abort thrown around throughout your code is not maintainable ... additionally, there may be a mechanism in the future that could allow you to recover from an error, or handle an error in a more graceful manner than simply exiting, and if you've already hard-coded this functionality in, then it would be very hard to undo.
Throwing an exception that is caught in main() is your best-bet at this point that will also give you flexibility in the future should you run the code under a different scenario that will allow you to recover from errors, or handle them differently. Additionally, throwing exceptions could help should you decide to add more debugging support, etc., as it will give you spots to implement logging features and record the program state from isolated and maintainable points in the software before you decide let the program exit.
I wrote a Linux program based on a buggy open source library. This library sometimes triggers segfaults that I cannot control. And of course once the library has segfaults, the entire program dies. However, I have to make sure my program keeps running even if the library has segfaults. This is because my program sort of serves as a "server" and it needs to at least tell the clients something bad happened and recover from the errors, rather than chicken out... Is there any way to do that?
I understand in Java one just needs to catch an exception. But how does C++ handle this?
[UPDATE]I understand there is also exception handling in C++, but Segfault is not an exception, is it? I don't think anything is thrown when segfault happens. You have to explicitly "throw" something to use try.... catch.... as far as I know.
Thanks so much, I am quite new to C++.
You cannot reliably resume execution after a segmentation violation. If your program must remain running, fence off the offending library in a separate process and communicate with it over a pipe. When it takes a segmentation violation, your program will notice the closed pipe.
Unfortunately, you cannot make the program continue. The buggy code that resulted in SIGSEGV usually triggers undefined behaviour like dereferencing a null pointer or reading garbage memory. You cannot possibly continue if your code operates on invalid data.
You can handle the signal, but the most you can do is dump the stack trace and die.
C and C++ are inherently unsafe, you cannot handle errors triggered by undefined behaviour and let the program continue.
You can use signal handlers. It's not really recommended though because you can't guarantee that you've eliminated the cause of the problem. The best thing to do would be to isolate it in a separate process- this is the approach Google Chrome takes.
If it's FOSS, the easiest thing to do would be to just debug it.
If you have access to the source, it might be useful to run the programmer in a debugger like GDB. GDB stops at the line which causes the segfault.
If you really want to catch the signal though, you need to hook up a signal handler, using the signal system call. I would probably just stick to the debugger though.
EDIT:
Since you write that the library segfaults, I would just like to point out the first rule of programming: It's always your fault. Especially if you are a new to C++, the segfault probably happens because you have used the library incorrectly in some way. C++ is a very subtle language and it is easy to do things you don't intend.
As mentioned over here you can’t catch segfault signals with try blocks or “map” segment violations to anything. It’s really bad idea to handle SIGSEGV yourself. SEGV from C++ code is a severe error. You can use gdb to figure it out why and solve it.
When I exit my C++ program it crashes with errors like:
EAccessViolation with mesage 'Access violation at address 0...
and
Abnormal Program Termination
It is probably caused by some destructor because it happens only when the application exits. I use a few external libraries and cannot find the code that causes it. Is there a function that forces immediate program exit (something like kill in Linux) so that memory would have to be freed by the operating system? I could use this function in app exit event.
I know that it would be a terrible solution because it'd just hide the problem.
I'm just asking out of sheer curiosity, so please don't give me -1 :)
I tried exit(0) from stdlib but it didn't help.
EDIT:
Thanks for your numerous replies:)
I use Builder C++ 6 (I know it's outdated but for some reasons I had to use it). My app uses library to neural networks (FANN). Using the debugger I found that program crashes in:
~neural_net()
{
destroy();
}
destroy() calls multiple time another function fann_safe_free(ptr), that is:
#define fann_safe_free(x) {if(x) { free(x); x = NULL; }}
The library works great, problem only appears when it does cleaning. That's why I asked about so brutal solution. My app is multi-threaded but other threads operate on different data.
I will analyze my code for the n-th time(the bug must be somewhere), thanks for all your tips :)
You should fix the problem.
First step: find at check all functions you register with atexit() (not many I hope)
Second step: find all global variables and check their destructors.
Third Step: find all static function variables check their destructors.
But otherwise you can abort.
Note: abort is for Abnormal program termination.
abort()
The difference: (note letting an application leave the main function is the equivalent of exit())
exit()
Call the functions registered with the atexit(3) function, in the reverse order of their registration. This includes the destruction of all global (static storage duration) variables.
Flush all open output streams.
Close all open streams.
Unlink all files created with the tmpfile(3) function.
abort()
Flush all open output streams.
Close all open streams.
It's a terrible solution for more than one reason. It will hide the problem (maybe), but it could also corrupt data, depending on the nature of your application.
Why don't you use a debugger and try to find out what is causing the error?
If your application is multi-threaded, you should make sure that all threads are properly shut down before exiting the application. This is a fairly common cause of that type of error on exit, when a background thread is attempting to use memory/objects that have already been destructed.
Edit:
based on your updated question, I have the following suggestions:
Try to find out more specifically what is causing the crash in the destructor.
The first thing I would do is make sure that it's not trying to destruct a NULL object. When you get your crash in ~neural_net in your debugger, check your "this" pointer to make sure it's not NULL. If it is, then check your call-stack and see where it's being destructed, and do a check to make sure it's not NULL before calling delete.
If it's not NULL, then I would unroll that macro in destroy, so you can see if it's crashing on the call to free.
You could try calling abort(); (declared in <stdlib.h> and in <process.h>)
The version in VisualC++, however, will print a warning message as it exits: "This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information."
On Linux/UNIX you can use _exit:
#include <unistd.h>
void _exit(int status);
The function _exit() is like exit(), but does not call any functions registered with atexit() or on_exit(). Whether it flushes standard I/O buffers and removes temporary files created with tmpfile(3) is implementation dependent. On the other hand, _exit() does close open file descriptors, and this may cause an unknown delay, waiting for pending output to finish. If the delay is undesired, it may be useful to call functions like tcflush() before calling _exit(). Whether any pending I/O is cancelled, and which pending I/O may be cancelled upon _exit(), is implementation-dependent.
Have you tried the gruesome step by step? If you're project/solution is simply to large to do so maybe you could try segmenting it assuming you use a modular build and test each component indivdually. Without any code or visible destructors abstract advice is all I can give you I'm afraid. But nonetheless I hope trying to minimize the debugging field will help in some way.
Good luck with getting an answer :)
That immediate program exit (and yes, that's a terrible solution) is abort()
That happens most likely because a NULL pointer is being accessed. Depending on your OS try getting a stack trace and identify the culprit, don't just exit.
If you use linux, valgrind should solve your problem.
but if it is windows, try one of these: MemoryValidator, BoundsChecker or other tools like these.
Simply close your application is not the best way to deal with bugs ...