Keep running the program after SIGABRT c++ signal - c++

I use a third library in my c++ program which under certain circumstances emits SIGABRT signal. I know that trying to free non-initialized pointer or something like this can be the cause of this signal. Nevertheless I want to keep running my program after this signal is emitted, to show a message and allow the user to change the settings, in order to cope with this signal.
(I use QT for developing.)
How can I do that?

I use a third library in my c++ program which under certain circumstances emits SIGABRT signal
If you have the source code of that library, you need to correct the bug (and the bug could be in your code).
BTW, probably SIGABRT happens because abort(3) gets indirectly called (perhaps because you violated some conventions or invariants of that library, which might use assert(3) - and indirectly call abort). I guess that in caffe the various CHECK* macros could indirectly call abort. I leave you to investigate that.
If you don't have the source code or don't have the capacity or time to fix that bug in that third party library, you should give up using that library and use something else.
In many cases, you should trust external libraries more than your own code. Probably, you are abusing or misusing that library. Read carefully its documentation and be sure that your own code calling it is using that library correctly and respects its invariants and conventions. Probably the bug is in your own code, at some other place.
I want to keep running my program
This is impossible (or very unreliable, so unreasonable). I guess that your program has some undefined behavior. Be very scared, and work hard to avoid UB.
You need to improve your debugging skills. Learn better how to use the gdb debugger, valgrind, GCC sanitizers (e.g. instrumentation options like -fsanitize=address, -fsanitize=undefined and others), etc...
You reasonably should not try to handle SIGABRT even if in principle you might (but then read carefully signal(7), signal-safety(7) and hints about handling Unix signals in Qt). I strongly recommend to avoid even trying catching SIGABRT.

Unfortunately, you can't.
SIGABRT signal is itself sent right after abort()
Ref:
https://stackoverflow.com/a/3413215/9332965

You can handle SIGABRT, but you probably shouldn't.
The "can" is straightforward - just trap it in the usual way, using signal(). You don't want to return from this signal handler - you probably got here from abort() - possibly originally from assert() - and that function will exit after raising the signal. You could however longjmp() back to a state you set up earlier.
The "shouldn't" is because once SIGABRT has been raised, your data structures (including those of Qt and any other libraries) are likely in an inconsistent state and actually using any of your program's state is likely to be unpredictable at best. Apart from exiting immediately, there's not much you can do other than exec() a replacement program to take over in a sane initial state.
If you just want to show a friendly message, then you perhaps could exec() a small program to do that (or just use xmessage), but beware of exiting this with a success status where you would have had an indication of the SIGABRT otherwise.

Unfortunately there isn't much you can do to prevent SIGABRT from terminating your program. Not without modifying some code that was hopefully written by you.
You would either need to change code to not throw an abort, or you would have to spawn a new process that runs the code instead of the current process. I do not suggest you use a child process to solve this problem. It's most likely caused by misuse of an api or computer resources, such as low memory.

Related

How can I block in my Qt application signal SIGSEGV from cURL library?

My Qt app uses cURL library to send HTTP requests and sometimes cURL sends SIGSEGV and after that my app crashes.
Is it possible to catch this signal and prevent segmentation fault ?
TL;DR:
Don't attempt to block or ignore SIGSEGV. It will just bring you pain.
Explanation:
SIGSEGV means very bad things have happened, and your program has wandered into memory that it is not allowed to access. This means your program is utterly hosed.
Pardon my Canadian.
Your program is broken. You can't trust it to behave in any rational fashion anymore and should actually thank that SIGSEGV for letting you know this. As much as you don't want to see SIGSEGV, it's a hell of a lot better than the alternative of a broken program continuing to run and spewing out false information.
You can catch a SIGSEGV with a signal handler, but say you do catch and try to handle it. Odds are the program goes right back into the code that triggered the SIGSEGV and very likely raises it again as soon as the signal handler. Or does something else weird. Possibly something worse.
It's not really even worth catching SIGSEGV with a signal handler to make an attempt to output a message, because what's the message going to say? "Oh Smurf. Something very bad happened."?
So, yes, you can catch it, but you can't do anything to recover. The program is already too damaged to continue. That leaves preventing SIGSEGV in the first place, and that means fixing your code.
The best thing you can do is determine why the program crashed. The easiest way to do that is run your program with whatever debugger came with your development environment, wait for the crash, then inspect the backtrace for hints on how the program came to meet its doom.
Typically a link to Eric Lippert's How to debug small programs can be found right about here, and I can't think of a good reason to leave it out.
One other thing, though. cURL is a pretty solid library. Odds are overwhelmingly good that you are using it wrong or passed it bad information: a dead pointer, an unterminated C-style string, an pointer to an explosive function. I'd start by looking at how you are using cURL.
No, it's not. Instead, fix the bug and prevent the segmentation fault that way. Presumably your platform supplies some kind of debugger.

Let gcc call a specific function between c operations

I am trying to make a watchdog for a single-threaded program. The problem is, that we run some foreign so/dlls (the code is available) which means that we pass control there.
The idea is to recompile these with some callback to a sort of a cancellation routine.
Is it possible to let GCC call some callback functions in between of C-transactions or asm-transactions in this compiled foreign code?
What I'm about to suggest does not involve the compiler, but this sounds like a problem you can solve at runtime with POSIX signals or ptrace ...
With a signal you can interrupt the current context, similar to what would happen in kernel mode with an IRQ. You will have to worry about being "signal-safe" (example: your handler can't use malloc because it might interrupt malloc itself while its data structures are in an indeterminate state.)
With ptrace you can step through instructions in another process as if in a debugger.
Tread carefully, as these are difficult mechanisms to use correctly and it's very easy to shoot yourself in the foot.

How to end a C++ program after error?

I am refactoring an old code, and one of the things I'd like to address is the way that errors are handled. I'm well aware of exceptions and how they work, but I'm not entirely sure they're the best solution for the situations I'm trying to handle.
In this code, if things don't validate, there's really no reason or advantage to unwind the stack. We're done. There's no point in trying to save the ship, because it's a non-interactive code that runs in parallel through the Sun Grid Engine. The user can't intervene. What's more, these validation failures don't really represent exceptional circumstances. They're expected.
So how do I best deal with this? One thing I'm not sure I want is an exit point in every class method that can fail. That seems unmaintainable. Am I wrong? Is it acceptable practice to just call exit() or abort() at the failure point in codes like this? Or should I throw an exception all the way back to some generic catch statement in main? What's the advantage?
Throwing an exception to be caught in main and then exiting means your RAII resource objects get cleaned up. On most systems this isn't needed for a lot of resource types. The OS will clean up memory, file handles, etc. (though I've used a system where failing to free memory meant it remained allocated until system restart, so leaking on program exit wasn't a good idea.)
But there are other resource types that you may want to release cleanly such as network or database connections, or a mechanical device you're driving and need to shut down safely. If an application uses a lot of such things then you may prefer to throw an exception to unwind the stack back to main, and then exit.
So the appropriate method of exiting depends on the application. If an application knows it's safe then calling _Exit(), abort(), exit(), or quickexit() may be perfectly reasonable. (Library code shouldn't call these, since obviously the library has no idea whether its safe for every application that will ever use the library.) If there is some critical clean up that must be performed before an application exits but you know it's limited, then the application can register that clean up code via atexit() or at_quick_exit().
So basically decide what you need cleaned up, document it, implement it, and try to make sure it's tested.
It is acceptable to terminate the program if it cannot handle the error gracefully. There are few things you can do:
Call abort() if you need a core dump.
Call exit() if you want to give a chance to run to those routines registered with atexit() (that is most likely to call destructors for global C++ objects).
Call _exit() to terminate a process immediately.
There is nothing wrong with using those functions as long as you understand what you are doing, know your other choices, and choose that path willingly. After all, that's why those functions exist. So if you don't think it makes any sense to try to handle the error or do anything else when it happens - go ahead. What I would probably do is try to log some informative message (say, to syslog), and call _exit. If logging fails - call abort to get a core along the termination.
I'd suggest to call global function
void stopProgram() {
exit(1);
}
Later you can change it's behavior, so it is maintainable.
As you pointed out, having an exit or abort thrown around throughout your code is not maintainable ... additionally, there may be a mechanism in the future that could allow you to recover from an error, or handle an error in a more graceful manner than simply exiting, and if you've already hard-coded this functionality in, then it would be very hard to undo.
Throwing an exception that is caught in main() is your best-bet at this point that will also give you flexibility in the future should you run the code under a different scenario that will allow you to recover from errors, or handle them differently. Additionally, throwing exceptions could help should you decide to add more debugging support, etc., as it will give you spots to implement logging features and record the program state from isolated and maintainable points in the software before you decide let the program exit.

Catch Segfault or any other errors/exceptions/signals in C++ like catching exceptions in Java

I wrote a Linux program based on a buggy open source library. This library sometimes triggers segfaults that I cannot control. And of course once the library has segfaults, the entire program dies. However, I have to make sure my program keeps running even if the library has segfaults. This is because my program sort of serves as a "server" and it needs to at least tell the clients something bad happened and recover from the errors, rather than chicken out... Is there any way to do that?
I understand in Java one just needs to catch an exception. But how does C++ handle this?
[UPDATE]I understand there is also exception handling in C++, but Segfault is not an exception, is it? I don't think anything is thrown when segfault happens. You have to explicitly "throw" something to use try.... catch.... as far as I know.
Thanks so much, I am quite new to C++.
You cannot reliably resume execution after a segmentation violation. If your program must remain running, fence off the offending library in a separate process and communicate with it over a pipe. When it takes a segmentation violation, your program will notice the closed pipe.
Unfortunately, you cannot make the program continue. The buggy code that resulted in SIGSEGV usually triggers undefined behaviour like dereferencing a null pointer or reading garbage memory. You cannot possibly continue if your code operates on invalid data.
You can handle the signal, but the most you can do is dump the stack trace and die.
C and C++ are inherently unsafe, you cannot handle errors triggered by undefined behaviour and let the program continue.
You can use signal handlers. It's not really recommended though because you can't guarantee that you've eliminated the cause of the problem. The best thing to do would be to isolate it in a separate process- this is the approach Google Chrome takes.
If it's FOSS, the easiest thing to do would be to just debug it.
If you have access to the source, it might be useful to run the programmer in a debugger like GDB. GDB stops at the line which causes the segfault.
If you really want to catch the signal though, you need to hook up a signal handler, using the signal system call. I would probably just stick to the debugger though.
EDIT:
Since you write that the library segfaults, I would just like to point out the first rule of programming: It's always your fault. Especially if you are a new to C++, the segfault probably happens because you have used the library incorrectly in some way. C++ is a very subtle language and it is easy to do things you don't intend.
As mentioned over here you can’t catch segfault signals with try blocks or “map” segment violations to anything. It’s really bad idea to handle SIGSEGV yourself. SEGV from C++ code is a severe error. You can use gdb to figure it out why and solve it.

How to terminate program in C++

When I exit my C++ program it crashes with errors like:
EAccessViolation with mesage 'Access violation at address 0...
and
Abnormal Program Termination
It is probably caused by some destructor because it happens only when the application exits. I use a few external libraries and cannot find the code that causes it. Is there a function that forces immediate program exit (something like kill in Linux) so that memory would have to be freed by the operating system? I could use this function in app exit event.
I know that it would be a terrible solution because it'd just hide the problem.
I'm just asking out of sheer curiosity, so please don't give me -1 :)
I tried exit(0) from stdlib but it didn't help.
EDIT:
Thanks for your numerous replies:)
I use Builder C++ 6 (I know it's outdated but for some reasons I had to use it). My app uses library to neural networks (FANN). Using the debugger I found that program crashes in:
~neural_net()
{
destroy();
}
destroy() calls multiple time another function fann_safe_free(ptr), that is:
#define fann_safe_free(x) {if(x) { free(x); x = NULL; }}
The library works great, problem only appears when it does cleaning. That's why I asked about so brutal solution. My app is multi-threaded but other threads operate on different data.
I will analyze my code for the n-th time(the bug must be somewhere), thanks for all your tips :)
You should fix the problem.
First step: find at check all functions you register with atexit() (not many I hope)
Second step: find all global variables and check their destructors.
Third Step: find all static function variables check their destructors.
But otherwise you can abort.
Note: abort is for Abnormal program termination.
abort()
The difference: (note letting an application leave the main function is the equivalent of exit())
exit()
Call the functions registered with the atexit(3) function, in the reverse order of their registration. This includes the destruction of all global (static storage duration) variables.
Flush all open output streams.
Close all open streams.
Unlink all files created with the tmpfile(3) function.
abort()
Flush all open output streams.
Close all open streams.
It's a terrible solution for more than one reason. It will hide the problem (maybe), but it could also corrupt data, depending on the nature of your application.
Why don't you use a debugger and try to find out what is causing the error?
If your application is multi-threaded, you should make sure that all threads are properly shut down before exiting the application. This is a fairly common cause of that type of error on exit, when a background thread is attempting to use memory/objects that have already been destructed.
Edit:
based on your updated question, I have the following suggestions:
Try to find out more specifically what is causing the crash in the destructor.
The first thing I would do is make sure that it's not trying to destruct a NULL object. When you get your crash in ~neural_net in your debugger, check your "this" pointer to make sure it's not NULL. If it is, then check your call-stack and see where it's being destructed, and do a check to make sure it's not NULL before calling delete.
If it's not NULL, then I would unroll that macro in destroy, so you can see if it's crashing on the call to free.
You could try calling abort(); (declared in <stdlib.h> and in <process.h>)
The version in VisualC++, however, will print a warning message as it exits: "This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information."
On Linux/UNIX you can use _exit:
#include <unistd.h>
void _exit(int status);
The function _exit() is like exit(), but does not call any functions registered with atexit() or on_exit(). Whether it flushes standard I/O buffers and removes temporary files created with tmpfile(3) is implementation dependent. On the other hand, _exit() does close open file descriptors, and this may cause an unknown delay, waiting for pending output to finish. If the delay is undesired, it may be useful to call functions like tcflush() before calling _exit(). Whether any pending I/O is cancelled, and which pending I/O may be cancelled upon _exit(), is implementation-dependent.
Have you tried the gruesome step by step? If you're project/solution is simply to large to do so maybe you could try segmenting it assuming you use a modular build and test each component indivdually. Without any code or visible destructors abstract advice is all I can give you I'm afraid. But nonetheless I hope trying to minimize the debugging field will help in some way.
Good luck with getting an answer :)
That immediate program exit (and yes, that's a terrible solution) is abort()
That happens most likely because a NULL pointer is being accessed. Depending on your OS try getting a stack trace and identify the culprit, don't just exit.
If you use linux, valgrind should solve your problem.
but if it is windows, try one of these: MemoryValidator, BoundsChecker or other tools like these.
Simply close your application is not the best way to deal with bugs ...