signal() : any performance impact? - c++

I need to catch SIGABRT, SIGSEGV and SIGILL to present the user a proper critical error message when something out of my control fails and the program need to exit.
However my program does a lot of realtime computing, so performance is important.
Does signal() ( http://www.cplusplus.com/reference/csignal/signal/ ) cause any performance loss (some sort of constant monitoring ?) or not at all (only triggered when an exception happen, no performance lost otherwise).
edit: My software runs on Windows (7 and higher) and OS X (10.7 and higher).

If your time critical process catches signals, there is no "special" time wasting. Indeed, the kernel holds a table of signals and actions for your process which it has to walk through if a signal was send. But every way of sending a message to a process or invoking a handler needs time. A message queue or waiting on a "flag" will have nearly the same "waste".
But using signals can have other implications which should be mentioned. Nearly every system call can be interrupted if the signal arrives. The return value from the call is EINTR. If you have many signals you pass to your process, this may slow down your application a lot, because you have to always check for EINTR with go again into the system call. And every system call is a bit expensive. So looping a lot over system calls with EINTR return values can be a bad design.
But for your question, you only look for SIGABRT, SIGSEGV and SIGILL. These signals are typically only used for seldom exceptions. So don't fear to use them as needed. But avoid using these signals frequently for own IPC. That can be done but is very bad design. For user IPC there are better signal names and also better methods.
In a short: For only catching exception signals, you don't have any time critical issues here.

Related

How to safely terminate a multithreaded process

I am working on a project where we have used pthread_create to create several child threads.
The thread creation logic is not in my control as its implemented by some other part of project.
Each thread perform some operation which takes more than 30 seconds to complete.
Under normal condition the program works perfectly fine.
But the problem occurs at the time of termination of the program.
I need to exit from main as quickly as possible when I receive the SIGINT signal.
When I call exit() or return from main, the exit handlers and global objects' destructors are called. And I believe these operations are having a race condition with the running threads. And I believe there are many race conditions, which is making hard to solve all of theses.
The way I see it there are two solutions.
call _exit() and forget all de-allocation of resources
When SIGINT is there, close/kill all threads and then call exit() from main thread, which will release resources.
I think 1st option will work, but I do not want to abruptly terminate the process.
So I want to know if it is possible to terminate all child threads as quickly as possible so that exit handler & destructor can perform required clean-up task and terminate the program.
I have gone through this post, let me know if you know other ways: POSIX API call to list all the pthreads running in a process
Also, let me know if there is any other solution to this problem
What is it that you need to do before the program quits? If the answer is 'deallocate resources', then you don't need to worry. If you call _exit then the program will exit immediately and the OS will clean up everything for you.
Be aware also that what you can safely do in a signal hander is extremely limited, so attempting to perform any cleanup yourself is not recommended. If you're interested, there's a list of what you can do here. But you can't flush a file to disk, for example (which is about the only thing I can think of that you might legitimately want to do here). That's off limits.
I need to exit from main as quickly as possible when I receive the SIGINT signal.
How is that defined? Because there's no way to "exit quickly as possible" when you receive one signal like that.
You can either set flag(s), post to semaphore(s), or similar to set a state that tells other threads it's time to shut down, or you can kill the entire process.
If you elect to set flag(s) or similar to tell the other threads to shut down, you set those flags and return from your signal handler and hope the threads behave and the process shuts down cleanly.
If you elect to kill threads, there's effectively no difference in killing a thread, killing the process, or calling _exit(). You might as well just keep it simple and call _exit().
That's all you can chose between when you have to make your decision in a single signal handler call. Pick one.
A better solution is to use escalating signals. For example, when you get SIGQUIT or SIGINT, you set flag(s) or otherwise tell threads it's time to clean up and exit the process - or else. Then, say five seconds later whatever is shutting down your process sends SIGTERM and the "or else" happens. When you get SIGTERM, your signal handler simply calls _exit() - those threads had their chance and they messed it up and that's their fault. Or you can call abort() to generate a core file and maybe provide enough evidence to fix the miscreant threads that won't shut down.
And finally, five seconds later the managing process will nuke the process from orbit with SIGKILL just to be sure.

Handling Signals in an MPI Application / Gracefully exit

How can signals be handled safley in and MPI application (for example SIGUSR1 which should tell the application that its runtime has expired and should terminate in the next 10 min.)
I have several constraints:
Finish all parallel/serial IO first befor quitting the application!
In all other circumstances the application can exit without any problem
How can this be achieved safely, no deadlocks while trying to exit, and properly leaving the current context jumping back to main() and calling MPI_FINALIZE() ?
Somehow the processes have to aggree on exiting (I think this is the same in multithreaded applicaitons) but how is that done efficiently without having to communicate to much? Is anybody aware of some standart way of doing this properly?
Below are some thought which might or might not work:
Idea 1:
Lets say for each process we catch the signal in a signal handler and push it on a "unhandled signals stack" (USS) and we simply return from the signal handler routine . We then have certain termination points in our application especially before and after IO operations which then handle all signals in USS.
If there is a SIGUSR1 in USS for example, each process would then exit at a termination point.
This idea has the problem that there could still be deadlocks, process 1 is just catching a singnal befor a termination point, while process 2 passed this point already and is now starting parallel IO. process 1 would exit, which results in a deadlock in process 2 (waiting for process 1 for IO which exited)...
Idea 2:
Only the master process 0 catches the signal in the signal handler and then sends a broadcast message : "all process exit!" at a specific point in the application. All processes receive the broadcast and throw and exception which is catched in main and MPI_FINALIZE is called.
This way the exit happens safely, but for the cost of having to receive continously broadcast message to see if we should exit or not
Thanks a lot!
If your goal is to stop all processes at the same point, then there is no way around always synchronizing at the possible termination points. That is, a collective call at the termination points is required.
Of course, you can try to avoid an extra broadcast by using the synchronization of another collective call to ensure proper termination, or piggy-pack the termination information on an existing broadcast, but I don't think that's worth it. After all, you only need to synchronize before I/O and at least once per ten minutes. At such a frequency, even a broadcast is not a performance problem.
Using signals in your MPI application in general is not safe. Some implementations may support it and others may not.
For instance, in MPICH, SIGUSR1 is used by the process manager for internal notification of abnormal failures.
http://lists.mpich.org/pipermail/discuss/2014-October/003242.html
Open MPI on the other had will forward SIGUSR1 and SIGUSR2 from mpiexec to the other processes.
http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#sect14
Other implementations will differ. So before you go too far down this route, make sure that the implementation you're using can deal with it.

Catch process kills in c++ under Unix

Is it possible with a C++ program to monitor which processes gets killed (either by the user or by the OS), or if the process terminates for some other reasons which are not segmentation fault or illegal operations, and perform some actions afterwards?
Short answer, yes it's possible.
Long answer:
You will need to implement signal handlers for the different signals that may kill a process. You can't necessarily catch EVERY type of signal (in particular, SIGKILL is not possible to catch since that would potentially make a process unkillable).
Use the sigaction function call to set up your signal handlers.
There is a decent list of which signals do what here (about 1/3 down from the top):
http://pubs.opengroup.org/onlinepubs/7908799/xsh/signal.h.html
Edit: Sorry, thought you meant within the process, not from outside of the process. If you "own" the process, you can use ptrace and it's PTRACE_GETSIGINFO to get what the signal was.
To generally "find processes killed" would be quite difficult - or at least to tell the difference between processes just exiting on their own, as opposed to those that exit because they are killed for some other reason.

Catching signals such as SIGSEGV and SIGFPE in multithreaded program

I am trying to write a multithreaded logging system for a program running on linux.
Calls to the logging system in the main program threads pushes a data structure containing the data to be logged into a FIFO queue. A dedicated thread picks the data of the queue and outputs the data, while the programs main thread continues with its task.
If the main program causes SIGSEGV or other signals to be raised I need to make sure that the queue is empty before terminating.
My plan is to block the signals using pthread_sigmask http://man7.org/linux/man-pages/man3/pthread_sigmask.3.html for all but one thread, but reading the list of signals on http://man7.org/linux/man-pages/man7/signal.7.html i noticed:
A signal may be generated (and thus pending) for a process as a whole (e.g., when sent >using kill(2)) or for a specific thread (e.g., certain signals, such as SIGSEGV and SIGFPE, >generated as a consequence of executing a specific machine-language instruction are
thread directed, as are signals targeted at a specific thread using pthread_kill(3)).
If I block SIGSEGV on all threads but a thread dedicated to catching signals, will it then catch a SIGSEGV raised by a different thread?
I found the question Signal handling with multiple threads in Linux, but I am clueless as to which signals are thread specific and how to catch them.
I agree with the comments: in practice catching and handling SIGSEGV is often a bad thing.
And SIGSEGV is delivered to a specific thread (see this), the one running the machine instruction which accessed to some illegal address.
So you cannot run a thread dedicated to catching SIGSEGV in other threads. And you probably could not easily use signalfd(2) for SIGSEGV...
Catching (and returning normally from its signal handler) SIGSEGV is a complex and processor specific thing (it cannot be "portable C code"). You need to inspect and alter the machine state in the handler, that is either modify the address space (by calling mmap(2) etc...) or modify the register state of the current thread. So use sigaction(2) with SA_SIGINFO and change the machine specific state pointed by the third argument (of type ucontext_t*) of the signal handler. Then dive into the processor specific uc_mcontext field of it. Have fun changing individual registers, etc... If you don't alter the machine state of the faulty thread, execution is resumed (after returning from your SIGSEGV handler) in the same situation as before, and another SIGSEGV signal is immediately sent.... Or simply, don't return normally from a SIGSEGV handler (e.g. use siglongjmp(3) or abort(3) or _exit(2) ...).
Even if you happen to do all this, it is rumored that Linux kernels are not extremely efficient on such executions. So it is rumored that trying to mimic Hurd/Mach external pagers this way on Linux is not very efficient. See this answer...
Of course signal handlers should call only (see signal(7) for more) async-signal-safe functions. In particular, you cannot in principle call fprintf from them (and you might not be able to use reliably your logging system, but it could work in most but not all cases).
What I said on SIGSEGV also holds for SIGBUS and SIGFPE (and other thread-specific asynchronous signals, if they exist).

How do I guarantee fast shutdown of my win32 app?

I've got a C++ Win32 application that has a number of threads that might be busy doing IO (HTTP calls, etc) when the user wants to shutdown the application. Currently, I play nicely and wait for all the threads to end before returning from main. Sometimes, this takes longer than I would like and indeed, it seems kind of pointless to make the user wait when I could just exit. However, if I just go ahead and return from main, I'm likely to get crashes as destructors start getting called while there are still threads using the objects.
So, recognizing that in an ideal, platonic world of virtue, the best thing to do would be to wait for all the threads to exit and then shutdown cleanly, what is the next best real world solution? Simply making the threads exit faster may not be an option. The goal is to get the process dead as quickly as possible so that, for example, a new version can be installed over it. The only disk IO I'm doing is in a transactional db, so I'm not terribly concerned about pulling the plug on that.
Use overlapped IO so that you're always in control of the threads that are dealing with your I/O and can always stop them at any point; you either have them waiting on an IOCP and can post an application level shutdown code to it, OR you can wait on the event in your OVERLAPPED structure AND wait on your 'all threads please shutdown now' event as well.
In summary, avoid blocking calls that you can't cancel.
If you can't and you're stuck in a blocking socket call doing IO then you could always just close the socket from the thread that has decided that it's time to shut down and have the thread that's doing IO always check the 'shutdown now' event before retrying...
I use an exception-based technique that's worked pretty well for me in a number of Win32 applications.
To terminate a thread, I use QueueUserAPC() to queue a call to a function which throws an exception. However, the exception that's thrown isn't derived from the type "Exception", so will only be caught by my thread's wrapper procedure.
The advantages of this are as follows:
No special code needed in your thread to make it 'stoppable' - as soon as it enters an alertable wait state, it will run the APC function.
All destructors get invoked as the exception runs up the stack, so your thread exits cleanly.
The things you need to watch for:
Anything doing catch (...) will eat your exception. User code should always use catch(const Exception &e) or similar!
Make sure your I/O and delays are done in an "alertable" way. For example, this means calling sleepex(N, true) instead of sleep(N).
CPU-bound threads need to call sleepex(0,true) occasionally to check for termination.
You can also 'protect' areas of your code to prevent task termination during critical sections.
Best way: Do your work while the app is running, and do nothing (or as close to) at shutdown (works for startup too). If you stick to that pattern, then you can tear down the threads immediately (rather than "being nice" about it) when the shutdown request comes without worrying about work that still needs to be done.
In your specific situation, you'd probably need to wait for IO to finish (writes, at least) if you're doing local work there. HTTP requests and such you can probably just abandon/close outright (again, unless you're writing something). But if it is the case that you're writing during this shutdown and waiting on that, then you may want to notify the user of that, rather than letting your process look hung while you're wrapping things up.
I'd recommend having your GUI and work be done on different threads. When a user requests a shutdown, dismiss the GUI immediately giving the appearance that the application has closed. Allow the worker threads to close gracefully in the background.
If you want to pull the plug messily, exit(0) will do the trick.
I once had a similar problem, albeit in Visual Basic 6: threads from an app would connect to different servers, download some data, perform some operations looping upon that data, and store on a centralized server the result.
Then, new requirement was that threads should be stoppable from main form. I accomplished this in an easy though dirty fashion, by having the threads stop after N loops (equivalent roughly to half a second) to try to open a mutex with a specific name. Upon success, they immediately stopped whatever they were doing and quit, continued otherwise.
This mutex was created only by the main form, once it was created all the threads would soon close themselves. The disadvantage was that user needed to manually specify it wanted to run the threads again - another button to "Enable threads to run" accomplished this by releasing the mutex :D
This trick is guaranteed to work for mutex operations are atomic. Problem is you're never sure a thread really closed - a failure in the logic of handling the "openMutex succeeded" case could mean it never ends. You also don't know when/if all the threads have closed (assuming your code is right, this would take roughly the same time it takes for the loops to stop and "listen").
With VB's "apartment" model of multi-threading it's somewhat difficult to send info from the threads to the main app back and forth, it's much easier to "fire and forget" or to send it only from the main app to the thread. Thus, the need of these kind of long-cuts. Using C++ you're free to use your multi-threading model, so these constraints might not apply to you.
Whatever you do, do NOT use TerminateThread, especially on anything that could be in OS HTTP calls. You could potentially break IE until reboot.
Change all of your IO to an asynchronous or non-blocking model so that they can watch for termination events.
If you need to shutdown suddenly: Just call ExitProcess - which is what is going to be called just as soon as you return from WinMain anyway. Windows itself creates many worker threads that have no way to be cleaned up - they are terminated by process shutdown.
If you have any threads that are performing writes of some kind - obviously those need a chance to close their resources. But anything else - ignore the bounds checker warnings and just pull the rug from under their feet.
You can call TerminateProcess - this will stop the process immediately, without notifying anyone and without waiting for anything.
*NULL = 0 is the fastest way. if you don't want to crash, call exit() or its win32 equivalent.
Instruct the user to unplug the computer. Short of that, you have to abandon your asynchronous activities to the wind. Or is that HWIND? I can never remember in C++. Of course, you could take the middle road and quickly note in a text file or reg key what action was abandoned so that the next time the program runs it can take up that action again automatically or ask the user if they want to do so. Depending on what data you lose when you abandon the asynch action, you may not be able to do that. If you're interacting with the user, you may want to consider a dialog or some UI interaction that explains why its taking so long.
Personally, I prefer the instruction to the user to just unplug the computer. :)