Preface
I have a multi-threaded application running via Boost.Asio. There is only one boost::asio::io_service for the whole application and all the things are done inside it by a group of threads. Sometimes it is needed to spawn child processes using fork and exec. When child terminates I need to make waitpid on it to check exit code an to collect zombie. I used recently added boost::asio::signal_set but encountered a problem under ancient systems with linux-2.4.* kernels (that are unfortunately still used by some customers). Under older linux kernels threads are actually a special cases of processes and therefore if a child was spawned by one thread, another thread is unable to wait for it using waitpid family of system calls. Asio's signal_set posts signal handler to io_service and any thread running this service can run this handler, which is inappropriate for my case. So I decided to handle signals in old good signal/sigaction way - all threads have the same handler that calls waitpid. So there is another problem:
The problem
When signal is caught by handler and process is successfully sigwaited, how can I "post" this to my io_service from the handler? As it seems to me, obvious io_service::post() method is impossible because it can deadlock on io_service internal mutexes if signal comes at wrong time. The only thing that came to my mind is to use some pipe or socketpair to write notifications there and async_wait on another end as it is done sometimes to handle signals in poll() event loops.
Are there any better solutions?
I've not dealt with boost::asio but I have solved a similar problem. I believe my solution works for both LinuxThreads and the newer NPTL threads.
I'm assuming that the reason you want to "post" signals to your *io_service* is to interrupt an system call so the thread/program will exit cleanly. Is this correct? If not maybe you can better describe your end goal.
I tried a lot of different solutions including some which required detecting which type of threads were being used. The thing that finally helped me solve this was the section titled Interruption of System Calls and Library Functions by Signal Handlers of man signal(7).
The key is to use sigaction() in your signal handling thread with out SA_RESTART, to create handlers for all the signals you want to catch, unmask these signals using pthread_sigmask(SIG_UNBLOCK, sig_set, 0) in the signal handling thread and mask the same signal set in all other threads. The handler does not have to do anything. Just having a handler changes the behavior and not setting SA_RESTART allows interruptible systems calls (like write()) to interrupt. Whereas if you use sigwait() system calls in other threads are not interrupted.
In order to easily mask signals in all other threads. I start the signal handling thread. Then mask all the signals in want to handle in the main thread before starting any other threads. Then when other threads are started they copy the main thread's signal mask.
The point is if you do this then you may not need to post signals to your *io_service* because you can just check your system calls for interrupt return codes. I don't know how this works with boost::asio though.
So the end result of all this is that I can catch the signals I want like SIGINT, SIGTERM, SIGHUO and SIGQUIT in order to perform a clean shutdown but my other threads still get their system calls interrupted and can also exit cleanly with out any communication between the signal thread and the rest of the system, with out doing anything dangerous in the signal handler and a single implementation works on both LinuxThreads and NPTL.
Maybe that wasn't the answer you were looking for but I hope it helps.
NOTE: If you want to figure out if the system is running LinuxThreads you can do this by spawning a thread and then comparing it's PID to the main thread's PID. If they differ it's LinuxThreads. You can then choose the best solution for the thread type.
If you are already polling your IO, another possible solution that is very simple is to just use a boolean to signal the other threads. A boolean is always either zero or not so there is no possibility of a partial update and a race condition. You can then just set this boolean flag without any mutexes that the other threads read. Tools like valgrind wont like it but in practice it works.
If you want to be even more correct you can use gcc's atomics but this is compiler specific.
Related
Thread A executes a blocking call in a loop, until Thread B signals it to continue with the rest of the execution.
I tried the classic approach of an signal handler, which will change a condition variable, so I can test the condition before the the next call starts.
The problem now arises in the case, when the signal arrives after the check of the condition, but before the blocking call.
Short pseudo code example of the problem:
while(!isInterrupted){
raise(SIGINT)
block()
}
Assuming I cannot access or change the implementation of the blocking code and the blocking call doesn't provide an internal timeout functionality, which the signal handler could set to the minimal value, what would be the correct way for C and C++ to handle this?
Signals are used as the blocking call may only be woken up by receiving a SIGINT.
Thank you in advance for your help.
If you can modify the calling assemblies of your libc like I have with https://github.com/pskocik/musl, then you can eliminate this time-of-check to time-of-use problem by having your signal handler call a special function (provided in the modified libc) that'll break the system call if the signal is received while your code is in the function call wrapper after the check but not in kernel mode yet (in kernel mode, blocking calls are naturally broken by signal deliveries naturally).
Without access to your libc (/ you're building purely on top of POSIX), I believe the best you can do is a protocol-based solution:
setup a mechanism by which signal receivers acknowledge signal receipts
have the signal-sending code repeat (preferably with some sleeping) until receipt is acknowledged
That might not be the easiest to set up though (essentially, you'd be fighting POSIX to a degree). If you can afford it, doing the blocking operation in a new thread should be simpler, and pthread_cancel, unlike pthread_kill, should be able to reliably elicit a response (in this case, complete thread cancellation) in the target, unlike pthread_kill.
The downside of using a separate thread is it will be a bit more resource hungry.
Stop using blocking calls, then switch to actual sychronisation primitives.
Please look at mutexes and condition variables for this.
I have a multi-threaded program which needs to handle the Linux signal SIGVTALRM sent by a setitimer() every 25ms. However I am confused. I do not know why I need to use the Pthread_sigmask() to block and unblock the signal. Won't the signal be handled anyway when it is sent, regardless of which thread is processing at the given time instant?
Won't the signal be handled anyway when it is sent, regardless of which thread is processing at the given time instant?
In a single threaded program, yes. But in a multi-threaded program, POSIX doesn't specify which thread would receive the signal SIGVTALRM you send. Hence, pthread_sigmask() is typically used to block the interested signals and handle fetch those signal(s) sigwait() in a dedicated thread. This is probably the reason why you are using or asked to use pthread_sigmask().
The linked POSIX manual also provides a simple example showing how this can be done.
This is a follow-up to this question and I have looked at the related questions.
I am still attempting to do some cleanup when SIGTERM is received, and then achieve the effect of TERM which is the default behavior when no thread in a process is waiting for a signal, and and no signal handler is defined.
In the earlier question, I had assumed that signals had deterministic behavior in a multithreaded application, but after some non-deterministic results and some research, I realized that this is not a safe assumption.
My application is multithreaded and using a normal signal handler via sigaction or signal has non-deterministic results because the arbitrary choice of which thread is actually interrupted to run the signal handler matters to the signal handler's operation (it is cleaning up some threads which might deadlock if the wrong thread is interrupted).
Therefore, I switched to doing synchronous signal handling. In particular, I am blocking SIGTERM using pthread_sigmask in the starting thread, and then calling sigwait() in the thread that will actually perform the cleanup. However, after sigwait() returns in the thread, and I finish my cleanup, I then try the following:
kill(getpid(), SIGTERM);
However, this signal is obviously ignored, because all the other active threads are blocking SIGTERM. Thus, I need to unblock signals in all other threads, before returning from the function that my cleanup thread is running. Is there a function call that can be used to set pthread_sigmask on other threads?
There is no call that can be used to set signal mask on other threads. Received signal is a trigger and no sense to resend that signal for itself. Try to use pthreads condition variables instead.
I want to profile my daemon program, that pauses the main thread:
sigset_t signal_mask;
sigemptyset(&signal_mask);
sigaddset(&signal_mask, SIGTERM);
sigaddset(&signal_mask, SIGINT);
int sig;
sigwait(&signal_mask, &sig);
All other threads simply block all signals.
As far as I know the profiler uses SIGPROF signal for its operations. If I start profiling with such a code, the output .prof file is empty:
env CPUPROFILE=daemon.prof ./daemon
How should I properly handle signals in main thread and other threads to enable profiling? Or may be an issue is somewhere else?
All other threads simply block all signals.
You simply need to unblock SIGPROF in all threads (or in those that you want to profile). We were just solving exactly the same problem in a multi-threaded daemon.
I need to see more of your code, but your statement that "All other threads simply block all signals" raises...signals.
You have to remember that most system calls were created before the concept of threads existed. Signal handling is one of them. Thus, when you block a signal on ANY thread, it's likely blocked for ALL threads.
In fact, check out the signal(2) manpage:
The effects of signal() in a multithreaded process are unspecified.
Yes, this is sad, but it is the price you must pay for using a low-overhead statistical sampling profiler. And working around it is very easy: just remove SIGPROF (or SIGALRM if you are using the REAL mode) from your signal mask set and you should be fine.
And in general, unless you absolutely have to, you should not be doing process-level signal masking in anything other than the main thread...where "main" doesn't necessarily mean the thread that MAIN() is running in, but rather, the thread you consider the "boss" of all the others, for reasons you have made all too clear already. :)
You can also try using the pthread library's sigmask wrapper pthread_sigmask, but it is unclear to me how well it works in situations such as a child thread REMOVING an entry from a sigmask (pthreads inherit their parent's pthread sigmask).
what will be alternative of pthread_setcanceltype in windows thread programming in c++?
Windows threads do not have cancellation points, so theres no system cancel type to consider.
As such, "canceling" a thread on windows means that you, the developer, needs to come up with a strategy for telling a thread to exit. If it is a GUI thread, you can post it a WM_QUIT message. If it is a non GUI thread, then it really depends on what the thread is doing. You need to analyse the thread and see if there is a point where your code can explicitly check if it needs to keep going, or exit.
There is a pthreads-win32 implementation available if you'd rather avoid the question and get pthreads complaint behaviors on Win32.
You could use events as synchronization objects. Check in your thread, from time to time the status of event (WaitForSingleObject with zero timeout), and if it is signaled, return from the thread's main function. To cancel the thread from outside, just set the event.