sigwait() does not work in multithreaded program - c++

I'm trying to write a multithreaded program which one thread (variable thread in below) is responsible to any asynchronous signals that might be set to this process.
I am facing thread that uses sigwait() but does not react to any signals have been sent to process. (like SIGUSR1 in below).
static void * signal_thread(void *arg = nullptr)
{
int sig = -1;
sigset_t sigset;
sigfillset(&sigset);
pthread_sigmask(SIG_BLOCK, &sigset, NULL);
while(1)
{
int s = sigwait(&sigset, &sig);
if(s == 0)
printf("SIG %d recieved!...\n", sig);
usleep(20);
}
}
int main()
{
sigset_t signalset;
pthread_t thread;
pthread_create(&thread, NULL, &signal_thread, nullptr);
sigfillset(&signalset);
pthread_sigmask(SIG_BLOCK, &signalset, NULL);
while(1)
{
raise(SIGUSR1);
usleep(20);
}
}

The problem is concerned to two issues:
First, call of raise in main sent signal only to main thread not whole process.
Secondly, std::cout should be used instead of printf in signal_thread.

raise(sig) is the equivalent of calling pthread_kill(pthread_self(), sig).
Since the main thread raise()s the signal, the SIGUSR1 will be generated for that thread and not for any other. Thus, your signal_thread will be unable to sigwait() for the USR1, which will be held pending for the thread that generated it.

Related

Terminating a program with calling atexit functions (Linux)

Is there any way to send a signal to a process (in Linux), that results in a termination of the process after going through the "atexit-functions" (in this case: void shutdownEngines())? Using "pkill name" does not work.
#include <cstdlib>
void shutdownEngines() {/*is not executed by "pkill name"*/}
int main() {
atexit(shutdownEngines);
while(true)
doStuff();
}
Usage: I'm currently programming a robot. Every time I want to test it, I'll start the program and terminate it with "pkill name", but "shutdownEngines" isn't called and the robot keeps moving, falling off the table etc.
I know I could do "pkill name; ./shutdownEngines.sh", but this would be very bad style in my case (the numbers of the gpio pins connected to the engines are defined in a header file of the main program (the source code of the main program is not on the robot but on my computer). Making sure that there's always a "shutdownEngines.sh" program/script with the right pins on every robot would be very complicated.
Update
The following code works perfectly:
#include <iostream>
#include <csignal>
#include <cstdlib>
void signalHandler(__attribute__((unused)) const int signum) {
exit(EXIT_FAILURE);
}
void driverEpilog() {
std::cout << "shutting down engines...";
//drv255(0,0);
}
int main() {
signal(SIGTERM, signalHandler);
atexit(driverEpilog);
while(true)
system("sleep 1");
}
from the man page of atexit:
Functions registered using atexit() (and on_exit(3)) are not called
if a process terminates abnormally because of the delivery of a
signal.
atexit is called when your main routine returns or when you call exit, not on a signal.
When you call pkill you're sending a SIGTERM signal. Handle this signal with signal or sigaction instead (define handlers on SIGTERM, SIGINT, SIGFPE, ...) to stop the engines before exiting your program.
Example lifted from GNU C library documentation:
void
termination_handler (int signum)
{
struct temp_file *p;
for (p = temp_file_list; p; p = p->next)
unlink (p->name); // don't delete files, stop your engines instead :)
}
int
main (void)
{
…
struct sigaction new_action, old_action;
/* Set up the structure to specify the new action. */
new_action.sa_handler = termination_handler;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;
sigaction (SIGINT, NULL, &old_action);
if (old_action.sa_handler != SIG_IGN)
sigaction (SIGINT, &new_action, NULL);
sigaction (SIGHUP, NULL, &old_action);
if (old_action.sa_handler != SIG_IGN)
sigaction (SIGHUP, &new_action, NULL);
sigaction (SIGTERM, NULL, &old_action);
if (old_action.sa_handler != SIG_IGN)
sigaction (SIGTERM, &new_action, NULL);
…
}
(of course, no handler can handle the SIGKILL "signal", which tells the OS to remove your process from the active process list, without further notice!)

Example of handling signals in multi-threaded process

Can anyone give me the steps or even the code for the following situation:
A process which contains multiple thread, and of these threads is responsible of catching a user defined signal SIGUSR1. Only this thread should be capable of receiving this signal, and upon the reception of this signal I do some stuff.
In my situation the signal is being sent by a Kernel Module to my Process ID. Then it is the responsibility of my process to deliver it to the correct listening thread, which has also established the Signal Handler i.e. the signal handler is not in the main thread.
I already did some code which runs for a single-thread process, but I have a problem in running it in multiple thread environment.
I am running my code on Linux Ubuntu 12.04.3 with Kernel Version 3.8.0-29. And for the creation of the process I am mixing between Boost Threads and POSIX threads API.
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#include <string.h>
/* Value of the last signal caught */
volatile sig_atomic_t sig_value;
static void sig_handler(const int sig_number, siginfo_t *sig_info, void *context)
{
if (sig_number == SIGSEGV)
{
error_sys("Error at address 0x%lx", (long)sig_info->si_addr);
exit(-1);
}
sig_value = sig_number;
}
int init_signal_catcher()
{
struct sigaction sig_action; /* Structure describing the action to be taken when asignal arrives. */
sigset_t oldmask; /* Signal mask before signal disposition change. */
sigset_t newmask; /* Signal mask after signal disposition change. */
sigset_t zeromask; /* Signal mask to unblock all signal while suspended. */
/* Define signal mask and install signal handlers */
memset(&sig_action, 0, sizeof(struct sigaction));
sig_action.sa_flags = SA_SIGINFO;
sig_action.sa_sigaction = sig_handler;
/* Examine and change a signal action. */
sigaction(SIGHUP, &sig_action, NULL);
sigaction(SIGINT, &sig_action, NULL);
sigaction(SIGTERM, &sig_action, NULL);
sigaction(SIGSEGV, &sig_action, NULL);
sigaction(SIGUSR1, &sig_action, NULL);
/* Block SIGHUP, SIGINT, SIGTERM, SIGSEGV and SIGUSR1 signals. */
sigemptyset(&newmask);
sigaddset(&newmask, SIGHUP);
sigaddset(&newmask, SIGINT);
sigaddset(&newmask, SIGTERM);
sigaddset(&newmask, SIGSEGV);
sigaddset(&newmask, SIGUSR1);
/* Examine and change blocked signals. */
pthread_sigmask(SIG_BLOCK, &newmask, &oldmask);
/* Initialize the empty signal set. */
sigemptyset(&zeromask);
sig_value = 0;
while ((sig_value != SIGINT) && (sig_value != SIGTERM))
{
sig_value = 0;
/*
* Go to sleep (unblocking all signals) until a signal is catched.
* On return from sleep, the signals SIGHUP, SIGINT, SIGTERM and
* SIGUSR1 are again blocked.
*/
printf("Suspending on %lu mask.", zeromask);
// Wait for a signal.
sigsuspend(&zeromask);
switch(sig_value)
{
printf("Caught Signal %d", sig_value);
case SIGUSR1:
printf("Caught SIGUSR1");
break;
}
}
return 0;
}
The signals need to be blocked in every thread. The safest way to do this is to block them in the first thread before any others are created. Then a single, specially chosen thread can call sigsuspend() and only that thread will execute the signal handlers.
void *signal_handling_thread(void *whatever) {
sig_value := 0
while (sig_value not in (SIGTERM, SIGINT)) {
sigsuspend(empty_mask)
...
}
...
}
int main(int argc, char **argv) {
block_relevant_signals(); // SIG_BLOCK HUP, TERM, USR1, etc.
catch_relevant_signals(); // SA_SIGINFO ...
spawn_signal_handling_thread(); // spawned with relevant signals blocked
for (int i = 0; i < NUM_WORKERS; i++) {
spawn_worker_thread(); // spawned with relevant signals blocked
}
...
}
It's time to refactor your code to break apart concerns — do global process attribute manipulation in one place, signal-specific reaction in another, etc.
In your signal handler, you are calling exit(-1). exit(-1) is not asynchronous signal-handler safe. Use _exit(-1) instead.
The difference between the two functions is that exit() calls all of the registered atexit() routines (including C++ static destructors). Before exit() does that shutdown step, it uses pthread_mutex_lock() to ensure a thread-safe shutdown. If the lock happens to be held by another thread, your program will deadlock.
_exit() skips those atexit routines and terminates the process.
I'm not familiar with error_sys(), but it looks like it ends up using printf()/fprintf(). Those routines also tend to be protected by mutexes.
Here is an example to organize which thread gets which signal using pthread_sigmask: http://man7.org/linux/man-pages/man3/pthread_sigmask.3.html

I want to know which a signal is arrived when system call() is interrupted

My application has two threads. Each threads recevive some data from the server via each sockets. Threads wait to return epoll_wait(). Sometimes epoll_wait() returns -1 and errno is EINTR. EINTR means that system call() is interrupted by a signal. I added to process EINTR.
However I do not know what a signal is arrived and why a signal is arrived. I wonder it.
Method 1.
I created a thread.
sigset_t sMaskOfSignal;
sigset_t sOldMaskOfSignal;
sigfillset(&sMaskOfSignal);
sigprocmask(SIG_UNBLOCK, &sMaskOfSignal, &sOldMaskOfSignal)
while(1)
{
sigwait(&sMaskOfSignal, &sArrivedSignal);
fprintf(stdout, "%d(%s) signal caught\n", sArrivedSignal, strsignal(sArrivedSignal));
}
I could not catch a signal when epoll_wait() is interrupted.
Method 2
When I execute my application in strace tool, epoll_wait() never be interrupted.
My problem is reproduced very well in GDB tool. I need helps....
You can try to implement your own signal handler. If you application gets interrupted by a signal again, your own signal-handler will be called and you can see, what kind of signal has been raised.
void
signal_callback_handler(int signum)
{
printf("Caught signal %d\n",signum);
exit(signum); // terminate application
}
int main()
{
// Register signal handler for all signals you want to handle
signal(SIGINT, signal_callback_handler);
signal(SIGABRT, signal_callback_handler);
signal(SIGSEGV, signal_callback_handler);
// .. and even more, if you want to
}
Not a very handy-method, but this should (hopefully) enable you to find out, what signal has been raised. Take a look here to see the different signals, that can be handled (note: not all signals can be handled in your own signal-handler(!)).
May be you should try setting signal handler for catching all signals and set your signal flags to SA_SIGINFO
something like this
struct sigaction act;
sigemptyset(&act.sa_mask);
act.sa_flags = SA_SIGINFO;
act.sa_sigaction = <handler>;
sigaction(SIGFPE, &act, 0);
sigaction(SIGHUP, &act, 0);
sigaction(SIGABRT, &act, 0);
sigaction(SIGILL, &act, 0);
sigaction(SIGALRM, &act, 0);
sigaction(SIGALRM, &act, 0);
.
.
.
//and your handler looks like
void handle_sig (int sig, siginfo_t *info, void *ptr)
{
printf ("Signal is %d\n",sig);
}
Resgister the handler in your main program and ignore EINTR in epoll.

Unable to handle signal, causing interruped system calls

I am having trouble trying to handle a signal...
I have a multithreaded application, which receives a signal that interrupts system calls in a library that I am using. After some research, I found that if a signal is not handled, it gets sent to a random thread in the application. I have confirmed from the library I am using that they are not using any signals in their application. Neither am I. Here is my main driver class:
void sig_handler(int signum)
{
cout << "Signal Handle: " << signum << endl;
signal(signum, SIG_IGN);
}
class Driver
{
public:
Driver();
void LaunchLog();
void logonListen();
};
Driver::Driver()
{
pthread_t logThread;
pthread_create(&logThread, NULL, LaunchLogThread, (void*) this);
pthread_detach(logThread);
pthread_t listenThread;
pthread_create(&listenThread, NULL, LaunchListenThread, (void*)this);
pthread_detach(listenThread);
bool running = true;
while(running)
{
//Simple loop to resemble a menu for the console
}
}
void Driver::logonListen()
{
char buffer[256];
int logonPort = BASEPORT;
int sockfd;
struct sockaddr_in serv_addr, cli_addr;
socklen_t clilen;
//initialize socket
sockfd = socket(AF_INET, SOCK_DGRAM, 0);
if(sockfd < 0){
perror("Error opening socket");
exit(1);
}
bzero((char*)&serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = INADDR_ANY;
serv_addr.sin_port = htons(logonPort);
//bind socket to our address
if(bind(sockfd,(struct sockaddr*) &serv_addr, sizeof(serv_addr)) < 0)
{
perror("Error on bind");
exit(1);
}
clilen = sizeof(struct sockaddr_in);
int n;
int nextUser = 2;
char reply[32];
while(1)
{
//Wait for incoming connection on a socket
}
}
void Driver::LaunchLog()
{
Logger::Instance()->WriteToFile();
}
void* LaunchListenThread(void* ptr)
{
Driver* driver = (Driver*) ptr;
driver->logonListen();
}
int main()
{
signal (SIGHUP, sig_handler);
signal (SIGINT, sig_handler);
signal (SIGQUIT, sig_handler);
signal (SIGILL, sig_handler);
signal (SIGTRAP, sig_handler);
signal (SIGFPE, sig_handler);
signal (SIGKILL, sig_handler);
signal (SIGUSR1, sig_handler);
signal (SIGSEGV, sig_handler);
signal (SIGUSR2, sig_handler);
signal (SIGPIPE, sig_handler);
signal (SIGALRM, sig_handler);
signal (SIGTERM, sig_handler);
signal (SIGCHLD, sig_handler);
signal (SIGCONT, sig_handler);
signal (SIGSTOP, sig_handler);
signal (SIGTSTP, sig_handler);
signal (SIGTTIN, sig_handler);
signal (SIGTTOU, sig_handler);
signal (SIGABRT, sig_handler);
Driver driver;
return 0;
}
I am unable to handle the signals. Interrupted System Calls keep creeping up, and my signal handler never gets used. Even when I press CTRL+C in the console, the program ends with interruption rather than SIGINT being handled. Am I installing the handler incorrectly?
Is there a way to handle all signals and to ignore them if they arise?
Thanks
You want to use sigaction(2) with the SA_RESTART flag to ignore the signals and insure that system calls get restarted instead of interrupted:
struct sigaction sa;
sa.sa_hanlder = SIG_IGN;
sa.sa_flags = SA_RESTART;
sigaction(SIGALRM, &sa);
sigaction(SIGPIPE, &sa);
/* repeat for all the signals you want to ignore */
Note that you might not want to ignore things like SIGINT as you then won't be able to stop you program with ctrl-C. Likewise, ignoring SIGSEGV may cause your program to hang if it contains a bug, rather than exiting.
edit
Your description that neither you nor the library is using any signals doesn't quite ring true -- the signals are coming from SOMEWHERE, and it may just be a case that you don't realize something you are doing is using signals under the hood. If you're using any alarms or itimers anywhere, those involve signals. If your sig_handler is truly not being called, that implies that someone else (your library?) is installing a signal handler to replace it.
If you still can't figure out where the signals are coming from, you can run under a debugger, and enable the debugger's signal debugging ability (if needed). That should at least tell you which signals are occurring, and where they are occurring.
In general, with any system call, you should ALWAYS check for errors, and if you see the error EINTR unexpectedly, you should probably just loop and redo the system call.
A couple things. First, you should be checking the return value of signal(). It can return SIG_ERR if there are problems installing your signal handler (which there probably are in some of those cases at least, because some of those signals are not trappable). Second, using stdio/iostream functions in a signal handler can be problematic due to their asynchronous (with respect to something else in the program that might be using the same facilities) nature. And third, in your signal handler, even if it's called the first time, you are then setting that signal to be ignored, so any subsequent instances of that signal (assuming it's one that's catchable) will simply be ignored. If you want to ignore them to begin with, you don't need to write a handler, just signal(SIG<whatever>, SIG_IGN) in your main().

Waitpid equivalent with timeout?

Imagine I have a process that starts several child processes. The parent needs to know when a child exits.
I can use waitpid, but then if/when the parent needs to exit I have no way of telling the thread that is blocked in waitpid to exit gracefully and join it. It's nice to have things clean up themselves, but it may not be that big of a deal.
I can use waitpid with WNOHANG, and then sleep for some arbitrary time to prevent a busy wait. However then I can only know if a child has exited every so often. In my case it may not be super critical that I know when a child exits right away, but I'd like to know ASAP...
I can use a signal handler for SIGCHLD, and in the signal handler do whatever I was going to do when a child exits, or send a message to a different thread to do some action. But using a signal handler obfuscates the flow of the code a little bit.
What I'd really like to do is use waitpid on some timeout, say 5 sec. Since exiting the process isn't a time critical operation, I can lazily signal the thread to exit, while still having it blocked in waitpid the rest of the time, always ready to react. Is there such a call in linux? Of the alternatives, which one is best?
EDIT:
Another method based on the replies would be to block SIGCHLD in all threads with pthread \ _sigmask(). Then in one thread, keep calling sigtimedwait() while looking for SIGCHLD. This means that I can time out on that call and check whether the thread should exit, and if not, remain blocked waiting for the signal. Once a SIGCHLD is delivered to this thread, we can react to it immediately, and in line of the wait thread, without using a signal handler.
Don't mix alarm() with wait(). You can lose error information that way.
Use the self-pipe trick. This turns any signal into a select()able event:
int selfpipe[2];
void selfpipe_sigh(int n)
{
int save_errno = errno;
(void)write(selfpipe[1], "",1);
errno = save_errno;
}
void selfpipe_setup(void)
{
static struct sigaction act;
if (pipe(selfpipe) == -1) { abort(); }
fcntl(selfpipe[0],F_SETFL,fcntl(selfpipe[0],F_GETFL)|O_NONBLOCK);
fcntl(selfpipe[1],F_SETFL,fcntl(selfpipe[1],F_GETFL)|O_NONBLOCK);
memset(&act, 0, sizeof(act));
act.sa_handler = selfpipe_sigh;
sigaction(SIGCHLD, &act, NULL);
}
Then, your waitpid-like function looks like this:
int selfpipe_waitpid(void)
{
static char dummy[4096];
fd_set rfds;
struct timeval tv;
int died = 0, st;
tv.tv_sec = 5;
tv.tv_usec = 0;
FD_ZERO(&rfds);
FD_SET(selfpipe[0], &rfds);
if (select(selfpipe[0]+1, &rfds, NULL, NULL, &tv) > 0) {
while (read(selfpipe[0],dummy,sizeof(dummy)) > 0);
while (waitpid(-1, &st, WNOHANG) != -1) died++;
}
return died;
}
You can see in selfpipe_waitpid() how you can control the timeout and even mix with other select()-based IO.
Fork an intermediate child, which forks the real child and a timeout process and waits for all (both) of its children. When one exits, it'll kill the other one and exit.
pid_t intermediate_pid = fork();
if (intermediate_pid == 0) {
pid_t worker_pid = fork();
if (worker_pid == 0) {
do_work();
_exit(0);
}
pid_t timeout_pid = fork();
if (timeout_pid == 0) {
sleep(timeout_time);
_exit(0);
}
pid_t exited_pid = wait(NULL);
if (exited_pid == worker_pid) {
kill(timeout_pid, SIGKILL);
} else {
kill(worker_pid, SIGKILL); // Or something less violent if you prefer
}
wait(NULL); // Collect the other process
_exit(0); // Or some more informative status
}
waitpid(intermediate_pid, 0, 0);
Surprisingly simple :)
You can even leave out the intermediate child if you're sure no other module in the program is spwaning child processes of its own.
This is an interesting question.
I found sigtimedwait can do it.
EDIT 2016/08/29:
Thanks for Mark Edington's suggestion. I'v tested your example on Ubuntu 16.04, it works as expected.
Note: this only works for child processes. It's a pity that seems no equivalent way of Window's WaitForSingleObject(unrelated_process_handle, timeout) in Linux/Unix to get notified of unrelated process's termination within timeout.
OK, Mark Edington's sample code is here:
/* The program creates a child process and waits for it to finish. If a timeout
* elapses the child is killed. Waiting is done using sigtimedwait(). Race
* condition is avoided by blocking the SIGCHLD signal before fork().
*/
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
static pid_t fork_child (void)
{
int p = fork ();
if (p == -1) {
perror ("fork");
exit (1);
}
if (p == 0) {
puts ("child: sleeping...");
sleep (10);
puts ("child: exiting");
exit (0);
}
return p;
}
int main (int argc, char *argv[])
{
sigset_t mask;
sigset_t orig_mask;
struct timespec timeout;
pid_t pid;
sigemptyset (&mask);
sigaddset (&mask, SIGCHLD);
if (sigprocmask(SIG_BLOCK, &mask, &orig_mask) < 0) {
perror ("sigprocmask");
return 1;
}
pid = fork_child ();
timeout.tv_sec = 5;
timeout.tv_nsec = 0;
do {
if (sigtimedwait(&mask, NULL, &timeout) < 0) {
if (errno == EINTR) {
/* Interrupted by a signal other than SIGCHLD. */
continue;
}
else if (errno == EAGAIN) {
printf ("Timeout, killing child\n");
kill (pid, SIGKILL);
}
else {
perror ("sigtimedwait");
return 1;
}
}
break;
} while (1);
if (waitpid(pid, NULL, 0) < 0) {
perror ("waitpid");
return 1;
}
return 0;
}
If your program runs only on contemporary Linux kernels (5.3 or later), the preferred way is to use pidfd_open (https://lwn.net/Articles/789023/ https://man7.org/linux/man-pages/man2/pidfd_open.2.html).
This system call returns a file descriptor representing a process, and then you can select, poll or epoll it, the same way you wait on other types of file descriptors.
For example,
int fd = pidfd_open(pid, 0);
struct pollfd pfd = {fd, POLLIN, 0};
poll(&pfd, 1, 1000) == 1;
The function can be interrupted with a signal, so you could set a timer before calling waitpid() and it will exit with an EINTR when the timer signal is raised. Edit: It should be as simple as calling alarm(5) before calling waitpid().
I thought that select will return EINTR when SIGCHLD signaled by on of the child.
I belive this should work:
while(1)
{
int retval = select(0, NULL, NULL, NULL, &tv, &mask);
if (retval == -1 && errno == EINTR) // some signal
{
pid_t pid = (waitpid(-1, &st, WNOHANG) == 0);
if (pid != 0) // some child signaled
}
else if (retval == 0)
{
// timeout
break;
}
else // error
}
Note: you can use pselect to override current sigmask and avoid interrupts from unneeded signals.
Instead of calling waitpid() directly, you could call sigtimedwait() with SIGCHLD (which would be sended to the parent process after child exited) and wait it be delived to the current thread, just as the function name suggested, a timeout parameter is supported.
please check the following code snippet for detail
static bool waitpid_with_timeout(pid_t pid, int timeout_ms, int* status) {
sigset_t child_mask, old_mask;
sigemptyset(&child_mask);
sigaddset(&child_mask, SIGCHLD);
if (sigprocmask(SIG_BLOCK, &child_mask, &old_mask) == -1) {
printf("*** sigprocmask failed: %s\n", strerror(errno));
return false;
}
timespec ts;
ts.tv_sec = MSEC_TO_SEC(timeout_ms);
ts.tv_nsec = (timeout_ms % 1000) * 1000000;
int ret = TEMP_FAILURE_RETRY(sigtimedwait(&child_mask, NULL, &ts));
int saved_errno = errno;
// Set the signals back the way they were.
if (sigprocmask(SIG_SETMASK, &old_mask, NULL) == -1) {
printf("*** sigprocmask failed: %s\n", strerror(errno));
if (ret == 0) {
return false;
}
}
if (ret == -1) {
errno = saved_errno;
if (errno == EAGAIN) {
errno = ETIMEDOUT;
} else {
printf("*** sigtimedwait failed: %s\n", strerror(errno));
}
return false;
}
pid_t child_pid = waitpid(pid, status, WNOHANG);
if (child_pid != pid) {
if (child_pid != -1) {
printf("*** Waiting for pid %d, got pid %d instead\n", pid, child_pid);
} else {
printf("*** waitpid failed: %s\n", strerror(errno));
}
return false;
}
return true;
}
Refer: https://android.googlesource.com/platform/frameworks/native/+/master/cmds/dumpstate/DumpstateUtil.cpp#46
If you're going to use signals anyways (as per Steve's suggestion), you can just send the signal manually when you want to exit. This will cause waitpid to return EINTR and the thread can then exit. No need for a periodic alarm/restart.
Due to circumstances I absolutely needed this to run in the main thread and it was not very simple to use the self-pipe trick or eventfd because my epoll loop was running in another thread. So I came up with this by scrounging together other stack overflow handlers. Note that in general it's much safer to do this in other ways but this is simple. If anyone cares to comment about how it's really really bad then I'm all ears.
NOTE: It is absolutely necessary to block signals handling in any thread save for the one you want to run this in. I do this by default as I believe it messy to handle signals in random threads.
static void ctlWaitPidTimeout(pid_t child, useconds_t usec, int *timedOut) {
int rc = -1;
static pthread_mutex_t alarmMutex = PTHREAD_MUTEX_INITIALIZER;
TRACE("ctlWaitPidTimeout: waiting on %lu\n", (unsigned long) child);
/**
* paranoid, in case this was called twice in a row by different
* threads, which could quickly turn very messy.
*/
pthread_mutex_lock(&alarmMutex);
/* set the alarm handler */
struct sigaction alarmSigaction;
struct sigaction oldSigaction;
sigemptyset(&alarmSigaction.sa_mask);
alarmSigaction.sa_flags = 0;
alarmSigaction.sa_handler = ctlAlarmSignalHandler;
sigaction(SIGALRM, &alarmSigaction, &oldSigaction);
/* set alarm, because no alarm is fired when the first argument is 0, 1 is used instead */
ualarm((usec == 0) ? 1 : usec, 0);
/* wait for the child we just killed */
rc = waitpid(child, NULL, 0);
/* if errno == EINTR, the alarm went off, set timedOut to true */
*timedOut = (rc == -1 && errno == EINTR);
/* in case we did not time out, unset the current alarm so it doesn't bother us later */
ualarm(0, 0);
/* restore old signal action */
sigaction(SIGALRM, &oldSigaction, NULL);
pthread_mutex_unlock(&alarmMutex);
TRACE("ctlWaitPidTimeout: timeout wait done, rc = %d, error = '%s'\n", rc, (rc == -1) ? strerror(errno) : "none");
}
static void ctlAlarmSignalHandler(int s) {
TRACE("ctlAlarmSignalHandler: alarm occured, %d\n", s);
}
EDIT: I've since transitioned to using a solution that integrates well with my existing epoll()-based eventloop, using timerfd. I don't really lose any platform-independence since I was using epoll anyway, and I gain extra sleep because I know the unholy combination of multi-threading and UNIX signals won't hurt my program again.
I can use a signal handler for SIGCHLD, and in the signal handler do whatever I was going to do when a child exits, or send a message to a different thread to do some action. But using a signal handler obfuscates the flow of the code a little bit.
In order to avoid race conditions you should avoid doing anything more complex than changing a volatile flag in a signal handler.
I think the best option in your case is to send a signal to the parent. waitpid() will then set errno to EINTR and return. At this point you check for waitpid return value and errno, notice you have been sent a signal and take appropriate action.
If a third party library is acceptable then the libkqueue project emulates kqueue (the *BSD eventing system) and provides basic process monitoring with EVFILT_PROC + NOTE_EXIT.
The main advantages of using kqueue or libkqueue is that it's cross platform, and doesn't have the complexity of signal handling. If your program is utilises async I/O you may also find it a lower friction interface than using something like epoll and the various *fd functions (signalfd, eventfd, pidfd etc...).
#include <stdio.h>
#include <stdint.h>
#include <sys/event.h> /* kqueue header */
#include <sys/types.h> /* for pid_t */
/* Link with -lkqueue */
int waitpid_timeout(pid_t pid, struct timespec *timeout)
{
struct kevent changelist, eventlist;
int kq, ret;
/* Populate a changelist entry (an event we want to be notified of) */
EV_SET(&changelist, pid, EVFILT_PROC, EV_ADD, NOTE_EXIT, 0, NULL);
kq = kqueue();
/* Call kevent with a timeout */
ret = kevent(kq, &changelist, 1, &eventlist, 1, timeout);
/* Kevent returns 0 on timeout, the number of events that occurred, or -1 on error */
switch (ret) {
case -1:
printf("Error %s\n", strerror(errno));
break;
case 0:
printf("Timeout\n");
break;
case 1:
printf("PID %u exited, status %u\n", (unsigned int)eventlist.ident, (unsigned int)eventlist.data);
break;
}
close(kq);
return ret;
}
Behind the scenes on Linux libkqueue uses either pidfd on Linux kernels >= 5.3 or a waiter thread that listens for SIGCHLD and notifies one or more kqueue instances when a process exits. The second approach is not efficient (it scans PIDs that interest has been registered for using waitid), but that doesn't matter unless you're waiting on large numbers of PIDs.
EVFILT_PROC support has been included in kqueue since its inception, and in libkqueue since v2.5.0.