weird sigwait() behaviour and understand blocking and unblocking signals - c++

Consider the next piece of code:
// Sigaction and timers
struct sigaction sa;
sigset_t maskSet, pendingSet;
* Blocks the SIG_SETMASK in maskSet.
* #return None.
static void block_signal(void)
// ~~~Blocking signal~~~
if(sigemptyset(&pendingSet) == -1)
if(sigpending(&pendingSet) == -1)
if(sigprocmask(SIG_BLOCK, &maskSet, NULL) == -1)
* Unblocks the SIG_SETMASK in maskSet.
* #return None.
static void unblock_signal(void)
// ~~~Blocking signal~~~
int result;
int sig;
// If we got a signal while performing operation that require signal block.
if(sigismember(&pendingSet, SIGVTALRM) != -1)
result = sigwait(&pendingSet, &sig);
if(result == 0)
printf("sigwait() returned for signal %d\n", sig);
//Do stuff
if (sigprocmask(SIG_UNBLOCK, &maskSet, NULL) == -1)
My main goal is to be able to run functions, that during their run - SIGVTALARM ,that is raised by a timer i defined, will be blocked. (block_signal -> dosomthing SIGNALS ARE BLOCKED -> unblocksignal). Maskset is initialized with SIGVTALARM
Two questions:
By the way i implemented, result = sigwait(&pendingSet, &sig); causes the program go into an infinite loop.
Without sigwait() - I use a virtual timer that raise SIGVTALARM every defined interval of time.
Suppose i blocked SIGVTALARM. I understand that, as soon (while blocked) as it is raised - the signal becomes pending. And as soon as i unblock it, the signal is recieved and treated by a signal hanler.
What i dont understand is whats going on with the NEXT signal raised. Will the next signal will be raised a defined interval of time after the PREVIOUS signal released, or will it be raised a defined interval of time from the moment the PREVIOUS signal raised (and blocked -> became pending).


libuv signal handling in multithreaded programs

In a multithreaded C++ program where the main thread is executing a libuv event loop, is it guaranteed that this event loop thread is executing signal handlers registered using uv_signal_start?
Background information:
The I/O (or event) loop is [...] meant to be tied to a single thread.
But as we are in a multithreaded program, signal handlers can be executed by other threads
According to POSIX.1, a process-directed signal (sent using kill(2), for example) should be handled by a single, arbitrarily selected thread within the process.
So my question is basically whether libuv signal handling works as advertised
Signal handles implement Unix style signal handling on a per-event loop bases.
even in multithreaded programs.
TLDR: Yes, should work as advertised.
From my understanding of libuv's source code unix/signal.c there is a generic signal handler
static void uv__signal_handler(int signum) {
uv__signal_msg_t msg;
uv_signal_t* handle;
int saved_errno;
saved_errno = errno;
memset(&msg, 0, sizeof msg);
if (uv__signal_lock()) {
errno = saved_errno;
for (handle = uv__signal_first_handle(signum);
handle != NULL && handle->signum == signum;
handle = RB_NEXT(uv__signal_tree_s, &uv__signal_tree, handle)) {
int r;
msg.signum = signum;
msg.handle = handle;
/* write() should be atomic for small data chunks, so the entire message
* should be written at once. In theory the pipe could become full, in
* which case the user is out of luck.
do {
r = write(handle->loop->signal_pipefd[1], &msg, sizeof msg);
} while (r == -1 && errno == EINTR);
assert(r == sizeof msg ||
(r == -1 && (errno == EAGAIN || errno == EWOULDBLOCK)));
if (r != -1)
errno = saved_errno;
in which a pipe handle->loop->signal_pipefd[1] is used to tell the handle's associated loop abount the incoming signal. Indeed, this generic signal handler can be called from any thread, however the libuv thread will then call the user's specific signal handler registered with uv_signal_start in the event loop thread (main thread in my setting) when it reads the signal_pipefd[1] in the next loop iteration.
This was for the unix source code and the windows win/signal.c source has a similar mechanism.
So the answer should be yes, it should also work as advertised in a multithreaded setting, i.e. the registered handler will be executed by the loop thread.

How to do mutually exclusive read or write operation on a global variable in signal handler and normal function

The application is multi-threaded.
Inside main(), I register the signal handler for SIGUSR1:
// Global variable to indicate whether directory's
// content needs to be reloaded
bool reload_dir = 0;
int main (int argc, char *argv[])
signal(SIGUSR1, sigusr1_handler);
RunServer(arg1, ...);
return 0;
Signal handler:
static void
sigusr1_handler (int signo __unused)
reload_dir = 1;
The following function (which is called from main) is only executed by the main thread:
RunServer (arg1, ...)
// do some stuffs
while (cond != true) {
Now, when the SIGUSR1 is caught (by any thread including the main thread), I'm setting the reload_dir variable to 1. And in RunServer(), I'm reloading the directory based on that value. However, I'm also resetting the global variable reload_dir to 0 to avoid loading the directory repeatedly indefinitely. And this setting reload_dir to 0 will introduce a race.
Since we should not use locks or mutex in a signal handler, how can I achieve this without majorly changing the existing application design.
RunServer (arg1, ...)
// do some stuffs
while (cond != true) {
if (reload_dir) {
// reset reload_dir to avoid loading repeatedly indefinitely
reload_dir = 0; // Race condition?
Block SIGUSR1 with pthread_sigmask before any threads are spawned, so that all threads inherit that mask. Then, in main(), use sigtimedwait instead of sleep in your main loop to check if USR1 has been delivered.
int main(...) {
sigset_t ss_usr1;
sigaddset(&ss_usr1, SIGUSR1);
// block SIGUSR1
pthread_sigmask(SIG_BLOCK, &ss_usr1, NULL);
... spawn threads ...
// RunServer
struct timespec patience = { .tv_sec = 1 };
while (! some_condition) {
int s = sigtimedwait(&ss_usr1, NULL, &patience);
if (s == SIGUSR1) dir_reinit();
// handle errors other than timeout appropriately
Benefits: no signal handler complexity (and you won't be chastized, rightly, for using signal instead of the superior sigaction; no need for atomic flags or sig_atomic_t; easier to reason about behavior.

Creating a dispatch queue / thread handler in C++ with pipes: FIFOs overfilling

Threads are resource-heavy to create and use, so often a pool of threads will be reused for asynchronous tasks. A task is packaged up, and then "posted" to a broker that will enqueue the task on the next available thread.
This is the idea behind dispatch queues (i.e. Apple's Grand Central Dispatch), and thread handlers (Android's Looper mechanism).
Right now, I'm trying to roll my own. In fact, I'm plugging a gap in Android whereby there is an API for posting tasks in Java, but not in the native NDK. However, I'm keeping this question platform independent where I can.
Pipes are the ideal choice for my scenario. I can easily poll the file descriptor of the read-end of a pipe(2) on my worker thread, and enqueue tasks from any other thread by writing to the write-end. Here's what that looks like:
int taskRead, taskWrite;
void setup() {
// Create the pipe
int taskPipe[2];
taskRead = taskPipe[0];
taskWrite = taskPipe[1];
// Set up a routine that is called when task_r reports new data
function_that_polls_file_descriptor(taskRead, []() {
// Read the callback data
std::function<void(void)>* taskPtr;
::read(taskRead, &taskPtr, sizeof(taskPtr));
// Run the task - this is unsafe! See below.
// Clean up
delete taskPtr;
void post(const std::function<void(void)>& task) {
// Copy the function onto the heap
auto* taskPtr = new std::function<void(void)>(task);
// Write the pointer to the pipe - this may block if the FIFO is full!
::write(taskWrite, &taskPtr, sizeof(taskPtr));
This code puts a std::function on the heap, and passes the pointer to the pipe. The function_that_polls_file_descriptor then calls the provided expression to read the pipe and execute the function. Note that there are no safety checks in this example.
This works great 99% of the time, but there is one major drawback. Pipes have a limited size, and if the pipe is filled, then calls to post() will hang. This in itself is not unsafe, until a call to post() is made within a task.
auto evil = []() {
// Post a new task back onto the queue
// Not enough new tasks, let's make more!
for (int i = 0; i < 3; i++) {
// Now for each time this task is posted, 4 more tasks will be added to the queue.
If this happens, then the worker thread will be blocked, waiting to write to the pipe. But the pipe's FIFO is full, and the worker thread is not reading anything from it, so the entire system is in deadlock.
What can be done to ensure that calls to post() eminating from the worker thread always succeed, allowing the worker to continue processing the queue in the event it is full?
Thanks to all the comments and other answers in this post, I now have a working solution to this problem.
The trick I've employed is to prioritise worker threads by checking which thread is calling post(). Here is the rough algorithm:
overflow ← Ø
success ← WRITE(task, pipe)
IF NOT success THEN
overflow ← overflow ∪ {task}
Then on the worker thread:
task ← READ(pipe)
FOR EACH overtask ∈ overflow
overflow ← Ø
The wait is performed with pselect(2), adapted from the answer by #Sigismondo.
Here's the algorithm implemented in my original code example that will work for a single worker thread (although I haven't tested it after copy-paste). It can be extended to work for a thread pool by having a separate overflow queue for each thread.
int taskRead, taskWrite;
// These variables are only allowed to be modified by the worker thread
std::__thread_id workerId;
std::queue<std::function<void(void)>> overflow;
bool overflowInUse;
void setup() {
int taskPipe[2];
taskRead = taskPipe[0];
taskWrite = taskPipe[1];
// Make the pipe non-blocking to check pipe overflows manually
::fcntl(taskWrite, F_SETFL, ::fcntl(taskWrite, F_GETFL, 0) | O_NONBLOCK);
// Save the ID of this worker thread to compare later
workerId = std::this_thread::get_id();
overflowInUse = false;
function_that_polls_file_descriptor(taskRead, []() {
// Read the callback data
std::function<void(void)>* taskPtr;
::read(taskRead, &taskPtr, sizeof(taskPtr));
// Run the task
delete taskPtr;
// Run any tasks that were posted to the overflow
while (!overflow.empty()) {
taskPtr = overflow.front();
delete taskPtr;
// Release the overflow mechanism if applicable
overflowInUse = false;
bool write(std::function<void(void)>* taskPtr, bool blocking = true) {
ssize_t rc = ::write(taskWrite, &taskPtr, sizeof(taskPtr));
// Failure handling
if (rc < 0) {
// If blocking is allowed, wait for pipe to become available
int err = errno;
if ((errno == EAGAIN || errno == EWOULDBLOCK) && blocking) {
fd_set fds;
FD_SET(taskWrite, &fds);
::pselect(1, nullptr, &fds, nullptr, nullptr, nullptr);
// Try again
return write(tdata);
// Otherwise return false
return false;
return true;
void post(const std::function<void(void)>& task) {
auto* taskPtr = new std::function<void(void)>(task);
if (std::this_thread::get_id() == workerId) {
// The worker thread gets 1st-class treatment.
// It won't be blocked if the pipe is full, instead
// using an overflow queue until the overflow has been cleared.
if (!overflowInUse) {
bool success = write(taskPtr, false);
if (!success) {
overflowInUse = true;
} else {
} else {
Make the pipe write file descriptor non-blocking, so that write fails with EAGAIN when the pipe is full.
One improvement is to increase the pipe buffer size.
Another is to use a UNIX socket/socketpair and increase the socket buffer size.
Yet another solution is to use a UNIX datagram socket which many worker threads can read from, but only one gets the next datagram. In other words, you can use a datagram socket as a thread dispatcher.
You can use the old good select to determine whether the file descriptors are ready to be used for writing:
The file descriptors in writefds will be watched to see if
space is available for write (though a large write may still block).
Since you are writing a pointer, your write() cannot be classified as large at all.
Clearly you must be ready to handle the fact that a post may fail, and then be ready to retry it later... otherwise you will be facing indefinitely growing pipes, until you system will break again.
More or less (not tested):
bool post(const std::function<void(void)>& task) {
bool post_res = false;
// Copy the function onto the heap
auto* taskPtr = new std::function<void(void)>(task);
fd_set wfds;
struct timeval tv;
int retval;
FD_SET(taskWrite, &wfds);
// Don't wait at all
tv.tv_sec = 0;
tv.tv_usec = 0;
retval = select(1, NULL, &wfds, NULL, &tv);
// select() returns 0 when no FD's are ready
if (retval == -1) {
// handle error condition
} else if (retval > 0) {
// Write the pointer to the pipe. This write will succeed
::write(taskWrite, &taskPtr, sizeof(taskPtr));
post_res = true;
return post_res;
If you only look at Android/Linux using a pipe is not start of the art but using a event file descriptor together with epoll is the way to go.

Multiple Signal usage in cpp

with reference to this, I have included two timers (it_val1, it_val) in setTimer() in my program as below:
void stepRoutingTable(){
void incrementCounter(){
void setTimer(){
struct itimerval it_val1;
if (signal(SIGALRM, (void (*)(int)) incrementCounter) == SIG_ERR) {
cerr<<"Unable to catch SIGALRM"<<endl;
it_val1.it_value.tv_sec = updateInterval;
it_val1.it_value.tv_usec = (updateInterval) % 1000000;
it_val1.it_interval = it_val1.it_value;
if (setitimer(ITIMER_REAL, &it_val1, NULL) == -1) {
cerr<<"error calling setitimer()";
struct itimerval it_val;
if (signal(SIGALRM, (void (*)(int)) stepRoutingTable) == SIG_ERR) {
cerr<<"Unable to catch SIGALRM"<<endl;
it_val.it_value.tv_sec = updateInterval;
it_val.it_value.tv_usec = (updateInterval) % 1000000;
it_val.it_interval = it_val.it_value;
if (setitimer(ITIMER_REAL, &it_val, NULL) == -1) {
cerr<<"error calling setitimer()";
int main(int argc, char* ipCmd[]){
But only it_val is triggered upon execution and not it_val1, what could be the error?
There's only one SIGALARM signal handler, and only one ITIMER_REAL timer.
Installing a handler for a SIGALARM removes the previous signal handler, and replaces it with a new one.
Setting the ITIMER_REAL timer clears any previously set timer, and replaces it with a new one.
The shown code sets the SIGALARM handler, and sets the ITIMER_REAL timer. Then, the shown does this again.
The final result is that only the second timer and signal handler remains in effect. There's only one ITIMER_REAL timer, and when it expires, as described, a SIGALARM signal is generated, and whatever signal handler that's installed at that time, is the one that will be invoked.
If you need to implement a framework for multiple timeouts, with a signal handler for each one, you will have to write this framework yourself, in terms of a single timer, and a single signal handler.

Say I have 3 functions that can be called by an upper layer:
Start - Will only be called if we haven't been started yet, or Stop was previously called
Stop - Will only be called after a successful call to Start
Process - Can be called at any time (simultaneously on different threads); if started, will call into lower layer
In Stop, it must wait for all Process calls to finish calling into the lower layer, and prevent any further calls. With a locking mechanism, I can come up with the following pseudo code:
Start() {
IsStarted = true;
RefCount = 0;
Stop() {
IsStarted = false;
WaitForCompletionEvent = (RefCount != 0);
if (WaitForCompletionEvent)
ASSERT(RefCount == 0);
Process() {
AddedRef = IsStarted;
if (AddedRef)
if (!AddedRef) return;
FireCompletionEvent = (--RefCount == 0);
if (FilreCompletionEvent)
Is there a way to achieve the same behavior without a locking mechanism? Perhaps with some fancy usage of InterlockedCompareExchange and InterlockedIncremenet/InterlockedDecrement?
The reason I ask is that this is in the data path of a network driver and I would really prefer not to have any locks.
I believe it is possible to avoid the use of explicit locks and any unnecessary blocking or kernel calls.
Note that this is pseudo-code only, for illustrative purposes; it hasn't seen a compiler. And while I believe the threading logic is sound, please verify its correctness for yourself, or get an expert to validate it; lock-free programming is hard.
#define STOPPING 0x20000000;
#define STOPPED 0x40000000;
volatile LONG s = STOPPED;
// state and count
// bit 30 set -> stopped
// bit 29 set -> stopping
// bits 0 through 28 -> thread count
LONG n = InterlockedExchange(&s, 0); // sets s to 0
if ((n & STOPPED) == 0)
bluescreen("Invalid call to Start()");
LONG n = InterlockedCompareExchange(&s, STOPPED, 0);
if (n == 0)
// No calls to Process() were running so we could jump directly to stopped.
// Mission accomplished!
LONG n = InterlockedOr(&s, STOPPING);
if ((n & STOPPED) != 0)
bluescreen("Stop called when already stopped");
if ((n & STOPPING) != 0)
bluescreen("Stop called when already stopping");
n = InterlockedCompareExchange(&s, STOPPED, STOPPING);
if (n == STOPPING)
// The last call to Process() exited before we set the STOPPING flag.
// Mission accomplished!
// Now that STOPPING mode is set, and we know at least one call to Process
// is running, all we need do is wait for the event to be signaled.
// The event is only ever signaled after a thread has successfully
// changed the state to STOPPED. Mission accomplished!
LONG n = InterlockedCompareExchange(&s, STOPPED, STOPPING);
if (n == STOPPING)
// We've just stopped; let the call to Stop() complete.
if ((n & STOPPED) != 0 || (n & STOPPING) != 0)
// Checking here avoids changing the state unnecessarily when
// we already know we can't enter the lower layer.
// It also ensures that the transition from STOPPING to STOPPED can't
// be delayed even if there are lots of threads making new calls to Process().
n = InterlockedIncrement(&s);
if ((n & STOPPED) != 0)
// Turns out we've just stopped, so the call to Process() must be aborted.
// Explicitly set the state back to STOPPED, rather than decrementing it,
// in case Start() has been called. At least one thread will succeed.
InterlockedCompareExchange(&s, STOPPED, n);
if ((n & STOPPING) == 0)
n = InterlockedDecrement(&s);
if ((n & STOPPED) != 0 || n == (STOPPED - 1))
bluescreen("Stopped during call to Process, shouldn't be possible!");
if (n != STOPPING)
// Stop() has been called, and it looks like we're the last
// running call to Process() in which case we need to change the
// status to STOPPED and signal the call to Stop() to exit.
// However, another thread might have beaten us to it, so we must
// check again. The event MUST only be set once per call to Stop().
n = InterlockedCompareExchange(&s, STOPPED, STOPPING);
if (n == STOPPING)
// We've just stopped; let the call to Stop() complete.