using sigaction and alarm to break from infinite loop - c++

This is my code :
#define _OPEN_SYS
#include <unistd.h>
#include <stdio.h>
#include <signal.h>
#include <time.h>
volatile int footprint = 0;
void catcher(int signum) {
puts("inside signal catcher!");
alarm(0);
footprint = 1;
return;
}
main() {
printf("footprint=%d\n", footprint);
struct sigaction sact;
sigemptyset(&sact.sa_mask);
sact.sa_flags = 0;
sact.sa_handler = catcher;
if (footprint == 0) {
puts("the signal catcher never gained control");
sigaction(SIGALRM, &sact, NULL);
printf("before loop");
alarm(5); /* timer will pop in five seconds */
while (true);
} else
puts("the signal catcher gained control");
printf("after loop");
}
my output is :
footprint=0
the signal catcher never gained control
before loopinside signal catcher!
and the application keep running forever , I need someway to break this loop , I'm using similar code to make timeout for sybase statement execution as OCCI doesn't support timeout.

Signals such as SIGALRM will interrupt most system calls (but beware of automatically restartable calls). You cannot rely on them to interrupt your syscall-free loop. And even when it does, execution resumes after a signal, so your code happily goes right back to looping.
In fact, your code is not even valid C++ (!!!). Section 1.10p24 of the Standard says:
The implementation may assume that any thread will eventually do one
of the following:
terminate,
make a call to a library I/O function,
access or modify a volatile object, or
perform a synchronization operation or
an atomic operation.
Alex's suggestion of while ( footprint == 0 ) ; will at least correct this defect.

A loop such as while (true); can't be interrupted, except by terminating the thread executing it. The loop has to be coded to check for an interrupt condition and exit.
As Alex mentioned in a comment, while ( footprint == 0 ) ; would correctly implement a loop checking for the given signal handler.
Just being pedantic, footprint should be declared sig_atomic_t not int, but it probably doesn't matter.

Related

Signal Handling (c++)

I have this simple code that loops the word "SIGNALS ARE COOL" I'm trying to make it take signals like (SIGFPE. SIGABRT, SIGINT, SIGSEGV.) and show the signal type and the time I made this code that takes "SIGINT" signal how do I add more signals and how to control what my program show when the signals are triggered by the user.
// ConsoleApplication3.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <csignal>
using namespace std;
void signalHandler(int signum) {
cout << "Interrupt signal (" << signum << ") received.\n";
// cleanup and close up stuff here
// terminate program
exit(signum);
}
int main() {
// register signal SIGINT and signal handler
signal(SIGINT, signalHandler);
while (1) {
cout << "SIGNALS ARE COOL" << endl;
}
return 0;
}
I see that this looks like an assignment; so what I'm saying may not be relevant to you (but might be to someone someday).
--EDIT--
I see you've also got stdafx.h, which I think is a Visual Studio Windows thing, and here I am suggesting a POSIX solution (not pure C++). I didn't read carefully enough, and that invalidates my whole answer (I think). You probably can't use my suggestion, and for that I'm sorry.
However, I'm going to leave it here in case someone one day finds this and needs to work with signals in a Unix system.
--
I've found that it's often a lot more practical to avoid signal handling functions like this altogether, and take signals on your own terms. As noted by others, there's a lot of rules about what you can and can't do within a signal handler, because they can be invoked at any time, in any thread, unless you take extra precautions. I've seen this result in a lot of messy code, things like 'have a global bool got_signal that gets checked by things all over the application to know if they're supposed to shut down'. There's obviously nice ways to do signal handling, but at this point I try to avoid it altogether in favor of other options.
The functions pthread_sigmask and sigwait can be used to invert control here and allow you to accept signals within the defined flow of program execution where you want it, and then you don't need to worry about taking invalid actions when you handle them. Using pthread_sigmask you can tell the OS not to interrupt your program to deliver signals and instead queue them up, and then sigwait can be used to handle them at an appropriate time. You can't do this with all signals (some things like kill -9 and a SEGFAULT can't/shouldn't be ignored), but it works well for most of them.
Using an approach like this, it's really easy to interact with signals in a larger application too. You can block signals at the start of main, and that will propagate to all children threads, and then you can designate one specific child thread to just wait for signals an pass events into the rest of the application in whatever method is appropriate for the framework of your application.
#include <signal.h>
#include <unistd.h>
#include <initializer_list>
#include <functional>
#include <algorithm>
#include <iostream>
sigset_t make_sigset(std::initializer_list<int32_t> signals)
{
sigset_t set;
const int32_t result = sigemptyset(&set);
std::for_each(signals.begin(), signals.end(), std::bind(&sigaddset, &set, std::placeholders::_1));
return set;
}
int main()
{
const auto signal_list = make_sigset({SIGTERM, SIGSEGV, SIGINT, SIGABRT});
pthread_sigmask(SIG_BLOCK, &signal_list, nullptr);
int32_t last_signal;
do
{
sigwait(&signal_list, &last_signal);
std::cout << "Got signal " << last_signal << std::endl;
// Exit on sigint so ctrl+c still works
} while (last_signal != SIGINT);
return 0;
}
As already mentioned by #Eljay in the comments, you have to be careful with the things you do in a signal handler.
I'd also suggest not using namespace std, but that's a story for another time link.
I'd recommend you this page which explains a lot about what signals can and cannot do, according to the c++ standard. Now what they actually do in your compiler (which I assume is MSVC) may be different.
Some of the important bits, as already mentioned, you shouldn't do I/O, you shouldn't throw, etc...
To answer your question, you were on the right track, adding other signals can be done via:
// catch SIGTERM
std::signal(SIGTERM, signalHandler);
std::signal(SIGSEGV, signalHandler);
std::signal(SIGINT, signalHandler);
std::signal(SIGABRT, signalHandler);
// insert others
Then, what I'd suggest is storing the value of your signal into some atomic variable, like: gSignalThatStoppedMe.
std::atomic<int> gSignalThatStoppedMe = -1;
// I also added 'extern "C"' because the standard says so
extern "C" void signalHandler(int signum) {
gSignalThatStoppedMe.store(signum);
}
Then, your while loop would check for != -1, or pick another value for this, I've not checked if some implementations use -1 as a valid value for signals
// ...
while(gSignalThatStoppedMe.load() == -1)
{
// your old code
}
Now, do a switch of sorts, with the values inside and output the signal that stopped it, something like:
switch(gSignalThatStoppedMe.load())
{
case SIGINT:
std::puts("It was SIGINT");
break;
case SIGTERM:
std::puts("It was SIGTERM");
break;
default:
break;
}
I think this has less undefined behavior, which is always a good thing.
EDIT: here's a compiler explorer link
The output with CTRL-C:
SIGNALS ARE COOL
SIGNALS ARE COOL
SIGNALS ARE COOL
SIGNALS ARE COOL
SIGNALS ARE COOL
SIGNALS ARE COOL
SIGNALS ARE COOL
SIGNALS ARE COOL
SIGNALS ARE COOL
Interrupt signal SIGINT (2) received.

I'm trying to used std::signal to cleanly end my multithreaded program, what am I doing wrong?

What I'm trying to do
I have various things that must run concurrently on a linux, until the program is told to stop through ctrl-C (in which case SIGINT is received) or the service is stopped (in which case SIGTERM is received)
What I've come up with
For each thing that need to be done concurrently, I have a class that launches a thread in the constructor and whose destructor makes the thread stop and joins it. It looks basically like this:
#include <chrono>
#include <condition_variable>
#include <mutex>
#include <thread>
#include <system_error>
class SomeClassToDoStuff
{
public:
// Constructor
SomeClassToDoStuff()
{
_thread = std::thread([this]() {
while (true)
{
// Do some stuff
...
// Before going on to the next iteration
{
std::unique_lock<std::mutex> dataLock(_mutex);
// Wait for 2ms
if (!_shouldStopThreadExecution)
{
_conditionVariable.wait_for(dataLock, std::chrono::milliseconds(2));
}
// End the loop if requested
if (_shouldStopThreadExecution)
{
break;
}
}
}
// Some cleanup
...
});
}
// Destructor
~SomeClassToDoStuff()
{
if (_thread.joinable())
{
{
std::lock_guard<std::mutex> dataLock(_mutex);
_shouldStopThreadExecution = true;
}
_conditionVariable.notify_all();
try
{
_thread.join();
} catch (std::system_error const&) {}
}
}
private:
mutable std::mutex _mutex; // Mutex to keep things thread-safe
std::condition_variable _conditionVariable; // Condition variable used to wait
std::thread _thread; // Thread in which to do the stuff
bool _shouldStopThreadExecution = false; // Whether to stop the thread's execution
};
Then my main() looks like this:
#include <atomic>
#include <chrono>
#include <csignal>
#include <iostream>
#include <thread>
namespace {
std::atomic<int> programReturnValue(-1); // If positive or zero, the program must return with that value
}
static void signalHandler(int sig)
{
std::cout << "Signal received (" << sig << "). This iteration of the main loop will be the last." << std::endl;
programReturnValue.store(0);
}
int main()
{
// Make sure the program stops cleanly when receiving SIGTERM or SIGINT
{
std::signal(SIGTERM, signalHandler);
std::signal(SIGINT, signalHandler);
}
SomeClassToDoStuffA a;
SomeClassToDoStuffB b;
SomeClassToDoStuffC c;
SomeClassToDoStuffD d;
while (programReturnValue.load() < 0)
{
// Check that everything is alright
if (someCondition)
{
programReturnValue.store(1);
}
// Wait for 100us to avoid taking all of the processor's resources
std::this_thread::sleep_for(std::chrono::microseconds(100));
}
return programReturnValue.load();
}
(By the way, if there's an easier way to go about all this I'm interested to know)
The issue
When I hit ctrl+C or end the service, the program prints that signal 2 or 15 has been received (depending on which I used), and the program ends, which is good.
However:
The cleanup involves writing something to a file (in which things are successfully written during execution), but it seems that that doesn't always happen, which means that the cleanup isn't fully performed all the time, and that is a problem
The return code of the program isn't 0 as expected, or even 1, but either 130 or 143 depending on what signal is received
Why is that, and what am I doing wrong?
Edit: From what I understand, 130 and 143 are actually 128 + signal, i.e. what the program would return if I didn't try to handle the signals
Edit2: I'm getting a better idea of what's happening, and only half the issue seems to be coming from my program itself.
The program is actually run by a bash script, which then prints its return value and may relaunch it depending on the situation. Sending SIGINT and SIGTERM to the script is also supposed to send SIGTERM to the program.
It turns out that I suck at bash. I had written something like this:
#!/bin/sh
trap "killall myProgram --quiet --wait" 2 15
/path/to/myProgram&
wait $!
RETURN_VALUE=$?
echo "Code exited with return code ${RETURN_VALUE}"
# Some other stuff
...
ctrl-C while running the script in terminal actually leads to the program receiving both SIGINT then SIGTERM
the return code I'm printing is actually the result of wait+trap, not my program's
I will rework the script, but can the fact that both signals are sent to my program the reason why the cleanup fails sometimes? How? What can I do about it?
I am a bit confused about your signal handling:
To me it seems you use the terminating System-signal only to set the return-value and break the while loop in main; the threads, or rather their wrappers are terminated i.e. destructed only at the time them going out of scope, which is at the end of your main-scope, when you already have returned! Thrown exceptions (in your destructors) cannot be caught anymore.
your threads are therefor not ended yet, while you have already returned from main.
As a solution: I would recommend to set the stopping state _shouldStopThreadExecution at the time the main receives the signal for stopping already. And then remove the try statements for the .join() in your destructor in order to see the correct ending of the threads under quaranty.

Thread ending unexpectedly. c++

I'm trying to get a hold on pthreads. I see some people also have unexpected pthread behavior, but none of the questions seemed to be answered.
The following piece of code should create two threads, one which relies on the other. I read that each thread will create variables within their stack (can't be shared between threads) and using a global pointer is a way to have threads share a value. One thread should print it's current iteration, while another thread sleeps for 10 seconds. Ultimately one would expect 10 iterations. Using break points, it seems the script just dies at
while (*pointham != "cheese"){
It could also be I'm not properly utilizing code blocks debug functionality. Any pointers (har har har) would be helpful.
#include <iostream>
#include <cstdlib>
#include <pthread.h>
#include <unistd.h>
#include <string>
using namespace std;
string hamburger = "null";
string * pointham = &hamburger;
void *wait(void *)
{
int i {0};
while (*pointham != "cheese"){
sleep (1);
i++;
cout << "Waiting on that cheese " << i;
}
pthread_exit(NULL);
}
void *cheese(void *)
{
cout << "Bout to sleep then get that cheese";
sleep (10);
*pointham = "cheese";
pthread_exit(NULL);
}
int main()
{
pthread_t threads[2];
pthread_create(&threads[0], NULL, cheese, NULL);
pthread_create(&threads[1], NULL, wait, NULL);
return 0;
}
The problem is that you start your threads, then exit the process (thereby killing your threads). You have to wait for your threads to exit, preferably with the pthread_join function.
If you don't want to have to join all your threads, you can call pthread_exit() in the main thread instead of returning from main().
But note the BUGS section from the manpage:
Currently, there are limitations in the kernel implementation logic for
wait(2)ing on a stopped thread group with a dead thread group leader.
This can manifest in problems such as a locked terminal if a stop sig‐
nal is sent to a foreground process whose thread group leader has
already called pthread_exit().
According to this tutorial:
If main() finishes before the threads it has created, and exits with pthread_exit(), the other threads will continue to execute. Otherwise, they will be automatically terminated when main() finishes.
So, you shouldn't end the main function with the statement return 0;. But you should use pthread_exit(NULL); instead.
If this doesn't work with you, you may need to learn about joining threads here.

How to pause a pthread ANY TIME I want?

recently I set out to port ucos-ii to Ubuntu PC.
As we know, it's not possible to simulate the "process" in the ucos-ii by simply adding a flag in "while" loop in the pthread's call-back function to perform pause and resume(like the solution below). Because the "process" in ucos-ii can be paused or resumed at any time!
How to sleep or pause a PThread in c on Linux
I have found one solution on the web-site below, but it can't be built because it's out of date. It uses the process in Linux to simulate the task(acts like the process in our Linux) in ucos-ii.
http://www2.hs-esslingen.de/~zimmerma/software/index_uk.html
If pthread can act like the process which can be paused and resumed at any time, please tell me some related functions, I can figure it out myself. If it can't, I think I should focus on the older solution. Thanks a lot.
The Modula-3 garbage collector needs to suspend pthreads at an arbitrary time, not just when they are waiting on a condition variable or mutex. It does it by registering a (Unix) signal handler that suspends the thread and then using pthread_kill to send a signal to the target thread. I think it works (it has been reliable for others but I'm debugging an issue with it right now...) It's a bit kludgy, though....
Google for ThreadPThread.m3 and look at the routines "StopWorld" and "StartWorld". Handler itself is in ThreadPThreadC.c.
If stopping at specific points with a condition variable is insufficient, then you can't do this with pthreads. The pthread interface does not include suspend/resume functionality.
See, for example, answer E.4 here:
The POSIX standard provides no mechanism by which a thread A can suspend the execution of another thread B, without cooperation from B. The only way to implement a suspend/restart mechanism is to have B check periodically some global variable for a suspend request and then suspend itself on a condition variable, which another thread can signal later to restart B.
That FAQ answer goes on to describe a couple of non-standard ways of doing it, one in Solaris and one in LinuxThreads (which is now obsolete; do not confuse it with current threading on Linux); neither of those apply to your situation.
On Linux you can probably setup custom signal handler (eg. using signal()) that will contain wait for another signal (eg. using sigsuspend()). You then send the signals using pthread_kill() or tgkill(). It is important to use so-called "realtime signals" for this, because normal signals like SIGUSR1 and SIGUSR2 don't get queued, which means that they can get lost under high load conditions. You send a signal several times, but it gets received only once, because before while signal handler is running, new signals of the same kind are ignored. So if you have concurent threads doing PAUSE/RESUME , you can loose RESUME event and cause deadlock. On the other hand, the pending realtime signals (like SIGRTMIN+1 and SIGRTMIN+2) are not deduplicated, so there can be several same rt signals in queue at the same time.
DISCLAIMER: I had not tried this yet. But in theory it should work.
Also see man 7 signal-safety. There is a list of calls that you can safely call in signal handlers. Fortunately sigsuspend() seems to be one of them.
UPDATE: I have working code right here:
//Filename: pthread_pause.c
//Author: Tomas 'Harvie' Mudrunka 2021
//Build: CFLAGS=-lpthread make pthread_pause; ./pthread_pause
//Test: valgrind --tool=helgrind ./pthread_pause
//I've wrote this code as excercise to solve following stack overflow question:
// https://stackoverflow.com/questions/9397068/how-to-pause-a-pthread-any-time-i-want/68119116#68119116
#define _GNU_SOURCE //pthread_yield() needs this
#include <signal.h>
#include <pthread.h>
//#include <pthread_extra.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <unistd.h>
#include <errno.h>
#include <sys/resource.h>
#include <time.h>
#define PTHREAD_XSIG_STOP (SIGRTMIN+0)
#define PTHREAD_XSIG_CONT (SIGRTMIN+1)
#define PTHREAD_XSIGRTMIN (SIGRTMIN+2) //First unused RT signal
pthread_t main_thread;
sem_t pthread_pause_sem;
pthread_once_t pthread_pause_once_ctrl = PTHREAD_ONCE_INIT;
void pthread_pause_once(void) {
sem_init(&pthread_pause_sem, 0, 1);
}
#define pthread_pause_init() (pthread_once(&pthread_pause_once_ctrl, &pthread_pause_once))
#define NSEC_PER_SEC (1000*1000*1000)
// timespec_normalise() from https://github.com/solemnwarning/timespec/
struct timespec timespec_normalise(struct timespec ts)
{
while(ts.tv_nsec >= NSEC_PER_SEC) {
++(ts.tv_sec); ts.tv_nsec -= NSEC_PER_SEC;
}
while(ts.tv_nsec <= -NSEC_PER_SEC) {
--(ts.tv_sec); ts.tv_nsec += NSEC_PER_SEC;
}
if(ts.tv_nsec < 0) { // Negative nanoseconds isn't valid according to POSIX.
--(ts.tv_sec); ts.tv_nsec = (NSEC_PER_SEC + ts.tv_nsec);
}
return ts;
}
void pthread_nanosleep(struct timespec t) {
//Sleep calls on Linux get interrupted by signals, causing premature wake
//Pthread (un)pause is built using signals
//Therefore we need self-restarting sleep implementation
//IO timeouts are restarted by SA_RESTART, but sleeps do need explicit restart
//We also need to sleep using absolute time, because relative time is paused
//You should use this in any thread that gets (un)paused
struct timespec wake;
clock_gettime(CLOCK_MONOTONIC, &wake);
t = timespec_normalise(t);
wake.tv_sec += t.tv_sec;
wake.tv_nsec += t.tv_nsec;
wake = timespec_normalise(wake);
while(clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &wake, NULL)) if(errno!=EINTR) break;
return;
}
void pthread_nsleep(time_t s, long ns) {
struct timespec t;
t.tv_sec = s;
t.tv_nsec = ns;
pthread_nanosleep(t);
}
void pthread_sleep(time_t s) {
pthread_nsleep(s, 0);
}
void pthread_pause_yield() {
//Call this to give other threads chance to run
//Wait until last (un)pause action gets finished
sem_wait(&pthread_pause_sem);
sem_post(&pthread_pause_sem);
//usleep(0);
//nanosleep(&((const struct timespec){.tv_sec=0,.tv_nsec=1}), NULL);
//pthread_nsleep(0,1); //pthread_yield() is not enough, so we use sleep
pthread_yield();
}
void pthread_pause_handler(int signal) {
//Do nothing when there are more signals pending (to cleanup the queue)
//This is no longer needed, since we use semaphore to limit pending signals
/*
sigset_t pending;
sigpending(&pending);
if(sigismember(&pending, PTHREAD_XSIG_STOP)) return;
if(sigismember(&pending, PTHREAD_XSIG_CONT)) return;
*/
//Post semaphore to confirm that signal is handled
sem_post(&pthread_pause_sem);
//Suspend if needed
if(signal == PTHREAD_XSIG_STOP) {
sigset_t sigset;
sigfillset(&sigset);
sigdelset(&sigset, PTHREAD_XSIG_STOP);
sigdelset(&sigset, PTHREAD_XSIG_CONT);
sigsuspend(&sigset); //Wait for next signal
} else return;
}
void pthread_pause_enable() {
//Having signal queue too deep might not be necessary
//It can be limited using RLIMIT_SIGPENDING
//You can get runtime SigQ stats using following command:
//grep -i sig /proc/$(pgrep binary)/status
//This is no longer needed, since we use semaphores
//struct rlimit sigq = {.rlim_cur = 32, .rlim_max=32};
//setrlimit(RLIMIT_SIGPENDING, &sigq);
pthread_pause_init();
//Prepare sigset
sigset_t sigset;
sigemptyset(&sigset);
sigaddset(&sigset, PTHREAD_XSIG_STOP);
sigaddset(&sigset, PTHREAD_XSIG_CONT);
//Register signal handlers
//signal(PTHREAD_XSIG_STOP, pthread_pause_handler);
//signal(PTHREAD_XSIG_CONT, pthread_pause_handler);
//We now use sigaction() instead of signal(), because it supports SA_RESTART
const struct sigaction pause_sa = {
.sa_handler = pthread_pause_handler,
.sa_mask = sigset,
.sa_flags = SA_RESTART,
.sa_restorer = NULL
};
sigaction(PTHREAD_XSIG_STOP, &pause_sa, NULL);
sigaction(PTHREAD_XSIG_CONT, &pause_sa, NULL);
//UnBlock signals
pthread_sigmask(SIG_UNBLOCK, &sigset, NULL);
}
void pthread_pause_disable() {
//This is important for when you want to do some signal unsafe stuff
//Eg.: locking mutex, calling printf() which has internal mutex, etc...
//After unlocking mutex, you can enable pause again.
pthread_pause_init();
//Make sure all signals are dispatched before we block them
sem_wait(&pthread_pause_sem);
//Block signals
sigset_t sigset;
sigemptyset(&sigset);
sigaddset(&sigset, PTHREAD_XSIG_STOP);
sigaddset(&sigset, PTHREAD_XSIG_CONT);
pthread_sigmask(SIG_BLOCK, &sigset, NULL);
sem_post(&pthread_pause_sem);
}
int pthread_pause(pthread_t thread) {
sem_wait(&pthread_pause_sem);
//If signal queue is full, we keep retrying
while(pthread_kill(thread, PTHREAD_XSIG_STOP) == EAGAIN) usleep(1000);
pthread_pause_yield();
return 0;
}
int pthread_unpause(pthread_t thread) {
sem_wait(&pthread_pause_sem);
//If signal queue is full, we keep retrying
while(pthread_kill(thread, PTHREAD_XSIG_CONT) == EAGAIN) usleep(1000);
pthread_pause_yield();
return 0;
}
void *thread_test() {
//Whole process dies if you kill thread immediately before it is pausable
//pthread_pause_enable();
while(1) {
//Printf() is not async signal safe (because it holds internal mutex),
//you should call it only with pause disabled!
//Will throw helgrind warnings anyway, not sure why...
//See: man 7 signal-safety
pthread_pause_disable();
printf("Running!\n");
pthread_pause_enable();
//Pausing main thread should not cause deadlock
//We pause main thread here just to test it is OK
pthread_pause(main_thread);
//pthread_nsleep(0, 1000*1000);
pthread_unpause(main_thread);
//Wait for a while
//pthread_nsleep(0, 1000*1000*100);
pthread_unpause(main_thread);
}
}
int main() {
pthread_t t;
main_thread = pthread_self();
pthread_pause_enable(); //Will get inherited by all threads from now on
//you need to call pthread_pause_enable (or disable) before creating threads,
//otherwise first (un)pause signal will kill whole process
pthread_create(&t, NULL, thread_test, NULL);
while(1) {
pthread_pause(t);
printf("PAUSED\n");
pthread_sleep(3);
printf("UNPAUSED\n");
pthread_unpause(t);
pthread_sleep(1);
/*
pthread_pause_disable();
printf("RUNNING!\n");
pthread_pause_enable();
*/
pthread_pause(t);
pthread_unpause(t);
}
pthread_join(t, NULL);
printf("DIEDED!\n");
}
I am also working on library called "pthread_extra", which will have stuff like this and much more. Will publish soon.
UPDATE2: This is still causing deadlocks when calling pause/unpause rapidly (removed sleep() calls). Printf() implementation in glibc has mutex, so if you suspend thread which is in middle of printf() and then want to printf() from your thread which plans to unpause that thread later, it will never happen, because printf() is locked. Unfortunately i've removed the printf() and only run empty while loop in the thread, but i still get deadlocks under high pause/unpause rates. and i don't know why. Maybe (even realtime) Linux signals are not 100% safe. There is realtime signal queue, maybe it just overflows or something...
UPDATE3: i think i've managed to fix the deadlock, but had to completely rewrite most of the code. Now i have one (sig_atomic_t) variable per each thread which holds state whether that thread should be running or not. Works kinda like condition variable. pthread_(un)pause() transparently remembers this for each thread. I don't have two signals. now i only have one signal. handler of that signal looks at that variable and only blocks on sigsuspend() when that variable says the thread should NOT run. otherwise it returns from signal handler. in order to suspend/resume the thread i now set the sig_atomic_t variable to desired state and call that signal (which is common for both suspend and resume). It is important to use realtime signals to be sure handler will actualy run after you've modified the state variable. Code is bit complex because of the thread status database. I will share the code in separate solution as soon as i manage to simplify it enough. But i want to preserve the two signal version in here, because it kinda works, i like the simplicity and maybe people will give us more insight on how to optimize it.
UPDATE4: I've fixed the deadlock in original code (no need for helper variable holding the status) by using single handler for two signals and optimizing signal queue a bit. There is still some problem with printf() shown by helgrind, but it is not caused by my signals, it happens even when i do not call pause/unpause at all. Overall this was only tested on LINUX, not sure how portable the code is, because there seem to be some undocumented behaviour of signal handlers which was originaly causing the deadlock.
Please note that pause/unpause cannot be nested. if you pause 3 times, and unpause 1 time, the thread WILL RUN. If you need such behaviour, you should create some kind of wrapper which will count the nesting levels and signal the thread accordingly.
UPDATE5: I've improved robustness of the code by following changes: I ensure proper serialization of pause/unpause calls by use of semaphores. This hopefuly fixes last remaining deadlocks. Now you can be sure that when pause call returns, the target thread is actualy already paused. This also solves issues with signal queue overflowing. Also i've added SA_RESTART flag, which prevents internal signals from causing interuption of IO waits. Sleeps/delays still have to be restarted manualy, but i provide convenient wrapper called pthread_nanosleep() which does just that.
UPDATE6: i realized that simply restarting nanosleep() is not enough, because that way timeout does not run when thread is paused. Therefore i've modified pthread_nanosleep() to convert timeout interval to absolute time point in the future and sleep until that. Also i've hidden semaphore initialization, so user does not need to do that.
Here is example of thread function within a class with pause/resume functionality...
class SomeClass
{
public:
// ... construction/destruction
void Resume();
void Pause();
void Stop();
private:
static void* ThreadFunc(void* pParam);
pthread_t thread;
pthread_mutex_t mutex;
pthread_cond_t cond_var;
int command;
};
SomeClass::SomeClass()
{
pthread_mutex_init(&mutex, NULL);
pthread_cond_init(&cond_var, NULL);
// create thread in suspended state..
command = 0;
pthread_create(&thread, NULL, ThreadFunc, this);
}
SomeClass::~SomeClass()
{
// we should stop the thread and exit ThreadFunc before calling of blocking pthread_join function
// also it prevents the mutex staying locked..
Stop();
pthread_join(thread, NULL);
pthread_cond_destroy(&cond_var);
pthread_mutex_destroy(&mutex);
}
void* SomeClass::ThreadFunc(void* pParam)
{
SomeClass* pThis = (SomeClass*)pParam;
timespec time_ns = {0, 50*1000*1000}; // 50 milliseconds
while(1)
{
pthread_mutex_lock(&pThis->mutex);
if (pThis->command == 2) // command to stop thread..
{
// be sure to unlock mutex before exit..
pthread_mutex_unlock(&pThis->mutex);
return NULL;
}
else if (pThis->command == 0) // command to pause thread..
{
pthread_cond_wait(&pThis->cond_var, &pThis->mutex);
// dont forget to unlock the mutex..
pthread_mutex_unlock(&pThis->mutex);
continue;
}
if (pThis->command == 1) // command to run..
{
// normal runing process..
fprintf(stderr, "*");
}
pthread_mutex_unlock(&pThis->mutex);
// it's important to give main thread few time after unlock 'this'
pthread_yield();
// ... or...
//nanosleep(&time_ns, NULL);
}
pthread_exit(NULL);
}
void SomeClass::Stop()
{
pthread_mutex_lock(&mutex);
command = 2;
pthread_cond_signal(&cond_var);
pthread_mutex_unlock(&mutex);
}
void SomeClass::Pause()
{
pthread_mutex_lock(&mutex);
command = 0;
// in pause command we dont need to signal cond_var because we not in wait state now..
pthread_mutex_unlock(&mutex);
}
void SomeClass::Resume()
{
pthread_mutex_lock(&mutex);
command = 1;
pthread_cond_signal(&cond_var);
pthread_mutex_unlock(&mutex);
}

Cancelling a thread using pthread_cancel : good practice or bad

I have a C++ program on Linux (CentOS 5.3) spawning multiple threads which are in an infinite loop to perform a job and sleep for certain minutes.
Now I have to cancel the running threads in case a new configuration notification comes in and freshly start new set of threads, for which i have used pthread_cancel.
What I observed was, the threads were not getting stopped even after receiving cancel indication,even some sleeping threads were coming up after the sleep was completed.
As the behavior was not desired, usage of pthread_cancel in the mentioned scenario raises question about being good or bad practice.
Please comment on the pthread_cancel usage in above mentioned scenario.
In general thread cancellation is not a really good idea. It is better, whenever possible, to have a shared flag, that is used by the threads to break out of the loop. That way, you will let the threads perform any cleanup they might need to do before actually exiting.
On the issue of the threads not actually cancelling, the POSIX specification determines a set of cancellation points ( man 7 pthreads ). Threads can be cancelled only at those points. If your infinite loop does not contain a cancellation point you can add one by calling pthread_testcancel. If pthread_cancel has been called, then it will be acted upon at this point.
If you are writing exception safe C++ code (see http://www.boost.org/community/exception_safety.html) than your code is naturally ready for thread cancellation. glibs throws C++ exception on thread cancel, so that your destructors can do the appropriate clean-up.
You can do the equivalent of the code below.
#include <pthread.h>
#include <cxxabi.h>
#include <unistd.h>
...
void *Control(void* pparam)
{
try
{
// do your work here, maybe long loop
}
catch (abi::__forced_unwind&)
{ // handle pthread_cancel stack unwinding exception
throw;
}
catch (exception &ex)
{
throw ex;
}
}
int main()
{
pthread_t tid;
int rtn;
rtn = pthread_create( &tid, NULL, Control, NULL );
usleep(500);
// some other work here
rtn = pthtead_cancel( tid );
}
I'd use boost::asio.
Something like:
struct Wait {
Wait() : timer_(io_service_), run_(true) {}
boost::asio::io_service io_service_;
mutable boost::asio::deadline_timer timer_;
bool run_;
};
void Wait::doWwork() {
while (run) {
boost::system::error_code ec;
timer_.wait(ec);
io_service_.run();
if (ec) {
if (ec == boost::asio::error::operation_aborted) {
// cleanup
} else {
// Something else, possibly nasty, happened
}
}
}
}
void Wait::halt() {
run_ = false;
timer_.cancel();
}
Once you've got your head round it, asio is a wonderful tool.