c++ implementing semaphore on my own - c++

let's pretend there are no libraries that provide semaphores for C++. I wrote this:
#include <vector>
#include <Windows.h>
class Semaphore {
HANDLE mutexS; // provides mutex in semaphore rutines
std::vector<HANDLE> queue; // provides FIFO queue for blocked threads
int value; // semaphore's value
public:
Semaphore(int init=1);
~Semaphore();
void wait();
void signal();
};
Semaphore::Semaphore(int init) {
value = init;
queue = std::vector<HANDLE>();
mutexS = CreateMutex(0,0,0);
}
Semaphore::~Semaphore() {
CloseHandle(mutexS);
}
void Semaphore::signal() {
WaitForSingleObject(mutexS, INFINITE);
if (++value <= 0) {
HANDLE someOldThread = queue.front();
ResumeThread(someOldThread);
queue.erase(queue.begin());
CloseHandle(someOldThread);
}
ReleaseMutex(mutexS);
}
I would like to know why this implementation of wait() doesn't work:
void Semaphore::wait() {
WaitForSingleObject(mutexS, INFINITE);
if (--value < 0) {
HANDLE thisThread = GetCurrentThread();
queue.push_back(thisThread);
ReleaseMutex(mutexS);
SuspendThread(thisThread );
}
else
ReleaseMutex(mutexS);
}
And this one works:
void Semaphore::wait() {
WaitForSingleObject(mutexS, INFINITE);
if (--value < 0) {
HANDLE thisThread = GetCurrentThread();
HANDLE alsoThisThread;
DuplicateHandle(GetCurrentProcess(), thisThread, GetCurrentProcess(), &alsoThisThread, 0, 0, DUPLICATE_SAME_ACCESS);
queue.push_back(alsoThisThread);
ReleaseMutex(mutexS);
SuspendThread(alsoThisThread);
}
else
ReleaseMutex(mutexS);
}
What exactly happens in each case? I've been banging my head over it for a lot of time now. The first implementation of wait, which doesn't work, makes my program block (well, it probably blocks some thread forever). The 2nd implementation works like a charm. What gives ? Why do I need to duplicate thread handles and block the duplicate ?

MSDN helps a lot here ;)
GetCurrentThread returns a pseudo-handle which is a constant for "the current thread":
A pseudo handle is a special constant that is interpreted as the current thread handle.
So when you push it in the queue, you are always pushing a constant that says "the current thread", which is obviously not what you want.
To get a real handle, you have to use DuplicateHandle
If hSourceHandle is a pseudo handle returned by GetCurrentProcess or GetCurrentThread, DuplicateHandle converts it to a real handle to a process or thread, respectively.
A final note: I suppose you are implementing this as a "test" right? Because there are several potential problems.. A very good learning exercise would be to dig them out. But you should not use this in production code.
Out of curiosity: if you want to experiment a little more, the "canonical" way of implementing semaphore with mutexes is to use two mutexes: see here

MSDN documentation for GetCurrentThread has the answer (accents are mine):
The return value is a pseudo handle for the current thread.
A pseudo handle is a special constant that is interpreted as the current thread handle. The calling thread can use this handle to specify itself whenever a thread handle is required.
...
The function cannot be used by one thread to create a handle that can be used by other threads to refer to the first thread. The handle is always interpreted as referring to the thread that is using it. A thread can create a "real" handle to itself that can be used by other threads, or inherited by other processes, by specifying the pseudo handle as the source handle in a call to the DuplicateHandle function.

Related

Synchronize Threads - InterlockedExchange

I like to check if a thread is doing work. If the thread is doing work I will wait for an event until the thread has stopped its work. The event the thread will set at the end.
To check if the thread is working I declared a volatile bool variable. The bool variable will be true if the thread is running, else it is false. At the end of the thread the bool variable will be set to false.
Is it adequate to use a volatile bool variable or do I have to use an atomic function?
BTW: Can please someone explain me the InterlockedExchange Method, I donĀ“t understand the use case I will need this function.
Update
I see without my code it is not clear to say if a volatile bool variable will adequate. I wrote a testclass which shows my problem.
class Testclass
{
public:
Testclass(void);
~Testclass(void);
void doThreadedWork();
void Work();
void StartWork();
void WaitUntilFinish();
private:
HANDLE hHasWork;
HANDLE hAbort;
HANDLE hFinished;
volatile bool m_bWorking;
};
//.cpp
#include "stdafx.h"
#include "Testclass.h"
CRITICAL_SECTION cs;
DWORD WINAPI myThread(LPVOID lpParameter)
{
Testclass* pTestclass = (Testclass*) lpParameter;
pTestclass->doThreadedWork();
return 0;
}
Testclass::Testclass(void)
{
InitializeCriticalSection(&cs);
DWORD myThreadID;
HANDLE myHandle = CreateThread(0, 0, myThread, this, 0, &myThreadID);
m_bWorking = false;
hHasWork = CreateEvent(NULL,TRUE,FALSE,NULL);
hAbort = CreateEvent(NULL,TRUE,FALSE,NULL);
hFinished = CreateEvent(NULL,FALSE,FALSE,NULL);
}
Testclass::~Testclass(void)
{
DeleteCriticalSection(&cs);
CloseHandle(hHasWork);
CloseHandle(hAbort);
CloseHandle(hFinished);
}
void Testclass::Work()
{
// do some work
m_bWorking = false;
SetEvent(hFinished);
}
void Testclass::StartWork()
{
EnterCriticalSection(&cs);
m_bWorking = true;
ResetEvent(hFinished);
SetEvent(hHasWork);
LeaveCriticalSection(&cs);
}
void Testclass::doThreadedWork()
{
HANDLE hEvents[2];
hEvents[0] = hHasWork;
hEvents[1] = hAbort;
while(true)
{
DWORD dwEvent = WaitForMultipleObjects(2, hEvents, FALSE, INFINITE);
if(WAIT_OBJECT_0 == dwEvent)
{
Work();
}
else
{
break;
}
}
}
void Testclass::WaitUntilFinish()
{
EnterCriticalSection(&cs);
if(!m_bWorking)
{
// if the thread is not working, do not wait and return
LeaveCriticalSection(&cs);
return;
}
WaitForSingleObject(hFinished,INFINITE);
LeaveCriticalSection(&cs);
}
For me it is not realy clear if m_bWorking value n a atomic way or if the volatile cast will adequate.
There is a lot of background to cover for your question. We don't know for example what tool chain you are using so I am going to answer it as a winapi question. I further assume you have some something in mind like this:
volatile bool flag = false;
DWORD WINAPI WorkFn(void*) {
flag = true;
// work here
....
// done.
flag = false;
return 0;
}
int main() {
HANDLE th = CreateThread(...., &WorkFn, NULL, ..);
// wait for start of work.
while (!flag) {
// ?? # 1
}
// Seems thread is busy now. Time to wait for it to finish.
while (flag) {
// ?? # 2
}
}
There are many things wrong here. For starters the volatile does very little here. When flag = true happens it will eventually be visible to the other thread because it is backed by a global variable. This is so because it will at least make it into the cache and the cache has ways to tell other processors that a given line (which is a range of addresses) is dirty. The only way it would not make it into the cache is that if the compiler makes a super crazy optimization in which flag stays in the cpu as a register. That could actually happen but not in this particular code example.
So volatile tells the compiler to never keep the variable as a register. That is what it is, every time you see a volatile variable you can translate it as "never enregister this variable". Its use here is just basically a paranoid move.
If this code is what you had in mind then this looping over a flag pattern is called a Spinlock and this one is a really poor one. It is almost never the right thing to do in a user mode program.
Before we go into better approaches let me tackle your Interlocked question. What people usually mean is this pattern
volatile long flag = 0;
DWORD WINAPI WorkFn(void*) {
InterlockedExchange(&flag, 1);
....
}
int main() {
...
while (InterlockedCompareExchange(&flag, 1, 1) = 0L) {
YieldProcessor();
}
...
}
Assume the ... means similar code as before. What the InterlockedExchange() is doing is forcing the write to memory to happen in a deterministic, "broadcast the change now", kind of way and the typical way to read it in the same "bypass the cache" way is via InterlockedCompareExchange().
One problem with them is that they generate more traffic on the system bus. That is, the bus now being used to broadcast cache synchronization packets among the cpus on the system.
std::atomic<bool> flag would be the modern, C++11 way to do the same, but still not what you really want to do.
I added the YieldProcessor() call there to point to the real problem. When you wait for a memory address to change you are using cpu resources that would be better used somewhere else, for example in the actual work (!!). If you actually yield the processor there is at least a chance that the OS will give it to the WorkFn, but in a multicore machine it will quickly go back to polling the variable. In a modern machine you will be checking this flag millions of times per second, with the yield, probably 200000 times per second. Terrible waste either way.
What you want to do here is to leverage Windows to do a zero-cost wait, or at least a low cost as you want to:
DWORD WINAPI WorkFn(void*) {
// work here
....
return 0;
}
int main() {
HANDLE th = CreateThread(...., &WorkFn, NULL, ..);
WaitForSingleObject(th, INFINITE);
// work is done!
CloseHandle(th);
}
When you return from the worker thread the thread handle get signaled and the wait it satisfied. While stuck in WaitForSingleObject you don't consume any cpu cycles. If you want to do a periodic activity in the main() function while you wait you can replace INFINITE with 1000, which will release the main thread every second. In that case you need to check the return value of WaitForSingleObject to tell the timeout from thread being done case.
If you need to actually know when work started, you need an additional waitable object, for example, a Windows event which is obtained via CreateEvent() and can be waited on using the same WaitForSingleObject.
Update [1/23/2016]
Now that we can see the code you have in mind, you don't need atomics, volatile works just fine. The m_bWorking is protected by the cs mutex anyhow for the true case.
If I might suggest, you can use TryEnterCriticalSection and cs to accomplish the same without m_bWorking at all:
void Testclass::Work()
{
EnterCriticalSection(&cs);
// do some work
LeaveCriticalSection(&cs);
SetEvent(hFinished); // could be removed as well
}
void Testclass::StartWork()
{
ResetEvent(hFinished); // could be removed.
SetEvent(hHasWork);
}
void Testclass::WaitUntilFinish()
{
if (TryEnterCriticalSection(&cs)) {
// Not busy now.
LeaveCriticalSection(&cs);
return;
} else {
// busy doing work. If we use EnterCriticalSection(&cs)
// here we can even eliminate hFinished from the code.
}
...
}
For some reason, the Interlocked API does not include an "InterlockedGet" or "InterlockedSet" function. This is a strange omission and the typical work around is to cast through volatile.
You can use code like the following on Windows:
#include <intrin.h>
__inline int InterlockedIncrement(int *j)
{ // This is VS-specific
return _InterlockedIncrement((volatile LONG *) j);
}
__inline int InterlockedDecrement(int *j)
{ // This is VS-specific
return _InterlockedDecrement((volatile LONG *) j);
}
__inline static void InterlockedSet(int *val, int newval)
{
*((volatile int *)val) = newval;
}
__inline static int InterlockedGet(int *val)
{
return *((volatile int *)val);
}
Yes, it's ugly. But it's the best way to work around the deficiency if you're not using C++11. If you're using C++11, use std::atomic instead.
Note that this is Windows-specific code and should not be used on other platforms.
No, volatile bool will not be enough. You need an atomic bool, as you correctly suspect. Otherwise, you might never see your bool updated.
There is also no InterlockedExchange in C++ (the tags of your question), but there are compare_exchange_weak and compare_exchange_strong functions in C++11. Those are used to set the value of an object to a certain NewValue, provided it's current value is TestValue and indicate the status of this attempt (was the change made or not). The benefit of those functions is that this is done in such a fasion that you are guaranteed that if two threads are trying to perform this operation, only one will succeed. This is very helpful when you need to take a certain actions depending on the result of the operation.

notify thread about changes in variable (signals?)

I have main() and thread in the same program.
there is a variable named "status", that can get several values
I need that when the variable changes, to notify the thread (the thread cnat wait for the status variable, it is already doing fluent task) .
is there an easy way to do so? similar to interrupts? how about signals?
the function inside the main:
int main()
{
char *status;
...
...
while (1)
{
switch (status)
{
case: status1 ...notify the thread
case: status2 ...notify the thread
case: status3 ...notify the thread
}
}
}
if someone could give me an example it will be great!
thanks!
Since you're already using the pthread library you can use conditional variables to tell the thread that there is data ready for processing. Take a look at this StackOverflow question for more information.
I understand that you do not want to wait indefinitely for this notification, however C++ only implements cooperative scheduling. You cannot just pause a thread, fiddle with its memory, and resume it.
Therefore, the first thing you have to understand is that the thread which has to process the signal/action you want to send must be willing to do so; which in other words means must explicitly check for the signal at some point.
There are multiple ways for a thread to check for a signal:
condition variable: they require waiting for the signal (which might be undesirable) but that wait can be bounded by a duration
action queue (aka channel): you create a queue of signals/actions and every so often the target thread checks for something to do; if there is nothing it just goes on doing whatever it has to do, if there is something you have to decide whether it should do everything or only process the N firsts. Beware of overflowing the queue.
just check the status variable directly every so often, it does not tell you how many times it changed (unless it keeps an history: but then we are back to the queue), but it allows you to amend your ways.
Given your requirements, I would think that the queue is probably the best idea among those three.
Might be this example helpful for you.
DWORD sampleThread( LPVOID argument );
int main()
{
bool defValue = false;
bool* status = &defValue;
CreateThread(NULL, 0, sampleThread, status, 0,NULL);
while(1)
{
//.............
defValue = true; //trigger thread
// ...
}
return 0;
}
DWORD sampleThread( LPVOID argument )
{
bool* syncPtr = reinterpret_cast<bool*>(argument);
while (1)
{
if (false == *syncPtr)
{
// do something
}
else (true = *syncPtr)
{
//do somthing else
}
}
}

c++ winapi threads

These days I'm trying to learn more things about threads in windows. I thought about making this practical application:
Let's say there are several threads started when a button "Start" is pressed. Assume these threads are intensive (they keep running / have always something to work on).
This app would also have a "Stop" button. When this button is pressed all the threads should close in a nice way: free resources and abandon work and return the state they were before the "Start" button was pressed.
Another request of the app is that the functions runned by the threads shouldn't contain any instruction checking if the "Stop" button was pressed. The function running in the thread shouldn't care about the stop button.
Language: C++
OS: Windows
Problems:
WrapperFunc(function, param)
{
// what to write here ?
// if i write this:
function(param);
// i cannot stop the function from executing
}
How should I construct the wrapper function so that I can stop the thread properly?
( without using TerminateThread or some other functions )
What if the programmer allocates some memory dynamically? How can I free it before closing
the thread?( note that when I press "Stop button" the thread is still processing data)
I though about overloading the new operator or just imposing the usage of a predefined
function to be used when allocating memory dynamically. This, however, means
that the programmer who uses this api is constrained and it's not what I want.
Thank you
Edit: Skeleton to describe the functionality I'd like to achieve.
struct wrapper_data
{
void* (*function)(LPVOID);
LPVOID *params;
};
/*
this function should make sure that the threads stop properly
( free memory allocated dynamically etc )
*/
void* WrapperFunc(LPVOID *arg)
{
wrapper_data *data = (wrapper_data*) arg;
// what to write here ?
// if i write this:
data->function(data->params);
// i cannot stop the function from executing
delete data;
}
// will have exactly the same arguments as CreateThread
MyCreateThread(..., function, params, ...)
{
// this should create a thread that runs the wrapper function
wrapper_data *data = new wrapper_data;
data->function = function;
data->params = params;
CreateThread(..., WrapperFunc, (LPVOID) wrapper_data, ...);
}
thread_function(LPVOID *data)
{
while(1)
{
//do stuff
}
}
// as you can see I want it to be completely invisible
// to the programmer who uses this
MyCreateThread(..., thread_function, (LPVOID) params,...);
One solution is to have some kind of signal that tells the threads to stop working. Often this can be a global boolean variable that is normally false but when set to true it tells the threads to stop. As for the cleaning up, do it when the threads main loop is done before returning from the thread.
I.e. something like this:
volatile bool gStopThreads = false; // Defaults to false, threads should not stop
void thread_function()
{
while (!gStopThreads)
{
// Do some stuff
}
// All processing done, clean up after my self here
}
As for the cleaning up bit, if you keep the data inside a struct or a class, you can forcibly kill them from outside the threads and just either delete the instances if you allocated them dynamically or let the system handle it if created e.g. on the stack or as global objects. Of course, all data your thread allocates (including files, sockets etc.) must be placed in this structure or class.
A way of keeping the stopping functionality in the wrapper, is to have the actual main loop in the wrapper, together with the check for the stop-signal. Then in the main loop just call a doStuff-like function that does the actual processing. However, if it contains operations that might take time, you end up with the first problem again.
See my answer to this similar question:
How do I guarantee fast shutdown of my win32 app?
Basically, you can use QueueUserAPC to queue a proc which throws an exception. The exception should bubble all the way up to a 'catch' in your thread proc.
As long as any libraries you're using are reasonably exception-aware and use RAII, this works remarkably well. I haven't successfully got this working with boost::threads however, as it's doesn't put suspended threads into an alertable wait state, so QueueUserAPC can't wake them.
If you don't want the "programmer" of the function that the thread will execute deal with the "stop" event, make the thread execute a function of "you" that deals with the "stop" event and when that event isn't signaled executes the "programmer" function...
In other words the "while(!event)" will be in a function that calls the "job" function.
Code Sample.
typedef void (*JobFunction)(LPVOID params); // The prototype of the function to execute inside the thread
struct structFunctionParams
{
int iCounter;
structFunctionParams()
{
iCounter = 0;
}
};
struct structJobParams
{
bool bStop;
JobFunction pFunction;
LPVOID pFunctionParams;
structJobParams()
{
bStop = false;
pFunction = NULL;
pFunctionParams = NULL;
}
};
DWORD WINAPI ThreadProcessJob(IN LPVOID pParams)
{
structJobParams* pJobParams = (structJobParams*)pParams;
while(!pJobParams->bStop)
{
// Execute the "programmer" function
pJobParams->pFunction(pJobParams->pFunctionParams);
}
return 0;
}
void ThreadFunction(LPVOID pParams)
{
// Do Something....
((structFunctionParams*)pParams)->iCounter ++;
}
int _tmain(int argc, _TCHAR* argv[])
{
structFunctionParams stFunctionParams;
structJobParams stJobParams;
stJobParams.pFunction = &ThreadFunction;
stJobParams.pFunctionParams = &stFunctionParams;
DWORD dwIdThread = 0;
HANDLE hThread = CreateThread(
NULL,
0,
ThreadProcessJob,
(LPVOID) &stJobParams, 0, &dwIdThread);
if(hThread)
{
// Give it 5 seconds to work
Sleep(5000);
stJobParams.bStop = true; // Signal to Stop
WaitForSingleObject(hThread, INFINITE); // Wait to finish
CloseHandle(hThread);
}
}

How to pause a pthread ANY TIME I want?

recently I set out to port ucos-ii to Ubuntu PC.
As we know, it's not possible to simulate the "process" in the ucos-ii by simply adding a flag in "while" loop in the pthread's call-back function to perform pause and resume(like the solution below). Because the "process" in ucos-ii can be paused or resumed at any time!
How to sleep or pause a PThread in c on Linux
I have found one solution on the web-site below, but it can't be built because it's out of date. It uses the process in Linux to simulate the task(acts like the process in our Linux) in ucos-ii.
http://www2.hs-esslingen.de/~zimmerma/software/index_uk.html
If pthread can act like the process which can be paused and resumed at any time, please tell me some related functions, I can figure it out myself. If it can't, I think I should focus on the older solution. Thanks a lot.
The Modula-3 garbage collector needs to suspend pthreads at an arbitrary time, not just when they are waiting on a condition variable or mutex. It does it by registering a (Unix) signal handler that suspends the thread and then using pthread_kill to send a signal to the target thread. I think it works (it has been reliable for others but I'm debugging an issue with it right now...) It's a bit kludgy, though....
Google for ThreadPThread.m3 and look at the routines "StopWorld" and "StartWorld". Handler itself is in ThreadPThreadC.c.
If stopping at specific points with a condition variable is insufficient, then you can't do this with pthreads. The pthread interface does not include suspend/resume functionality.
See, for example, answer E.4 here:
The POSIX standard provides no mechanism by which a thread A can suspend the execution of another thread B, without cooperation from B. The only way to implement a suspend/restart mechanism is to have B check periodically some global variable for a suspend request and then suspend itself on a condition variable, which another thread can signal later to restart B.
That FAQ answer goes on to describe a couple of non-standard ways of doing it, one in Solaris and one in LinuxThreads (which is now obsolete; do not confuse it with current threading on Linux); neither of those apply to your situation.
On Linux you can probably setup custom signal handler (eg. using signal()) that will contain wait for another signal (eg. using sigsuspend()). You then send the signals using pthread_kill() or tgkill(). It is important to use so-called "realtime signals" for this, because normal signals like SIGUSR1 and SIGUSR2 don't get queued, which means that they can get lost under high load conditions. You send a signal several times, but it gets received only once, because before while signal handler is running, new signals of the same kind are ignored. So if you have concurent threads doing PAUSE/RESUME , you can loose RESUME event and cause deadlock. On the other hand, the pending realtime signals (like SIGRTMIN+1 and SIGRTMIN+2) are not deduplicated, so there can be several same rt signals in queue at the same time.
DISCLAIMER: I had not tried this yet. But in theory it should work.
Also see man 7 signal-safety. There is a list of calls that you can safely call in signal handlers. Fortunately sigsuspend() seems to be one of them.
UPDATE: I have working code right here:
//Filename: pthread_pause.c
//Author: Tomas 'Harvie' Mudrunka 2021
//Build: CFLAGS=-lpthread make pthread_pause; ./pthread_pause
//Test: valgrind --tool=helgrind ./pthread_pause
//I've wrote this code as excercise to solve following stack overflow question:
// https://stackoverflow.com/questions/9397068/how-to-pause-a-pthread-any-time-i-want/68119116#68119116
#define _GNU_SOURCE //pthread_yield() needs this
#include <signal.h>
#include <pthread.h>
//#include <pthread_extra.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <unistd.h>
#include <errno.h>
#include <sys/resource.h>
#include <time.h>
#define PTHREAD_XSIG_STOP (SIGRTMIN+0)
#define PTHREAD_XSIG_CONT (SIGRTMIN+1)
#define PTHREAD_XSIGRTMIN (SIGRTMIN+2) //First unused RT signal
pthread_t main_thread;
sem_t pthread_pause_sem;
pthread_once_t pthread_pause_once_ctrl = PTHREAD_ONCE_INIT;
void pthread_pause_once(void) {
sem_init(&pthread_pause_sem, 0, 1);
}
#define pthread_pause_init() (pthread_once(&pthread_pause_once_ctrl, &pthread_pause_once))
#define NSEC_PER_SEC (1000*1000*1000)
// timespec_normalise() from https://github.com/solemnwarning/timespec/
struct timespec timespec_normalise(struct timespec ts)
{
while(ts.tv_nsec >= NSEC_PER_SEC) {
++(ts.tv_sec); ts.tv_nsec -= NSEC_PER_SEC;
}
while(ts.tv_nsec <= -NSEC_PER_SEC) {
--(ts.tv_sec); ts.tv_nsec += NSEC_PER_SEC;
}
if(ts.tv_nsec < 0) { // Negative nanoseconds isn't valid according to POSIX.
--(ts.tv_sec); ts.tv_nsec = (NSEC_PER_SEC + ts.tv_nsec);
}
return ts;
}
void pthread_nanosleep(struct timespec t) {
//Sleep calls on Linux get interrupted by signals, causing premature wake
//Pthread (un)pause is built using signals
//Therefore we need self-restarting sleep implementation
//IO timeouts are restarted by SA_RESTART, but sleeps do need explicit restart
//We also need to sleep using absolute time, because relative time is paused
//You should use this in any thread that gets (un)paused
struct timespec wake;
clock_gettime(CLOCK_MONOTONIC, &wake);
t = timespec_normalise(t);
wake.tv_sec += t.tv_sec;
wake.tv_nsec += t.tv_nsec;
wake = timespec_normalise(wake);
while(clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &wake, NULL)) if(errno!=EINTR) break;
return;
}
void pthread_nsleep(time_t s, long ns) {
struct timespec t;
t.tv_sec = s;
t.tv_nsec = ns;
pthread_nanosleep(t);
}
void pthread_sleep(time_t s) {
pthread_nsleep(s, 0);
}
void pthread_pause_yield() {
//Call this to give other threads chance to run
//Wait until last (un)pause action gets finished
sem_wait(&pthread_pause_sem);
sem_post(&pthread_pause_sem);
//usleep(0);
//nanosleep(&((const struct timespec){.tv_sec=0,.tv_nsec=1}), NULL);
//pthread_nsleep(0,1); //pthread_yield() is not enough, so we use sleep
pthread_yield();
}
void pthread_pause_handler(int signal) {
//Do nothing when there are more signals pending (to cleanup the queue)
//This is no longer needed, since we use semaphore to limit pending signals
/*
sigset_t pending;
sigpending(&pending);
if(sigismember(&pending, PTHREAD_XSIG_STOP)) return;
if(sigismember(&pending, PTHREAD_XSIG_CONT)) return;
*/
//Post semaphore to confirm that signal is handled
sem_post(&pthread_pause_sem);
//Suspend if needed
if(signal == PTHREAD_XSIG_STOP) {
sigset_t sigset;
sigfillset(&sigset);
sigdelset(&sigset, PTHREAD_XSIG_STOP);
sigdelset(&sigset, PTHREAD_XSIG_CONT);
sigsuspend(&sigset); //Wait for next signal
} else return;
}
void pthread_pause_enable() {
//Having signal queue too deep might not be necessary
//It can be limited using RLIMIT_SIGPENDING
//You can get runtime SigQ stats using following command:
//grep -i sig /proc/$(pgrep binary)/status
//This is no longer needed, since we use semaphores
//struct rlimit sigq = {.rlim_cur = 32, .rlim_max=32};
//setrlimit(RLIMIT_SIGPENDING, &sigq);
pthread_pause_init();
//Prepare sigset
sigset_t sigset;
sigemptyset(&sigset);
sigaddset(&sigset, PTHREAD_XSIG_STOP);
sigaddset(&sigset, PTHREAD_XSIG_CONT);
//Register signal handlers
//signal(PTHREAD_XSIG_STOP, pthread_pause_handler);
//signal(PTHREAD_XSIG_CONT, pthread_pause_handler);
//We now use sigaction() instead of signal(), because it supports SA_RESTART
const struct sigaction pause_sa = {
.sa_handler = pthread_pause_handler,
.sa_mask = sigset,
.sa_flags = SA_RESTART,
.sa_restorer = NULL
};
sigaction(PTHREAD_XSIG_STOP, &pause_sa, NULL);
sigaction(PTHREAD_XSIG_CONT, &pause_sa, NULL);
//UnBlock signals
pthread_sigmask(SIG_UNBLOCK, &sigset, NULL);
}
void pthread_pause_disable() {
//This is important for when you want to do some signal unsafe stuff
//Eg.: locking mutex, calling printf() which has internal mutex, etc...
//After unlocking mutex, you can enable pause again.
pthread_pause_init();
//Make sure all signals are dispatched before we block them
sem_wait(&pthread_pause_sem);
//Block signals
sigset_t sigset;
sigemptyset(&sigset);
sigaddset(&sigset, PTHREAD_XSIG_STOP);
sigaddset(&sigset, PTHREAD_XSIG_CONT);
pthread_sigmask(SIG_BLOCK, &sigset, NULL);
sem_post(&pthread_pause_sem);
}
int pthread_pause(pthread_t thread) {
sem_wait(&pthread_pause_sem);
//If signal queue is full, we keep retrying
while(pthread_kill(thread, PTHREAD_XSIG_STOP) == EAGAIN) usleep(1000);
pthread_pause_yield();
return 0;
}
int pthread_unpause(pthread_t thread) {
sem_wait(&pthread_pause_sem);
//If signal queue is full, we keep retrying
while(pthread_kill(thread, PTHREAD_XSIG_CONT) == EAGAIN) usleep(1000);
pthread_pause_yield();
return 0;
}
void *thread_test() {
//Whole process dies if you kill thread immediately before it is pausable
//pthread_pause_enable();
while(1) {
//Printf() is not async signal safe (because it holds internal mutex),
//you should call it only with pause disabled!
//Will throw helgrind warnings anyway, not sure why...
//See: man 7 signal-safety
pthread_pause_disable();
printf("Running!\n");
pthread_pause_enable();
//Pausing main thread should not cause deadlock
//We pause main thread here just to test it is OK
pthread_pause(main_thread);
//pthread_nsleep(0, 1000*1000);
pthread_unpause(main_thread);
//Wait for a while
//pthread_nsleep(0, 1000*1000*100);
pthread_unpause(main_thread);
}
}
int main() {
pthread_t t;
main_thread = pthread_self();
pthread_pause_enable(); //Will get inherited by all threads from now on
//you need to call pthread_pause_enable (or disable) before creating threads,
//otherwise first (un)pause signal will kill whole process
pthread_create(&t, NULL, thread_test, NULL);
while(1) {
pthread_pause(t);
printf("PAUSED\n");
pthread_sleep(3);
printf("UNPAUSED\n");
pthread_unpause(t);
pthread_sleep(1);
/*
pthread_pause_disable();
printf("RUNNING!\n");
pthread_pause_enable();
*/
pthread_pause(t);
pthread_unpause(t);
}
pthread_join(t, NULL);
printf("DIEDED!\n");
}
I am also working on library called "pthread_extra", which will have stuff like this and much more. Will publish soon.
UPDATE2: This is still causing deadlocks when calling pause/unpause rapidly (removed sleep() calls). Printf() implementation in glibc has mutex, so if you suspend thread which is in middle of printf() and then want to printf() from your thread which plans to unpause that thread later, it will never happen, because printf() is locked. Unfortunately i've removed the printf() and only run empty while loop in the thread, but i still get deadlocks under high pause/unpause rates. and i don't know why. Maybe (even realtime) Linux signals are not 100% safe. There is realtime signal queue, maybe it just overflows or something...
UPDATE3: i think i've managed to fix the deadlock, but had to completely rewrite most of the code. Now i have one (sig_atomic_t) variable per each thread which holds state whether that thread should be running or not. Works kinda like condition variable. pthread_(un)pause() transparently remembers this for each thread. I don't have two signals. now i only have one signal. handler of that signal looks at that variable and only blocks on sigsuspend() when that variable says the thread should NOT run. otherwise it returns from signal handler. in order to suspend/resume the thread i now set the sig_atomic_t variable to desired state and call that signal (which is common for both suspend and resume). It is important to use realtime signals to be sure handler will actualy run after you've modified the state variable. Code is bit complex because of the thread status database. I will share the code in separate solution as soon as i manage to simplify it enough. But i want to preserve the two signal version in here, because it kinda works, i like the simplicity and maybe people will give us more insight on how to optimize it.
UPDATE4: I've fixed the deadlock in original code (no need for helper variable holding the status) by using single handler for two signals and optimizing signal queue a bit. There is still some problem with printf() shown by helgrind, but it is not caused by my signals, it happens even when i do not call pause/unpause at all. Overall this was only tested on LINUX, not sure how portable the code is, because there seem to be some undocumented behaviour of signal handlers which was originaly causing the deadlock.
Please note that pause/unpause cannot be nested. if you pause 3 times, and unpause 1 time, the thread WILL RUN. If you need such behaviour, you should create some kind of wrapper which will count the nesting levels and signal the thread accordingly.
UPDATE5: I've improved robustness of the code by following changes: I ensure proper serialization of pause/unpause calls by use of semaphores. This hopefuly fixes last remaining deadlocks. Now you can be sure that when pause call returns, the target thread is actualy already paused. This also solves issues with signal queue overflowing. Also i've added SA_RESTART flag, which prevents internal signals from causing interuption of IO waits. Sleeps/delays still have to be restarted manualy, but i provide convenient wrapper called pthread_nanosleep() which does just that.
UPDATE6: i realized that simply restarting nanosleep() is not enough, because that way timeout does not run when thread is paused. Therefore i've modified pthread_nanosleep() to convert timeout interval to absolute time point in the future and sleep until that. Also i've hidden semaphore initialization, so user does not need to do that.
Here is example of thread function within a class with pause/resume functionality...
class SomeClass
{
public:
// ... construction/destruction
void Resume();
void Pause();
void Stop();
private:
static void* ThreadFunc(void* pParam);
pthread_t thread;
pthread_mutex_t mutex;
pthread_cond_t cond_var;
int command;
};
SomeClass::SomeClass()
{
pthread_mutex_init(&mutex, NULL);
pthread_cond_init(&cond_var, NULL);
// create thread in suspended state..
command = 0;
pthread_create(&thread, NULL, ThreadFunc, this);
}
SomeClass::~SomeClass()
{
// we should stop the thread and exit ThreadFunc before calling of blocking pthread_join function
// also it prevents the mutex staying locked..
Stop();
pthread_join(thread, NULL);
pthread_cond_destroy(&cond_var);
pthread_mutex_destroy(&mutex);
}
void* SomeClass::ThreadFunc(void* pParam)
{
SomeClass* pThis = (SomeClass*)pParam;
timespec time_ns = {0, 50*1000*1000}; // 50 milliseconds
while(1)
{
pthread_mutex_lock(&pThis->mutex);
if (pThis->command == 2) // command to stop thread..
{
// be sure to unlock mutex before exit..
pthread_mutex_unlock(&pThis->mutex);
return NULL;
}
else if (pThis->command == 0) // command to pause thread..
{
pthread_cond_wait(&pThis->cond_var, &pThis->mutex);
// dont forget to unlock the mutex..
pthread_mutex_unlock(&pThis->mutex);
continue;
}
if (pThis->command == 1) // command to run..
{
// normal runing process..
fprintf(stderr, "*");
}
pthread_mutex_unlock(&pThis->mutex);
// it's important to give main thread few time after unlock 'this'
pthread_yield();
// ... or...
//nanosleep(&time_ns, NULL);
}
pthread_exit(NULL);
}
void SomeClass::Stop()
{
pthread_mutex_lock(&mutex);
command = 2;
pthread_cond_signal(&cond_var);
pthread_mutex_unlock(&mutex);
}
void SomeClass::Pause()
{
pthread_mutex_lock(&mutex);
command = 0;
// in pause command we dont need to signal cond_var because we not in wait state now..
pthread_mutex_unlock(&mutex);
}
void SomeClass::Resume()
{
pthread_mutex_lock(&mutex);
command = 1;
pthread_cond_signal(&cond_var);
pthread_mutex_unlock(&mutex);
}

A kind of thread pool

I used to call CreateThread() for all my threads, and WaitForMultipleObjects(), an leave the routine.
To get somewhat faster code, I'd like to do a kind of thread pool. My thread pools are sometimes created, later used multiple times, and later destroyed (ie., there is not a single pool created at the begining of the program). Each thread in my thread pool call the same routine with different parameters, the number of threads is constant, and they always need to be launched at the same time.
What I do is as follows :
DWORD WINAPI runFunction(LPVOID p) {
Thread* = (Thread*) p;
while(true) {
WaitForSingleObject(thread->awakeEvenHandle, INFINITE);
thread->run();
SetEvent(thread->SleepingEventHandle);
SuspendThread(thread->handle);
}
return 0;
}
void ExecuteThreads(std::vector<Thread*> &threads) {
HANDLE* waitingEvents = new HANDLE[threads.size()];
for (int i=0; i<threads.size(); i++) {
if (threads[i]->handle == NULL) {
threads[i]->AwakeEventHandle = CreateEvent(NULL, true, false, "Awake");
threads[i]->SleepingEventHandle = CreateEvent(NULL, true, false, "Sleeping");
threads[i]->handle = CreateThread(NULL, 0. runFunction, (void*) threads[i], CREATE_SUSPENDED, NULL);
}
ResumeThread(threads[i]->handle);
ResetEvent(threads[i]->SleepingEventHandle);
SetEvent(threads[i]->AwakeEventHandle);
waitingEvents[i] = threads[i]->SleepingEventHandle;
}
WaitForMultipleObjects( threads.size(), waitingEvents, TRUE, INFINITE);
}
My class Thread has a destructor which calls CloseHandle for the HANDLEs SleepingEventHandle and AwakeEventHandle, and for the thread handle. The function Thread::run() is pure virtual, and it's up to the coder to inherit the Thread for an actual run() implementation.
As it, the code doesn't work . One reason is that when I don't need this pool anymore, the destructors of the Threads are called, but the runFunction cannot exits and this crashes (the pointer "thread" has been destroyed but is still used the the function). There are probably many other problems with my code.
How would you do it, in a simple manner ? Is-there an easy fix ? What problems will I encounter with this code ?
Thanks!
Why do you have to deal with such low level api functions? Have a look at boost::thread and boost::thread_group. Also there is a thread pool implementation works with boost::thread.
Now if your threads work for a short period of time, your system will have remarkable overhead with creating and signaling all those threads and events. ppl Task Parallelism or tbb::task are definitely the ways to go.