Synchronize Threads - InterlockedExchange - c++

I like to check if a thread is doing work. If the thread is doing work I will wait for an event until the thread has stopped its work. The event the thread will set at the end.
To check if the thread is working I declared a volatile bool variable. The bool variable will be true if the thread is running, else it is false. At the end of the thread the bool variable will be set to false.
Is it adequate to use a volatile bool variable or do I have to use an atomic function?
BTW: Can please someone explain me the InterlockedExchange Method, I don´t understand the use case I will need this function.
Update
I see without my code it is not clear to say if a volatile bool variable will adequate. I wrote a testclass which shows my problem.
class Testclass
{
public:
Testclass(void);
~Testclass(void);
void doThreadedWork();
void Work();
void StartWork();
void WaitUntilFinish();
private:
HANDLE hHasWork;
HANDLE hAbort;
HANDLE hFinished;
volatile bool m_bWorking;
};
//.cpp
#include "stdafx.h"
#include "Testclass.h"
CRITICAL_SECTION cs;
DWORD WINAPI myThread(LPVOID lpParameter)
{
Testclass* pTestclass = (Testclass*) lpParameter;
pTestclass->doThreadedWork();
return 0;
}
Testclass::Testclass(void)
{
InitializeCriticalSection(&cs);
DWORD myThreadID;
HANDLE myHandle = CreateThread(0, 0, myThread, this, 0, &myThreadID);
m_bWorking = false;
hHasWork = CreateEvent(NULL,TRUE,FALSE,NULL);
hAbort = CreateEvent(NULL,TRUE,FALSE,NULL);
hFinished = CreateEvent(NULL,FALSE,FALSE,NULL);
}
Testclass::~Testclass(void)
{
DeleteCriticalSection(&cs);
CloseHandle(hHasWork);
CloseHandle(hAbort);
CloseHandle(hFinished);
}
void Testclass::Work()
{
// do some work
m_bWorking = false;
SetEvent(hFinished);
}
void Testclass::StartWork()
{
EnterCriticalSection(&cs);
m_bWorking = true;
ResetEvent(hFinished);
SetEvent(hHasWork);
LeaveCriticalSection(&cs);
}
void Testclass::doThreadedWork()
{
HANDLE hEvents[2];
hEvents[0] = hHasWork;
hEvents[1] = hAbort;
while(true)
{
DWORD dwEvent = WaitForMultipleObjects(2, hEvents, FALSE, INFINITE);
if(WAIT_OBJECT_0 == dwEvent)
{
Work();
}
else
{
break;
}
}
}
void Testclass::WaitUntilFinish()
{
EnterCriticalSection(&cs);
if(!m_bWorking)
{
// if the thread is not working, do not wait and return
LeaveCriticalSection(&cs);
return;
}
WaitForSingleObject(hFinished,INFINITE);
LeaveCriticalSection(&cs);
}
For me it is not realy clear if m_bWorking value n a atomic way or if the volatile cast will adequate.

There is a lot of background to cover for your question. We don't know for example what tool chain you are using so I am going to answer it as a winapi question. I further assume you have some something in mind like this:
volatile bool flag = false;
DWORD WINAPI WorkFn(void*) {
flag = true;
// work here
....
// done.
flag = false;
return 0;
}
int main() {
HANDLE th = CreateThread(...., &WorkFn, NULL, ..);
// wait for start of work.
while (!flag) {
// ?? # 1
}
// Seems thread is busy now. Time to wait for it to finish.
while (flag) {
// ?? # 2
}
}
There are many things wrong here. For starters the volatile does very little here. When flag = true happens it will eventually be visible to the other thread because it is backed by a global variable. This is so because it will at least make it into the cache and the cache has ways to tell other processors that a given line (which is a range of addresses) is dirty. The only way it would not make it into the cache is that if the compiler makes a super crazy optimization in which flag stays in the cpu as a register. That could actually happen but not in this particular code example.
So volatile tells the compiler to never keep the variable as a register. That is what it is, every time you see a volatile variable you can translate it as "never enregister this variable". Its use here is just basically a paranoid move.
If this code is what you had in mind then this looping over a flag pattern is called a Spinlock and this one is a really poor one. It is almost never the right thing to do in a user mode program.
Before we go into better approaches let me tackle your Interlocked question. What people usually mean is this pattern
volatile long flag = 0;
DWORD WINAPI WorkFn(void*) {
InterlockedExchange(&flag, 1);
....
}
int main() {
...
while (InterlockedCompareExchange(&flag, 1, 1) = 0L) {
YieldProcessor();
}
...
}
Assume the ... means similar code as before. What the InterlockedExchange() is doing is forcing the write to memory to happen in a deterministic, "broadcast the change now", kind of way and the typical way to read it in the same "bypass the cache" way is via InterlockedCompareExchange().
One problem with them is that they generate more traffic on the system bus. That is, the bus now being used to broadcast cache synchronization packets among the cpus on the system.
std::atomic<bool> flag would be the modern, C++11 way to do the same, but still not what you really want to do.
I added the YieldProcessor() call there to point to the real problem. When you wait for a memory address to change you are using cpu resources that would be better used somewhere else, for example in the actual work (!!). If you actually yield the processor there is at least a chance that the OS will give it to the WorkFn, but in a multicore machine it will quickly go back to polling the variable. In a modern machine you will be checking this flag millions of times per second, with the yield, probably 200000 times per second. Terrible waste either way.
What you want to do here is to leverage Windows to do a zero-cost wait, or at least a low cost as you want to:
DWORD WINAPI WorkFn(void*) {
// work here
....
return 0;
}
int main() {
HANDLE th = CreateThread(...., &WorkFn, NULL, ..);
WaitForSingleObject(th, INFINITE);
// work is done!
CloseHandle(th);
}
When you return from the worker thread the thread handle get signaled and the wait it satisfied. While stuck in WaitForSingleObject you don't consume any cpu cycles. If you want to do a periodic activity in the main() function while you wait you can replace INFINITE with 1000, which will release the main thread every second. In that case you need to check the return value of WaitForSingleObject to tell the timeout from thread being done case.
If you need to actually know when work started, you need an additional waitable object, for example, a Windows event which is obtained via CreateEvent() and can be waited on using the same WaitForSingleObject.
Update [1/23/2016]
Now that we can see the code you have in mind, you don't need atomics, volatile works just fine. The m_bWorking is protected by the cs mutex anyhow for the true case.
If I might suggest, you can use TryEnterCriticalSection and cs to accomplish the same without m_bWorking at all:
void Testclass::Work()
{
EnterCriticalSection(&cs);
// do some work
LeaveCriticalSection(&cs);
SetEvent(hFinished); // could be removed as well
}
void Testclass::StartWork()
{
ResetEvent(hFinished); // could be removed.
SetEvent(hHasWork);
}
void Testclass::WaitUntilFinish()
{
if (TryEnterCriticalSection(&cs)) {
// Not busy now.
LeaveCriticalSection(&cs);
return;
} else {
// busy doing work. If we use EnterCriticalSection(&cs)
// here we can even eliminate hFinished from the code.
}
...
}

For some reason, the Interlocked API does not include an "InterlockedGet" or "InterlockedSet" function. This is a strange omission and the typical work around is to cast through volatile.
You can use code like the following on Windows:
#include <intrin.h>
__inline int InterlockedIncrement(int *j)
{ // This is VS-specific
return _InterlockedIncrement((volatile LONG *) j);
}
__inline int InterlockedDecrement(int *j)
{ // This is VS-specific
return _InterlockedDecrement((volatile LONG *) j);
}
__inline static void InterlockedSet(int *val, int newval)
{
*((volatile int *)val) = newval;
}
__inline static int InterlockedGet(int *val)
{
return *((volatile int *)val);
}
Yes, it's ugly. But it's the best way to work around the deficiency if you're not using C++11. If you're using C++11, use std::atomic instead.
Note that this is Windows-specific code and should not be used on other platforms.

No, volatile bool will not be enough. You need an atomic bool, as you correctly suspect. Otherwise, you might never see your bool updated.
There is also no InterlockedExchange in C++ (the tags of your question), but there are compare_exchange_weak and compare_exchange_strong functions in C++11. Those are used to set the value of an object to a certain NewValue, provided it's current value is TestValue and indicate the status of this attempt (was the change made or not). The benefit of those functions is that this is done in such a fasion that you are guaranteed that if two threads are trying to perform this operation, only one will succeed. This is very helpful when you need to take a certain actions depending on the result of the operation.

Related

Running a task in a separate thread which shold be able to stop on request

I am trying to design an infinite (or a user-defined length) loop that would be independent of my GUI process. I know how to start that loop in a separate thread, so the GUI process is not blocked. However, I would like to have a possibility to interrupt the loop at a press of a button. The complete scenario may look like this:
GUI::startButton->myClass::runLoop... ---> starts a loop in a new thread
GUI::stopButton->myClass::terminateLoop ---> should be able to interrupt the started loop
The problem I have is figuring out how to provide the stop functionality. I am sure there is a way to achieve this in C++. I was looking at a number of multithreading related posts and articles, as well as some lectures on how to use async and futures. Most of the examples did not fit my intended use and/or were too complex for my current state of skills.
Example:
GUIClass.cpp
MyClass *myClass = new MyClass;
void MyWidget::on_pushButton_start_clicked()
{
myClass->start().detach();
}
void MyWidget::on_pushButton_stop_clicked()
{
myClass->stop(); // TBD: how to implement the stop functionality?
}
MyClass.cpp
std::thread MyClass::start()
{
return std::thread(&MyClass::runLoop, this);
}
void MyClass::runLoop()
{
for(int i = 0; i < 999999; i++)
{
// do some work
}
}
As far as i know, there is no standard way to terminate a STL thread. And even if possible, this is not advisable since it can leave your application in an undefined state.
It would be better to add a check to your MyClass::runLoop method that stops execution in a controlled way as soon as an external condition is fulfilled. This might, for example, be a control variable like this:
std::thread MyClass::start()
{
_threadRunning = true;
if(_thread.joinable() == true) // If thr thread is joinable...
{
// Join before (re)starting the thread
_thread.join();
}
_thread = std::thread(&MyClass::runLoop, this);
return _thread;
}
void MyClass::runLoop()
{
for(int i = 0; i < MAX_ITERATION_COUNT; i++)
{
if(_threadRunning == false) { break; }
// do some work
}
}
Then you can end the thread with:
void MyClass::stopLoop()
{
_threadRunning = false;
}
_threadRunning would here be a member variable of type bool or, if your architecture for some reason has non-atomic bools, std::atomic<bool>.
With x86, x86_64, ARM and ARM64, however, you should be fine without atomic bools. It, however is advised to use them. Also to hint at the fact that the variable is used in a multithreading context.
Possible MyClass.h:
MyClass
{
public:
MyClass() : _threadRunning(false) {}
std::thread start();
std::thread runLoop();
std::thread stopLoop();
private:
std::thread _thread;
std::atomic<bool> _threadRunning;
}
It might be important to note that, depending on the code in your loop, it might take a while before the thread really stops.
Therefore it might be wise to std::thread::join the thread before restarting it, to make sure only one thread runs at a time.

Tricky situation with race condition

I have this race condition with an audio playback class, where every time I start playback I set keepPlaying as true, and false when I stop.
The problem happens when I stop() immediately after I start, and the keepPlaying flag is set to false, then reset to true again.
I could put a delay in stop(), but I don't think that's a very good solution. Should I use conditional variable to make stop() wait until keepPlaying is true?
How would you normally solve this problem?
#include <iostream>
#include <thread>
using namespace std;
class AudioPlayer
{
bool keepRunning;
thread thread_play;
public:
AudioPlayer(){ keepRunning = false; }
~AudioPlayer(){ stop(); }
void play()
{
stop();
// keepRunning = true; // A: this works OK
thread_play = thread(&AudioPlayer::_play, this);
}
void stop()
{
keepRunning = false;
if (thread_play.joinable()) thread_play.join();
}
void _play()
{
cout << "Playing: started\n";
keepRunning = true; // B: this causes problem
while(keepRunning)
{
this_thread::sleep_for(chrono::milliseconds(100));
}
cout << "Playing: stopped\n";
}
};
int main()
{
AudioPlayer ap;
ap.play();
ap.play();
ap.play();
return 0;
}
Output:
$ ./test
Playing: started
(pause indefinitely...)
Here is my suggestion, combining many comments from below as well:
1) Briefly synchronized the keepRunning flag with a mutex so that it cannot be modified while a previous thread is still changing state.
2) Changed the flag to atomic_bool, as it is also modified while the mutex is not used.
class AudioPlayer
{
thread thread_play;
public:
AudioPlayer(){ }
~AudioPlayer()
{
keepRunning = false;
thread_play.join();
}
void play()
{
unique_lock<mutex> l(_mutex);
keepRunning = false;
if ( thread_play.joinable() )
thread_play.join();
keepRunning = true;
thread_play = thread(&AudioPlayer::_play, this);
}
void stop()
{
unique_lock<mutex> l(_mutex);
keepRunning = false;
}
private:
void _play()
{
cout << "Playing: started\n";
while ( keepRunning == true )
{
this_thread::sleep_for(chrono::milliseconds(10));
}
cout << "Playing: stopped\n";
}
atomic_bool keepRunning { false };
std::mutex _mutex;
};
int main()
{
AudioPlayer ap;
ap.play();
ap.play();
ap.play();
this_thread::sleep_for(chrono::milliseconds(100));
ap.stop();
return 0;
}
To answer the question directly.
Setting keepPlaying=true at point A is synchronous in the main thread but setting it at point B it is asynchronous to the main thread.
Being asynchronous the call to ap.stop() in the main thread (and the one in the destructor) might take place before point B is reached (by the asynchronous thread) so the last thread runs forever.
You should also make keepRunning atomic that will make sure that the value is communicated between the threads correctly. There's no guarantee of when or if the sub-thread will 'see' the value set by the main thread without some synchronization. You could also use a std::mutex.
Other answers don't like .join() in stop(). I would say that's a design decision. You certainly need to make sure the thread has stopped before leaving main()(*) but that could take place in the destructor (as other answers suggest).
As a final note the more conventional design wouldn't keep re-creating the 'play' thread but would wake/sleep a single thread. There's an overhead of creating a thread and the 'classic' model treats this as a producer/consumer pattern.
#include <iostream>
#include <thread>
#include <atomic>
class AudioPlayer
{
std::atomic<bool> keepRunning;
std::thread thread_play;
public:
AudioPlayer():keepRunning(false){
}
~AudioPlayer(){ stop(); }
void play()
{
stop();
keepRunning = true; // A: this works OK
thread_play = std::thread(&AudioPlayer::_play, this);
}
void stop()
{
keepRunning=false;
if (thread_play.joinable()){
thread_play.join();
}
}
void _play()
{
std::cout<<"Playing: started\n";
while(keepRunning)
{
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
std::cout<<"Playing: stopped\n";
}
};
int main()
{
AudioPlayer ap;
ap.play();
ap.play();
ap.play();
ap.stop();
return 0;
}
(*) You can also detach() but that's not recommended.
First, what you have here is indeed the definition of a data race - one thread is writing to a non-atomic variable keepRunning and another is reading from it. So even if you uncomment the line in play, you'd still have a data race. To avoid that, make keepRunning a std::atomic<bool>.
Now, the fundamental problem is the lack of symmetry between play and stop - play does the actual work in a spawned thread, while stop does it in the main thread. To make the flow easier to reason about, increase symmetry:
set keepRunning in play, or
have play wait for the thread to be up and running and done with any setup (also eliminating the need for the if in stop).
As a side note, one way to handle cases where a flag is set and reset in possibly uneven order is to replace it with a counter. You then stall until you see the expected value, and only then apply the change (using CAS).
Ideally, you'd just set keepPlaying before starting the thread (as in your commented out play() function). That's the neatest solution, and skips the race completely.
If you want to be more fancy, you can also use a condition_variable and signal the playing thread with notify_one or notify_all, and in the loop check wait_until with a duration of 0. If it's not cv_status::timeout then you should stop playing.
Don't make stop pause and wait for state to settle down. That would work here, but is a bad habit to get into for later.
As noted in the comment, it is undefined behavior to write to a variable while simultaneously reading from it. atomic<bool> solves this, but wouldn't fix your race on its own, it just makes the reads and writes well defined.
I modified your program a bit and it works now. Let's discuss problems first:
Problem 1: using plain bool variable in 2 threads
Here both threads update the variable and it might lead to a race condition, because it is highly dependent which thread comes first and even end up in undefined behaviour. Undefined behaviour especially might occur when write from one thread is interrupted by another. Here Snps brought up links to the following SO answers:
When do I really need to use atomic<bool> instead of bool?
trap representation
In addition I was searching if write can be interrupted for bool on x86 platforms and came across this answer:
Can a bool read/write operation be not atomic on x86?
Problem 2: Caching as compiler optimization
Another problem is that variables are allowed to be cached. It means that the «playing thread» might cache the value of keepRunning and thus never terminate or terminate after considerable amount of time. In previous C++ version (98, 2003) a volatile modifier was the only construct to mark variables to prevent/avoid caching optimization and in this case force the compiler to always read the variable from its actual memory location. Thus given the «playing thread» enters the while loop keepRunning might be cached and never read or with considerable delays no matter when stop() modifies it.
After C++ 11 atomic template and atomic_bool specialization were introduced to make such variables as non-cachable and being read/set in an uninterruptible manner, thus adressing Problems 1 & 2.
Side note: volatile and caching explained by Andrei Alexandrescu in the Dr. Dobbs article which addresses exactly this situation:
Caching variables in registers is a very valuable optimization that applies most of the time, so it would be a pity to waste it. C and C++ give you the chance to explicitly disable such caching. If you use the volatile modifier on a variable, the compiler won't cache that variable in registers — each access will hit the actual memory location of that variable.
Problem 3: stop was called before _play() function was even started
The problem here is that in multi-threaded OSs scheduler grants some time slice for a thread to run. If the thread can progress and this time slice is not over thread continues to run. In «main thread» all play() calls were executed even before the «play threads» started to run. Thus the object destruction took place before _play() function started running. And there you set the variable keepRunning to true.
How I fixed this problem
We need to ensure that play() returns when the _play() function started running. A condition_variable is of help here. play() blocks so long until _play() notifies it that it has started the execution.
Here is the code:
#include <iostream>
#include <thread>
#include <atomic>
using namespace std;
class AudioPlayer
{
atomic_bool keepRunning;
thread thread_play;
std::mutex mutex;
std::condition_variable play_started;
public:
AudioPlayer()
: keepRunning{false}
{}
~AudioPlayer(){ stop(); }
void play()
{
stop();
std::unique_lock<std::mutex> lock(mutex);
thread_play = thread(&AudioPlayer::_play, this);
play_started.wait(lock);
}
void stop()
{
keepRunning = false;
cout << "stop called" << endl;
if (thread_play.joinable()) thread_play.join();
}
void _play()
{
cout << "Playing: started\n";
keepRunning = true; // B: this causes problem
play_started.notify_one();
while(keepRunning)
{
this_thread::sleep_for(chrono::milliseconds(100));
}
cout << "Playing: stopped\n";
}
};
int main()
{
AudioPlayer ap;
ap.play();
ap.play();
ap.play();
return 0;
}
Your solution A is actually almost correct. It's still undefined behavior to have one thread read from non-atomic variable that another is writing to. So keepRunning must be made an atomic<bool>. Once you do that and in conjunction with your fix from A, your code will be fine. That is because stop now has a correct post condition that no thread will be active (in particular no _play call) after it exits.
Note that no mutex is necessary. However, play and stop are not themselves thread safe. As long as the client of AudioPlayer is not using the same instance of AudioPlayer in multiple threads though that shouldn't matter.

Mutex Safety with Interrupts (Embedded Firmware)

Edit #Mike pointed out that my try_lock function in the code below is unsafe and that accessor creation can produce a race condition as well. The suggestions (from everyone) have convinced me that I'm going down the wrong path.
Original Question
The requirements for locking on an embedded microcontroller are different enough from multithreading that I haven't been able to convert multithreading examples to my embedded applications. Typically I don't have an OS or threads of any kind, just main and whatever interrupt functions are called by the hardware periodically.
It's pretty common that I need to fill up a buffer from an interrupt, but process it in main. I've created the IrqMutex class below to try to safely implement this. Each person trying to access the buffer is assigned a unique id through IrqMutexAccessor, then they each can try_lock() and unlock(). The idea of a blocking lock() function doesn't work from interrupts because unless you allow the interrupt to complete, no other code can execute so the unlock() code never runs. I do however use a blocking lock from the main() code occasionally.
However, I know that the double-check lock doesn't work without C++11 memory barriers (which aren't available on many embedded platforms). Honestly despite reading quite a bit about it, I don't really understand how/why the memory access reordering can cause a problem. I think that the use of volatile sig_atomic_t (possibly combined with the use of unique IDs) makes this different from the double-check lock. But I'm hoping someone can: confirm that the following code is correct, explain why it isn't safe, or offer a better way to accomplish this.
class IrqMutex {
friend class IrqMutexAccessor;
private:
std::sig_atomic_t accessorIdEnum;
volatile std::sig_atomic_t owner;
protected:
std::sig_atomic_t nextAccessor(void) { return ++accessorIdEnum; }
bool have_lock(std::sig_atomic_t accessorId) {
return (owner == accessorId);
}
bool try_lock(std::sig_atomic_t accessorId) {
// Only try to get a lock, while it isn't already owned.
while (owner == SIG_ATOMIC_MIN) {
// <-- If an interrupt occurs here, both attempts can get a lock at the same time.
// Try to take ownership of this Mutex.
owner = accessorId; // SET
// Double check that we are the owner.
if (owner == accessorId) return true;
// Someone else must have taken ownership between CHECK and SET.
// If they released it after CHECK, we'll loop back and try again.
// Otherwise someone else has a lock and we have failed.
}
// This shouldn't happen unless they called try_lock on something they already owned.
if (owner == accessorId) return true;
// If someone else owns it, we failed.
return false;
}
bool unlock(std::sig_atomic_t accessorId) {
// Double check that the owner called this function (not strictly required)
if (owner == accessorId) {
owner = SIG_ATOMIC_MIN;
return true;
}
// We still return true if the mutex was unlocked anyway.
return (owner == SIG_ATOMIC_MIN);
}
public:
IrqMutex(void) : accessorIdEnum(SIG_ATOMIC_MIN), owner(SIG_ATOMIC_MIN) {}
};
// This class is used to manage our unique accessorId.
class IrqMutexAccessor {
friend class IrqMutex;
private:
IrqMutex& mutex;
const std::sig_atomic_t accessorId;
public:
IrqMutexAccessor(IrqMutex& m) : mutex(m), accessorId(m.nextAccessor()) {}
bool have_lock(void) { return mutex.have_lock(accessorId); }
bool try_lock(void) { return mutex.try_lock(accessorId); }
bool unlock(void) { return mutex.unlock(accessorId); }
};
Because there is one processor, and no threading the mutex serves what I think is a subtly different purpose than normal. There are two main use cases I run into repeatedly.
The interrupt is a Producer and takes ownership of a free buffer and loads it with a packet of data. The interrupt/Producer may keep its ownership lock for a long time spanning multiple interrupt calls. The main function is the Consumer and takes ownership of a full buffer when it is ready to process it. The race condition rarely happens, but if the interrupt/Producer finishes with a packet and needs a new buffer, but they are all full it will try to take the oldest buffer (this is a dropped packet event). If the main/Consumer started to read and process that oldest buffer at exactly the same time they would trample all over each other.
The interrupt is just a quick change or increment of something (like a counter). However, if we want to reset the counter or jump to some new value with a call from the main() code we don't want to try to write to the counter as it is changing. Here main actually does a blocking loop to obtain a lock, however I think its almost impossible to have to actually wait here for more than two attempts. Once it has a lock, any calls to the counter interrupt will be skipped, but that's generally not a big deal for something like a counter. Then I update the counter value and unlock it so it can start incrementing again.
I realize these two samples are dumbed down a bit, but some version of these patterns occur in many of the peripherals in every project I work on and I'd like once piece of reusable code that can safely handle this across various embedded platforms. I included the C tag, because all of this is directly convertible to C code, and on some embedded compilers that's all that is available. So I'm trying to find a general method that is guaranteed to work in both C and C++.
struct ExampleCounter {
volatile long long int value;
IrqMutex mutex;
} exampleCounter;
struct ExampleBuffer {
volatile char data[256];
volatile size_t index;
IrqMutex mutex; // One mutex per buffer.
} exampleBuffers[2];
const volatile char * const REGISTER;
// This accessor shouldn't be created in an interrupt or a race condition can occur.
static IrqMutexAccessor myMutex(exampleCounter.mutex);
void __irqQuickFunction(void) {
// Obtain a lock, add the data then unlock all within one function call.
if (myMutex.try_lock()) {
exampleCounter.value++;
myMutex.unlock();
} else {
// If we failed to obtain a lock, we skipped this update this one time.
}
}
// These accessors shouldn't be created in an interrupt or a race condition can occur.
static IrqMutexAccessor myMutexes[2] = {
IrqMutexAccessor(exampleBuffers[0].mutex),
IrqMutexAccessor(exampleBuffers[1].mutex)
};
void __irqLongFunction(void) {
static size_t bufferIndex = 0;
// Check if we have a lock.
if (!myMutex[bufferIndex].have_lock() and !myMutex[bufferIndex].try_lock()) {
// If we can't get a lock try the other buffer
bufferIndex = (bufferIndex + 1) % 2;
// One buffer should always be available so the next line should always be successful.
if (!myMutex[bufferIndex].try_lock()) return;
}
// ... at this point we know we have a lock ...
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
exampleBuffers[bufferIndex].data[exampleBuffers[bufferIndex].index++] = c;
// We may keep the lock for multiple function calls until the end of packet.
static const char END_PACKET_SIGNAL = '\0';
if (c == END_PACKET_SIGNAL) {
// Unlock this buffer so it can be read from main.
myMutex[bufferIndex].unlock();
// Switch to the other buffer for next time.
bufferIndex = (bufferIndex + 1) % 2;
}
}
int main(void) {
while (true) {
// Mutex for counter
static IrqMutexAccessor myCounterMutex(exampleCounter.mutex);
// Change counter value
if (EVERY_ONCE_IN_A_WHILE) {
// Skip any updates that occur while we are updating the counter.
while(!myCounterMutex.try_lock()) {
// Wait for the interrupt to release its lock.
}
// Set the counter to a new value.
exampleCounter.value = 500;
// Updates will start again as soon as we unlock it.
myCounterMutex.unlock();
}
// Mutexes for __irqLongFunction.
static IrqMutexAccessor myBufferMutexes[2] = {
IrqMutexAccessor(exampleBuffers[0].mutex),
IrqMutexAccessor(exampleBuffers[1].mutex)
};
// Process buffers from __irqLongFunction.
for (size_t i = 0; i < 2; i++) {
// Obtain a lock so we can read the data.
if (!myBufferMutexes[i].try_lock()) continue;
// Check that the buffer isn't empty.
if (exampleBuffers[i].index == 0) {
myBufferMutexes[i].unlock(); // Don't forget to unlock.
continue;
}
// ... read and do something with the data here ...
exampleBuffer.index = 0;
myBufferMutexes[i].unlock();
}
}
}
}
Also note that I used volatile on any variable that is read-by or written-by the interrupt routine (unless the variable was only accessed from the interrupt like the static bufferIndex value in __irqLongFunction). I've read that mutexes remove some of need for volatile in multithreaded code, but I don't think that applies here. Did I use the right amount of volatile? I used it on: ExampleBuffer[].data[256], ExampleBuffer[].index, and ExampleCounter.value.
I apologize for the long answer, but perhaps it is fitting for a long question.
To answer your first question, I would say that your implementation of IrqMutex is not safe. Let me try to explain where I see problems.
Function nextAccessor
std::sig_atomic_t nextAccessor(void) { return ++accessorIdEnum; }
This function has a race condition, because the increment operator is not atomic, despite it being on an atomic value marked volatile. It involves 3 operations: reading the current value of accessorIdEnum, incrementing it, and writing the result back. If two IrqMutexAccessors are created at the same time, it's possible that they both get the same ID.
Function try_lock
The try_lock function also has a race condition. One thread (eg main), could go into the while loop, and then before taking ownership, another thread (eg an interrupt) can also go into the while loop and take ownership of the lock (returning true). Then the first thread can continue, moving onto owner = accessorId, and thus "also" take the lock. So two threads (or your main thread and an interrupt) can try_lock on an unowned mutex at the same time and both return true.
Disabling interrupts by RAII
We can achieve some level of simplicity and encapsulation by using RAII for interrupt disabling, for example the following class:
class InterruptLock {
public:
InterruptLock() {
prevInterruptState = currentInterruptState();
disableInterrupts();
}
~InterruptLock() {
restoreInterrupts(prevInterruptState);
}
private:
int prevInterruptState; // Whatever type this should be for the platform
InterruptLock(const InterruptLock&); // Not copy-constructable
};
And I would recommend disabling interrupts to get the atomicity you need within the mutex implementation itself. For example something like:
bool try_lock(std::sig_atomic_t accessorId) {
InterruptLock lock;
if (owner == SIG_ATOMIC_MIN) {
owner = accessorId;
return true;
}
return false;
}
bool unlock(std::sig_atomic_t accessorId) {
InterruptLock lock;
if (owner == accessorId) {
owner = SIG_ATOMIC_MIN;
return true;
}
return false;
}
Depending on your platform, this might look different, but you get the idea.
As you said, this provides a platform to abstract away from the disabling and enabling interrupts in general code, and encapsulates it to this one class.
Mutexes and Interrupts
Having said how I would consider implementing the mutex class, I would not actually use a mutex class for your use-cases. As you pointed out, mutexes don't really play well with interrupts, because an interrupt can't "block" on trying to acquire a mutex. For this reason, for code that directly exchanges data with an interrupt, I would instead strongly consider just directly disabling interrupts (for a very short time while the main "thread" touches the data).
So your counter might simply look like this:
volatile long long int exampleCounter;
void __irqQuickFunction(void) {
exampleCounter++;
}
...
// Change counter value
if (EVERY_ONCE_IN_A_WHILE) {
InterruptLock lock;
exampleCounter = 500;
}
In my mind, this is easier to read, easier to reason about, and won't "slip" when there's contention (ie miss timer beats).
Regarding the buffer use-case, I would strongly recommend against holding a lock for multiple interrupt cycles. A lock/mutex should be held for just the slightest moment required to "touch" a piece of memory - just long enough to read or write it. Get in, get out.
So this is how the buffering example might look:
struct ExampleBuffer {
char data[256];
} exampleBuffers[2];
ExampleBuffer* volatile bufferAwaitingConsumption = nullptr;
ExampleBuffer* volatile freeBuffer = &exampleBuffers[1];
const volatile char * const REGISTER;
void __irqLongFunction(void) {
static const char END_PACKET_SIGNAL = '\0';
static size_t index = 0;
static ExampleBuffer* receiveBuffer = &exampleBuffers[0];
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
receiveBuffer->data[index++] = c;
// End of packet?
if (c == END_PACKET_SIGNAL) {
// Make the packet available to the consumer
bufferAwaitingConsumption = receiveBuffer;
// Move on to the next buffer
receiveBuffer = freeBuffer;
freeBuffer = nullptr;
index = 0;
}
}
int main(void) {
while (true) {
// Fetch packet from shared variable
ExampleBuffer* packet;
{
InterruptLock lock;
packet = bufferAwaitingConsumption;
bufferAwaitingConsumption = nullptr;
}
if (packet) {
// ... read and do something with the data here ...
// Once we're done with the buffer, we need to release it back to the producer
{
InterruptLock lock;
freeBuffer = packet;
}
}
}
}
This code is arguably easier to reason about, since there are only two memory locations shared between the interrupt and the main loop: one to pass packets from the interrupt to the main loop, and one to pass empty buffers back to the interrupt. We also only touch those variables under "lock", and only for the minimum time needed to "move" the value. (for simplicity I've skipped over the buffer overflow logic when the main loop takes too long to free the buffer).
It's true that in this case one may not even need the locks, since we're just reading and writing simple value, but the cost of disabling the interrupts is not much, and the risk of making mistakes otherwise, is not worth it in my opinion.
Edit
As pointed out in the comments, the above solution was meant to only tackle the multithreading problem, and omitted overflow checking. Here is more complete solution which should be robust under overflow conditions:
const size_t BUFFER_COUNT = 2;
struct ExampleBuffer {
char data[256];
ExampleBuffer* next;
} exampleBuffers[BUFFER_COUNT];
volatile size_t overflowCount = 0;
class BufferList {
public:
BufferList() : first(nullptr), last(nullptr) { }
// Atomic enqueue
void enqueue(ExampleBuffer* buffer) {
InterruptLock lock;
if (last)
last->next = buffer;
else {
first = buffer;
last = buffer;
}
}
// Atomic dequeue (or returns null)
ExampleBuffer* dequeueOrNull() {
InterruptLock lock;
ExampleBuffer* result = first;
if (first) {
first = first->next;
if (!first)
last = nullptr;
}
return result;
}
private:
ExampleBuffer* first;
ExampleBuffer* last;
} freeBuffers, buffersAwaitingConsumption;
const volatile char * const REGISTER;
void __irqLongFunction(void) {
static const char END_PACKET_SIGNAL = '\0';
static size_t index = 0;
static ExampleBuffer* receiveBuffer = &exampleBuffers[0];
// Recovery from overflow?
if (!receiveBuffer) {
// Try get another free buffer
receiveBuffer = freeBuffers.dequeueOrNull();
// Still no buffer?
if (!receiveBuffer) {
overflowCount++;
return;
}
}
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
if (index < sizeof(receiveBuffer->data))
receiveBuffer->data[index++] = c;
// End of packet, or out of space?
if (c == END_PACKET_SIGNAL) {
// Make the packet available to the consumer
buffersAwaitingConsumption.enqueue(receiveBuffer);
// Move on to the next free buffer
receiveBuffer = freeBuffers.dequeueOrNull();
index = 0;
}
}
size_t getAndResetOverflowCount() {
InterruptLock lock;
size_t result = overflowCount;
overflowCount = 0;
return result;
}
int main(void) {
// All buffers are free at the start
for (int i = 0; i < BUFFER_COUNT; i++)
freeBuffers.enqueue(&exampleBuffers[i]);
while (true) {
// Fetch packet from shared variable
ExampleBuffer* packet = dequeueOrNull();
if (packet) {
// ... read and do something with the data here ...
// Once we're done with the buffer, we need to release it back to the producer
freeBuffers.enqueue(packet);
}
size_t overflowBytes = getAndResetOverflowCount();
if (overflowBytes) {
// ...
}
}
}
The key changes:
If the interrupt runs out of free buffers, it will recover
If the interrupt receives data while it doesn't have a receive buffer, it will communicate that to the main thread via getAndResetOverflowCount
If you keep getting buffer overflows, you can simply increase the buffer count
I've encapsulated the multithreaded access into a queue class implemented as a linked list (BufferList), which supports atomic dequeue and enqueue. The previous example also used queues, but of length 0-1 (either an item is enqueued or it isn't), and so the implementation of the queue was just a single variable. In the case of running out of free buffers, the receive queue could have 2 items, so I upgraded it to a proper queue rather than adding more shared variables.
If the interrupt is the producer and mainline code is the consumer, surely it's as simple as disabling the interrupt for the duration of the consume operation?
That's how I used to do it in my embedded micro controller days.

notify thread about changes in variable (signals?)

I have main() and thread in the same program.
there is a variable named "status", that can get several values
I need that when the variable changes, to notify the thread (the thread cnat wait for the status variable, it is already doing fluent task) .
is there an easy way to do so? similar to interrupts? how about signals?
the function inside the main:
int main()
{
char *status;
...
...
while (1)
{
switch (status)
{
case: status1 ...notify the thread
case: status2 ...notify the thread
case: status3 ...notify the thread
}
}
}
if someone could give me an example it will be great!
thanks!
Since you're already using the pthread library you can use conditional variables to tell the thread that there is data ready for processing. Take a look at this StackOverflow question for more information.
I understand that you do not want to wait indefinitely for this notification, however C++ only implements cooperative scheduling. You cannot just pause a thread, fiddle with its memory, and resume it.
Therefore, the first thing you have to understand is that the thread which has to process the signal/action you want to send must be willing to do so; which in other words means must explicitly check for the signal at some point.
There are multiple ways for a thread to check for a signal:
condition variable: they require waiting for the signal (which might be undesirable) but that wait can be bounded by a duration
action queue (aka channel): you create a queue of signals/actions and every so often the target thread checks for something to do; if there is nothing it just goes on doing whatever it has to do, if there is something you have to decide whether it should do everything or only process the N firsts. Beware of overflowing the queue.
just check the status variable directly every so often, it does not tell you how many times it changed (unless it keeps an history: but then we are back to the queue), but it allows you to amend your ways.
Given your requirements, I would think that the queue is probably the best idea among those three.
Might be this example helpful for you.
DWORD sampleThread( LPVOID argument );
int main()
{
bool defValue = false;
bool* status = &defValue;
CreateThread(NULL, 0, sampleThread, status, 0,NULL);
while(1)
{
//.............
defValue = true; //trigger thread
// ...
}
return 0;
}
DWORD sampleThread( LPVOID argument )
{
bool* syncPtr = reinterpret_cast<bool*>(argument);
while (1)
{
if (false == *syncPtr)
{
// do something
}
else (true = *syncPtr)
{
//do somthing else
}
}
}

c++ winapi threads

These days I'm trying to learn more things about threads in windows. I thought about making this practical application:
Let's say there are several threads started when a button "Start" is pressed. Assume these threads are intensive (they keep running / have always something to work on).
This app would also have a "Stop" button. When this button is pressed all the threads should close in a nice way: free resources and abandon work and return the state they were before the "Start" button was pressed.
Another request of the app is that the functions runned by the threads shouldn't contain any instruction checking if the "Stop" button was pressed. The function running in the thread shouldn't care about the stop button.
Language: C++
OS: Windows
Problems:
WrapperFunc(function, param)
{
// what to write here ?
// if i write this:
function(param);
// i cannot stop the function from executing
}
How should I construct the wrapper function so that I can stop the thread properly?
( without using TerminateThread or some other functions )
What if the programmer allocates some memory dynamically? How can I free it before closing
the thread?( note that when I press "Stop button" the thread is still processing data)
I though about overloading the new operator or just imposing the usage of a predefined
function to be used when allocating memory dynamically. This, however, means
that the programmer who uses this api is constrained and it's not what I want.
Thank you
Edit: Skeleton to describe the functionality I'd like to achieve.
struct wrapper_data
{
void* (*function)(LPVOID);
LPVOID *params;
};
/*
this function should make sure that the threads stop properly
( free memory allocated dynamically etc )
*/
void* WrapperFunc(LPVOID *arg)
{
wrapper_data *data = (wrapper_data*) arg;
// what to write here ?
// if i write this:
data->function(data->params);
// i cannot stop the function from executing
delete data;
}
// will have exactly the same arguments as CreateThread
MyCreateThread(..., function, params, ...)
{
// this should create a thread that runs the wrapper function
wrapper_data *data = new wrapper_data;
data->function = function;
data->params = params;
CreateThread(..., WrapperFunc, (LPVOID) wrapper_data, ...);
}
thread_function(LPVOID *data)
{
while(1)
{
//do stuff
}
}
// as you can see I want it to be completely invisible
// to the programmer who uses this
MyCreateThread(..., thread_function, (LPVOID) params,...);
One solution is to have some kind of signal that tells the threads to stop working. Often this can be a global boolean variable that is normally false but when set to true it tells the threads to stop. As for the cleaning up, do it when the threads main loop is done before returning from the thread.
I.e. something like this:
volatile bool gStopThreads = false; // Defaults to false, threads should not stop
void thread_function()
{
while (!gStopThreads)
{
// Do some stuff
}
// All processing done, clean up after my self here
}
As for the cleaning up bit, if you keep the data inside a struct or a class, you can forcibly kill them from outside the threads and just either delete the instances if you allocated them dynamically or let the system handle it if created e.g. on the stack or as global objects. Of course, all data your thread allocates (including files, sockets etc.) must be placed in this structure or class.
A way of keeping the stopping functionality in the wrapper, is to have the actual main loop in the wrapper, together with the check for the stop-signal. Then in the main loop just call a doStuff-like function that does the actual processing. However, if it contains operations that might take time, you end up with the first problem again.
See my answer to this similar question:
How do I guarantee fast shutdown of my win32 app?
Basically, you can use QueueUserAPC to queue a proc which throws an exception. The exception should bubble all the way up to a 'catch' in your thread proc.
As long as any libraries you're using are reasonably exception-aware and use RAII, this works remarkably well. I haven't successfully got this working with boost::threads however, as it's doesn't put suspended threads into an alertable wait state, so QueueUserAPC can't wake them.
If you don't want the "programmer" of the function that the thread will execute deal with the "stop" event, make the thread execute a function of "you" that deals with the "stop" event and when that event isn't signaled executes the "programmer" function...
In other words the "while(!event)" will be in a function that calls the "job" function.
Code Sample.
typedef void (*JobFunction)(LPVOID params); // The prototype of the function to execute inside the thread
struct structFunctionParams
{
int iCounter;
structFunctionParams()
{
iCounter = 0;
}
};
struct structJobParams
{
bool bStop;
JobFunction pFunction;
LPVOID pFunctionParams;
structJobParams()
{
bStop = false;
pFunction = NULL;
pFunctionParams = NULL;
}
};
DWORD WINAPI ThreadProcessJob(IN LPVOID pParams)
{
structJobParams* pJobParams = (structJobParams*)pParams;
while(!pJobParams->bStop)
{
// Execute the "programmer" function
pJobParams->pFunction(pJobParams->pFunctionParams);
}
return 0;
}
void ThreadFunction(LPVOID pParams)
{
// Do Something....
((structFunctionParams*)pParams)->iCounter ++;
}
int _tmain(int argc, _TCHAR* argv[])
{
structFunctionParams stFunctionParams;
structJobParams stJobParams;
stJobParams.pFunction = &ThreadFunction;
stJobParams.pFunctionParams = &stFunctionParams;
DWORD dwIdThread = 0;
HANDLE hThread = CreateThread(
NULL,
0,
ThreadProcessJob,
(LPVOID) &stJobParams, 0, &dwIdThread);
if(hThread)
{
// Give it 5 seconds to work
Sleep(5000);
stJobParams.bStop = true; // Signal to Stop
WaitForSingleObject(hThread, INFINITE); // Wait to finish
CloseHandle(hThread);
}
}