ARM real-time - How to avoid race conditions in interrupt service routines - concurrency

On ARM Cortex M3 bare-metal real-time microcontroller application, I need to ensure that function dispatch() is called exactly once per interrupt. dispatch() uses global data and is hence not reentrant, so I also need to ensure that it does not get called in an ISR if it is already in the middle of running.
In pseudocode, I can achieve these goals like this:
atomic bool dispatcher_running;
atomic uint dispatch_count;
ISR 0:
flag0 = true;
dispatch_count++;
if (!dispatcher_running)
dispatch();
ISR 1:
flag1 = true;
dispatch_count++;
if (!dispatcher_running)
dispatch();
...
void dispatch() {
if (dispatcher_running) // Line A
return;
dispatcher_running = true; // Line B
while (dispatch_count--) {
// Actual work goes here
}
dispatcher_running = false;
}
The challenge, however, is that dispatch_count and dispatcher_running need to be atomic or barriered. Otherwise, race conditions could result. Even pure atomic isn't enough: An ISR could call dispatch(), which passes the check at Line A, and, before it gets to Line B, another ISR calls dispatch(); this would result in two instances of dispatch running simultaneously.
My questions are:
What type of atomic or barriers do I need to make this work?
Which ARM instructions (e.g. DMB) can be used to achieve them?
Is there a portable std C++ way to do something similar? I don't intend to use e.g. std::atomic, as this is bare metal. But I would like to have a similar C++ lib that does the equivalent, as I am prototyping parts of the application on a PC.

Related

What lock-free primitives do people actually use to do lock-free audio processing in c++?

Anyone who has done a bit of low level audio programming has been cautioned about locking the audio thread. Here's a nice article on the subject.
But it's very unclear to me how to actually go about making a multithreaded audio processing application in c++ while strictly following this rule and ensuring thread safety. Assume that you're building something simple like a visualizer. You need to hand off audio data to a UI thread to process it and display it on a regular interval.
My first attempt at this would be to ping pong between two buffers. buffer*_write_state is a boolean type assumed to be atomic and lock free. buffer* is some kind of buffer with no expectation of being thread safe on its own and with some means of handing the case where one thread gets called at an insufficient rate (I don't mean to get into the complications of that here). For a generic looking boolean type, the implementation looks like this:
// Write thread.
if (buffer1_write_state) {
buffer1.write(data);
if (buffer2_write_state) {
buffer1_write_state = false;
}
} else {
buffer2.write(data);
if (buffer1_write_state) {
buffer2_write_state = false;
}
}
// Read thread.
if (buffer1_write_state) {
data = buffer2.read();
buffer2.clear();
buffer2_write_state = true;
} else if (buffer2_write_state) {
data = buffer1.read();
buffer1.clear();
buffer1_write_state = true;
}
I've implemented this using std::atomic_flag as my boolean type and as far as I can tell with my thread sanitizer, it is thread safe. std::atomic_flag is guaranteed to be lock free by the standard. The point that confuses me is that to even do this, I need std::atomic_flag's test() function which doesn't exist prior to c++20. The available, mutating test_and_set() and clear() functions don't do the job. Well known alternative std::atomic is not guaranteed to be lock-free by the standard. I've heard that it most cases it isn't.
I've read a few threads that caution people against rolling their own attempts at a lock-free structure, and I'm happy to abide by that tip, but how do experts even build these things if the basic tools aren't guaranteed to be lock-free?
I've heard that it most cases it isn't.
You heard wrong.
std::atomic<bool> is lock_free on all "normal" C++ implementations, e.g. for ARM, x86, PowerPC, etc. Use it if atomic_flag's restrictive API sucks too much. Or std::atomic<int>, also pretty universally lock_free on targets that have lock-free anything.
(The only plausible exception would be an 8-bit machine that can't load/store/RMW a pair of bytes.)
Note that if you're targeting ARM, you should enable compiler options to let it know you don't care about ARM CPUs too old to support atomic operations. In that case, the compiler will have to make slow code that uses library function calls in case it runs on ARMv4 or something. See std::atomic<bool> lock-free inconsistency on ARM (raspberry pi 3)

Changing a variable in an already running function C++

I am creating my first nacl app and am encountering an issue.
I need to stop a running while loop by changing its condition.
My code kind of looks likes this:
int flag = 1;
static void Test1() {
while (flag) {
sleep(2);
}
}
I want to change flag (flag = 0) in a safe way by calling another function to stop the infinite loop. How can I do this in C++?
You can use atomic_int for a variable that can be safely changed:
std::atomic_int flag = 1;
static void Test1() {
while (flag) {
sleep(2);
}
}
You need to use a lock to make sure your writes are safe and won't corrupt the value. You're threading library should provide you with locks. If your application isn't multithreaded, don't even worry about adding in this protection.
An example using the pthread POSIX library (see https://docs.oracle.com/cd/E19683-01/806-6867/sync-12/index.html for a more intricate example):
#include <pthread.h>
pthread_mutex_t g_flag_lock;
int g_flag;
void change_flag(int value) {
pthread_mutex_lock(&g_flag_lock);
g_flag = value;
pthread_mutex_unlock(&g_flag_lock);
}
Generally speaking, you only need to lock when writing a value. Reading doesn't usually create issues (I can think of one instance in my professional career that I locked on a read because something funky was happening).
Essentially, pthread_mutex_lock(&g_flag_lock); checks to make sure no other thread has currently locked g_flag_lock. If one has, it waits until that thread unlocks it again, and then snags it for itself.
I should also note that it isn't wise to haphazardly use locks. You'll find yourself in a deadlock situation. When writing multithreaded applications, you really need to think about the architecture and the timing.
I would assume that the std::atomic types simply abstract this pattern. I can't say for sure though.

Is mutex mandatory to access extern variable from a different thread?

I am developing an application in Qt/C++. At some point, there are two threads : one is the UI thread and the other one is the background thread. I have to do some operation from the background thread based on the value of an extern variable which is type of bool. I am setting this value by clicking a button on UI.
header.cpp
extern bool globalVar;
mainWindow.cpp
//main ui thread on button click
setVale(bool val){
globalVar = val;
}
backgroundThread.cpp
while(1){
if(globalVar)
//do some operation
else
//do some other operation
}
Here, writing to globalVar happens only when the user clicks the button whereas reading happens continuously.
So my question is :
In a situation like the one above, is mutex mandatory?
If read and write happens at the same time, does this cause the application to crash?
If read and write happens at same time, is globalVar going to have some value other than true or false?
Finally, does the OS provide any kind of locking mechanism to prevent the read/write operation to access a memory location at the same time by a different thread?
The loop
while(1){
if(globalVar)
//do some operation
else
//do some other operation
}
is busy waiting, which is extremely wasteful. Thus, you're probably better off with some classic synchronization that will wake the background thread (mostly) when there is something to be done. You should consider adapting this example of std::condition_variable.
Say you start with:
#include <thread>
#include <mutex>
#include <condition_variable>
std::mutex m;
std::condition_variable cv;
bool ready = false;
Your worker thread can then be something like this:
void worker_thread()
{
while(true)
{
// Wait until main() sends data
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return ready;});
ready = false;
lk.unlock();
}
The notifying thread should do something like this:
{
std::lock_guard<std::mutex> lk(m);
ready = true;
}
cv.notify_one();
Since it is just a single plain bool, I'd say a mutex is overkill, you should just go for an atomic integer instead. An atomic will read and write in a single CPU clock so no worries there, and it will be lock free, which is always better if possible.
If it is something more complex, then by all means go for a mutex.
It won't crash from that alone, but you can get data corruption, which may crash the application.
The system will not manage that stuff for you, you do it manually, just make sure all access to the data goes through the mutex.
Edit:
Since you specify a number of times that you don't want a complex solution, you may opt for simply using a mutex instead of the bool. There is no need to protect the bool with a mutex, since you can use the mutex as a bool, and yes, you could go with an atomic, but that's what the mutex already does (plus some extra functionality in the case of recursive mutexes).
It also matters what is your exact workload, since your example doesn't make a lot of sense in practice. It would be helpful to know what those some operations are.
So in your ui thread you could simply val ? mutex.lock() : mutex.unlock(), and in your secondary thread you could use if (mutex.tryLock()) doStuff; mutex.unlock(); else doOtherStuff;. Now if the operation in the secondary thread takes too long and you happen to be changing the lock in the main thread, that will block the main thread until the secondary thread unlocks. You could use tryLock(timeout) in the main thread, depending on what you prefer, lock() will block until success, while tryLock(timeout) will prevent blocking but the lock may fail. Also, take care not to unlock from a thread other than the one you locked with, and not to unlock an already unlocked mutex.
Depending on what you are actually doing, maybe an asynchronous event driven approach would be more appropriate. Do you really need that while(1)? How frequently do you perform those operations?
In situation like above does mutex is necessary?
A mutex is one tool that will work. What you actually need are three things:
a means of ensuring an atomic update (a bool will give you this as it's mandated to be an integral type by the standard)
a means of ensuring that the effects of a write made by one thread is actually visible in the other thread. This may sound counter-intuitive but the c++ memory model is single-threaded and optimisations (software and hardware) do not need to consider cross-thread communication, and...
a means of preventing the compiler (and CPU!!) from re-ordering the reads and writes.
The answer to the implied question is 'yes'. You will need something at does all of these things (see below)
If read and write happend at the same time does this cause to crash the application?
not when it's a bool, but the program won't behave as you expect. In fact, because the program is now exhibiting undefined behaviour you can no longer reason about its behaviour at all.
If read and write happens at same time, is globalVar going to have some value other thantrue or false?
not in this case because it's an intrinsic (atomic) type.
And is it going to happen the access(read/write) of a memory location at same time by different thread, does OS providing any kind of locking mechanism to prevent it?
Not unless you specify one.
Your options are:
std::atomic<bool>
std::mutex
std::atomic_signal_fence
Realistically speaking, as long as you use an integer type (not bool), make it volatile, and keep inside of its own cache line by properly aligning its storage, you don't need to do anything special at all.
In situation like above does mutex is necessary?
Only if you want to keep the value of the variable synchronized with other state.
If read and write happed at the same time does this cause to crash the application?
According to C++ standard, it's undefined behavior. So anything can happen: e.g. your application might not crash, but its state might be subtly corrupted. In real life, though, compilers often offer some sane implementation defined behavior and you're fine unless your platform is really weird. Anything commonplace, like 32 and 64 bit intel, PPC and ARM will be fine.
If read and write happens at same time, is globalVar going to have some value other thantrue or false?
globalVar can only have these two values, so it makes no sense to speak of any other values unless you're talking about its binary representation. Yes, it could happen that the binary representation is incorrect and not what the compiler would expect. That's why you shouldn't use a bool but a uint8_t instead.
I wouldn't love to see such flag in a code review, but if a uint8_t flag is the simplest solution to whatever problem you're solving, I say go for it. The if (globalVar) test will treat zero as false, and anything else as true, so temporary "gibberish" is OK and won't have any odd effects in practice. According to the standard, you'll be facing undefined behavior, of course.
And is it going to happen the access(read/write) of a memory location at same time by different thread, does OS providing any kind of locking mechanism to prevent it?
It's not the OS's job to do that.
Speaking of practice, though: on any reasonable platform, the use of a std::atomic_bool will have no overhead over the use of a naked uint8_t, so just use that and be done.

How to control thread lifetime using C++11 atomics

Following on from this question, I'd like to know what's the recommended approach we should take to replace the very common pattern we have in legacy code.
We have plenty of places where a primary thread is spawing one or more background worker threads and periodically pumping out some work for them to do, using a suitably synchronized queue. So the general pattern for a worker thread will look like this:
There will be an event HANDLE and a bool defined somewhere (usually as member variables) -
HANDLE hDoSomething = CreateEvent(NULL, FALSE, FALSE, NULL);
volatile bool bEndThread = false;
Then the worker thread function waits for the event to be signalled before doing work, but checks for a termination request inside the main loop -
unsigned int ThreadFunc(void *pParam)
{
// typical legacy implementation of a worker thread
while (true)
{
// wait for event
WaitForSingleObject(hDoSomething, INFINITE);
// check for termination request
if (bEndThread) break;
// ... do background work ...
}
// normal termination
return 0;
}
The primary thread can then give some work to the background thread like this -
// ... put some work on a synchronized queue ...
// pulse worker thread to do the work
SetEvent(hDoSomething);
And it can finally terminate the worker thread like so -
// to terminate the worker thread
bEndThread = true;
SetEvent(hDoSomething);
// wait for worker thread to die
WaitForSingleObject(hWorkerThreadHandle, dwSomeSuitableTimeOut);
In some cases, we've used two events (one for work, one for termination) and WaitForMultipleObjects instead, but the general pattern is the same.
So, looking at replacing the volatile bool with a C++11 standard equivalent, is it as simple as replacing this
volatile bool bEndThread = false;
with this?
std::atomic<bool> bEndThread = false;
I'm sure it will work, but it doesn't seem enough. Also, it doesn't affect the case where we use two events and no bool.
Note, I'm not intending to replace all this legacy stuff with the PPL and/or Concurrency Runtime equivalents because although we use these for new development, the legacy codebase is end-of-life and just needs to be compatible with the latest development tools (the original question I linked above shows where my concern arose).
Can someone give me a rough example of C++11 standard code we could use for this simple thread management pattern to rewrite our legacy code without too much refactoring?
If it ain't broken don't fix it (especially if this is a legacy code base)
VS style volatile will be around for a few more years. Given that
MFC isn't dead this won't be dead any time soon. A cursory Google
search says you can control it with /volatile:ms.
Atomics might do the job of volatile, especially if this is a counter
there might be little performance overhead.
Many Windows native functions have different performance characteristics when compared to their C++11 implementation. For example, Windows TimerQueues and Multimedia have precision that is not possible to achieve with C++11.
For example ::sleep_for(5)
will sleep for 15 (and not 5 or 6). This can be solved with a mysterious
call to timeSetPeriod. Another example is that unlocking on a condition variable can be slow to respond. Interfaces to fix these aren't exposed to C++11 on Windows.

c++ Thread safe accumulator

I need to monitor internal traffic based on minute interval , so i decide to do something like this:
Flow{
void send();
static uint accumulator;
}
//Only single thread call to send
void Flow::sendPacket(pck){
accumulator+=pck.size();
do();
}
//Only single thread call to monitor . **No the same thread that call to send!**
Monitor::monitor(){
//Start monitor
Flow::accumulator = 0;
sleep(60);
rate = accumulator/60;
}
Can i have without use atomic a risk that initialize to 0 will not happened correct?
My concern is that even atomic will not guaranty init, because if at the same time monitor init it to 0 and at the same time accumulate is done with old value than new accumulate value will be based in the old value and not on the init value.
In additional i concern from the atomic penalty. send is called for every packet.
Volatile doesn't help with multi-threading. You need to prevent simultaneous updates to the value of accumulator and updates at the same time that another thread is reading the value. If you have C++11 you can make accumulator atomic: std::atomic<uint> accumulator; Otherwise, you need to lock a mutex around all accesses to its value.
volatile is neither necessary nor sufficient for sharing data between threads, so don't use it.
If it might be accessed by more than one thread, then you must either:
make the accesses atomic, using the C++11 atomics library, or compiler-specific language extensions if that's not available, or
guard it with a mutex or similar lock, using the C++11 threading library, or some other library (Boost.Thread, POSIX threads, Intel TBB, Windows API, or numerous others) if that's not available.
Otherwise, you will have a data race, giving undefined behaviour.
If only one thread can access it, then you don't need to do anything special.