C++ Blocking Queue Segfault w/ Boost - c++

I had a need for a Blocking Queue in C++ with timeout-capable offer(). The queue is intended for multiple producers, one consumer. Back when I was implementing, I didn't find any good existing queues that fit this need, so I coded it myself.
I'm seeing segfaults come out of the take() method on the queue, but they are intermittent. I've been looking over the code for issues but I'm not seeing anything that looks problematic.
I'm wondering if:
There is an existing library that does this reliably that I should
use (boost or header-only preferred).
Anyone sees any obvious flaw in my code that I need to fix.
Here is the header:
class BlockingQueue
{
public:
BlockingQueue(unsigned int capacity) : capacity(capacity) { };
bool offer(const MyType & myType, unsigned int timeoutMillis);
MyType take();
void put(const MyType & myType);
unsigned int getCapacity();
unsigned int getCount();
private:
std::deque<MyType> queue;
unsigned int capacity;
};
And the relevant implementations:
boost::condition_variable cond;
boost::mutex mut;
bool BlockingQueue::offer(const MyType & myType, unsigned int timeoutMillis)
{
Timer timer;
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
// We use a while loop here because the monitor may have woken up because
// another producer did a PulseAll. In that case, the queue may not have
// room, so we need to re-check and re-wait if that is the case.
// We use an external stopwatch to stop the madness if we have taken too long.
while (queue.size() >= this->capacity)
{
int monitorTimeout = timeoutMillis - ((unsigned int) timer.getElapsedMilliSeconds());
if (monitorTimeout <= 0)
{
return false;
}
if (!cond.timed_wait(lock, boost::posix_time::milliseconds(timeoutMillis)))
{
return false;
}
}
cond.notify_all();
queue.push_back(myType);
return true;
}
void BlockingQueue::put(const MyType & myType)
{
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
// We use a while loop here because the monitor may have woken up because
// another producer did a PulseAll. In that case, the queue may not have
// room, so we need to re-check and re-wait if that is the case.
// We use an external stopwatch to stop the madness if we have taken too long.
while (queue.size() >= this->capacity)
{
cond.wait(lock);
}
cond.notify_all();
queue.push_back(myType);
}
MyType BlockingQueue::take()
{
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
while (queue.size() == 0)
{
cond.wait(lock);
}
cond.notify_one();
MyType myType = this->queue.front();
this->queue.pop_front();
return myType;
}
unsigned int BlockingQueue::getCapacity()
{
return this->capacity;
}
unsigned int BlockingQueue::getCount()
{
return this->queue.size();
}
And yes, I didn't implement the class using templates - that is next on the list :)
Any help is greatly appreciated. Threading issues can be really hard to pin down.
-Ben

Why are cond, and mut globals? I would expect them to be members of your BlockingQueue object. I don't know what else is touching those things, but there may be an issue there.
I too have implemented a ThreadSafeQueue as part of a larger project:
https://github.com/cdesjardins/QueuePtr/blob/master/include/ThreadSafeQueue.h
It is a similar concept to yours, except the enqueue (aka offer) functions are non-blocking because there is basically no max capacity. To enforce a capacity I typically have a pool with N buffers added at system init time, and a Queue for message passing at run time, this also eliminates the need for memory allocation at run time which I consider to be a good thing (I typically work on embedded applications).
The only difference between a pool, and a queue is that a pool gets a bunch of buffers enqueued at system init time. So you have something like this:
ThreadSafeQueue<BufferDataType*> pool;
ThreadSafeQueue<BufferDataType*> queue;
void init()
{
for (int i = 0; i < NUM_BUFS; i++)
{
pool.enqueue(new BufferDataType);
}
}
Then when you want send a message you do something like the following:
void producerA()
{
BufferDataType *buf;
if (pool.waitDequeue(buf, timeout) == true)
{
initBufWithMyData(buf);
queue.enqueue(buf);
}
}
This way the enqueue function is quick and easy, but if the pool is empty, then you will block until someone puts a buffer back into the pool. The idea being that some other thread will be blocking on the queue and will return buffers to the pool when they have been processed as follows:
void consumer()
{
BufferDataType *buf;
if (queue.waitDequeue(buf, timeout) == true)
{
processBufferData(buf);
pool.enqueue(buf);
}
}
Anyways take a look at it, maybe it will help.

I suppose the problem in your code is modifying the deque by several threads. Look:
you're waiting for codition from another thread;
and then immediately sending a signal to other threads that deque is unlocked just before you want to modify it;
then you modifying the deque while other threads are thinking deque is allready unlocked and starting doing the same.
So, try to place all the cond.notify_*() after modifying the deque. I.e.:
void BlockingQueue::put(const MyType & myType)
{
boost::unique_lock<boost::mutex> lock(mut);
while (queue.size() >= this->capacity)
{
cond.wait(lock);
}
queue.push_back(myType); // <- modify first
cond.notify_all(); // <- then say to others that deque is free
}
For better understanding I suggest to read about the pthread_cond_wait().

Related

Thread pool stuck on wait condition

I'm encountering a stuck in my c++ program using this thread pool class:
class ThreadPool {
unsigned threadCount;
std::vector<std::thread> threads;
std::list<std::function<void(void)> > queue;
std::atomic_int jobs_left;
std::atomic_bool bailout;
std::atomic_bool finished;
std::condition_variable job_available_var;
std::condition_variable wait_var;
std::mutex wait_mutex;
std::mutex queue_mutex;
std::mutex mtx;
void Task() {
while (!bailout) {
next_job()();
--jobs_left;
wait_var.notify_one();
}
}
std::function<void(void)> next_job() {
std::function<void(void)> res;
std::unique_lock<std::mutex> job_lock(queue_mutex);
// Wait for a job if we don't have any.
job_available_var.wait(job_lock, [this]()->bool { return queue.size() || bailout; });
// Get job from the queue
mtx.lock();
if (!bailout) {
res = queue.front();
queue.pop_front();
}else {
// If we're bailing out, 'inject' a job into the queue to keep jobs_left accurate.
res = [] {};
++jobs_left;
}
mtx.unlock();
return res;
}
public:
ThreadPool(int c)
: threadCount(c)
, threads(threadCount)
, jobs_left(0)
, bailout(false)
, finished(false)
{
for (unsigned i = 0; i < threadCount; ++i)
threads[i] = std::move(std::thread([this, i] { this->Task(); }));
}
~ThreadPool() {
JoinAll();
}
void AddJob(std::function<void(void)> job) {
std::lock_guard<std::mutex> lock(queue_mutex);
queue.emplace_back(job);
++jobs_left;
job_available_var.notify_one();
}
void JoinAll(bool WaitForAll = true) {
if (!finished) {
if (WaitForAll) {
WaitAll();
}
// note that we're done, and wake up any thread that's
// waiting for a new job
bailout = true;
job_available_var.notify_all();
for (auto& x : threads)
if (x.joinable())
x.join();
finished = true;
}
}
void WaitAll() {
std::unique_lock<std::mutex> lk(wait_mutex);
if (jobs_left > 0) {
wait_var.wait(lk, [this] { return this->jobs_left == 0; });
}
lk.unlock();
}
};
gdb say (when stopping the blocked execution) that the stuck was in (std::unique_lock&, ThreadPool::WaitAll()::{lambda()#1})+58>
I'm using g++ v5.3.0 with support for c++14 (-std=c++1y)
How can I avoid this problem?
Update
I've edited (rewrote) the class: https://github.com/edoz90/threadpool/blob/master/ThreadPool.h
The issue here is a race condition on your job count. You're using one mutex to protect the queue, and another to protect the count, which is semantically equivalent to the queue size. Clearly the second mutex is redundant (and improperly used), as is the job_count variable itself.
Every method that deals with the queue has to gain exclusive access to it (even JoinAll to read its size), so you should use the same queue_mutex in the three bits of code that tamper with it (JoinAll, AddJob and next_job).
Btw, splitting the code at next_job() is pretty awkward IMO. You would avoid calling a dummy function if you handled the worker thread body in a single function.
EDIT:
As other comments have already stated, you would probably be better off getting your eyes off the code and reconsidering the problem globally for a while.
The only thing you need to protect here is the job queue, so you need only one mutex.
Then there is the problem of waking up the various actors, which requires a condition variable since C++ basically does not give you any other useable synchronization object.
Here again you don't need more than one variable. Terminating the thread pool is equivalent to dequeueing the jobs without executing them, which can be done any which way, be it in the worker threads themselves (skipping execution if the termination flag is set) or in the JoinAll function (clearing the queue after gaining exclusive access).
Last but not least, you might want to invalidate AddJob once someone decided to close the pool, or else you could get stuck in the destructor while someone keeps feeding in new jobs.
I think you need to keep it simple.
you seem to be using a mutex too many. So there's queue_mutex and you use that when you add and process jobs.
Now what's the need for another separate mutex when you are waiting on reading the queue?
Why can't you use just a conditional variable with the same queue_mutex to read the queue in your WaitAll() method?
Update
I would also recommend using a lock_guard instead of the unique_lock in your WaitAll. There really isn't a need to lock the queue_mutex beyond the WaitAll under exceptional conditions. If you exit the WaitAll exceptionally it should be released regardless.
Update2
Ignore my Update above. Since you are using a condition variable you can't use a lock guard in the WaitAll. But if you are using a unique_lock always go with the try_to_lock version especially if you have more than a couple control paths

avoid busy waiting and mode switches between realtime and non realtime threading

I have the following problem:
we do have a controller implemented with ros_control that runs on a Real Time, Xenomai linux-patched system. The control loop is executed by iteratively calling an update function. I need to communicate some of the internal state of the controller, and for this task I'm using LCM, developed in MIT. Regardless of the internal behaviour of LCM, the publication method is breaking the real time, therefore I've implemented in C++11 a publication loop running on a separated thread. But the loop it is gonna publish at infinite frequency if I don't synchronize the secondary thread with the controller. Therefore I'm using also condition variables.
Here's an example for the controller:
MyClass mc;
// This is called just once
void init(){
mc.init();
}
// Control loop function (e.g., called every 5 ms in RT)
void update(const ros::Time& time, const ros::Duration& period) {
double value = time.toSec();
mc.setValue(value);
}
And for the class which is trying to publish:
double myvalue;
std::mutex mutex;
std::condition_variable cond;
bool go = true;
void MyClass::init(){
std::thread thread(&MyClass::body, this);
}
void MyClass::setValue(double value){
myvalue = value;
{
std::lock_guard<std::mutex> lk(mutex);
go = true;
}
cond.notify_one();
}
void MyClass::body() {
while(true) {
std::unique_lock<std::mutex>lk(mutex);
cond.wait(lk, [this] {return go;});
publish(myvalue); // the dangerous call
go = false;
lk.unlock();
}
}
This code produces mode switches (i.e., is breaking real time). Probably because of the lock on the condition variable, which I use to synchronize the secondary thread with the main controller and is in contention with the thread. If I do something like this:
void MyClass::body() {
while(true) {
if(go){
publish(myvalue);
go = false;
}
}
}
void MyClass::setValue(double value){
myvalue = value;
go = true;
}
I would not produce mode switches, but also it would be unsafe and most of all I would have busy waiting for the secondary thread.
Is there a way to have non-blocking synch between main thread and secondary thread (i.e., having body doing something only when setValue is called) which is also non-busy waiting?
Use a lock free data structure.
In your case here you don't even need a data structure, just use an atomic for go. No locks necessary. You might look into using a semaphore instead of a condition variable to avoid the now-unused lock too. And if you need a semaphore to avoid using a lock you're going to end up using your base OS semaphores, not C++11 since C++11 doesn't even have them.
This isn't perfect, but it should reduce your busy-wait frequency with only occasional loss of responsiveness.
The idea is to use a naked condition variable wake up while passing a message through an atomic.
template<class T>
struct non_blocking_poke {
std::atomic<T> message;
std::atomic<bool> active;
std::mutex m;
std::condition_variable v;
void poke(T t) {
message = t;
active = true;
v.notify_one();
}
template<class Rep, class Period>
T wait_for_poke(const std::chrono::duration<Rep, Period>& busy_time) {
std::unique_lock<std::mutex> l(m);
while( !v.wait_for(l, busy_time, [&]{ return active; } ))
{}
active = false;
return message;
}
};
The waiting thread wakes up every busy_time to see if it missed a message. However, it will usually get a message faster than that (there is a race condition where it misses a message). In addition, multiple messages can be sent without the reliever getting them. However, if a message is sent, within about 1 second the receiver will get that message or a later message.
non_blocking_poke<double> poker;
// in realtime thread:
poker.poke(3.14);
// in non-realtime thread:
while(true) {
using namespace std::literals::chrono_literals;
double d = poker.wait_for_poke( 1s );
std::cout << d << '\n';
}
In an industrial quality solution, you'll also want an abort flag or message to stop the loops.

Mutex Safety with Interrupts (Embedded Firmware)

Edit #Mike pointed out that my try_lock function in the code below is unsafe and that accessor creation can produce a race condition as well. The suggestions (from everyone) have convinced me that I'm going down the wrong path.
Original Question
The requirements for locking on an embedded microcontroller are different enough from multithreading that I haven't been able to convert multithreading examples to my embedded applications. Typically I don't have an OS or threads of any kind, just main and whatever interrupt functions are called by the hardware periodically.
It's pretty common that I need to fill up a buffer from an interrupt, but process it in main. I've created the IrqMutex class below to try to safely implement this. Each person trying to access the buffer is assigned a unique id through IrqMutexAccessor, then they each can try_lock() and unlock(). The idea of a blocking lock() function doesn't work from interrupts because unless you allow the interrupt to complete, no other code can execute so the unlock() code never runs. I do however use a blocking lock from the main() code occasionally.
However, I know that the double-check lock doesn't work without C++11 memory barriers (which aren't available on many embedded platforms). Honestly despite reading quite a bit about it, I don't really understand how/why the memory access reordering can cause a problem. I think that the use of volatile sig_atomic_t (possibly combined with the use of unique IDs) makes this different from the double-check lock. But I'm hoping someone can: confirm that the following code is correct, explain why it isn't safe, or offer a better way to accomplish this.
class IrqMutex {
friend class IrqMutexAccessor;
private:
std::sig_atomic_t accessorIdEnum;
volatile std::sig_atomic_t owner;
protected:
std::sig_atomic_t nextAccessor(void) { return ++accessorIdEnum; }
bool have_lock(std::sig_atomic_t accessorId) {
return (owner == accessorId);
}
bool try_lock(std::sig_atomic_t accessorId) {
// Only try to get a lock, while it isn't already owned.
while (owner == SIG_ATOMIC_MIN) {
// <-- If an interrupt occurs here, both attempts can get a lock at the same time.
// Try to take ownership of this Mutex.
owner = accessorId; // SET
// Double check that we are the owner.
if (owner == accessorId) return true;
// Someone else must have taken ownership between CHECK and SET.
// If they released it after CHECK, we'll loop back and try again.
// Otherwise someone else has a lock and we have failed.
}
// This shouldn't happen unless they called try_lock on something they already owned.
if (owner == accessorId) return true;
// If someone else owns it, we failed.
return false;
}
bool unlock(std::sig_atomic_t accessorId) {
// Double check that the owner called this function (not strictly required)
if (owner == accessorId) {
owner = SIG_ATOMIC_MIN;
return true;
}
// We still return true if the mutex was unlocked anyway.
return (owner == SIG_ATOMIC_MIN);
}
public:
IrqMutex(void) : accessorIdEnum(SIG_ATOMIC_MIN), owner(SIG_ATOMIC_MIN) {}
};
// This class is used to manage our unique accessorId.
class IrqMutexAccessor {
friend class IrqMutex;
private:
IrqMutex& mutex;
const std::sig_atomic_t accessorId;
public:
IrqMutexAccessor(IrqMutex& m) : mutex(m), accessorId(m.nextAccessor()) {}
bool have_lock(void) { return mutex.have_lock(accessorId); }
bool try_lock(void) { return mutex.try_lock(accessorId); }
bool unlock(void) { return mutex.unlock(accessorId); }
};
Because there is one processor, and no threading the mutex serves what I think is a subtly different purpose than normal. There are two main use cases I run into repeatedly.
The interrupt is a Producer and takes ownership of a free buffer and loads it with a packet of data. The interrupt/Producer may keep its ownership lock for a long time spanning multiple interrupt calls. The main function is the Consumer and takes ownership of a full buffer when it is ready to process it. The race condition rarely happens, but if the interrupt/Producer finishes with a packet and needs a new buffer, but they are all full it will try to take the oldest buffer (this is a dropped packet event). If the main/Consumer started to read and process that oldest buffer at exactly the same time they would trample all over each other.
The interrupt is just a quick change or increment of something (like a counter). However, if we want to reset the counter or jump to some new value with a call from the main() code we don't want to try to write to the counter as it is changing. Here main actually does a blocking loop to obtain a lock, however I think its almost impossible to have to actually wait here for more than two attempts. Once it has a lock, any calls to the counter interrupt will be skipped, but that's generally not a big deal for something like a counter. Then I update the counter value and unlock it so it can start incrementing again.
I realize these two samples are dumbed down a bit, but some version of these patterns occur in many of the peripherals in every project I work on and I'd like once piece of reusable code that can safely handle this across various embedded platforms. I included the C tag, because all of this is directly convertible to C code, and on some embedded compilers that's all that is available. So I'm trying to find a general method that is guaranteed to work in both C and C++.
struct ExampleCounter {
volatile long long int value;
IrqMutex mutex;
} exampleCounter;
struct ExampleBuffer {
volatile char data[256];
volatile size_t index;
IrqMutex mutex; // One mutex per buffer.
} exampleBuffers[2];
const volatile char * const REGISTER;
// This accessor shouldn't be created in an interrupt or a race condition can occur.
static IrqMutexAccessor myMutex(exampleCounter.mutex);
void __irqQuickFunction(void) {
// Obtain a lock, add the data then unlock all within one function call.
if (myMutex.try_lock()) {
exampleCounter.value++;
myMutex.unlock();
} else {
// If we failed to obtain a lock, we skipped this update this one time.
}
}
// These accessors shouldn't be created in an interrupt or a race condition can occur.
static IrqMutexAccessor myMutexes[2] = {
IrqMutexAccessor(exampleBuffers[0].mutex),
IrqMutexAccessor(exampleBuffers[1].mutex)
};
void __irqLongFunction(void) {
static size_t bufferIndex = 0;
// Check if we have a lock.
if (!myMutex[bufferIndex].have_lock() and !myMutex[bufferIndex].try_lock()) {
// If we can't get a lock try the other buffer
bufferIndex = (bufferIndex + 1) % 2;
// One buffer should always be available so the next line should always be successful.
if (!myMutex[bufferIndex].try_lock()) return;
}
// ... at this point we know we have a lock ...
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
exampleBuffers[bufferIndex].data[exampleBuffers[bufferIndex].index++] = c;
// We may keep the lock for multiple function calls until the end of packet.
static const char END_PACKET_SIGNAL = '\0';
if (c == END_PACKET_SIGNAL) {
// Unlock this buffer so it can be read from main.
myMutex[bufferIndex].unlock();
// Switch to the other buffer for next time.
bufferIndex = (bufferIndex + 1) % 2;
}
}
int main(void) {
while (true) {
// Mutex for counter
static IrqMutexAccessor myCounterMutex(exampleCounter.mutex);
// Change counter value
if (EVERY_ONCE_IN_A_WHILE) {
// Skip any updates that occur while we are updating the counter.
while(!myCounterMutex.try_lock()) {
// Wait for the interrupt to release its lock.
}
// Set the counter to a new value.
exampleCounter.value = 500;
// Updates will start again as soon as we unlock it.
myCounterMutex.unlock();
}
// Mutexes for __irqLongFunction.
static IrqMutexAccessor myBufferMutexes[2] = {
IrqMutexAccessor(exampleBuffers[0].mutex),
IrqMutexAccessor(exampleBuffers[1].mutex)
};
// Process buffers from __irqLongFunction.
for (size_t i = 0; i < 2; i++) {
// Obtain a lock so we can read the data.
if (!myBufferMutexes[i].try_lock()) continue;
// Check that the buffer isn't empty.
if (exampleBuffers[i].index == 0) {
myBufferMutexes[i].unlock(); // Don't forget to unlock.
continue;
}
// ... read and do something with the data here ...
exampleBuffer.index = 0;
myBufferMutexes[i].unlock();
}
}
}
}
Also note that I used volatile on any variable that is read-by or written-by the interrupt routine (unless the variable was only accessed from the interrupt like the static bufferIndex value in __irqLongFunction). I've read that mutexes remove some of need for volatile in multithreaded code, but I don't think that applies here. Did I use the right amount of volatile? I used it on: ExampleBuffer[].data[256], ExampleBuffer[].index, and ExampleCounter.value.
I apologize for the long answer, but perhaps it is fitting for a long question.
To answer your first question, I would say that your implementation of IrqMutex is not safe. Let me try to explain where I see problems.
Function nextAccessor
std::sig_atomic_t nextAccessor(void) { return ++accessorIdEnum; }
This function has a race condition, because the increment operator is not atomic, despite it being on an atomic value marked volatile. It involves 3 operations: reading the current value of accessorIdEnum, incrementing it, and writing the result back. If two IrqMutexAccessors are created at the same time, it's possible that they both get the same ID.
Function try_lock
The try_lock function also has a race condition. One thread (eg main), could go into the while loop, and then before taking ownership, another thread (eg an interrupt) can also go into the while loop and take ownership of the lock (returning true). Then the first thread can continue, moving onto owner = accessorId, and thus "also" take the lock. So two threads (or your main thread and an interrupt) can try_lock on an unowned mutex at the same time and both return true.
Disabling interrupts by RAII
We can achieve some level of simplicity and encapsulation by using RAII for interrupt disabling, for example the following class:
class InterruptLock {
public:
InterruptLock() {
prevInterruptState = currentInterruptState();
disableInterrupts();
}
~InterruptLock() {
restoreInterrupts(prevInterruptState);
}
private:
int prevInterruptState; // Whatever type this should be for the platform
InterruptLock(const InterruptLock&); // Not copy-constructable
};
And I would recommend disabling interrupts to get the atomicity you need within the mutex implementation itself. For example something like:
bool try_lock(std::sig_atomic_t accessorId) {
InterruptLock lock;
if (owner == SIG_ATOMIC_MIN) {
owner = accessorId;
return true;
}
return false;
}
bool unlock(std::sig_atomic_t accessorId) {
InterruptLock lock;
if (owner == accessorId) {
owner = SIG_ATOMIC_MIN;
return true;
}
return false;
}
Depending on your platform, this might look different, but you get the idea.
As you said, this provides a platform to abstract away from the disabling and enabling interrupts in general code, and encapsulates it to this one class.
Mutexes and Interrupts
Having said how I would consider implementing the mutex class, I would not actually use a mutex class for your use-cases. As you pointed out, mutexes don't really play well with interrupts, because an interrupt can't "block" on trying to acquire a mutex. For this reason, for code that directly exchanges data with an interrupt, I would instead strongly consider just directly disabling interrupts (for a very short time while the main "thread" touches the data).
So your counter might simply look like this:
volatile long long int exampleCounter;
void __irqQuickFunction(void) {
exampleCounter++;
}
...
// Change counter value
if (EVERY_ONCE_IN_A_WHILE) {
InterruptLock lock;
exampleCounter = 500;
}
In my mind, this is easier to read, easier to reason about, and won't "slip" when there's contention (ie miss timer beats).
Regarding the buffer use-case, I would strongly recommend against holding a lock for multiple interrupt cycles. A lock/mutex should be held for just the slightest moment required to "touch" a piece of memory - just long enough to read or write it. Get in, get out.
So this is how the buffering example might look:
struct ExampleBuffer {
char data[256];
} exampleBuffers[2];
ExampleBuffer* volatile bufferAwaitingConsumption = nullptr;
ExampleBuffer* volatile freeBuffer = &exampleBuffers[1];
const volatile char * const REGISTER;
void __irqLongFunction(void) {
static const char END_PACKET_SIGNAL = '\0';
static size_t index = 0;
static ExampleBuffer* receiveBuffer = &exampleBuffers[0];
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
receiveBuffer->data[index++] = c;
// End of packet?
if (c == END_PACKET_SIGNAL) {
// Make the packet available to the consumer
bufferAwaitingConsumption = receiveBuffer;
// Move on to the next buffer
receiveBuffer = freeBuffer;
freeBuffer = nullptr;
index = 0;
}
}
int main(void) {
while (true) {
// Fetch packet from shared variable
ExampleBuffer* packet;
{
InterruptLock lock;
packet = bufferAwaitingConsumption;
bufferAwaitingConsumption = nullptr;
}
if (packet) {
// ... read and do something with the data here ...
// Once we're done with the buffer, we need to release it back to the producer
{
InterruptLock lock;
freeBuffer = packet;
}
}
}
}
This code is arguably easier to reason about, since there are only two memory locations shared between the interrupt and the main loop: one to pass packets from the interrupt to the main loop, and one to pass empty buffers back to the interrupt. We also only touch those variables under "lock", and only for the minimum time needed to "move" the value. (for simplicity I've skipped over the buffer overflow logic when the main loop takes too long to free the buffer).
It's true that in this case one may not even need the locks, since we're just reading and writing simple value, but the cost of disabling the interrupts is not much, and the risk of making mistakes otherwise, is not worth it in my opinion.
Edit
As pointed out in the comments, the above solution was meant to only tackle the multithreading problem, and omitted overflow checking. Here is more complete solution which should be robust under overflow conditions:
const size_t BUFFER_COUNT = 2;
struct ExampleBuffer {
char data[256];
ExampleBuffer* next;
} exampleBuffers[BUFFER_COUNT];
volatile size_t overflowCount = 0;
class BufferList {
public:
BufferList() : first(nullptr), last(nullptr) { }
// Atomic enqueue
void enqueue(ExampleBuffer* buffer) {
InterruptLock lock;
if (last)
last->next = buffer;
else {
first = buffer;
last = buffer;
}
}
// Atomic dequeue (or returns null)
ExampleBuffer* dequeueOrNull() {
InterruptLock lock;
ExampleBuffer* result = first;
if (first) {
first = first->next;
if (!first)
last = nullptr;
}
return result;
}
private:
ExampleBuffer* first;
ExampleBuffer* last;
} freeBuffers, buffersAwaitingConsumption;
const volatile char * const REGISTER;
void __irqLongFunction(void) {
static const char END_PACKET_SIGNAL = '\0';
static size_t index = 0;
static ExampleBuffer* receiveBuffer = &exampleBuffers[0];
// Recovery from overflow?
if (!receiveBuffer) {
// Try get another free buffer
receiveBuffer = freeBuffers.dequeueOrNull();
// Still no buffer?
if (!receiveBuffer) {
overflowCount++;
return;
}
}
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
if (index < sizeof(receiveBuffer->data))
receiveBuffer->data[index++] = c;
// End of packet, or out of space?
if (c == END_PACKET_SIGNAL) {
// Make the packet available to the consumer
buffersAwaitingConsumption.enqueue(receiveBuffer);
// Move on to the next free buffer
receiveBuffer = freeBuffers.dequeueOrNull();
index = 0;
}
}
size_t getAndResetOverflowCount() {
InterruptLock lock;
size_t result = overflowCount;
overflowCount = 0;
return result;
}
int main(void) {
// All buffers are free at the start
for (int i = 0; i < BUFFER_COUNT; i++)
freeBuffers.enqueue(&exampleBuffers[i]);
while (true) {
// Fetch packet from shared variable
ExampleBuffer* packet = dequeueOrNull();
if (packet) {
// ... read and do something with the data here ...
// Once we're done with the buffer, we need to release it back to the producer
freeBuffers.enqueue(packet);
}
size_t overflowBytes = getAndResetOverflowCount();
if (overflowBytes) {
// ...
}
}
}
The key changes:
If the interrupt runs out of free buffers, it will recover
If the interrupt receives data while it doesn't have a receive buffer, it will communicate that to the main thread via getAndResetOverflowCount
If you keep getting buffer overflows, you can simply increase the buffer count
I've encapsulated the multithreaded access into a queue class implemented as a linked list (BufferList), which supports atomic dequeue and enqueue. The previous example also used queues, but of length 0-1 (either an item is enqueued or it isn't), and so the implementation of the queue was just a single variable. In the case of running out of free buffers, the receive queue could have 2 items, so I upgraded it to a proper queue rather than adding more shared variables.
If the interrupt is the producer and mainline code is the consumer, surely it's as simple as disabling the interrupt for the duration of the consume operation?
That's how I used to do it in my embedded micro controller days.

Thread locks occuring using boost::thread. What's wrong with my condition variables?

I wrote a Link class for passing data between two nodes in a network. I've implemented it with two deques (one for data going from node 0 to node 1, and the other for data going from node 1 to node 0). I'm trying to multithread the application, but I'm getting threadlocks. I'm trying to prevent reading from and writing to the same deque at the same time. In reading more about how I originally implemented this, I think I'm using the condition variables incorrectly (and maybe shouldn't be using the boolean variables?). Should I have two mutexes, one for each deque? Please help if you can. Thanks!
class Link {
public:
// other stuff...
void push_back(int sourceNodeID, Data newData);
void get(int destinationNodeID, std::vector<Data> &linkData);
private:
// other stuff...
std::vector<int> nodeIDs_;
// qVector_ has two deques, one for Data from node 0 to node 1 and
// one for Data from node 1 to node 0
std::vector<std::deque<Data> > qVector_;
void initialize(int nodeID0, int nodeID1);
boost::mutex mutex_;
std::vector<boost::shared_ptr<boost::condition_variable> > readingCV_;
std::vector<boost::shared_ptr<boost::condition_variable> > writingCV_;
std::vector<bool> writingData_;
std::vector<bool> readingData_;
};
The push_back function:
void Link::push_back(int sourceNodeID, Data newData)
{
int idx;
if (sourceNodeID == nodeIDs_[0]) idx = 1;
else
{
if (sourceNodeID == nodeIDs_[1]) idx = 0;
else throw runtime_error("Link::push_back: Invalid node ID");
}
boost::unique_lock<boost::mutex> lock(mutex_);
// pause to avoid multithreading collisions
while (readingData_[idx]) readingCV_[idx]->wait(lock);
writingData_[idx] = true;
qVector_[idx].push_back(newData);
writingData_[idx] = false;
writingCV_[idx]->notify_all();
}
The get function:
void Link::get(int destinationNodeID,
std::vector<Data> &linkData)
{
int idx;
if (destinationNodeID == nodeIDs_[0]) idx = 0;
else
{
if (destinationNodeID == nodeIDs_[1]) idx = 1;
else throw runtime_error("Link::get: Invalid node ID");
}
boost::unique_lock<boost::mutex> lock(mutex_);
// pause to avoid multithreading collisions
while (writingData_[idx]) writingCV_[idx]->wait(lock);
readingData_[idx] = true;
std::copy(qVector_[idx].begin(),qVector_[idx].end(),back_inserter(linkData));
qVector_[idx].erase(qVector_[idx].begin(),qVector_[idx].end());
readingData_[idx] = false;
readingCV_[idx]->notify_all();
return;
}
and here's initialize (in case it's helpful)
void Link::initialize(int nodeID0, int nodeID1)
{
readingData_ = std::vector<bool>(2,false);
writingData_ = std::vector<bool>(2,false);
for (int i = 0; i < 2; ++i)
{
readingCV_.push_back(make_shared<boost::condition_variable>());
writingCV_.push_back(make_shared<boost::condition_variable>());
}
nodeIDs_.reserve(2);
nodeIDs_.push_back(nodeID0);
nodeIDs_.push_back(nodeID1);
qVector_.reserve(2);
qVector_.push_back(std::deque<Data>());
qVector_.push_back(std::deque<Data>());
}
I'm trying to multithread the application, but I'm getting threadlocks.
What is a "threadlock"? It's difficult to see what your code is trying to accomplish. Consider, first, your push_back() code, whose synchronized portion looks like this:
boost::unique_lock<boost::mutex> lock(mutex_);
while (readingData_[idx]) readingCV_[idx]->wait(lock);
writingData_[idx] = true;
qVector_[idx].push_back(newData);
writingData_[idx] = false;
writingCV_[idx]->notify_all();
Your writingData[idx] boolean starts off as false, and becomes true only momentarily while a thread has the mutex locked. By the time the mutex is released, it is false again. So for any other thread that has to wait to acquire the mutex, writingData[idx] will never be true.
But in your get() code, you have
boost::unique_lock<boost::mutex> lock(mutex_);
// pause to avoid multithreading collisions
while (writingData_[idx]) writingCV_[idx]->wait(lock);
By the time a thread gets the lock on the mutex, writingData[idx] is back to false and so the while loop (and wait on the CV) is never entered.
An exactly symmetric analysis applies to the readingData[idx] boolean, which also is always false outside the mutex lock.
So your condition variables are never waited on. You need to completely rethink your design.
Start with one mutex per queue (the deque is overkill for simply passing data), and for each queue associate a condition variable with the queue being non-empty. The get() method will thus wait until the queue is non-empty, which will be signalled in the push_back() method. Something like this (untested code):
template <typename Data>
class BasicQueue
{
public:
void push( Data const& data )
{
boost::unique_lock _lock( mutex_ );
queue_.push_back( data );
not_empty_.notify_all();
}
void get ( Data& data )
{
boost::unique_lock _lock( mutex_ );
while ( queue_.size() == 0 )
not_empty_.wait( _lock ); // this releases the mutex
// mutex is reacquired here, with queue_.size() > 0
data = queue_.front();
queue_.pop_front();
}
private:
std::queue<Data> queue_;
boost::mutex mutex_;
boost::condition_variable not_empty_;
};
Yes. You need two mutexes. Your deadlocks are almost certainly a result of contention on the single mutex. If you break into your running program with a debugger you will see where the threads are hanging. Also I don't see why you would need the bools. (EDIT: It may be possible to come up with a design that uses a single mutex but it's simpler and safer to stick with one mutex per shared data structure)
A rule of thumb would be to have one mutex per shared data structure you are trying to protect. That mutex guards the data structure against concurrent access and provides thread safety. In your case one mutex per deque. E.g.:
class SafeQueue
{
private:
std::deque<Data> q_;
boost::mutex m_;
boost::condition_variable v_;
public:
void push_back(Data newData)
{
boost::lock_guard<boost::mutex> lock(m_);
q_.push_back(newData);
// notify etc.
}
// ...
};
In terms of notification via condition variables see here:
Using condition variable in a producer-consumer situation
So there would also be one condition_variable per object which the producer would notify and the consumer would wait on. Now you can create two of these queues for communicating in both directions. Keep in mind that with only two threads you can still deadlock if both threads are blocked (waiting for data) and both queues are empty.

consumer/producer in c++

This is a classic c/p problem where some threads produce data while other read the data. Both the producer and consumers are sharing a const sized buffer. If the buffer is empty then the consumers have to wait and if it is full then the producer has to wait. I am using semaphores to keep track of full or empty queues. The producer is going to decrement free spots semaphore, add value, and increment filled slots semaphore. So I am trying to implement a program that gets some numbers from the generator function, and then prints out the average of the numbers. By treating this as a producer-consumer problem, I am trying to save some time in the execution of the program. The generateNumber function causes some delay in the process so I want to create a number of threads that generate numbers, and put them into a queue. Then the "main thread" which is running the main function has to read from the queue and find the sum and then average. So here is what I have so far:
#include <cstdio>
#include <cstdlib>
#include <time.h>
#include "Thread.h"
#include <queue>
int generateNumber() {
int delayms = rand() / (float) RAND_MAX * 400.f + 200;
int result = rand() / (float) RAND_MAX * 20;
struct timespec ts;
ts.tv_sec = 0;
ts.tv_nsec = delayms * 1000000;
nanosleep(&ts, NULL);
return result; }
struct threadarg {
Semaphore filled(0);
Semaphore empty(n);
std::queue<int> q; };
void* threadfunc(void *arg) {
threadarg *targp = (threadarg *) arg;
threadarg &targ = *targp;
while (targ.empty.value() != 0) {
int val = generateNumber();
targ.empty.dec();
q.push_back(val);
targ.filled.inc(); }
}
int main(int argc, char **argv) {
Thread consumer, producer;
// read the command line arguments
if (argc != 2) {
printf("usage: %s [nums to average]\n", argv[0]);
exit(1); }
int n = atoi(argv[1]);
// Seed random number generator
srand(time(NULL));
}
I am a bit confused now because I am not sure how to create multiple producer threads that are generating numbers (if q is not full) while the consumer is reading from the queue (that is if q is not empty). I am not sure what to put in the main to implment it.
also in "Thread.h", you can create a thread, a mutex, or a semaphore. The thread has the methods .run(threadFunc, arg), .join(), etc. A mutex can be locked or unlocked. The semaphore methods have all been used in my code.
Your queue is not synchronized, so multiple producers could call push_back at the same time, or at the same time the consumer is calling pop_front ... this will break.
The simple approach to making this work is to use a thread-safe queue, which can be a wrapper around the std::queue you already have, plus a mutex.
You can start by adding a mutex, and locking/unlocking it around each call you forward to std::queue - for a single consumer that should be sufficient, for multiple consumers you'd need to fuse front() and pop_front() into a single synchronized call.
To let the consumer block while the queue is empty, you can add a condition variable to your wrapper.
That should be enough that you can find the answer online - sample code below.
template <typename T> class SynchronizedQueue
{
std::queue<T> queue_;
std::mutex mutex_;
std::condition_variable condvar_;
typedef std::lock_guard<std::mutex> lock;
typedef std::unique_lock<std::mutex> ulock;
public:
void push(T const &val)
{
lock l(mutex_); // prevents multiple pushes corrupting queue_
bool wake = queue_.empty(); // we may need to wake consumer
queue_.push(val);
if (wake) condvar_.notify_one();
}
T pop()
{
ulock u(mutex_);
while (queue_.empty())
condvar_.wait(u);
// now queue_ is non-empty and we still have the lock
T retval = queue_.front();
queue_.pop();
return retval;
}
};
Replace std::mutex et al with whatever primitives your "Thread.h" gives you.
What I would do is this:
Make a data class that hides your queue
Create thread-safe accessor methods for saving a piece of data to the q, and removing a piece of data from the q ( I would use a single mutex, or a critical section for accessors)
Handle the case where a consumor doesn't have any data to work with (sleep)
Handle the case where the q is becoming too full, and the producers need to slow down
Let the threads go willy-nilly adding and removing as they produce / consume
Also, remember to add a sleep into each and every thread, or else you'll peg the CPU and not give the thread scheduler a good spot to switch contexts and share the CPU with other threads / processes. You don't need to, but it's a good practice.
When managing shared state like this, you need a condition variable and
a mutex. The basic pattern is a function along the lines of:
ScopedLock l( theMutex );
while ( !conditionMet ) {
theCondition.wait( theMutex );
}
doWhatever();
theCondition.notify();
In your case, I'd probably make the condition variable and the mutex
members of the class implementing the queue. To write, the
conditionMet would be !queue.full(), so you'd end up with something
like:
ScopedLock l( queue.myMutex );
while ( queue.full() ) {
queue.myCondition.wait();
}
queue.insert( whatever );
queue.myCondition.notify();
and to read:
ScopedLock l( queue.myMutex );
while ( queue.empty() ) {
queue.myCondition.wait();
}
results = queue.extract();
queue.myCondition.notify();
return results;
Depending on the threading interface, there may be two notify
functions: notify one (which wakes up a single thread), and notify all
(which wakes up all of the waiting threads); in this case, you'll need
notify all (or you'll need two condition variables, one for space to
write, and one for something to read, with each function waiting on one,
but notifying the other).
Protect the queue accesses with a mutex, that should be it. A 'Computer Science 101' bounded producer-consumer queue needs two semaphores, (to manage the free/empty count and for producers/consumers to wait on, as you are already doing), and one mutex/futex/criticalSection to protect the queue.
I don't see how replacing the semaphores and mutex with condvars is any great help. What's the point? How do you implement a bounded producer-consumer queue with condvars that works on all platforms with multiple producers/consumers?
#include<iostream>
#include<deque>
#include<mutex>
#include<chrono>
#include<condition_variable>
#include<thread>
using namespace std;
mutex mu,c_out;
condition_variable cv;
class Buffer
{
public:
Buffer() {}
void add(int ele)
{
unique_lock<mutex> ulock(mu);
cv.wait(ulock,[this](){return q.size()<_size;});
q.push_back(ele);
mu.unlock();
cv.notify_all();
return;
}
int remove()
{
unique_lock<mutex> ulock(mu);
cv.wait(ulock,[this](){return q.size()>0;});
int v=q.back();
q.pop_back();
mu.unlock();
cv.notify_all();
return v;
}
int calculateAvarage()
{
int total=0;
unique_lock<mutex> ulock(mu);
cv.wait(ulock,[this](){return q.size()>0;});
deque<int>::iterator it = q.begin();
while (it != q.end())
{
total += *it;
std::cout << ' ' << *it++;
}
return total/q.size();
}
private:
deque<int> q;
const unsigned int _size=10;
};
class Producer
{
public:
Producer(Buffer *_bf=NULL)
{
this->bf=_bf;
}
void Produce()
{
while(true)
{
int num=rand()%10;
bf->add(num);
c_out.lock();
cout<<"Produced:"<<num<<"avarage:"<<bf->calculateAvarage()<<endl;
this_thread::sleep_for(chrono::microseconds(5000));
c_out.unlock();
}
}
private:
Buffer *bf;
};
class Consumer
{
public:
Consumer(Buffer *_bf=NULL)
{
this->bf=_bf;
}
void Consume()
{
while (true)
{
int num=bf->remove();
c_out.lock();
cout<<"Consumed:"<<num<<"avarage:"<<bf->calculateAvarage()<<endl;
this_thread::sleep_for(chrono::milliseconds(5000));
c_out.unlock();
}
}
private:
Buffer *bf;
};
int main()
{
Buffer b;
Consumer c(&b);
Producer p(&b);
thread th1(&Producer::Produce,&p);
thread th2(&Consumer::Consume,&c);
th1.join();
th2.join();
return 0;
}
Buffer class has doublended queue and max Buffer size of 10.
It has two function to add into queue and remove from queue.
Buffer class has calculateAvarage() function which will calculate the avarage echa time a element is added or deleted.
There are two more classes one is producer and consumer having buffwr class pointer .
We are having Consume() in consumer class and Produce() in Producer class.
Consume()>>Lock the buffer and check if size is of buffer is not 0 it will remove from Buffer and notify to producer and unlock.
Produce()>>Lok the buffer and check if size is of buffer is not max buffer size it will add and notify to consumer and unlock.