Producer-consumer based multi-threading for image processing - c++

I want to implement multi-threading which is based upon Producer-consumer approach for an image processing task. For my case, the Producer thread should grabs the images and put them into a container whereas the consumer thread should extract the images from the Container thread. I think that I should use queue for the implementation of container.
I want to use the following code as suggested in this SO answer. But I have become quite confused with the implementation of container and putting the incoming image into it in the Producer thread.
PROBLEM: The image displayed by the first consumer thread does not contain the full data. And, the second consumer thread never displays any image. May be, there is some race situation or lock situation due to which the second thread is not able to access the data of queue at all. I have already tried to use Mutex.
#include <vector>
#include <thread>
#include <memory>
#include <queue>
#include <opencv2/highgui.hpp>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
Mutex mu;
struct ThreadSafeContainer
queue<unsigned char*> safeContainer;
struct Producer
Producer(std::shared_ptr<ThreadSafeContainer> c) : container(c)
void run()
// grab image from camera
// store image in container
Mat image(400, 400, CV_8UC3, Scalar(10, 100,180) );
unsigned char *pt_src =;
std::shared_ptr<ThreadSafeContainer> container;
struct Consumer
Consumer(std::shared_ptr<ThreadSafeContainer> c) : container(c)
void run()
// read next image from container
if (!container->safeContainer.empty())
unsigned char *ptr_consumer_Image;
ptr_consumer_Image = container->safeContainer.front(); //The front of the queue contain the pointer to the image data
Mat image(400, 400, CV_8UC3); = ptr_consumer_Image;
imshow("consumer image", image);
std::shared_ptr<ThreadSafeContainer> container;
int main()
//Pointer object to the class containing a "container" which will help "Producer" and "Consumer" to put and take images
auto ptrObject_container = make_shared<ThreadSafeContainer>();
//Pointer object to the Producer...intialize the "container" variable of "Struct Producer" with the above created common "container"
auto ptrObject_producer = make_shared<Producer>(ptrObject_container);
//FIRST Pointer object to the Consumer...intialize the "container" variable of "Struct Consumer" with the above created common "container"
auto first_ptrObject_consumer = make_shared<Consumer>(ptrObject_container);
//SECOND Pointer object to the Consumer...intialize the "container" variable of "Struct Consumer" with the above created common "container"
auto second_ptrObject_consumer = make_shared<Consumer>(ptrObject_container);
//RUN producer thread
thread producerThread(&Producer::run, ptrObject_producer);
//RUN first thread of Consumer
thread first_consumerThread(&Consumer::run, first_ptrObject_consumer);
//RUN second thread of Consumer
thread second_consumerThread(&Consumer::run, second_ptrObject_consumer);
//JOIN all threads
return 0;

I don't see an actual question in your original question, so I'll give you the reference material I used to implement producer-consumer in my college course.
Slides 13 and 17 give good examples of producer-consumer
I made use of this in the lab which I have posted on my github here:
If you look in my you can see my implementation of the producer-consumer pattern.
Remember that using this pattern that you can't switch the order of the wait statements or else you can end up in deadlock.
Hope this is helpful.
Okay, so here is a summary of the consumer-producer pattern in my code linked above. The idea behind the producer consumer is to have a thread safe way of passing tasks from a "producer" thread to "consumer" worker threads. In the case of my example, the work to be done is to handle client requests. The producer thread (.serve()) monitors the incoming socket and passes the connection to consumer threads (.handle()) to handle the actual request as they come in. All of the code for this pattern is found in the file (with some declarations/imports in server.h).
For the sake of being brief, I am leaving out some detail. Be sure to go through each line and understand what is going on. Look up the library functions I am using and what the parameters mean. I'm giving you a lot of help here, but there is still plenty of work for you to do to gain a full understanding.
Like I mentioned above, the entire producer thread is found in the .serve() function. It does the following things
Initializes the semaphores. There are two version here because of OS differences. I programmed on a OS X, but had to turn in code on Linux. Since Semaphores are tied to the OS, it is important to understand how to use semaphores in your particular setup.
It sets up the socket for the client to talk to. Not important for your application.
Creates the consumer threads.
Watches the client socket and uses the producer pattern to pass items to the consumers. This code is below
At the bottom of the .serve() function you can see the following code:
while ((client = accept(server_,(struct sockaddr *)&client_addr,&clientlen)) > 0) {
sem_wait(clients_.e); //buffer check
sem_post(clients_.n); //produce
First, you check the buffer semaphore "e" to ensure there is room in your queue to place the request. Second, acquire the semaphore "s" for the queue. Then add your task (In this case, a client connection) to the queue. Release the semaphore for the queue. Finally, signal to the consumers using semaphore "n".
In the .handle() method you really only care about the very beginning of the thread.
sem_wait(clients_.n); //consume
client = clients_.q->front();
sem_post(clients_.e); //buffer free
//Handles the client requests until they disconnect.
The consumer does similar actions to the producer, but in opposite fashion. First the consumer waits for the producer to signal on the semaphore "n". Remember since there are multiple consumers it is completely random which consumer might end up acquiring this semaphore. They fight over it, but only one can move passed this point per sem_post of that semaphore. Second, they acquire the queue semaphore like the producer does. Pop the first item off the queue and release the semaphore. Finally, they signal on the buffer semaphore "e" that there is now more room in the buffer.
I know the semaphores have terrible names. They match my professor's slides since that's where I learned it. I think they stand for the following:
e for empty : this semaphore stops the producer from pushing more items on the queue if it is full.
s for semaphore : My least favorite. But my professor's style was to have a struct for each shared data struct. In this case "clients_" is the struct including all three semaphores and the queue. Basically this semaphore is there to ensure no two threads touch the same data structure at the same time.
n for number of items in the queue.

Ok, so to make it as simple as possible. You will need 2 threads, mutex, queue and 2 thread processing functions.
static DWORD WINAPI ThreadFunc_Prod(LPVOID lpParam);
static DWORD WINAPI ThreadFunc_Con(LPVOID lpParam);
HANDLE m_hThread[2];
queue<int> m_Q;
mutex m_M;
Add all needed stuff, these are just core parts you need
DWORD dwThreadId;
m_hThread[0] = CreateThread(NULL, 0, this->ThreadFunc_Prod, this, 0, &dwThreadId);
// same for 2nd thread
DWORD WINAPI Server::ThreadFunc_Prod(LPVOID lpParam)
cYourClass* o = (cYourClass*) lpParam;
int nData2Q = GetData(); // this is whatever you use to get your data
DWORD WINAPI Server::ThreadFunc_Con(LPVOID lpParam)
cYourClass* o = (cYourClass*) lpParam;
int res;
if (m_Q.empty())
// bad, no data, escape or wait or whatever, don't block context
res = m_Q.front();
// do you magic with res here
And in the end of main - don't forget to use WaitForMultipleObjects
All possible examples can be found directly in MSDN so there is quite nice commentary about that.
ok, so I believe header is self-explainable, so I will give you little bit more description to source. Somewhere in your source (can be even in Constructor) you create threads - the way how to create thread may differ but idea is the same (in win - thread is run right after its creation in posix u have to join). I believe u shall have somewhere a function which starts all your magic, lets call it MagicKicker()
In case of posix, create thread in constructor and join em in your MagicKicker(), win - create in MagicKicker()
Than you would need to declare (in header) two function where you thread function will be implemented ThreadFunc_Prod and ThreadFunc_Prod , important magic here is that you will pass reference to your object to this function (coz thread are basically static) so u can easy access shared resources as queues, mutexes, etc...
These function are actually doing the work. You actually have all u need in you code, just use this as adding routine in Producer:
int nData2Q = GetData(); // this is whatever you use to get your data
m_M.lock(); // locks mutex so nobody cant enter mutex
m_Q.push(nData2Q); // puts data from producer to share queue
m_M.unlock(); // unlock mutex so u can access mutex in your consumer
And add this to your consumer:
int res;
m_M.lock(); // locks mutex so u cant access anything wrapped by mutex in producer
if (m_Q.empty()) // check if there is something in queue
// nothing in you queue yet OR already
// skip this thread run, you can i.e. sleep for some time to build queue
continue; // in case of while wrap
return; // in case that u r running some framework with threadloop
else // there is actually something
res = m_Q.front(); // get oldest element of queue
m_Q.pop(); // delete this element from queue
m_M.unlock(); // unlock mutex so producer can add new items to queue
// do you magic with res here

The problem mentioned in my question was that the image displayed by the Consumer thread was not containing complete data. The image displayed by the Consumer thread contains several patches which suggest that it could not get the full data produced by Producer thread.
ANSWER The reason behind it is the declaration of Mat image inside the while loop of Consumer thread. The Mat instance created inside the while loop gets deleted once the second round of while loop starts and therefore the Producer thread was never able to access the data of Mat image created in the Consumer thread.
SOLUTION: I should have done it something like this
struct ThreadSafeContainer
queue<Mat> safeContainer;
struct Producer
Producer(std::shared_ptr<ThreadSafeContainer> c) : container(c)
void run()
// grab image from camera
// store image in container
Mat image(400, 400, CV_8UC3, Scalar(10, 100,180) );
std::shared_ptr<ThreadSafeContainer> container;
struct Consumer
Consumer(std::shared_ptr<ThreadSafeContainer> c) : container(c)
void run()
// read next image from container
if (!container->safeContainer.empty())
Mat image= container->safeContainer.front(); //The front of the queue contain the image
imshow("consumer image", image);
std::shared_ptr<ThreadSafeContainer> container;


How bad it is to lock a mutex in an infinite loop or an update function

std::queue<double> some_q;
std::mutex mu_q;
/* an update function may be an event observer */
void UpdateFunc()
/* some other processing */
std::lock_guard lock{ mu_q };
while (!some_q.empty())
const auto& val = some_q.front();
/* update different states according to val */
/* some other processing */
/* some other thread might add some values after processing some other inputs */
void AddVal(...)
std::lock_guard lock{ mu_q };
For this case is it okay to handle the queue this way?
Or would it be better if I try to use a lock-free queue like the boost one?
How bad it is to lock a mutex in an infinite loop or an update function
It's pretty bad. Infinite loops actually make your program have undefined behavior unless it does one of the following:
make a call to a library I/O function
perform an access through a volatile glvalue
perform a synchronization operation or an atomic operation
Acquiring the mutex lock before entering the loop and just holding it does not count as performing a synchronization operation (in the loop). Also, when holding the mutex, noone can add information to the queue, so while processing the information you extract, all threads wanting to add to the queue will have to wait - and no other worker threads wanting to share the load can extract from the queue either. It's usually better to extract one task from the queue, release the lock and then work with what you got.
The common way is to use a condition_variable that lets other threads acquire the lock and then notify other threads waiting with the same condition_variable. The CPU will be pretty close to idle while waiting and wake up to do the work when needed.
Using your program as a base, it could look like this:
#include <chrono>
#include <condition_variable>
#include <iostream>
#include <mutex>
#include <queue>
#include <thread>
std::queue<double> some_q;
std::mutex mu_q;
std::condition_variable cv_q; // the condition variable
bool stop_q = false; // something to signal the worker thread to quit
/* an update function may be an event observer */
void UpdateFunc() {
while(true) {
double val;
std::unique_lock lock{mu_q};
// cv_q.wait lets others acquire the lock to work with the queue
// while it waits to be notified.
while (not stop_q && some_q.empty()) cv_q.wait(lock);
if(stop_q) break; // time to quit
val = std::move(some_q.front());
} // lock released so others can use the queue
// do time consuming work with "val" here
std::cout << "got " << val << '\n';
/* some other thread might add some values after processing some other inputs */
void AddVal(double val) {
std::lock_guard lock{mu_q};
cv_q.notify_one(); // notify someone that there's a new value to work with
void StopQ() { // a function to set the queue in shutdown mode
std::lock_guard lock{mu_q};
stop_q = true;
cv_q.notify_all(); // notify all that it's time to stop
int main() {
auto th = std::thread(UpdateFunc);
// simulate some events coming with some time apart
If you really want to process everything that is currently in the queue, then extract everything first and then release the lock, then work with what you extracted. Extracting everything from the queue is done quickly by just swapping in another std::queue. Example:
#include <atomic>
std::atomic<bool> stop_q{}; // needs to be atomic in this version
void UpdateFunc() {
while(not stop_q) {
std::queue<double> work; // this will be used to swap with some_q
std::unique_lock lock{mu_q};
// cv_q.wait lets others acquire the lock to work with the queue
// while it waits to be notified.
while (not stop_q && some_q.empty()) cv_q.wait(lock);
std::swap(work, some_q); // extract everything from the queue at once
} // lock released so others can use the queue
// do time consuming work here
while(not stop_q && not work.empty()) {
auto val = std::move(work.front());
std::cout << "got " << val << '\n';
You can use it like you currently are assuming proper use of the lock across all threads. However, you may run into some frustrations about how you want to call updateFunc().
Are you going to be using a callback?
Are you going to be using an ISR?
Are you going to be polling?
If you use a 3rd party lib it often trivializes thread synchronization and queues
For example, if you are using a CMSIS RTOS(v2). It is a fairly straight forward process to get multiple threads to pass information between each other. You could have multiple producers, and a single consumer.
The single consumer can wait in a forever loop where it waits to receive a message before performing its work
when timeout is set to osWaitForever the function will wait for an
infinite time until the message is retrieved (i.e. wait semantics).
// Two producers
// One consumer which will run only once something enters the queue
tldr; You are safe to proceed, but using a library will likely make your synchronization problems easier.

C++ Threading using 2 Containers

I have the following problem. I use a vector that gets filled up with values from a temperature sensor. This function runs in one thread. Then I have another thread responsible for publishing all the values into a data base which runs once every second. Now the publishing thread will lock the vector using a mutex, so the function that fills it with values will get blocked. However, while the thread that publishes the values is using the vector I want to use another vector to save the temperature values so that I don't lose any values while the data is getting published. How do I get around this problem? I thought about using a pointer that points to the containers and then switching it to the other container once it gets locked to keep saving values, but I dont quite know how.
I tried to add a minimal reproducable example, I hope it kind of explains my situation.
void publish(std::vector<temperature> &inputVector)
//this function would publish the values into a database
//via mqtt and also runs in a thread.
int main()
std::vector<temperature> testVector;
std::vector<temperature> testVector2;
//I am repeatedly saving values into the vector.
//I want to do this in a thread but if the vector locked by a mutex
//i want to switch over to the other vector
Assuming you are using std::mutex, you can use mutex::try_lock on the producer side. Something like this:
if (myMutex.try_lock()) {
// locking succeeded - move all queued values and push the new value
std::move(testVector2.begin(), testVector2.end(), std::back_inserter(testVector));
} else {
// locking failed - queue the value
Of course publish() needs to lock the mutex, too.
void publish(std::vector<temperature> &inputVector)
std::lock_guard<std::mutex> lock(myMutex);
//this function would publish the values into a database
//via mqtt and also runs in a thread.
This seems like the perfect opportunity for an additional (shared) buffer or queue, that's protected by the lock.
main would be essentially as it is now, pushing your new values into the shared buffer.
The other thread would, when it can, lock that buffer and take the new values from it. This should be very fast.
Then, it does not need to lock the shared buffer while doing its database things (which take longer), as it's only working on its own vector during that procedure.
Here's some pseudo-code:
std::mutex pendingTempsMutex;
std::vector<temperature> pendingTemps;
void thread2()
std::vector<temperature> temps;
while (1)
// Get new temps if we have any
std::scoped_lock l(pendingTempsMutex);
if (!temps.empty())
void thread1()
while (1)
std::scoped_lock l(pendingTempsMutex);
Or, if getValue() blocks:
temperature newValue = testSensor.getValue();
std::scoped_lock l(pendingTempsMutex);
Usually you'd use a std::queue for pendingTemps though. I don't think it really matters in this example, because you're always consuming everything in thread 2, but it's more conventional and can be more efficient in some scenarios. It can't lose you much as it's backed by a std::deque. But you can measure/test to see what's best for you.
This solution is pretty much what you already proposed/explored in the question, except that the producer shouldn't be in charge of managing the second vector.
You can improve it by having thread2 wait to be "informed" that there are new values, with a condition variable, otherwise you're going to be doing a lot of busy-waiting. I leave that as an exercise to the reader ;) There should be an example and discussion in your multi-threaded programming book.

Is this implementation of inter-process Producer Consumer correct and safe against process crash?

I am developing a message queue between two processes on Windows.
I would like to support multiple producers and one consumer.
The queue must not be corrupted by the crash of one of the processes, that is, the other processes are not effected by the crash, and when the crashed process is restarted it can continue communication (with the new, updated state).
Assume that the event objects in these snippets are wrappers for named Windows Auto Reset Events and mutex objects are wrappers for named Windows mutex (I used the C++ non-interprocess mutex type as a placeholder).
This is the producer side:
void producer()
for (;;)
// Multiple producers modify _writeOffset so must be given exclusive access
unique_lock<mutex> excludeProducers(_producerMutex);
// A snapshot of the readOffset is sufficient because we use _notFullEvent.
long readOffset = InterlockedCompareExchange(&_readOffset, 0, 0);
// while is required because _notFullEvent.Wait might return because it was abandoned
while (IsFull(readOffset, _writeOffset))
readOffset = InterlockedCompareExchange(&_readOffset, 0, 0);
// use a mutex to protect the resource from the consumer
unique_lock<mutex> lockResource(_resourceMutex);
// update the state
InterlockedExchange(&_writeOffset, IncrementOffset(_writeOffset));
Similarly, this is the consumer side:
void consumer()
for (;;)
long writeOffset = InterlockedCompareExchange(&_writeOffset, 0, 0);
while (IsEmpty(_readOffset, writeOffset))
writeOffset = InterlockedCompareExchange(&_writeOffset, 0, 0);
unique_lock<mutex> lockResource(_resourceMutex);
InterlockedExchange(&_readOffset, IncrementOffset(_readOffset));
Are there any race conditions in this implementation?
Is it indeed protected against crashes as required?
P.S. The queue meets the requirements if the state of the queue is protected. If the crash occurred within the process(i) or consume(i) the contents of those slots might be corrupted and other means will be used to detect and maybe even correct corruption of those. Those means are out of the scope of this question.
There is indeed a race condition in this implementation.
Thank you #VTT for pointing it out.
#VTT wrote that if the producer dies right before _notEmptyEvent.Set(); then consumer may get stuck forever.
Well, maybe not forever, because when the producer is resumed it will add an item and wake up the consumer again. But the state has indeed been corrupted. If, for instance this happens QUEUE_SIZE times, the producer will see that the queue is full (IsFull() will return true) and it will wait. This is a deadlock.
I am considering the following solution to this, adding the commented code on the producer side. A similar addition should be made on the consumer side:
void producer()
for (;;)
// Multiple producers modify _writeOffset so must be given exclusive access
unique_lock<mutex> excludeProducers(_producerMutex);
// A snapshot of the readOffset is sufficient because we use _notFullEvent.
long readOffset = InterlockedCompareExchange(&_readOffset, 0, 0);
// ====================== Added begin
if (!IsEmpty(readOffset, _writeOffset))
// ======================= end Added
// while is required because _notFullEvent.Wait might return because it was abandoned
while (IsFull(readOffset, _writeOffset))
This will cause the producer to wake up the consumer whenever it gets the chance to run, if indeed the queue is now not empty.
This is looking more like a solution based on condition variables, which would have been my preferred pattern, were it not for the unfortunate fact that on Windows, condition variables are not named and therefore cannot be shared between processes.
If this solution is voted correct, I will edit the original post with the complete code.
So there are a few problems with the code posted in the question:
As already noted, there's a marginal race condition; if the queue were to become full, and all the active producers crashed before setting _notFullEvent, your code would deadlock. Your answer correctly resolves that problem by setting the event at the start of the loop rather than the end.
You're over-locking; there's typically little point in having multiple producers if only one of them is going to be producing at a time. This prohibits writing directly into shared memory, you'll need a local cache. (It isn't impossible to have multiple producers writing directly into different slots in the shared memory, but it would make robustness much more difficult to achieve.)
Similarly, you typically need to be able to produce and consume simultaneously, and your code doesn't allow this.
Here's how I'd do it, using a single mutex (shared by both consumer and producer threads) and two auto-reset event objects.
void consumer(void)
for (;;)
if (!IsFull(*read_offset, *write_offset))
// Queue is not full, make sure at least one producer is awake
while (IsEmpty(*read_offset, *write_offset))
// Queue is empty, wait for producer to add a message
WaitForSingleObject(notEmptyEvent, INFINITE);
*read_offset = IncrementOffset(*read_offset);
void producer(void)
for (;;)
if (!IsEmpty(*read_offset, *write_offset))
// Queue is not empty, make sure consumer is awake
if (!IsFull(*read_offset, *write_offset))
// Queue is not full, make sure at least one other producer is awake
while (IsFull(*read_offset, *write_offset))
// Queue is full, wait for consumer to remove a message
WaitForSingleObject(notFullEvent, INFINITE);
*write_offset = IncrementOffset(*write_offset);

multithreaded program producer/consumer [boost]

I'm playing with boost library and C++. I want to create a multithreaded program that contains a producer, conumer, and a stack. The procuder fills the stack, the consumer remove items (int) from the stack. everything work (pop, push, mutex) But when i call the pop/push winthin a thread, i don't get any effect
i made this simple code :
#include "stdafx.h"
#include <stack>
#include <iostream>
#include <algorithm>
#include <boost/shared_ptr.hpp>
#include <boost/thread.hpp>
#include <boost/date_time.hpp>
#include <boost/signals2/mutex.hpp>
#include <ctime>
using namespace std;
/ *
* this class reprents a stack which is proteced by mutex
* Pop and push are executed by one thread each time.
class ProtectedStack{
private :
stack<int> m_Stack;
boost::signals2::mutex m;
public :
ProtectedStack(const ProtectedStack & p){
void push(int x){
void pop(){
int size(){
return m_Stack.size();
bool isEmpty(){
return m_Stack.empty();
int top(){
*The producer is the class that fills the stack. It encapsulate the thread object
class Producer{
Producer(int number ){
//create thread here but don't start here
void fillStack (ProtectedStack& s ) {
int object = 3; //random value
//cout<<"push object\n";
void produce (ProtectedStack & s){
//call fill within a thread
m_Thread = boost::thread(&Producer::fillStack,this, s);
private :
int m_Number;
boost::thread m_Thread;
/* The consumer will consume the products produced by the producer */
class Consumer {
private :
int m_Number;
boost::thread m_Thread;
Consumer(int n){
m_Number = n;
void remove(ProtectedStack &s ) {
if(s.isEmpty()){ // if the stack is empty sleep and wait for the producer to fill the stack
//cout<<"stack is empty\n";
boost::posix_time::seconds workTime(1);
s.pop(); //pop it
//cout<<"pop object\n";
void consume (ProtectedStack & s){
//call remove within a thread
m_Thread = boost::thread(&Consumer::remove, this, s);
int main(int argc, char* argv[])
ProtectedStack s;
Producer p(0);
Producer p2(1);
cout<<"size after production "<<s.size()<<endl;
Consumer c(0);
Consumer c2(1);
cout<<"size after consumption "<<s.size()<<endl;
return 0;
After i run that in VC++ 2010 / win7
i got :
Could you please help me understand why when i call fillStack function from the main i got an effect but when i call it from a thread nothing happens?
Thank you
Your example code suffers from a couple synchronization issues as noted by others:
Missing locks on calls to some of the members of ProtectedStack.
Main thread could exit without allowing worker threads to join.
The producer and consumer do not loop as you would expect. Producers should always (when they can) be producing, and consumers should keep consuming as new elements are pushed onto the stack.
cout's on the main thread may very well be performed before the producers or consumers have had a chance to work yet.
I would recommend looking at using a condition variable for synchronization between your producers and consumers. Take a look at the producer/consumer example here:
It is a rather new feature in the standard library as of C++11 and supported as of VS2012. Before VS2012, you would either need boost or to use Win32 calls.
Using a condition variable to tackle a producer/consumer problem is nice because it almost enforces the use of a mutex to lock shared data and it provides a signaling mechanism to let consumers know something is ready to be consumed so they don't have so spin (which is always a trade off between the responsiveness of the consumer and CPU usage polling the queue). It also does so being atomic itself which prevents the possibility of threads missing a signal that there is something to consume as explained here:
To give a brief run-down of how a condition variable takes care of this...
A producer does all time consuming activities on its thread without the owning the mutex.
The producer locks the mutex, adds the item it produced to a global data structure (probably a queue of some sort), lets go of the mutex and signals a single consumer to go -- in that order.
A consumer that is waiting on the condition variable re-acquires the mutex automatically, removes the item out of the queue and does some processing on it. During this time, the producer is already working on producing a new item but has to wait until the consumer is done before it can queue the item up.
This would have the following impact on your code:
No more need for ProtectedStack, a normal stack/queue data structure will do.
No need for boost if you are using a new enough compiler - removing build dependencies is always a nice thing.
I get the feeling that threading is rather new to you so I can only offer the advice to look at how others have solved synchronization issues as it is very difficult to wrap your mind around. Confusion about what is going on in an environment with multiple threads and shared data typically leads to issues like deadlocks down the road.
The major problem with your code is that your threads are not synchronized.
Remember that by default threads execution isn't ordered and isn't sequenced, so consumer threads actually can be (and in your particular case are) finished before any producer thread produces any data.
To make sure consumers will be run after producers finished its work you need to use thread::join() function on producer threads, it will stop main thread execution until producers exit:
// Start producers
p.m_Thread.join(); // Wait p to complete
p2.m_Thread.join(); // Wait p2 to complete
// Start consumers
This will do the trick, but probably this is not good for typical producer-consumer use case.
To achieve more useful case you need to fix consumer function.
Your consumer function actually doesn't wait for produced data, it will just exit if stack is empty and never consume any data if no data were produced yet.
It shall be like this:
void remove(ProtectedStack &s)
// Place your actual exit condition here,
// e.g. count of consumed elements or some event
// raised by producers meaning no more data available etc.
// For testing/educational purpose it can be just while(true)
// Second sleeping is too big, use milliseconds instead
boost::posix_time::milliseconds workTime(1);
Another problem is wrong thread constructor usage:
m_Thread = boost::thread(&Producer::fillStack, this, s);
Quote from Boost.Thread documentation:
Thread Constructor with arguments
template <class F,class A1,class A2,...>
thread(F f,A1 a1,A2 a2,...);
F and each An must by copyable or movable.
As if thread(boost::bind(f,a1,a2,...)). Consequently, f and each an are copied into
internal storage for access by the new thread.
This means that each your thread receives its own copy of s and all modifications aren't applied to s but to local thread copies. It's the same case when you pass object to function argument by value. You need to pass s object by reference instead - using boost::ref:
void produce(ProtectedStack& s)
m_Thread = boost::thread(&Producer::fillStack, this, boost::ref(s));
void consume(ProtectedStack& s)
m_Thread = boost::thread(&Consumer::remove, this, boost::ref(s));
Another issues is about your mutex usage. It's not the best possible.
Why do you use mutex from Signals2 library? Just use boost::mutex from Boost.Thread and remove uneeded dependency to Signals2 library.
Use RAII wrapper boost::lock_guard instead of direct lock/unlock calls.
As other people mentioned, you shall protect with lock all members of ProtectedStack.
boost::mutex m;
void push(int x)
boost::lock_guard<boost::mutex> lock(m);
void pop()
boost::lock_guard<boost::mutex> lock(m);
if(!m_Stack.empty()) m_Stack.pop();
int size()
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.size();
bool isEmpty()
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.empty();
int top()
boost::lock_guard<boost::mutex> lock(m);
You're not checking that the producing thread has executed before you try to consume. You're also not locking around size/empty/top... that's not safe if the container's being updated.

C++ multithreading, simple consumer / producer threads, LIFO, notification, counter

I am new to multi-thread programming, I want to implement the following functionality.
There are 2 threads, producer and consumer.
Consumer only processes the latest value, i.e., last in first out (LIFO).
Producer sometimes generates new value at a faster rate than consumer can
process. For example, producer may generate 2 new value in 1
milli-second, but it approximately takes consumer 5 milli-seconds to process.
If consumer receives a new value in the middle of processing an old
value, there is no need to interrupt. In other words, consumer will finish current
execution first, then start an execution on the latest value.
Here is my design process, please correct me if I am wrong.
There is no need for a queue, since only the latest value is
processed by consumer.
Is notification sent from producer being queued automatically???
I will use a counter instead.
ConsumerThread() check the counter at the end, to make sure producer
doesn't generate new value.
But what happen if producer generates a new value just before consumer
goes to sleep(), but after check the counter???
Here is some pseudo code.
boost::mutex mutex;
double x;
void ProducerThread()
boost::scoped_lock lock(mutex);
x = rand();
notify(); // wake up consumer thread
void ConsumerThread()
counter = 0; // reset counter, only process the latest value
... do something which takes 5 milli-seconds ...
if (counter > 0)
... execute this function again, not too sure how to implement this ...
... what happen if producer generates a new value here??? ...
If I understood your question correctly, for your particular application, the consumer only needs to process the latest available value provided by the producer. In other words, it's acceptable for values to get dropped because the consumer cannot keep up with the producer.
If that's the case, then I agree that you can get away without a queue and use a counter. However, the shared counter and value variables will be need to be accessed atomically.
You can use boost::condition_variable to signal notifications to the consumer that a new value is ready. Here is a complete example; I'll let the comments do the explaining.
#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>
#include <boost/thread/locks.hpp>
#include <boost/date_time/posix_time/posix_time_types.hpp>
boost::mutex mutex;
boost::condition_variable condvar;
typedef boost::unique_lock<boost::mutex> LockType;
// Variables that are shared between producer and consumer.
double value = 0;
int count = 0;
void producer()
while (true)
// value and counter must both be updated atomically
// using a mutex lock
LockType lock(mutex);
value = std::rand();
// Notify the consumer that a new value is ready.
// Simulate exaggerated 2ms delay
void consumer()
// Local copies of 'count' and 'value' variables. We want to do the
// work using local copies so that they don't get clobbered by
// the producer when it updates.
int currentCount = 0;
double currentValue = 0;
while (true)
// Acquire the mutex before accessing 'count' and 'value' variables.
LockType lock(mutex); // mutex is locked while in this scope
while (count == currentCount)
// Wait for producer to signal that there is a new value.
// While we are waiting, Boost releases the mutex so that
// other threads may acquire it.
// `lock` is automatically re-acquired when we come out of
// condvar.wait(lock). So it's safe to access the 'value'
// variable at this point.
currentValue = value; // Grab a copy of the latest value
// while we hold the lock.
// Now that we are out of the mutex lock scope, we work with our
// local copy of `value`. The producer can keep on clobbering the
// 'value' variable all it wants, but it won't affect us here
// because we are now using `currentValue`.
std::cout << "value = " << currentValue << "\n";
// Simulate exaggerated 5ms delay
int main()
boost::thread c(&consumer);
boost::thread p(&producer);
I was thinking about this question recently, and realized that this solution, while it may work, is not optimal. Your producer is using all that CPU just to throw away half of the computed values.
I suggest that you reconsider your design and go with a bounded blocking queue between the producer and consumer. Such a queue should have the following characteristics:
The queue has a fixed size (bounded)
If the consumer wants to pop the next item, but the queue is empty, the operation will be blocked until notified by the producer that an item is available.
The producer can check if there's room to push another item and block until the space becomes available.
With this type of queue, you can effectively throttle down the producer so that it doesn't outpace the consumer. It also ensures that the producer doesn't waste CPU resources computing values that will be thrown away.
Libraries such as TBB and PPL provide implementations of concurrent queues. If you want to attempt to roll your own using std::queue (or boost::circular_buffer) and boost::condition_variable, check out this blogger's example.
The short answer is that you're almost certainly wrong.
With a producer/consumer, you pretty much need a queue between the two threads. There are basically two alternatives: either your code won't will simply lose tasks (which usually equals not working at all) or else your producer thread will need to block for the consumer thread to be idle before it can produce an item -- which effectively translates to single threading.
For the moment, I'm going to assume that the value you get back from rand is supposed to represent the task to be executed (i.e., is the value produced by the producer and consumed by the consumer). In that case, I'd write the code something like this:
void producer() {
for (int i=0; i<100; i++)
queue.insert(random()); // queue.insert blocks if queue is full
queue.insert(-1.0); // Tell consumer to exit
void consumer() {
double value;
while ((value = queue.get()) != -1) // queue.get blocks if queue is empty
This, relegates nearly all the interlocking to the queue. The rest of the code for both threads pretty much ignores threading issues entirely.
Implementing a pipeline is actually quite tricky if you are doing it ground-up. For example, you'd have to use condition variable to avoid the kind of race condition you described in your question, avoid busy waiting when implementing the mechanism for "waking up" the consumer etc... Even using a "queue" of just 1 element won't save you from some of these complexities.
It's usually much better to use specialized libraries that were developed and extensively tested specifically for this purpose. If you can live with Visual C++ specific solution, take a look at Parallel Patterns Library, and the concept of Pipelines.