C++ multithreading, simple consumer / producer threads, LIFO, notification, counter

C++ multithreading, simple consumer / producer threads, LIFO, notification, counter - c++

I am new to multi-thread programming, I want to implement the following functionality.
There are 2 threads, producer and consumer.
Consumer only processes the latest value, i.e., last in first out (LIFO).
Producer sometimes generates new value at a faster rate than consumer can
process. For example, producer may generate 2 new value in 1
milli-second, but it approximately takes consumer 5 milli-seconds to process.
If consumer receives a new value in the middle of processing an old
value, there is no need to interrupt. In other words, consumer will finish current
execution first, then start an execution on the latest value.
Here is my design process, please correct me if I am wrong.
There is no need for a queue, since only the latest value is
processed by consumer.
Is notification sent from producer being queued automatically???
I will use a counter instead.
ConsumerThread() check the counter at the end, to make sure producer
doesn't generate new value.
But what happen if producer generates a new value just before consumer
goes to sleep(), but after check the counter???
Here is some pseudo code.
boost::mutex mutex;
double x;
void ProducerThread()
{
{
boost::scoped_lock lock(mutex);
x = rand();
counter++;
}
notify(); // wake up consumer thread
}
void ConsumerThread()
{
counter = 0; // reset counter, only process the latest value
... do something which takes 5 milli-seconds ...
if (counter > 0)
{
... execute this function again, not too sure how to implement this ...
}
else
{
... what happen if producer generates a new value here??? ...
sleep();
}
}
Thanks.

If I understood your question correctly, for your particular application, the consumer only needs to process the latest available value provided by the producer. In other words, it's acceptable for values to get dropped because the consumer cannot keep up with the producer.
If that's the case, then I agree that you can get away without a queue and use a counter. However, the shared counter and value variables will be need to be accessed atomically.
You can use boost::condition_variable to signal notifications to the consumer that a new value is ready. Here is a complete example; I'll let the comments do the explaining.
#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>
#include <boost/thread/locks.hpp>
#include <boost/date_time/posix_time/posix_time_types.hpp>
boost::mutex mutex;
boost::condition_variable condvar;
typedef boost::unique_lock<boost::mutex> LockType;
// Variables that are shared between producer and consumer.
double value = 0;
int count = 0;
void producer()
{
while (true)
{
{
// value and counter must both be updated atomically
// using a mutex lock
LockType lock(mutex);
value = std::rand();
++count;
// Notify the consumer that a new value is ready.
condvar.notify_one();
}
// Simulate exaggerated 2ms delay
boost::this_thread::sleep(boost::posix_time::milliseconds(200));
}
}
void consumer()
{
// Local copies of 'count' and 'value' variables. We want to do the
// work using local copies so that they don't get clobbered by
// the producer when it updates.
int currentCount = 0;
double currentValue = 0;
while (true)
{
{
// Acquire the mutex before accessing 'count' and 'value' variables.
LockType lock(mutex); // mutex is locked while in this scope
while (count == currentCount)
{
// Wait for producer to signal that there is a new value.
// While we are waiting, Boost releases the mutex so that
// other threads may acquire it.
condvar.wait(lock);
}
// `lock` is automatically re-acquired when we come out of
// condvar.wait(lock). So it's safe to access the 'value'
// variable at this point.
currentValue = value; // Grab a copy of the latest value
// while we hold the lock.
}
// Now that we are out of the mutex lock scope, we work with our
// local copy of `value`. The producer can keep on clobbering the
// 'value' variable all it wants, but it won't affect us here
// because we are now using `currentValue`.
std::cout << "value = " << currentValue << "\n";
// Simulate exaggerated 5ms delay
boost::this_thread::sleep(boost::posix_time::milliseconds(500));
}
}
int main()
{
boost::thread c(&consumer);
boost::thread p(&producer);
c.join();
p.join();
}
ADDENDUM
I was thinking about this question recently, and realized that this solution, while it may work, is not optimal. Your producer is using all that CPU just to throw away half of the computed values.
I suggest that you reconsider your design and go with a bounded blocking queue between the producer and consumer. Such a queue should have the following characteristics:
Thread-safe
The queue has a fixed size (bounded)
If the consumer wants to pop the next item, but the queue is empty, the operation will be blocked until notified by the producer that an item is available.
The producer can check if there's room to push another item and block until the space becomes available.
With this type of queue, you can effectively throttle down the producer so that it doesn't outpace the consumer. It also ensures that the producer doesn't waste CPU resources computing values that will be thrown away.
Libraries such as TBB and PPL provide implementations of concurrent queues. If you want to attempt to roll your own using std::queue (or boost::circular_buffer) and boost::condition_variable, check out this blogger's example.

The short answer is that you're almost certainly wrong.
With a producer/consumer, you pretty much need a queue between the two threads. There are basically two alternatives: either your code won't will simply lose tasks (which usually equals not working at all) or else your producer thread will need to block for the consumer thread to be idle before it can produce an item -- which effectively translates to single threading.
For the moment, I'm going to assume that the value you get back from rand is supposed to represent the task to be executed (i.e., is the value produced by the producer and consumed by the consumer). In that case, I'd write the code something like this:
void producer() {
for (int i=0; i<100; i++)
queue.insert(random()); // queue.insert blocks if queue is full
queue.insert(-1.0); // Tell consumer to exit
}
void consumer() {
double value;
while ((value = queue.get()) != -1) // queue.get blocks if queue is empty
process(value);
}
This, relegates nearly all the interlocking to the queue. The rest of the code for both threads pretty much ignores threading issues entirely.

Implementing a pipeline is actually quite tricky if you are doing it ground-up. For example, you'd have to use condition variable to avoid the kind of race condition you described in your question, avoid busy waiting when implementing the mechanism for "waking up" the consumer etc... Even using a "queue" of just 1 element won't save you from some of these complexities.
It's usually much better to use specialized libraries that were developed and extensively tested specifically for this purpose. If you can live with Visual C++ specific solution, take a look at Parallel Patterns Library, and the concept of Pipelines.

Related

How bad it is to lock a mutex in an infinite loop or an update function

std::queue<double> some_q;
std::mutex mu_q;
/* an update function may be an event observer */
void UpdateFunc()
{
/* some other processing */
std::lock_guard lock{ mu_q };
while (!some_q.empty())
{
const auto& val = some_q.front();
/* update different states according to val */
some_q.pop();
}
/* some other processing */
}
/* some other thread might add some values after processing some other inputs */
void AddVal(...)
{
std::lock_guard lock{ mu_q };
some_q.push(...);
}
For this case is it okay to handle the queue this way?
Or would it be better if I try to use a lock-free queue like the boost one?

How bad it is to lock a mutex in an infinite loop or an update function
It's pretty bad. Infinite loops actually make your program have undefined behavior unless it does one of the following:
terminate
make a call to a library I/O function
perform an access through a volatile glvalue
perform a synchronization operation or an atomic operation
Acquiring the mutex lock before entering the loop and just holding it does not count as performing a synchronization operation (in the loop). Also, when holding the mutex, noone can add information to the queue, so while processing the information you extract, all threads wanting to add to the queue will have to wait - and no other worker threads wanting to share the load can extract from the queue either. It's usually better to extract one task from the queue, release the lock and then work with what you got.
The common way is to use a condition_variable that lets other threads acquire the lock and then notify other threads waiting with the same condition_variable. The CPU will be pretty close to idle while waiting and wake up to do the work when needed.
Using your program as a base, it could look like this:
#include <chrono>
#include <condition_variable>
#include <iostream>
#include <mutex>
#include <queue>
#include <thread>
std::queue<double> some_q;
std::mutex mu_q;
std::condition_variable cv_q; // the condition variable
bool stop_q = false; // something to signal the worker thread to quit
/* an update function may be an event observer */
void UpdateFunc() {
while(true) {
double val;
{
std::unique_lock lock{mu_q};
// cv_q.wait lets others acquire the lock to work with the queue
// while it waits to be notified.
while (not stop_q && some_q.empty()) cv_q.wait(lock);
if(stop_q) break; // time to quit
val = std::move(some_q.front());
some_q.pop();
} // lock released so others can use the queue
// do time consuming work with "val" here
std::cout << "got " << val << '\n';
}
}
/* some other thread might add some values after processing some other inputs */
void AddVal(double val) {
std::lock_guard lock{mu_q};
some_q.push(val);
cv_q.notify_one(); // notify someone that there's a new value to work with
}
void StopQ() { // a function to set the queue in shutdown mode
std::lock_guard lock{mu_q};
stop_q = true;
cv_q.notify_all(); // notify all that it's time to stop
}
int main() {
auto th = std::thread(UpdateFunc);
// simulate some events coming with some time apart
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(1.2);
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(3.4);
std::this_thread::sleep_for(std::chrono::seconds(1));
AddVal(5.6);
std::this_thread::sleep_for(std::chrono::seconds(1));
StopQ();
th.join();
}
If you really want to process everything that is currently in the queue, then extract everything first and then release the lock, then work with what you extracted. Extracting everything from the queue is done quickly by just swapping in another std::queue. Example:
#include <atomic>
std::atomic<bool> stop_q{}; // needs to be atomic in this version
void UpdateFunc() {
while(not stop_q) {
std::queue<double> work; // this will be used to swap with some_q
{
std::unique_lock lock{mu_q};
// cv_q.wait lets others acquire the lock to work with the queue
// while it waits to be notified.
while (not stop_q && some_q.empty()) cv_q.wait(lock);
std::swap(work, some_q); // extract everything from the queue at once
} // lock released so others can use the queue
// do time consuming work here
while(not stop_q && not work.empty()) {
auto val = std::move(work.front());
work.pop();
std::cout << "got " << val << '\n';
}
}
}

You can use it like you currently are assuming proper use of the lock across all threads. However, you may run into some frustrations about how you want to call updateFunc().
Are you going to be using a callback?
Are you going to be using an ISR?
Are you going to be polling?
If you use a 3rd party lib it often trivializes thread synchronization and queues
For example, if you are using a CMSIS RTOS(v2). It is a fairly straight forward process to get multiple threads to pass information between each other. You could have multiple producers, and a single consumer.
The single consumer can wait in a forever loop where it waits to receive a message before performing its work
when timeout is set to osWaitForever the function will wait for an
infinite time until the message is retrieved (i.e. wait semantics).
// Two producers
osMessageQueuePut(X,Y,Z,timeout=0)
osMessageQueuePut(X,Y,Z,timeout=0)
// One consumer which will run only once something enters the queue
osMessageQueueGet(X,Y,Z,osWaitForever)
tldr; You are safe to proceed, but using a library will likely make your synchronization problems easier.

And odd use of conditional variable with local mutex

Poring through legacy code of old and large project, I had found that there was used some odd method of creating thread-safe queue, something like this:
template < typename _Msg>
class WaitQue: public QWaitCondition
{
public:
typedef _Msg DataType;
void wakeOne(const DataType& msg)
{
QMutexLocker lock_(&mx);
que.push(msg);
QWaitCondition::wakeOne();
}
void wait(DataType& msg)
{
/// wait if empty.
{
QMutex wx; // WHAT?
QMutexLocker cvlock_(&wx);
if (que.empty())
QWaitCondition::wait(&wx);
}
{
QMutexLocker _wlock(&mx);
msg = que.front();
que.pop();
}
}
unsigned long size() {
QMutexLocker lock_(&mx);
return que.size();
}
private:
std::queue<DataType> que;
QMutex mx;
};
wakeOne is used from threads as kind of "posting" function" and wait is called from other threads and waits indefinitely until a message appears in queue. In some cases roles between threads reverse at different stages and using separate queues.
Is this even legal way to use a QMutex by creating local one? I kind of understand why someone could do that to dodge deadlock while reading size of que but how it even works? Is there a simpler and more idiomatic way to achieve this behavior?

Its legal to have a local condition variable. But it normally makes no sense.
As you've worked out in this case is wrong. You should be using the member:
void wait(DataType& msg)
{
QMutexLocker cvlock_(&mx);
while (que.empty())
QWaitCondition::wait(&mx);
msg = que.front();
que.pop();
}
Notice also that you must have while instead of if around the call to QWaitCondition::wait. This is for complex reasons about (possible) spurious wake up - the Qt docs aren't clear here. But more importantly the fact that the wake and the subsequent reacquire of the mutex is not an atomic operation means you must recheck the variable queue for emptiness. It could be this last case where you previously were getting deadlocks/UB.
Consider the scenario of an empty queue and a caller (thread 1) to wait into QWaitCondition::wait. This thread blocks. Then thread 2 comes along and adds an item to the queue and calls wakeOne. Thread 1 gets woken up and tries to reacquire the mutex. However, thread 3 comes along in your implementation of wait, takes the mutex before thread 1, sees the queue isn't empty, processes the single item and moves on, releasing the mutex. Then thread 1 which has been woken up finally acquires the mutex, returns from QWaitCondition::wait and tries to process... an empty queue. Yikes.

Is this implementation of inter-process Producer Consumer correct and safe against process crash?

I am developing a message queue between two processes on Windows.
I would like to support multiple producers and one consumer.
The queue must not be corrupted by the crash of one of the processes, that is, the other processes are not effected by the crash, and when the crashed process is restarted it can continue communication (with the new, updated state).
Assume that the event objects in these snippets are wrappers for named Windows Auto Reset Events and mutex objects are wrappers for named Windows mutex (I used the C++ non-interprocess mutex type as a placeholder).
This is the producer side:
void producer()
{
for (;;)
{
// Multiple producers modify _writeOffset so must be given exclusive access
unique_lock<mutex> excludeProducers(_producerMutex);
// A snapshot of the readOffset is sufficient because we use _notFullEvent.
long readOffset = InterlockedCompareExchange(&_readOffset, 0, 0);
// while is required because _notFullEvent.Wait might return because it was abandoned
while (IsFull(readOffset, _writeOffset))
{
_notFullEvent.Wait(INFINITE);
readOffset = InterlockedCompareExchange(&_readOffset, 0, 0);
}
// use a mutex to protect the resource from the consumer
{
unique_lock<mutex> lockResource(_resourceMutex);
produce(_writeOffset);
}
// update the state
InterlockedExchange(&_writeOffset, IncrementOffset(_writeOffset));
_notEmptyEvent.Set();
}
}
Similarly, this is the consumer side:
void consumer()
{
for (;;)
{
long writeOffset = InterlockedCompareExchange(&_writeOffset, 0, 0);
while (IsEmpty(_readOffset, writeOffset))
{
_notEmptyEvent.Wait(INFINITE);
writeOffset = InterlockedCompareExchange(&_writeOffset, 0, 0);
}
{
unique_lock<mutex> lockResource(_resourceMutex);
consume(_readOffset);
}
InterlockedExchange(&_readOffset, IncrementOffset(_readOffset));
_notFullEvent.Set();
}
}
Are there any race conditions in this implementation?
Is it indeed protected against crashes as required?
P.S. The queue meets the requirements if the state of the queue is protected. If the crash occurred within the process(i) or consume(i) the contents of those slots might be corrupted and other means will be used to detect and maybe even correct corruption of those. Those means are out of the scope of this question.

There is indeed a race condition in this implementation.
Thank you #VTT for pointing it out.
#VTT wrote that if the producer dies right before _notEmptyEvent.Set(); then consumer may get stuck forever.
Well, maybe not forever, because when the producer is resumed it will add an item and wake up the consumer again. But the state has indeed been corrupted. If, for instance this happens QUEUE_SIZE times, the producer will see that the queue is full (IsFull() will return true) and it will wait. This is a deadlock.
I am considering the following solution to this, adding the commented code on the producer side. A similar addition should be made on the consumer side:
void producer()
{
for (;;)
{
// Multiple producers modify _writeOffset so must be given exclusive access
unique_lock<mutex> excludeProducers(_producerMutex);
// A snapshot of the readOffset is sufficient because we use _notFullEvent.
long readOffset = InterlockedCompareExchange(&_readOffset, 0, 0);
// ====================== Added begin
if (!IsEmpty(readOffset, _writeOffset))
{
_notEmptyEvent.Set();
}
// ======================= end Added
// while is required because _notFullEvent.Wait might return because it was abandoned
while (IsFull(readOffset, _writeOffset))
This will cause the producer to wake up the consumer whenever it gets the chance to run, if indeed the queue is now not empty.
This is looking more like a solution based on condition variables, which would have been my preferred pattern, were it not for the unfortunate fact that on Windows, condition variables are not named and therefore cannot be shared between processes.
If this solution is voted correct, I will edit the original post with the complete code.

So there are a few problems with the code posted in the question:
As already noted, there's a marginal race condition; if the queue were to become full, and all the active producers crashed before setting _notFullEvent, your code would deadlock. Your answer correctly resolves that problem by setting the event at the start of the loop rather than the end.
You're over-locking; there's typically little point in having multiple producers if only one of them is going to be producing at a time. This prohibits writing directly into shared memory, you'll need a local cache. (It isn't impossible to have multiple producers writing directly into different slots in the shared memory, but it would make robustness much more difficult to achieve.)
Similarly, you typically need to be able to produce and consume simultaneously, and your code doesn't allow this.
Here's how I'd do it, using a single mutex (shared by both consumer and producer threads) and two auto-reset event objects.
void consumer(void)
{
claim_mutex();
for (;;)
{
if (!IsFull(*read_offset, *write_offset))
{
// Queue is not full, make sure at least one producer is awake
SetEvent(notFullEvent);
}
while (IsEmpty(*read_offset, *write_offset))
{
// Queue is empty, wait for producer to add a message
release_mutex();
WaitForSingleObject(notEmptyEvent, INFINITE);
claim_mutex();
}
release_mutex();
consume(*read_offset);
claim_mutex();
*read_offset = IncrementOffset(*read_offset);
}
}
void producer(void)
{
claim_mutex();
for (;;)
{
if (!IsEmpty(*read_offset, *write_offset))
{
// Queue is not empty, make sure consumer is awake
SetEvent(notEmptyEvent);
}
if (!IsFull(*read_offset, *write_offset))
{
// Queue is not full, make sure at least one other producer is awake
SetEvent(notFullEvent);
}
release_mutex();
produce_in_local_cache();
claim_mutex();
while (IsFull(*read_offset, *write_offset))
{
// Queue is full, wait for consumer to remove a message
release_mutex();
WaitForSingleObject(notFullEvent, INFINITE);
claim_mutex();
}
copy_from_local_cache_to_shared_memory(*write_offset);
*write_offset = IncrementOffset(*write_offset);
}
}

C++11 lockfree single producer single consumer: how to avoid busy wait

I'm trying to implement a class that uses two threads: one for the producer and one for the consumer. The current implementation does not use locks:
#include <boost/lockfree/spsc_queue.hpp>
#include <atomic>
#include <thread>
using Queue =
boost::lockfree::spsc_queue<
int,
boost::lockfree::capacity<1024>>;
class Worker
{
public:
Worker() : working_(false), done_(false) {}
~Worker() {
done_ = true; // exit even if the work has not been completed
worker_.join();
}
void enqueue(int value) {
queue_.push(value);
if (!working_) {
working_ = true;
worker_ = std::thread([this]{ work(); });
}
}
void work() {
int value;
while (!done_ && queue_.pop(value)) {
std::cout << value << std::endl;
}
working_ = false;
}
private:
std::atomic<bool> working_;
std::atomic<bool> done_;
Queue queue_;
std::thread worker_;
};
The application needs to enqueue work items for a certain amount of time and then sleep waiting for an event. This is a minimal main that simulates the behavior:
int main()
{
Worker w;
for (int i = 0; i < 1000; ++i)
w.enqueue(i);
std::this_thread::sleep_for(std::chrono::seconds(1));
for (int i = 0; i < 1000; ++i)
w.enqueue(i);
std::this_thread::sleep_for(std::chrono::seconds(1));
}
I'm pretty sure that my implementation is bugged: what if the worker thread completes and before executing working_ = false, another enqueue comes? Is it possible to make my code thread safe without using locks?
The solution requires:
a fast enqueue
the destructor has to quit even if the queue is not empty
no busy wait, because there are long period of time in which the worker thread is idle
no locks if possible
Edit
I did another implementation of the Worker class, based on your suggestions. Here is my second attempt:
class Worker
{
public:
Worker()
: working_(ATOMIC_FLAG_INIT), done_(false) { }
~Worker() {
// exit even if the work has not been completed
done_ = true;
if (worker_.joinable())
worker_.join();
}
bool enqueue(int value) {
bool enqueued = queue_.push(value);
if (!working_.test_and_set()) {
if (worker_.joinable())
worker_.join();
worker_ = std::thread([this]{ work(); });
}
return enqueued;
}
void work() {
int value;
while (!done_ && queue_.pop(value)) {
std::cout << value << std::endl;
}
working_.clear();
while (!done_ && queue_.pop(value)) {
std::cout << value << std::endl;
}
}
private:
std::atomic_flag working_;
std::atomic<bool> done_;
Queue queue_;
std::thread worker_;
};
I introduced the worker_.join() inside the enqueue method. This can impact the performances, but in very rare cases (when the queue gets empty and before the thread exits, another enqueue comes). The working_ variable is now an atomic_flag that is set in enqueue and cleared in work. The Additional while after working_.clear() is needed because if another value is pushed, before the clear, but after the while, the value is not processed.
Is this implementation correct?
I did some tests and the implementation seems to work.
OT: Is it better to put this as an edit, or an answer?

what if the worker thread completes and before executing working_ = false, another enqueue comes?
Then the value will be pushed to the queue but will not be processed until another value is enqueued after the flag is set. You (or your users) may decide whether that is acceptable. This can be avoided using locks, but they're against your requirements.
The code may fail if the running thread is about to finish and sets working_ = false; but hasn't stopped running before next value is enqueued. In that case your code will call operator= on the running thread which results in a call to std::terminate according to the linked documentation.
Adding worker_.join() before assigning the worker to a new thread should prevent that.
Another problem is that queue_.push may fail if the queue is full because it has a fixed size. Currently you just ignore the case and the value will not be added to the full queue. If you wait for queue to have space, you don't get fast enqueue (in the edge case). You could take the bool returned by push (which tells if it was successful) and return it from enqueue. That way the caller may decide whether it wants to wait or discard the value.
Or use non-fixed size queue. Boost has this to say about that choice:
Can be used to completely disable dynamic memory allocations during push in order to ensure lockfree behavior.
If the data structure is configured as fixed-sized, the internal nodes are stored inside an array and they are addressed
by array indexing. This limits the possible size of the queue to the number of elements that can be addressed by the index
type (usually 2**16-2), but on platforms that lack double-width compare-and-exchange instructions, this is the best way
to achieve lock-freedom.

Your worker thread needs more than 2 states.
Not running
Doing tasks
Idle shutdown
Shutdown
If you force shut down, it skips idle shutdown. If you run out of tasks, it transitions to idle shutdown. In idle shutdown, it empties the task queue, then goes into shutting down.
Shutdown is set, then you walk off the end of your worker task.
The producer first puts things on the queue. Then it checks the worker state. If Shutdown or Idle shutdown, first join it (and transition it to not running) then launch a new worker. If not running, just launch a new worker.
If the producer wants to launch a new worker, it first makes sure that we are in the not running state (otherwise, logic error). We then transition to the Doing tasks state, and then we launch the worker thread.
If the producer wants to shut down the helper task, it sets the done flag. It then checks the worker state. If it is anything besides not running, it joins it.
This can result in a worker thread that is launched for no good reason.
There are a few cases where the above can block, but there where a few before as well.
Then, we write a formal or semi-formal proof that the above cannot lose messages, because when writing lock free code you aren't done until you have a proof.

This is my solution of the question. I don't like very much answering myself, but I think showing actual code may help others.
#include <boost/lockfree/spsc_queue.hpp>
#include <atomic>
#include <thread>
// I used this semaphore class: https://gist.github.com/yohhoy/2156481
#include "binsem.hpp"
using Queue =
boost::lockfree::spsc_queue<
int,
boost::lockfree::capacity<1024>>;
class Worker
{
public:
// the worker thread starts in the constructor
Worker()
: working_(ATOMIC_FLAG_INIT), done_(false), semaphore_(0)
, worker_([this]{ work(); })
{ }
~Worker() {
// exit even if the work has not been completed
done_ = true;
semaphore_.signal();
worker_.join();
}
bool enqueue(int value) {
bool enqueued = queue_.push(value);
if (!working_.test_and_set())
// signal to the worker thread to wake up
semaphore_.signal();
return enqueued;
}
void work() {
int value;
// the worker thread continue to live
while (!done_) {
// wait the start signal, sleeping
semaphore_.wait();
while (!done_ && queue_.pop(value)) {
// perform actual work
std::cout << value << std::endl;
}
working_.clear();
while (!done_ && queue_.pop(value)) {
// perform actual work
std::cout << value << std::endl;
}
}
}
private:
std::atomic_flag working_;
std::atomic<bool> done_;
binsem semaphore_;
Queue queue_;
std::thread worker_;
};
I tried the suggestion of #Cameron, to not shutdown the thread and adding a semaphore. This actually is used only in the first enqueue and in the last work. This is not lock-free, but only in these two cases.
I did some performance comparison, between my previous version (see my edited question), and this one. There are no significant differences, when there are not many start and stop. However, the enqueue is 10 times faster when it have to signal the worker thread, instead of starting a new thread. This is a rare case, so it is not very important, but anyway it is an improvement.
This implementation satisfies:
lock-free in the common case (when enqueue and work are busy);
no busy wait in case for long time there are not enqueue
the destructor exits as soon as possible
correctness?? :)

Very partial answer: I think all those atomics, semaphores and states are a back-communication channel, from "the thread" to "the Worker". Why not use another queue for that? At the very least, thinking about it will help you around the problem.

multithreaded program producer/consumer [boost]

I'm playing with boost library and C++. I want to create a multithreaded program that contains a producer, conumer, and a stack. The procuder fills the stack, the consumer remove items (int) from the stack. everything work (pop, push, mutex) But when i call the pop/push winthin a thread, i don't get any effect
i made this simple code :
#include "stdafx.h"
#include <stack>
#include <iostream>
#include <algorithm>
#include <boost/shared_ptr.hpp>
#include <boost/thread.hpp>
#include <boost/date_time.hpp>
#include <boost/signals2/mutex.hpp>
#include <ctime>
using namespace std;
/ *
* this class reprents a stack which is proteced by mutex
* Pop and push are executed by one thread each time.
*/
class ProtectedStack{
private :
stack<int> m_Stack;
boost::signals2::mutex m;
public :
ProtectedStack(){
}
ProtectedStack(const ProtectedStack & p){
}
void push(int x){
m.lock();
m_Stack.push(x);
m.unlock();
}
void pop(){
m.lock();
//return m_Stack.top();
if(!m_Stack.empty())
m_Stack.pop();
m.unlock();
}
int size(){
return m_Stack.size();
}
bool isEmpty(){
return m_Stack.empty();
}
int top(){
return m_Stack.top();
}
};
/*
*The producer is the class that fills the stack. It encapsulate the thread object
*/
class Producer{
public:
Producer(int number ){
//create thread here but don't start here
m_Number=number;
}
void fillStack (ProtectedStack& s ) {
int object = 3; //random value
s.push(object);
//cout<<"push object\n";
}
void produce (ProtectedStack & s){
//call fill within a thread
m_Thread = boost::thread(&Producer::fillStack,this, s);
}
private :
int m_Number;
boost::thread m_Thread;
};
/* The consumer will consume the products produced by the producer */
class Consumer {
private :
int m_Number;
boost::thread m_Thread;
public:
Consumer(int n){
m_Number = n;
}
void remove(ProtectedStack &s ) {
if(s.isEmpty()){ // if the stack is empty sleep and wait for the producer to fill the stack
//cout<<"stack is empty\n";
boost::posix_time::seconds workTime(1);
boost::this_thread::sleep(workTime);
}
else{
s.pop(); //pop it
//cout<<"pop object\n";
}
}
void consume (ProtectedStack & s){
//call remove within a thread
m_Thread = boost::thread(&Consumer::remove, this, s);
}
};
int main(int argc, char* argv[])
{
ProtectedStack s;
Producer p(0);
p.produce(s);
Producer p2(1);
p2.produce(s);
cout<<"size after production "<<s.size()<<endl;
Consumer c(0);
c.consume(s);
Consumer c2(1);
c2.consume(s);
cout<<"size after consumption "<<s.size()<<endl;
getchar();
return 0;
}
After i run that in VC++ 2010 / win7
i got :
0
0
Could you please help me understand why when i call fillStack function from the main i got an effect but when i call it from a thread nothing happens?
Thank you

Your example code suffers from a couple synchronization issues as noted by others:
Missing locks on calls to some of the members of ProtectedStack.
Main thread could exit without allowing worker threads to join.
The producer and consumer do not loop as you would expect. Producers should always (when they can) be producing, and consumers should keep consuming as new elements are pushed onto the stack.
cout's on the main thread may very well be performed before the producers or consumers have had a chance to work yet.
I would recommend looking at using a condition variable for synchronization between your producers and consumers. Take a look at the producer/consumer example here: http://en.cppreference.com/w/cpp/thread/condition_variable
It is a rather new feature in the standard library as of C++11 and supported as of VS2012. Before VS2012, you would either need boost or to use Win32 calls.
Using a condition variable to tackle a producer/consumer problem is nice because it almost enforces the use of a mutex to lock shared data and it provides a signaling mechanism to let consumers know something is ready to be consumed so they don't have so spin (which is always a trade off between the responsiveness of the consumer and CPU usage polling the queue). It also does so being atomic itself which prevents the possibility of threads missing a signal that there is something to consume as explained here: https://en.wikipedia.org/wiki/Sleeping_barber_problem
To give a brief run-down of how a condition variable takes care of this...
A producer does all time consuming activities on its thread without the owning the mutex.
The producer locks the mutex, adds the item it produced to a global data structure (probably a queue of some sort), lets go of the mutex and signals a single consumer to go -- in that order.
A consumer that is waiting on the condition variable re-acquires the mutex automatically, removes the item out of the queue and does some processing on it. During this time, the producer is already working on producing a new item but has to wait until the consumer is done before it can queue the item up.
This would have the following impact on your code:
No more need for ProtectedStack, a normal stack/queue data structure will do.
No need for boost if you are using a new enough compiler - removing build dependencies is always a nice thing.
I get the feeling that threading is rather new to you so I can only offer the advice to look at how others have solved synchronization issues as it is very difficult to wrap your mind around. Confusion about what is going on in an environment with multiple threads and shared data typically leads to issues like deadlocks down the road.

The major problem with your code is that your threads are not synchronized.
Remember that by default threads execution isn't ordered and isn't sequenced, so consumer threads actually can be (and in your particular case are) finished before any producer thread produces any data.
To make sure consumers will be run after producers finished its work you need to use thread::join() function on producer threads, it will stop main thread execution until producers exit:
// Start producers
...
p.m_Thread.join(); // Wait p to complete
p2.m_Thread.join(); // Wait p2 to complete
// Start consumers
...
This will do the trick, but probably this is not good for typical producer-consumer use case.
To achieve more useful case you need to fix consumer function.
Your consumer function actually doesn't wait for produced data, it will just exit if stack is empty and never consume any data if no data were produced yet.
It shall be like this:
void remove(ProtectedStack &s)
{
// Place your actual exit condition here,
// e.g. count of consumed elements or some event
// raised by producers meaning no more data available etc.
// For testing/educational purpose it can be just while(true)
while(!_some_exit_condition_)
{
if(s.isEmpty())
{
// Second sleeping is too big, use milliseconds instead
boost::posix_time::milliseconds workTime(1);
boost::this_thread::sleep(workTime);
}
else
{
s.pop();
}
}
}
Another problem is wrong thread constructor usage:
m_Thread = boost::thread(&Producer::fillStack, this, s);
Quote from Boost.Thread documentation:
Thread Constructor with arguments
template <class F,class A1,class A2,...>
thread(F f,A1 a1,A2 a2,...);
Preconditions:
F and each An must by copyable or movable.
Effects:
As if thread(boost::bind(f,a1,a2,...)). Consequently, f and each an are copied into
internal storage for access by the new thread.
This means that each your thread receives its own copy of s and all modifications aren't applied to s but to local thread copies. It's the same case when you pass object to function argument by value. You need to pass s object by reference instead - using boost::ref:
void produce(ProtectedStack& s)
{
m_Thread = boost::thread(&Producer::fillStack, this, boost::ref(s));
}
void consume(ProtectedStack& s)
{
m_Thread = boost::thread(&Consumer::remove, this, boost::ref(s));
}
Another issues is about your mutex usage. It's not the best possible.
Why do you use mutex from Signals2 library? Just use boost::mutex from Boost.Thread and remove uneeded dependency to Signals2 library.
Use RAII wrapper boost::lock_guard instead of direct lock/unlock calls.
As other people mentioned, you shall protect with lock all members of ProtectedStack.
Sample:
boost::mutex m;
void push(int x)
{
boost::lock_guard<boost::mutex> lock(m);
m_Stack.push(x);
}
void pop()
{
boost::lock_guard<boost::mutex> lock(m);
if(!m_Stack.empty()) m_Stack.pop();
}
int size()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.size();
}
bool isEmpty()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.empty();
}
int top()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.top();
}

You're not checking that the producing thread has executed before you try to consume. You're also not locking around size/empty/top... that's not safe if the container's being updated.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js