What are some use cases for memory_order_relaxed - c++

The C++ memory model has relaxed atomics, which do not put any ordering guarantees on memory operations. Other than the mailbox example in C which I have found here:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1525.htm
Based on the motivating example in this paper:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2153.pdf
I was curious about other use cases for this type of synchronization mechanism.

A simple example that I see in my work frequently is a stats counter. If you
want to count the number of times an event happens but don't need any sort of
synchronization across threads aside from making the increment safe, using
memory_order_relaxed makes sense.
static std::atomic<size_t> g_event_count_;
void HandleEvent() {
// Increment the global count. This operation is safe and correct even
// if there are other threads concurrently running HandleEvent or
// PrintStats.
g_event_count_.fetch_add(1, std::memory_order_relaxed);
[...]
}
void PrintStats() {
// Snapshot the "current" value of the counter. "Current" is in scare
// quotes because the value may change while this function is running.
// But unlike a plain old size_t, reading from std::atomic<size_t> is
// safe.
const size_t event_count =
g_event_count_.load(std::memory_order_relaxed);
// Use event_count in a report.
[...]
}
In both cases, there is no need to use a stronger memory order. On some
platforms, doing so could have negative performance impact.

Event reader in this case could be connected to X11 socket, where frequency of events depends from a user actions (resizing window, typing, etc.) And if the GUI thread's event dispatcher is checking for events at regular intervals (e.g. due to some timer events in user application) we don't want to needlessly block event reader thread by acquiring lock on the shared event queue which we know is empty. We can simply check if anything has been queued by using the 'dataReady' atomic. This is also known as "Double-checked locking" pattern.
namespace {
std::mutex mutex;
std::atomic_bool dataReady(false);
std::atomic_bool done(false);
std::deque<int> events; // shared event queue, protected by mutex
}
void eventReaderThread()
{
static int eventId = 0;
std::chrono::milliseconds ms(100);
while (true) {
std::this_thread::sleep_for(ms);
mutex.lock();
eventId++; // populate event queue, e.g from pending messgaes on a socket
events.push_back(eventId);
dataReady.store(true, std::memory_order_release);
mutex.unlock();
if (eventId == 10) {
done.store(true, std::memory_order_release);
break;
}
}
}
void guiThread()
{
while (!done.load(std::memory_order_acquire)) {
if (dataReady.load(std::memory_order_acquire)) { // Double-checked locking pattern
mutex.lock();
std::cout << events.front() << std::endl;
events.pop_front();
// If consumer() is called again, and producer() has not added new events yet,
// we will see the value set via this memory_order_relaxed.
// If producer() has added new events, we will see that as well due to normal
// acquire->release.
// relaxed docs say: "guarantee atomicity and modification order consistency"
dataReady.store(false, std::memory_order_relaxed);
mutex.unlock();
}
}
}
int main()
{
std::thread producerThread(eventReaderThread);
std::thread consumerThread(guiThread);
producerThread.join();
consumerThread.join();
}

Related

Minimal mutexes for std::queue producer/consumer

I have two threads that work the producer and consumer sides of a std::queue. The queue isn't often full, so I'd like to avoid the consumer grabbing the mutex that is guarding mutating the queue.
Is it okay to call empty() outside the mutex then only grab the mutex if there is something in the queue?
For example:
struct MyData{
int a;
int b;
};
class SpeedyAccess{
public:
void AddDataFromThread1(MyData data){
const std::lock_guard<std::mutex> queueMutexLock(queueAccess);
workQueue.push(data);
}
void CheckFromThread2(){
if(!workQueue.empty()) // Un-protected access...is this dangerous?
{
queueAccess.lock();
MyData data = workQueue.front();
workQueue.pop();
queueAccess.unlock();
ExpensiveComputation(data);
}
}
private:
void ExpensiveComputation(MyData& data);
std::queue<MyData> workQueue;
std::mutex queueAccess;
}
Thread 2 does the check and isn't particularly time-critical, but will get called a lot (500/sec?). Thread 1 is very time critical, a lot of stuff needs to run there, but isn't called as frequently (max 20/sec).
If I add a mutex guard around empty(), if the queue is empty when thread 2 comes, it won't hold the mutex for long, so might not be a big hit. However, since it gets called so frequently, it might occasionally happen at the same time something is trying to get put on the back....will this cause a substantial amount of waiting in thread 1?
As written in the comments above, you should call empty() only under a lock.
But I believe there is a better way to do it.
You can use a std::condition_variable together with a std::mutex, to achieve synchronization of access to the queue, without locking the mutex more than you must.
However - when using std::condition_variable, you must be aware that it suffers from spurious wakeups. You can read about it here: Spurious wakeup - Wikipedia.
You can see some code examples here:
Condition variable examples.
The correct way to use a std::condition_variable is demonstrated below (with some comments).
This is just a minimal example to show the principle.
#include <thread>
#include <mutex>
#include <condition_variable>
#include <queue>
#include <iostream>
using MyData = int;
std::mutex mtx;
std::condition_variable cond_var;
std::queue<MyData> q;
void producer()
{
MyData produced_val = 0;
while (true)
{
std::this_thread::sleep_for(std::chrono::milliseconds(1000)); // simulate some pause between productions
++produced_val;
std::cout << "produced: " << produced_val << std::endl;
{
// Access the Q under the lock:
std::unique_lock<std::mutex> lck(mtx);
q.push(produced_val);
cond_var.notify_all(); // It's not a must to nofity under the lock but it might be more efficient (see #DavidSchwartz's comment below).
}
}
}
void consumer()
{
while (true)
{
MyData consumed_val;
{
// Access the Q under the lock:
std::unique_lock<std::mutex> lck(mtx);
// NOTE: The following call will lock the mutex only when the the condition_varible will cause wakeup
// (due to `notify` or spurious wakeup).
// Then it will check if the Q is empty.
// If empty it will release the lock and continue to wait.
// If not empty, the lock will be kept until out of scope.
// See the documentation for std::condition_variable.
cond_var.wait(lck, []() { return !q.empty(); }); // will loop internally to handle spurious wakeups
consumed_val = q.front();
q.pop();
}
std::cout << "consumed: " << consumed_val << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(200)); // simulate some calculation
}
}
int main()
{
std::thread p(producer);
std::thread c(consumer);
while(true) {}
p.join(); c.join(); // will never happen in our case but to remind us what is needed.
return 0;
}
Some notes:
In your real code, none of the threads should run forever. You should have some mechanism to notify them to gracefully exit.
The global variables (mtx,q etc.) are better to be members of some context class, or passed to the producer() and consumer() as parameters.
This example assumes for simplicity that the producer's production rate is always low relatively to the consumer's rate. In your real code you can make it more general, by making the consumer extract all elements in the Q each time the condition_variable is signaled.
You can "play" with the sleep_for times for the producer and consumer to test varios timing cases.

Using std::condition_variable with atomic<bool>

There are several questions on SO dealing with atomic, and other that deal with std::condition_variable. But my question if my use below is correct?
Three threads, one ctrl thread that does preparation work before unpausing the two other threads. The ctrl thread also is able to pause the worker threads (sender/receiver) while they are in their tight send/receive loops.
The idea with using the atomic is to make the tight loops faster in case the boolean for pausing is not set.
class SomeClass
{
public:
//...
// Disregard that data is public...
std::condition_variable cv; // UDP threads will wait on this cv until allowed
// to run by ctrl thread.
std::mutex cv_m;
std::atomic<bool> pause_test_threads;
};
void do_pause_test_threads(SomeClass *someclass)
{
if (!someclass->pause_test_threads)
{
// Even though we use an atomic, mutex must be held during
// modification. See documentation of condition variable
// notify_all/wait. Mutex does not need to be held for the actual
// notify call.
std::lock_guard<std::mutex> lk(someclass->cv_m);
someclass->pause_test_threads = true;
}
}
void unpause_test_threads(SomeClass *someclass)
{
if (someclass->pause_test_threads)
{
{
// Even though we use an atomic, mutex must be held during
// modification. See documentation of condition variable
// notify_all/wait. Mutex does not need to be held for the actual
// notify call.
std::lock_guard<std::mutex> lk(someclass->cv_m);
someclass->pause_test_threads = false;
}
someclass->cv.notify_all(); // Allow send/receive threads to run.
}
}
void wait_to_start(SomeClass *someclass)
{
std::unique_lock<std::mutex> lk(someclass->cv_m); // RAII, no need for unlock.
auto not_paused = [someclass](){return someclass->pause_test_threads == false;};
someclass->cv.wait(lk, not_paused);
}
void ctrl_thread(SomeClass *someclass)
{
// Do startup work
// ...
unpause_test_threads(someclass);
for (;;)
{
// ... check for end-program etc, if so, break;
if (lost ctrl connection to other endpoint)
{
pause_test_threads();
}
else
{
unpause_test_threads();
}
sleep(SLEEP_INTERVAL);
}
unpause_test_threads(someclass);
}
void sender_thread(SomeClass *someclass)
{
wait_to_start(someclass);
...
for (;;)
{
// ... check for end-program etc, if so, break;
if (someclass->pause_test_threads) wait_to_start(someclass);
...
}
}
void receiver_thread(SomeClass *someclass)
{
wait_to_start(someclass);
...
for (;;)
{
// ... check for end-program etc, if so, break;
if (someclass->pause_test_threads) wait_to_start(someclass);
...
}
I looked through your code manipulating conditional variable and atomic, and it seems that it is correct and won't cause problems.
Why you should protect writes to shared variable even if it is atomic:
There could be problems if write to shared variable happens between checking it in predicate and waiting on condition. Consider following:
Waiting thread wakes spuriously, aquires mutex, checks predicate and evaluates it to false, so it must wait on cv again.
Controlling thread sets shared variable to true.
Controlling thread sends notification, which is not received by anybody, because there is no threads waiting on conditional variable.
Waiting thread waits on conditional variable. Since notification was already sent, it would wait until next spurious wakeup, or next time when controlling thread sends notification. Potentially waiting indefinetly.
Reads from shared atomic variables without locking is generally safe, unless it introduces TOCTOU problems.
In your case you are reading shared variable to avoid unnecessary locking and then checking it again after lock (in conditional wait call). It is a valid optimisation, called double-checked locking and I do not see any potential problems here.
You might want to check if atomic<bool> is lock-free. Otherwise you will have even more locks you would have without it.
In general, you want to treat the fact that variable is atomic independently of how it works with a condition variable.
If all code that interacts with the condition variable follows the usual pattern of locking the mutex before query/modification, and the code interacting with the condition variable does not rely on code that does not interact with the condition variable, it will continue to be correct even if it wraps an atomic mutex.
From a quick read of your pseudo-code, this appears to be correct. However, pseudo-code is often a poor substitute for real code for multi-threaded code.
The "optimization" of only waiting on the condition variable (and locking the mutex) when an atomic read says you might want to may or may not be an optimization. You need to profile throughput.
atomic data doesn't need another synchronization, it's basis of lock-free algorithms and data structures.
void do_pause_test_threads(SomeClass *someclass)
{
if (!someclass->pause_test_threads)
{
/// your pause_test_threads might be changed here by other thread
/// so you have to acquire mutex before checking and changing
/// or use atomic methods - compare_exchange_weak/strong,
/// but not all together
std::lock_guard<std::mutex> lk(someclass->cv_m);
someclass->pause_test_threads = true;
}
}

Why is there no wait function for condition_variable which does not relock the mutex

Consider the following example.
std::mutex mtx;
std::condition_variable cv;
void f()
{
{
std::unique_lock<std::mutex> lock( mtx );
cv.wait( lock ); // 1
}
std::cout << "f()\n";
}
void g()
{
std::this_thread::sleep_for( 1s );
cv.notify_one();
}
int main()
{
std::thread t1{ f };
std::thread t2{ g };
t2.join();
t1.join();
}
g() "knows" that f() is waiting in the scenario I would like to discuss.
According to cppreference.com there is no need for g() to lock the mutex before calling notify_one. Now in the line marked "1" cv will release the mutex and relock it once the notification is sent. The destructor of lock releases it again immediately after that. This seems to be superfluous especially since locking is expensive. (I know in certain scenarios the mutex needs to be locked. But this is not the case here.)
Why does condition_variable have no function "wait_nolock" which does not relock the mutex once the notification arrives. If the answer is that pthreads do not provide such functionality: Why can`t pthreads be extended for providing it? Is there an alternative for realizing the desired behavior?
You misunderstand what your code does.
Your code on line // 1 is free to not block at all. condition_variables can (and will!) have spurious wakeups -- they can wake up for no good reason at all.
You are responsible for checking if the wakeup is spurious.
Using a condition_variable properly requires 3 things:
A condition_variable
A mutex
Some data guarded by the mutex
The data guarded by the mutex is modified (under the mutex). Then (with the mutex possibly disengaged), the condition_variable is notified.
On the other end, you lock the mutex, then wait on the condition variable. When you wake up, your mutex is relocked, and you test if the wakeup is spurious by looking at the data guarded by the mutex. If it is a valid wakeup, you process and proceed.
If it wasn't a valid wakeup, you go back to waiting.
In your case, you don't have any data guarded, you cannot distinguish spurious wakeups from real ones, and your design is incomplete.
Not surprisingly with the incomplete design you don't see the reason why the mutex is relocked: it is relocked so you can safely check the data to see if the wakeup was spurious or not.
If you want to know why condition variables are designed that way, probably because this design is more efficient than the "reliable" one (for whatever reason), and rather than exposing higher level primitives, C++ exposed the lower level more efficient primitives.
Building a higher level abstraction on top of this isn't hard, but there are design decisions. Here is one built on top of std::experimental::optional:
template<class T>
struct data_passer {
std::experimental::optional<T> data;
bool abort_flag = false;
std::mutex guard;
std::condition_variable signal;
void send( T t ) {
{
std::unique_lock<std::mutex> _(guard);
data = std::move(t);
}
signal.notify_one();
}
void abort() {
{
std::unique_lock<std::mutex> _(guard);
abort_flag = true;
}
signal.notify_all();
}
std::experimental::optional<T> get() {
std::unique_lock<std::mutex> _(guard);
signal.wait( _, [this]()->bool{
return data || abort_flag;
});
if (abort_flag) return {};
T retval = std::move(*data);
data = {};
return retval;
}
};
Now, each send can cause a get to succeed at the other end. If more than one send occurs, only the latest one is consumed by a get. If and when abort_flag is set, instead get() immediately returns {};
The above supports multiple consumers and producers.
An example of how the above might be used is a source of preview state (say, a UI thread), and one or more preview renderers (which are not fast enough to be run in the UI thread).
The preview state dumps a preview state into the data_passer<preview_state> willy-nilly. The renderers compete and one of them grabs it. Then they render it, and pass it back (through whatever mechanism).
If the preview states come faster than the renderers consume them, only the most recent one is of interest, so the earlier ones are discarded. But existing previews aren't aborted just because a new state shows up.
Questions where asked below about race conditions.
If the data being communicated is atomic, can't we do without the mutex on the "send" side?
So something like this:
template<class T>
struct data_passer {
std::atomic<std::experimental::optional<T>> data;
std::atomic<bool> abort_flag = false;
std::mutex guard;
std::condition_variable signal;
void send( T t ) {
data = std::move(t); // 1a
signal.notify_one(); // 1b
}
void abort() {
abort_flag = true; // 1a
signal.notify_all(); // 1b
}
std::experimental::optional<T> get() {
std::unique_lock<std::mutex> _(guard); // 2a
signal.wait( _, [this]()->bool{ // 2b
return data.load() || abort_flag.load(); // 2c
});
if (abort_flag.load()) return {};
T retval = std::move(*data.load());
// data = std::experimental::nullopt; // doesn't make sense
return retval;
}
};
the above fails to work.
We start with the listening thread. It does step 2a, then waits (2b). It evaluates the condition at step 2c, but doesn't return from the lambda yet.
The broadcasting thread then does step 1a (setting the data), then signals the condition variable. At this moment, nobody is waiting on the condition variable (the code in the lambda doesn't count!).
The listening thread then finishes the lambda, and returns "spurious wakeup". It then blocks on the condition variable, and never notices that data was sent.
The std::mutex used while waiting on the condition variable must guard the write to the data "passed" by the condition variable (whatever test you do to determine if the wakeup was spurious), and the read (in the lambda), or the possibility of "lost signals" exists. (At least in a simple implementation: more complex implementations can create lock-free paths for "common cases" and only use the mutex in a double-check. This is beyond the scope of this question.)
Using atomic variables does not get around this problem, because the two operations of "determine if the message was spurious" and "rewait in the condition variable" must be atomic with regards to the "spuriousness" of the message.

multithreaded program producer/consumer [boost]

I'm playing with boost library and C++. I want to create a multithreaded program that contains a producer, conumer, and a stack. The procuder fills the stack, the consumer remove items (int) from the stack. everything work (pop, push, mutex) But when i call the pop/push winthin a thread, i don't get any effect
i made this simple code :
#include "stdafx.h"
#include <stack>
#include <iostream>
#include <algorithm>
#include <boost/shared_ptr.hpp>
#include <boost/thread.hpp>
#include <boost/date_time.hpp>
#include <boost/signals2/mutex.hpp>
#include <ctime>
using namespace std;
/ *
* this class reprents a stack which is proteced by mutex
* Pop and push are executed by one thread each time.
*/
class ProtectedStack{
private :
stack<int> m_Stack;
boost::signals2::mutex m;
public :
ProtectedStack(){
}
ProtectedStack(const ProtectedStack & p){
}
void push(int x){
m.lock();
m_Stack.push(x);
m.unlock();
}
void pop(){
m.lock();
//return m_Stack.top();
if(!m_Stack.empty())
m_Stack.pop();
m.unlock();
}
int size(){
return m_Stack.size();
}
bool isEmpty(){
return m_Stack.empty();
}
int top(){
return m_Stack.top();
}
};
/*
*The producer is the class that fills the stack. It encapsulate the thread object
*/
class Producer{
public:
Producer(int number ){
//create thread here but don't start here
m_Number=number;
}
void fillStack (ProtectedStack& s ) {
int object = 3; //random value
s.push(object);
//cout<<"push object\n";
}
void produce (ProtectedStack & s){
//call fill within a thread
m_Thread = boost::thread(&Producer::fillStack,this, s);
}
private :
int m_Number;
boost::thread m_Thread;
};
/* The consumer will consume the products produced by the producer */
class Consumer {
private :
int m_Number;
boost::thread m_Thread;
public:
Consumer(int n){
m_Number = n;
}
void remove(ProtectedStack &s ) {
if(s.isEmpty()){ // if the stack is empty sleep and wait for the producer to fill the stack
//cout<<"stack is empty\n";
boost::posix_time::seconds workTime(1);
boost::this_thread::sleep(workTime);
}
else{
s.pop(); //pop it
//cout<<"pop object\n";
}
}
void consume (ProtectedStack & s){
//call remove within a thread
m_Thread = boost::thread(&Consumer::remove, this, s);
}
};
int main(int argc, char* argv[])
{
ProtectedStack s;
Producer p(0);
p.produce(s);
Producer p2(1);
p2.produce(s);
cout<<"size after production "<<s.size()<<endl;
Consumer c(0);
c.consume(s);
Consumer c2(1);
c2.consume(s);
cout<<"size after consumption "<<s.size()<<endl;
getchar();
return 0;
}
After i run that in VC++ 2010 / win7
i got :
0
0
Could you please help me understand why when i call fillStack function from the main i got an effect but when i call it from a thread nothing happens?
Thank you
Your example code suffers from a couple synchronization issues as noted by others:
Missing locks on calls to some of the members of ProtectedStack.
Main thread could exit without allowing worker threads to join.
The producer and consumer do not loop as you would expect. Producers should always (when they can) be producing, and consumers should keep consuming as new elements are pushed onto the stack.
cout's on the main thread may very well be performed before the producers or consumers have had a chance to work yet.
I would recommend looking at using a condition variable for synchronization between your producers and consumers. Take a look at the producer/consumer example here: http://en.cppreference.com/w/cpp/thread/condition_variable
It is a rather new feature in the standard library as of C++11 and supported as of VS2012. Before VS2012, you would either need boost or to use Win32 calls.
Using a condition variable to tackle a producer/consumer problem is nice because it almost enforces the use of a mutex to lock shared data and it provides a signaling mechanism to let consumers know something is ready to be consumed so they don't have so spin (which is always a trade off between the responsiveness of the consumer and CPU usage polling the queue). It also does so being atomic itself which prevents the possibility of threads missing a signal that there is something to consume as explained here: https://en.wikipedia.org/wiki/Sleeping_barber_problem
To give a brief run-down of how a condition variable takes care of this...
A producer does all time consuming activities on its thread without the owning the mutex.
The producer locks the mutex, adds the item it produced to a global data structure (probably a queue of some sort), lets go of the mutex and signals a single consumer to go -- in that order.
A consumer that is waiting on the condition variable re-acquires the mutex automatically, removes the item out of the queue and does some processing on it. During this time, the producer is already working on producing a new item but has to wait until the consumer is done before it can queue the item up.
This would have the following impact on your code:
No more need for ProtectedStack, a normal stack/queue data structure will do.
No need for boost if you are using a new enough compiler - removing build dependencies is always a nice thing.
I get the feeling that threading is rather new to you so I can only offer the advice to look at how others have solved synchronization issues as it is very difficult to wrap your mind around. Confusion about what is going on in an environment with multiple threads and shared data typically leads to issues like deadlocks down the road.
The major problem with your code is that your threads are not synchronized.
Remember that by default threads execution isn't ordered and isn't sequenced, so consumer threads actually can be (and in your particular case are) finished before any producer thread produces any data.
To make sure consumers will be run after producers finished its work you need to use thread::join() function on producer threads, it will stop main thread execution until producers exit:
// Start producers
...
p.m_Thread.join(); // Wait p to complete
p2.m_Thread.join(); // Wait p2 to complete
// Start consumers
...
This will do the trick, but probably this is not good for typical producer-consumer use case.
To achieve more useful case you need to fix consumer function.
Your consumer function actually doesn't wait for produced data, it will just exit if stack is empty and never consume any data if no data were produced yet.
It shall be like this:
void remove(ProtectedStack &s)
{
// Place your actual exit condition here,
// e.g. count of consumed elements or some event
// raised by producers meaning no more data available etc.
// For testing/educational purpose it can be just while(true)
while(!_some_exit_condition_)
{
if(s.isEmpty())
{
// Second sleeping is too big, use milliseconds instead
boost::posix_time::milliseconds workTime(1);
boost::this_thread::sleep(workTime);
}
else
{
s.pop();
}
}
}
Another problem is wrong thread constructor usage:
m_Thread = boost::thread(&Producer::fillStack, this, s);
Quote from Boost.Thread documentation:
Thread Constructor with arguments
template <class F,class A1,class A2,...>
thread(F f,A1 a1,A2 a2,...);
Preconditions:
F and each An must by copyable or movable.
Effects:
As if thread(boost::bind(f,a1,a2,...)). Consequently, f and each an are copied into
internal storage for access by the new thread.
This means that each your thread receives its own copy of s and all modifications aren't applied to s but to local thread copies. It's the same case when you pass object to function argument by value. You need to pass s object by reference instead - using boost::ref:
void produce(ProtectedStack& s)
{
m_Thread = boost::thread(&Producer::fillStack, this, boost::ref(s));
}
void consume(ProtectedStack& s)
{
m_Thread = boost::thread(&Consumer::remove, this, boost::ref(s));
}
Another issues is about your mutex usage. It's not the best possible.
Why do you use mutex from Signals2 library? Just use boost::mutex from Boost.Thread and remove uneeded dependency to Signals2 library.
Use RAII wrapper boost::lock_guard instead of direct lock/unlock calls.
As other people mentioned, you shall protect with lock all members of ProtectedStack.
Sample:
boost::mutex m;
void push(int x)
{
boost::lock_guard<boost::mutex> lock(m);
m_Stack.push(x);
}
void pop()
{
boost::lock_guard<boost::mutex> lock(m);
if(!m_Stack.empty()) m_Stack.pop();
}
int size()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.size();
}
bool isEmpty()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.empty();
}
int top()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.top();
}
You're not checking that the producing thread has executed before you try to consume. You're also not locking around size/empty/top... that's not safe if the container's being updated.

C++ multithreading, simple consumer / producer threads, LIFO, notification, counter

I am new to multi-thread programming, I want to implement the following functionality.
There are 2 threads, producer and consumer.
Consumer only processes the latest value, i.e., last in first out (LIFO).
Producer sometimes generates new value at a faster rate than consumer can
process. For example, producer may generate 2 new value in 1
milli-second, but it approximately takes consumer 5 milli-seconds to process.
If consumer receives a new value in the middle of processing an old
value, there is no need to interrupt. In other words, consumer will finish current
execution first, then start an execution on the latest value.
Here is my design process, please correct me if I am wrong.
There is no need for a queue, since only the latest value is
processed by consumer.
Is notification sent from producer being queued automatically???
I will use a counter instead.
ConsumerThread() check the counter at the end, to make sure producer
doesn't generate new value.
But what happen if producer generates a new value just before consumer
goes to sleep(), but after check the counter???
Here is some pseudo code.
boost::mutex mutex;
double x;
void ProducerThread()
{
{
boost::scoped_lock lock(mutex);
x = rand();
counter++;
}
notify(); // wake up consumer thread
}
void ConsumerThread()
{
counter = 0; // reset counter, only process the latest value
... do something which takes 5 milli-seconds ...
if (counter > 0)
{
... execute this function again, not too sure how to implement this ...
}
else
{
... what happen if producer generates a new value here??? ...
sleep();
}
}
Thanks.
If I understood your question correctly, for your particular application, the consumer only needs to process the latest available value provided by the producer. In other words, it's acceptable for values to get dropped because the consumer cannot keep up with the producer.
If that's the case, then I agree that you can get away without a queue and use a counter. However, the shared counter and value variables will be need to be accessed atomically.
You can use boost::condition_variable to signal notifications to the consumer that a new value is ready. Here is a complete example; I'll let the comments do the explaining.
#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>
#include <boost/thread/locks.hpp>
#include <boost/date_time/posix_time/posix_time_types.hpp>
boost::mutex mutex;
boost::condition_variable condvar;
typedef boost::unique_lock<boost::mutex> LockType;
// Variables that are shared between producer and consumer.
double value = 0;
int count = 0;
void producer()
{
while (true)
{
{
// value and counter must both be updated atomically
// using a mutex lock
LockType lock(mutex);
value = std::rand();
++count;
// Notify the consumer that a new value is ready.
condvar.notify_one();
}
// Simulate exaggerated 2ms delay
boost::this_thread::sleep(boost::posix_time::milliseconds(200));
}
}
void consumer()
{
// Local copies of 'count' and 'value' variables. We want to do the
// work using local copies so that they don't get clobbered by
// the producer when it updates.
int currentCount = 0;
double currentValue = 0;
while (true)
{
{
// Acquire the mutex before accessing 'count' and 'value' variables.
LockType lock(mutex); // mutex is locked while in this scope
while (count == currentCount)
{
// Wait for producer to signal that there is a new value.
// While we are waiting, Boost releases the mutex so that
// other threads may acquire it.
condvar.wait(lock);
}
// `lock` is automatically re-acquired when we come out of
// condvar.wait(lock). So it's safe to access the 'value'
// variable at this point.
currentValue = value; // Grab a copy of the latest value
// while we hold the lock.
}
// Now that we are out of the mutex lock scope, we work with our
// local copy of `value`. The producer can keep on clobbering the
// 'value' variable all it wants, but it won't affect us here
// because we are now using `currentValue`.
std::cout << "value = " << currentValue << "\n";
// Simulate exaggerated 5ms delay
boost::this_thread::sleep(boost::posix_time::milliseconds(500));
}
}
int main()
{
boost::thread c(&consumer);
boost::thread p(&producer);
c.join();
p.join();
}
ADDENDUM
I was thinking about this question recently, and realized that this solution, while it may work, is not optimal. Your producer is using all that CPU just to throw away half of the computed values.
I suggest that you reconsider your design and go with a bounded blocking queue between the producer and consumer. Such a queue should have the following characteristics:
Thread-safe
The queue has a fixed size (bounded)
If the consumer wants to pop the next item, but the queue is empty, the operation will be blocked until notified by the producer that an item is available.
The producer can check if there's room to push another item and block until the space becomes available.
With this type of queue, you can effectively throttle down the producer so that it doesn't outpace the consumer. It also ensures that the producer doesn't waste CPU resources computing values that will be thrown away.
Libraries such as TBB and PPL provide implementations of concurrent queues. If you want to attempt to roll your own using std::queue (or boost::circular_buffer) and boost::condition_variable, check out this blogger's example.
The short answer is that you're almost certainly wrong.
With a producer/consumer, you pretty much need a queue between the two threads. There are basically two alternatives: either your code won't will simply lose tasks (which usually equals not working at all) or else your producer thread will need to block for the consumer thread to be idle before it can produce an item -- which effectively translates to single threading.
For the moment, I'm going to assume that the value you get back from rand is supposed to represent the task to be executed (i.e., is the value produced by the producer and consumed by the consumer). In that case, I'd write the code something like this:
void producer() {
for (int i=0; i<100; i++)
queue.insert(random()); // queue.insert blocks if queue is full
queue.insert(-1.0); // Tell consumer to exit
}
void consumer() {
double value;
while ((value = queue.get()) != -1) // queue.get blocks if queue is empty
process(value);
}
This, relegates nearly all the interlocking to the queue. The rest of the code for both threads pretty much ignores threading issues entirely.
Implementing a pipeline is actually quite tricky if you are doing it ground-up. For example, you'd have to use condition variable to avoid the kind of race condition you described in your question, avoid busy waiting when implementing the mechanism for "waking up" the consumer etc... Even using a "queue" of just 1 element won't save you from some of these complexities.
It's usually much better to use specialized libraries that were developed and extensively tested specifically for this purpose. If you can live with Visual C++ specific solution, take a look at Parallel Patterns Library, and the concept of Pipelines.