C++ Templated Producer-Consumer BlockingQueue, unbounded buffer: How do I end elegantly? - c++

I wrote a BlockingQueue in order to communicate two threads. You could say it follows the Producer-Consumer pattern, with an unbounded buffer. Therefore, I implemented it using a Critical Section and a Semaphore, like this:
#pragma once
#include "Semaphore.h"
#include "Guard.h"
#include <queue>
namespace DRA{
namespace CommonCpp{
template<class Element>
class BlockingQueue{
CCriticalSection m_csQueue;
CSemaphore m_semElementCount;
std::queue<Element> m_Queue;
//Forbid copy and assignment
BlockingQueue( const BlockingQueue& );
BlockingQueue& operator=( const BlockingQueue& );
public:
BlockingQueue( unsigned int maxSize );
~BlockingQueue();
Element Pop();
void Push( Element newElement );
};
}
}
//Template definitions
template<class Element>
DRA::CommonCpp::BlockingQueue<Element>::BlockingQueue( unsigned int maxSize ):
m_csQueue( "BlockingQueue::m_csQueue" ),
m_semElementCount( 0, maxSize ){
}
template<class Element>
DRA::CommonCpp::BlockingQueue<Element>::~BlockingQueue(){
//TODO What can I do here?
}
template<class Element>
void DRA::CommonCpp::BlockingQueue<Element>::Push( Element newElement ){
{//RAII block
CGuard g( m_csQueue );
m_Queue.push( newElement );
}
m_semElementCount.Signal();
}
template<class Element>
Element DRA::CommonCpp::BlockingQueue<Element>::Pop(){
m_semElementCount.Wait();
Element popped;
{//RAII block
CGuard g( m_csQueue );
popped = m_Queue.front();
m_Queue.pop();
}
return popped;
}
CGuard is a RAII wrapper for a CCriticalSection, it enters it on construction and leaves it on destruction. CSemaphore is a wrapper for a Windows semaphore.
So far, so good, the threads are communicating perfectly. However, when the producer thread stops producing and ends, and the consumer thread has consumed everything, the consumer thread stays forever hung on a Pop() call.
How can I tell the consumer to end elegantly? I thought of sending a special empty Element, but it seems too sloppy.

You better use events instead of Semaphore. While adding, take lock on CS, and check element count (store into bIsEmpty local variable). Add into queue, then check if number of elements WAS empty, SetEvent!
On pop method, lock first, then check if it is empty, then WaitForSingleObject - as soon as WFSO returns you get that there is at least one item in queue.
Check this article

Does your Semaphore implementation have a timed wait function available? On Windows, that would be WaitForSingleObject() specifying a timeout. If so, Pop() could be implemented like this:
// Pseudo code
bool Pop(Element& r, timeout)
{
if(sem.wait(timeout))
{
r = m_Queue.front();
m_Queue.pop();
}
return false;
}
This way, the Pop() is still blocking, though it can be easily interrupted. Even with very short timeouts this won't consume significant amounts of CPU (more than absolutely necessary, yes -- and potentially introduce additional context switching -- so note those caveats).

You need a way of telling the consumer to stop. This could be a special element in the queue, say a simple wrapper structure around the Element, or a flag - a member variable of the queue class (in which case you want to make sure the flag is dealt with atomically - lookup windows "interlocked" functions). Then you need to check that condition in the consumer every time it wakes up. Finally, in the destructor, set that stop condition and signal the semaphore.
One issue remains - what to return from the consumer's pop(). I'd go for a boolean return value and an argument of type Element& to copy result into on success.
Edit:
Something like this:
bool Queue::Pop( Element& result ) {
sema.Wait();
if ( stop_condition ) return false;
critical_section.Enter();
result = m_queue.front();
m_queue.pop;
critical_section.Leave();
return true;
}

Change pop to return a boost optional (or do it like the standard library does with top/pop to separate the tasks) and then signal m_semElementCount one last time on destruction.

Related

How to test my blocking queue actually blocks

I have a blocking queue (it would be really hard for me to change its implementation), and I want to test that it actually blocks. In particular, the pop methods must block if the queue is empty and unblock as soon as a push is performed. See the following pseudo C++11 code for the test:
BlockingQueue queue; // empty queue
thread pushThread([]
{
sleep(large_delay);
queue.push();
});
queue.pop();
Obviously it is not perfect, because it may happen that the whole thread pushThread is executed and terminates before pop is called, even if the delay is large, and the larger the delay the more I have to wait for the test being over.
How can I properly ensure that pop is executed before push is called and that is blocks until push returns?
I do not believe this is possible without adding some extra state and interfaces to your BlockingQueue.
Proof goes something like this. You want to wait until the reading thread is blocked on pop. But there is no way to distinguish between that and the thread being about to execute the pop. This remains true no matter what you put just before or after the call to pop itself.
If you really want to fix this with 100% reliability, you need to add some state inside the queue, guarded by the queue's mutex, that means "someone is waiting". The pop call then has to update that state just before it atomically releases the mutex and goes to sleep on the internal condition variable. The push thread can obtain the mutex and wait until "someone is waiting". To avoid a busy loop here, you will want to use the condition variable again.
All of this machinery is nearly as complicated as the queue itself, so maybe you will want to test it, too... This sort of multi-threaded code is where concepts like "code coverage" -- and arguably even unit testing itself -- break down a bit. There are just too many possible interleavings of operations.
In practice, I would probably go with your original approach of sleeping.
template<class T>
struct async_queue {
T pop() {
auto l = lock();
++wait_count;
cv.wait( l, [&]{ return !data.empty(); } );
--wait_count;
auto r = std::move(data.front());
data.pop_front();
return r;
}
void push(T in) {
{
auto l = lock();
data.push_back( std::move(in) );
}
cv.notify_one();
}
void push_many(std::initializer_list<T> in) {
{
auto l = lock();
for (auto&& x: in)
data.push_back( x );
}
cv.notify_all();
}
std::size_t readers_waiting() {
return wait_count;
}
std::size_t data_waiting() const {
auto l = lock();
return data.size();
}
private:
std::queue<T> data;
std::condition_variable cv;
mutable std::mutex m;
std::atomic<std::size_t> wait_count{0};
auto lock() const { return std::unique_lock<std::mutex>(m); }
};
or somesuch.
In the push thread, busy wait on readers_waiting until it passes 1.
At which point you have the lock and are within cv.wait before the lock is unlocked. Do a push.
In theory an infinitely slow reader thread could have gotten into cv.wait and still be evaluating the first lambda by the time you call push, but an infinitely slow reader thread is no different than a blocked one...
This does, however, deal with slow thread startup and the like.
Using readers_waiting and data_waiting for anything other than debugging is usually code smell.
You can use a std::condition_variable to accomplish this. The help page of cppreference.com actually shows a very nice cosumer-producer example which should be exactly what you are looking for: http://en.cppreference.com/w/cpp/thread/condition_variable
EDIT: Actually the german version of cppreference.com has an even better example :-) http://de.cppreference.com/w/cpp/thread/condition_variable

Determining safety of deleting concurrent queue

I'm programming a lock-free single-consumer single-producer growable queue in C++ for a real-time system. The internal queue works but it needs to be growable. The producer thread is real-time thus any operation needs to be deterministic (so no waits, locks, memory allocations), while the consumer thread isn't.
Thus the idea is that the consumer thread occasionally grows the size of the queue if need be. The implementation of the queue is such that the consumer-end cannot grow. Therefore the actual queue is indirectly wrapped inside an object which dispatches calls, and the actual growth is implemented by swapping the reference to the internal queue to a new, while keeping the old one hanging around in case the producer thread is using it.
The problem however is, that I cannot figure out how to prove when the producer thread stops using the old queue and it therefore is safe to delete without having to resort to locks. Here is a pseudo-representation of the code:
template<typename T>
class queue
{
public:
queue()
: old(nullptr)
{
current.store(nullptr);
grow();
}
bool produce(const T & data)
{
qimpl * q = current.load();
return q->produce(data);
}
bool consume(T & data)
{
// the queue has grown? if so, a new and an old queue exists. consume the old firstly.
if (old)
{
// here is the problem. we never really know when the producer thread stops using
// the old queue and starts using the new. it could be concurrently halfway-through inserting items
// now, while the following consume call fails meanwhile.
// thus, it is not safe yet to delete the old queue.
// we know however, that it will take at most one call to produce() after we called grow()
// before the producer thread starts using the new queue.
if (old->consume(data))
{
return true;
}
else
{
delete old;
old = nullptr;
}
}
if (current.load()->consume(data))
{
return true;
}
return false;
}
// consumer only as well
void grow()
{
old = current.load();
current.store(new qimlp());
}
private:
class qimpl
{
public:
bool produce(const T & data);
bool consume(const T & data);
};
std::atomic<qimpl *> current;
qimpl * old;
};
Note that ATOMIC_POINTER_LOCK_FREE == 2 is a condition for the code to compile. The only provable condition I see, is, that if grow() is called, the next produce() call will use the new internal queue. Thus if an atomic count inside produce is incremented each call, then its safe to delete the old queue at N + 1, where N is the count at the time of the grow() call. The issue, however, is that you then need to atomically swap the new pointer and store the count which seems not possible.
Any ideas is welcome, and for reference, this is how the system would work:
queue<int> q;
void consumer()
{
while (true)
{
int data;
if (q.consume(data))
{
// ..
}
}
}
void producer()
{
while (true)
{
q.produce(std::rand());
}
}
int main()
{
std::thread p(producer); std::thread c(consumer);
p.detach(); c.detach();
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
EDIT:
Okay, the problem is solved now. It dawned on me, that the old queue is provably outdated when an item is pushed to the new queue. Thus the snippet now looks like this:
bool pop(T & data)
{
if (old)
{
if (old->consume(data))
{
return true;
}
}
// note that if the old queue is empty, and the new has an enqueued element, we can conclusively
// prove that it is safe to delete the old queue since it is (a) empty and (b) the thread state
// for the producer is updated such that it uses all the new entities and will never use the old again.
// if we successfully dequeue an element, we can delete the old (if it exists).
if (current.load()->consume(data))
{
if (old)
{
delete old;
old = nullptr;
}
return true;
}
return false;
}
I don't fully understand the usage of grow() in your algorithm, but it seems you need some sort of Read-Copy-Update (RCU) mechanism for safetly deleting queue when it is not needed.
This article describes different flavors of this mechanism related to Linux, but you can google RCU flavors, suitable for other platforms.

C++ thread safe bound queue returning object for original thread to delete - 1 writer - 1 reader

The goal is to have a writer thread and a reader thread but only the writer news and deletes the action object. There is only one reader and one writer.
something like:
template<typename T, std::size_t MAX>
class TSQ
{
public:
// blocks if there are MAX items in queue
// returns used Object to be deleted or 0 if none exist
T * push(T * added); // added will be processed by reader
// blocks if there are no objects in queue
// returns item pushed from writer for deletion
T * pop(T * used); // used will be freed by writer
private:
// stuff here
};
-or better if the delete and return can be encapsulated:
template<typename T, std::size_t MAX>
class TSQ
{
public:
// blocks if there are MAX items in queue
push(T * added); // added will be processed by reader
// blocks if there are no objects in queue
// returns item pushed from writer for deletion
T& pop();
private:
// stuff here
};
where the writer thread has a loop like:
my_object *action;
while (1) {
// create action
delete my_queue.push(action);
}
and the reader has a loop like:
my_object * action=0;
while(1) {
action=my_queue.pop(action);
// do stuff with action
}
The reason to have the writer delete the action item is for performance
Is there an optimal way to do this?
Bonus points if MAX=0 is specialized to be unbounded (not required, just tidy)
I'm not looking for the full code, just the data structure and general approach
This is an instance of the producer-consumer problem. A popular way to solve this is to use a lockfree queue.
Also, the first practical change you might want to make is to add a sleep(0) into the production/consumption loops, so you will give up your time slice every iteration and won't end up using 100% of a CPU core.
The most common solution to this problem is to pass values, not pointers.
You can pass shared_ptr to this queue. Your queue doesn't need to know how to free memory after you.
If you use something like Lamport's ring buffer for single producer - single consumer blocking queue, it's a natural solution to use std::vector under the hood that will call destructors for every element automatically.
template<typename T, std::size_t MAX>
class TSQ
{
public:
// blocks if there are MAX items in queue
void push(T added); // added will be processed by reader
// blocks if there are no objects in queue
T pop();
private:
std::vector<T> _content;
size_t _push_index;
size_t _pop_index;
...
};

Race condition in a concurrent queue

I am currently trying to write a concurrent queue, but I have some segfaults that I can't explain to myself. My queue implementation is essentially given by the first listing on this site.
http://www.justsoftwaresolutions.co.uk/threading/implementing-a-thread-safe-queue-using-condition-variables.html
The site says that there is a race condition if objects are removed from the queue in parallel, but I just don't see why there is one, could anyone explain it to me?
Edit: This is the code:
template<typename Data>
class concurrent_queue
{
private:
std::queue<Data> the_queue;
mutable boost::mutex the_mutex;
public:
void push(const Data& data)
{
boost::mutex::scoped_lock lock(the_mutex);
the_queue.push(data);
}
bool empty() const
{
boost::mutex::scoped_lock lock(the_mutex);
return the_queue.empty();
}
Data& front()
{
boost::mutex::scoped_lock lock(the_mutex);
return the_queue.front();
}
Data const& front() const
{
boost::mutex::scoped_lock lock(the_mutex);
return the_queue.front();
}
void pop()
{
boost::mutex::scoped_lock lock(the_mutex);
the_queue.pop();
}
};
What if the queue is empty by the time you attempt to read item from it?
Think of this user code:
while(!q.empty()) //here you check q is not empty
{
//since q is not empty, you enter inside the loop
//BUT before executing the next statement in this loop body,
//the OS transfers the control to the other thread
//which removes items from q, making it empty!!
//then this thread executes the following statement!
auto item = q.front(); //what would it do (given q is empty?)
}
If you use empty and find the queue is not empty, another thread may have popped the item making it empty before you use the result.
Similarly for front, you may read the front item, and it could be popped by another thread by the time you use the item.
The answers from #parkydr and #Nawaz are correct, but here's another food for thought;
What are you trying to achieve?
The reason to have a thread-safe queue is sometimes (I dare not say often) mistaken. In many cases you want to lock "outside" the queue, in the context where the queue is just an implementation detail.
One reason however, for thread-safe queues are for consumer-producer situations, where 1-N nodes push data, and 1-M nodes pop from it regardless of what they get. All elements in the queue are treated equal, and the consumers just pop without knowing what they get, and start working on the data. In situations like that, your interface should not expose a T& front(). Well, you never should return a reference if you're not sure there's an item there (and in parallel situations, you can never be certain without external locks).
I would recommend using unique_ptr's (or shared_ptr of course) and to only expose race free functions (I'm leaving out const functions for brevity). Using std::unique_ptr will require C++11, but you can use boost::shared_ptr for the same functionality if C++11 isn't possible for you to use:
// Returns the first item, or an empty unique_ptr
std::unique_ptr< T > pop( );
// Returns the first item if it exists. Otherwise, waits at most <timeout> for
// a value to be be pushed. Returns an empty unique_ptr if timeout was reached.
std::unique_ptr< T > pop( {implementation-specific-type} timeout );
void push( std::unique_ptr< T >&& ptr );
Features such as exist() and front() are naturally victims of race conditions, since they cannot atomically perform the task you (think you) want. exist() will sometimes return a value which is incorrect at the time you receive the result, and front() would have to throw if the queue is empty.
I think the answers why the empty() function is useless/dangerous are clear. If you want a blocking queue, remove that.
Instead, add a condition variable (boost::condition, IIRC). The functions to push/pop then look like this:
void push(T data)
{
scoped_lock lock(mutex);
queue.push(data);
condition_var.notify_one();
}
data pop()
{
scoped_lock lock(mutex);
while(queue.empty())
condition_var.wait(lock);
return queue.pop();
}
Note that this is pseudo-ish code, but I'm confident that you can figure this out. That said, the suggestion to use unique_ptr (or auto_ptr for C98) to avoid copying the actual data is a good idea, but that's is a completely separate issue.

Is this implementation of a Blocking Queue safe?

I'm trying to implement a queue which blocks on the Pop operation if it's empty, and unblocks as soon as a new element is pushed. I'm afraid I might have some race condition; I tried to look at some other implementation, but most I found were done in .NET, and the few C++ I found depended too much on other library classes.
template <class Element>
class BlockingQueue{
DRA::CommonCpp::CCriticalSection m_csQueue;
DRA::CommonCpp::CEvent m_eElementPushed;
std::queue<Element> m_Queue;
public:
void Push( Element newElement ){
CGuard g( m_csQueue );
m_Queue.push( newElement );
m_eElementPushed.set();
}
Element Pop(){
{//RAII block
CGuard g( m_csQueue );
bool wait = m_Queue.empty();
}
if( wait )
m_eElementPushed.wait();
Element first;
{//RAII block
CGuard g( m_csQueue );
first = m_Queue.front();
m_Queue.pop();
}
return first;
}
};
Some explanations are due:
CCriticalSection is a wrapper for a Windows Critical Section, methods Enter and Leave are private, and CGuard is its only friend
CGuard is a RAII wrapper for CCriticalSection, enters critical section on constructor, leaves it on destructor
CEvent is a wrapper for a Windows Event, wait uses the WaitForSingleObject function
I don't mind that Elements are passed around by value, they are small objects
I can't use Boost, just Windows stuff (as I have already been doing with CEvent and CGuard)
I'm afraid there might be some weird race condition scenario when using Pop(). What do you guys think?
UPDATE: Since I'm working on Visual Studio 2010 (.NET 4.0), I ended up using the unbounded_buffer class provided by the C++ runtime. Of course, I wrapped it in a class using the Pointer to Implementation Idiom (Chesire Cat) just in case we decide to change the implementation or need to port this class to another environment
It’s not thread safe:
{//RAII block
CGuard g( m_csQueue );
bool wait = m_Queue.empty();
}
/// BOOM! Other thread ninja-Pop()s an item.
if( wait )
m_eElementPushed.wait();
Notice the location of the BOOM comment. In fact, other locations are also thinkable (after the if). In either case, the subsequent front and pop calls will fail.
Condition variables should be helpful if you are targetting newest Windows versions. This typically makes implementing blocking queues simpler.
See here for design of a similar queue using Boost - even if you cannot use Boost or condition variables, the general guidance and follow-up discussion there should be useful.