Determining safety of deleting concurrent queue - c++

I'm programming a lock-free single-consumer single-producer growable queue in C++ for a real-time system. The internal queue works but it needs to be growable. The producer thread is real-time thus any operation needs to be deterministic (so no waits, locks, memory allocations), while the consumer thread isn't.
Thus the idea is that the consumer thread occasionally grows the size of the queue if need be. The implementation of the queue is such that the consumer-end cannot grow. Therefore the actual queue is indirectly wrapped inside an object which dispatches calls, and the actual growth is implemented by swapping the reference to the internal queue to a new, while keeping the old one hanging around in case the producer thread is using it.
The problem however is, that I cannot figure out how to prove when the producer thread stops using the old queue and it therefore is safe to delete without having to resort to locks. Here is a pseudo-representation of the code:
template<typename T>
class queue
{
public:
queue()
: old(nullptr)
{
current.store(nullptr);
grow();
}
bool produce(const T & data)
{
qimpl * q = current.load();
return q->produce(data);
}
bool consume(T & data)
{
// the queue has grown? if so, a new and an old queue exists. consume the old firstly.
if (old)
{
// here is the problem. we never really know when the producer thread stops using
// the old queue and starts using the new. it could be concurrently halfway-through inserting items
// now, while the following consume call fails meanwhile.
// thus, it is not safe yet to delete the old queue.
// we know however, that it will take at most one call to produce() after we called grow()
// before the producer thread starts using the new queue.
if (old->consume(data))
{
return true;
}
else
{
delete old;
old = nullptr;
}
}
if (current.load()->consume(data))
{
return true;
}
return false;
}
// consumer only as well
void grow()
{
old = current.load();
current.store(new qimlp());
}
private:
class qimpl
{
public:
bool produce(const T & data);
bool consume(const T & data);
};
std::atomic<qimpl *> current;
qimpl * old;
};
Note that ATOMIC_POINTER_LOCK_FREE == 2 is a condition for the code to compile. The only provable condition I see, is, that if grow() is called, the next produce() call will use the new internal queue. Thus if an atomic count inside produce is incremented each call, then its safe to delete the old queue at N + 1, where N is the count at the time of the grow() call. The issue, however, is that you then need to atomically swap the new pointer and store the count which seems not possible.
Any ideas is welcome, and for reference, this is how the system would work:
queue<int> q;
void consumer()
{
while (true)
{
int data;
if (q.consume(data))
{
// ..
}
}
}
void producer()
{
while (true)
{
q.produce(std::rand());
}
}
int main()
{
std::thread p(producer); std::thread c(consumer);
p.detach(); c.detach();
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
EDIT:
Okay, the problem is solved now. It dawned on me, that the old queue is provably outdated when an item is pushed to the new queue. Thus the snippet now looks like this:
bool pop(T & data)
{
if (old)
{
if (old->consume(data))
{
return true;
}
}
// note that if the old queue is empty, and the new has an enqueued element, we can conclusively
// prove that it is safe to delete the old queue since it is (a) empty and (b) the thread state
// for the producer is updated such that it uses all the new entities and will never use the old again.
// if we successfully dequeue an element, we can delete the old (if it exists).
if (current.load()->consume(data))
{
if (old)
{
delete old;
old = nullptr;
}
return true;
}
return false;
}

I don't fully understand the usage of grow() in your algorithm, but it seems you need some sort of Read-Copy-Update (RCU) mechanism for safetly deleting queue when it is not needed.
This article describes different flavors of this mechanism related to Linux, but you can google RCU flavors, suitable for other platforms.

Related

Set the limit in Queue

I want to set a limit on my Queue. You can find below the implementation of the Queue class.
So, in short, I want to write in Queue in one thread until the limit and then wait for the free space. And the second thread read the Queue and do some operations with the data that it received.
int main()
{
//loop that adds new elements to the Queue.
thread one(buildQueue, input, Queue);
loop{
obj = Queue.pop()
func(obj) //do some math
}
}
So the problems is that the Queue builds until the end, but I want to set only 10 elements, for example. And program should work like this:
Check whether free space in Queue is available.
If there is no space - wait.
Write in the Queue until the limit.
Class Queue
template <typename T> class Queue{
private:
const unsigned int MAX = 5;
std::deque<T> newQueue;
std::mutex d_mutex;
std::condition_variable d_condition;
public:
void push(T const& value)
{
{
std::unique_lock<std::mutex> lock(this->d_mutex);
newQueue.push_front(value);
}
this->d_condition.notify_one();
}
T pop()
{
std::unique_lock<std::mutex> lock(this->d_mutex);
this->d_condition.wait(lock, [=]{ return !this->newQueue.empty(); });
T rc(std::move(this->newQueue.back()));
this->newQueue.pop_back();
return rc;
}
unsigned int size()
{
return newQueue.size();
}
unsigned int maxQueueSize()
{
return this->MAX;
}
};
I'm pretty new in threads program so that I could misunderstand the conception. That is why different hints are highly appreciated.
You should have researched the Queue class in the MSDN website. It provides extensive information revolving methods included with Queue. However, to answer your question specifically, to set a queue with a specific capacity, it would be the following:
Queue(int capacity)
where it is Type System::int32 capacity is the initial number of elements in the queue. Then, your queue will be filled until the limit. The issue is, the queue will not "stop" once its filled. It will start to allocate more memory as that is its nature so in your threaded (or multithreaded by the sounds of it), you must make sure to take care of the realoc-deallocation of the queue memory based on timing. You should be able to determine the milliseconds needed to fill your queue with the desired capacity and read the queue, meanwhile clearing the queue. Likewise, you can copy the queue contents to a 1D array and do a full queue clear using MyQueue->Clear()without having to read queue elements 1 by one (if timing and code complexity is an issue).

Synchronization technique to wait till all objects have been processed

In this code, I am first creating a thread that keeps running always. Then I am creating objects and adding them one by one to a queue. The thread picks up object from queue one by one processes them and deletes them.
class MyClass
{
public:
MyClass();
~MyClass();
Process();
};
std::queue<class MyClass*> MyClassObjQueue;
void ThreadFunctionToProcessAndDeleteObjectsFromQueue()
{
while(1)
{
// Get and Process and then Delete Objects one by one from MyClassObjQueue.
}
}
void main()
{
CreateThread (ThreadFunctionToProcessAndDeleteObjectsFromQueue);
int N = GetNumberOfObjects(); // Call some function that gets value of number of objects
// Create objects and queue them
for (int i=0; i<N; i++)
{
try
{
MyClass* obj = NULL;
obj = new MyClass;
MyClassObjQueue.push(obj);
}
catch(std::bad_alloc&)
{
if(obj)
delete obj;
}
}
// Wait till all objects have been processed and destroyed (HOW ???)
}
PROBLEM:
I am not sure how to wait till all objects have been processed before I quit. One way is to keep on checking size of queue periodically by using while(1) loop and Sleep. But I think it's novice way to do the things. I really want to do it in elegant way by using thread synchronization objects (e.g. semaphore etc.) so that synchronization function will wait for all objects to finish. But not sure how to do that. Any input will be appreciated.
(Note: I've not used synchronization objects to add/delete from queue in the code above. This is only to keep the code simple & readable. I know STL containers are not thread safe)

C++ Blocking Queue Segfault w/ Boost

I had a need for a Blocking Queue in C++ with timeout-capable offer(). The queue is intended for multiple producers, one consumer. Back when I was implementing, I didn't find any good existing queues that fit this need, so I coded it myself.
I'm seeing segfaults come out of the take() method on the queue, but they are intermittent. I've been looking over the code for issues but I'm not seeing anything that looks problematic.
I'm wondering if:
There is an existing library that does this reliably that I should
use (boost or header-only preferred).
Anyone sees any obvious flaw in my code that I need to fix.
Here is the header:
class BlockingQueue
{
public:
BlockingQueue(unsigned int capacity) : capacity(capacity) { };
bool offer(const MyType & myType, unsigned int timeoutMillis);
MyType take();
void put(const MyType & myType);
unsigned int getCapacity();
unsigned int getCount();
private:
std::deque<MyType> queue;
unsigned int capacity;
};
And the relevant implementations:
boost::condition_variable cond;
boost::mutex mut;
bool BlockingQueue::offer(const MyType & myType, unsigned int timeoutMillis)
{
Timer timer;
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
// We use a while loop here because the monitor may have woken up because
// another producer did a PulseAll. In that case, the queue may not have
// room, so we need to re-check and re-wait if that is the case.
// We use an external stopwatch to stop the madness if we have taken too long.
while (queue.size() >= this->capacity)
{
int monitorTimeout = timeoutMillis - ((unsigned int) timer.getElapsedMilliSeconds());
if (monitorTimeout <= 0)
{
return false;
}
if (!cond.timed_wait(lock, boost::posix_time::milliseconds(timeoutMillis)))
{
return false;
}
}
cond.notify_all();
queue.push_back(myType);
return true;
}
void BlockingQueue::put(const MyType & myType)
{
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
// We use a while loop here because the monitor may have woken up because
// another producer did a PulseAll. In that case, the queue may not have
// room, so we need to re-check and re-wait if that is the case.
// We use an external stopwatch to stop the madness if we have taken too long.
while (queue.size() >= this->capacity)
{
cond.wait(lock);
}
cond.notify_all();
queue.push_back(myType);
}
MyType BlockingQueue::take()
{
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
while (queue.size() == 0)
{
cond.wait(lock);
}
cond.notify_one();
MyType myType = this->queue.front();
this->queue.pop_front();
return myType;
}
unsigned int BlockingQueue::getCapacity()
{
return this->capacity;
}
unsigned int BlockingQueue::getCount()
{
return this->queue.size();
}
And yes, I didn't implement the class using templates - that is next on the list :)
Any help is greatly appreciated. Threading issues can be really hard to pin down.
-Ben
Why are cond, and mut globals? I would expect them to be members of your BlockingQueue object. I don't know what else is touching those things, but there may be an issue there.
I too have implemented a ThreadSafeQueue as part of a larger project:
https://github.com/cdesjardins/QueuePtr/blob/master/include/ThreadSafeQueue.h
It is a similar concept to yours, except the enqueue (aka offer) functions are non-blocking because there is basically no max capacity. To enforce a capacity I typically have a pool with N buffers added at system init time, and a Queue for message passing at run time, this also eliminates the need for memory allocation at run time which I consider to be a good thing (I typically work on embedded applications).
The only difference between a pool, and a queue is that a pool gets a bunch of buffers enqueued at system init time. So you have something like this:
ThreadSafeQueue<BufferDataType*> pool;
ThreadSafeQueue<BufferDataType*> queue;
void init()
{
for (int i = 0; i < NUM_BUFS; i++)
{
pool.enqueue(new BufferDataType);
}
}
Then when you want send a message you do something like the following:
void producerA()
{
BufferDataType *buf;
if (pool.waitDequeue(buf, timeout) == true)
{
initBufWithMyData(buf);
queue.enqueue(buf);
}
}
This way the enqueue function is quick and easy, but if the pool is empty, then you will block until someone puts a buffer back into the pool. The idea being that some other thread will be blocking on the queue and will return buffers to the pool when they have been processed as follows:
void consumer()
{
BufferDataType *buf;
if (queue.waitDequeue(buf, timeout) == true)
{
processBufferData(buf);
pool.enqueue(buf);
}
}
Anyways take a look at it, maybe it will help.
I suppose the problem in your code is modifying the deque by several threads. Look:
you're waiting for codition from another thread;
and then immediately sending a signal to other threads that deque is unlocked just before you want to modify it;
then you modifying the deque while other threads are thinking deque is allready unlocked and starting doing the same.
So, try to place all the cond.notify_*() after modifying the deque. I.e.:
void BlockingQueue::put(const MyType & myType)
{
boost::unique_lock<boost::mutex> lock(mut);
while (queue.size() >= this->capacity)
{
cond.wait(lock);
}
queue.push_back(myType); // <- modify first
cond.notify_all(); // <- then say to others that deque is free
}
For better understanding I suggest to read about the pthread_cond_wait().

multithreaded program producer/consumer [boost]

I'm playing with boost library and C++. I want to create a multithreaded program that contains a producer, conumer, and a stack. The procuder fills the stack, the consumer remove items (int) from the stack. everything work (pop, push, mutex) But when i call the pop/push winthin a thread, i don't get any effect
i made this simple code :
#include "stdafx.h"
#include <stack>
#include <iostream>
#include <algorithm>
#include <boost/shared_ptr.hpp>
#include <boost/thread.hpp>
#include <boost/date_time.hpp>
#include <boost/signals2/mutex.hpp>
#include <ctime>
using namespace std;
/ *
* this class reprents a stack which is proteced by mutex
* Pop and push are executed by one thread each time.
*/
class ProtectedStack{
private :
stack<int> m_Stack;
boost::signals2::mutex m;
public :
ProtectedStack(){
}
ProtectedStack(const ProtectedStack & p){
}
void push(int x){
m.lock();
m_Stack.push(x);
m.unlock();
}
void pop(){
m.lock();
//return m_Stack.top();
if(!m_Stack.empty())
m_Stack.pop();
m.unlock();
}
int size(){
return m_Stack.size();
}
bool isEmpty(){
return m_Stack.empty();
}
int top(){
return m_Stack.top();
}
};
/*
*The producer is the class that fills the stack. It encapsulate the thread object
*/
class Producer{
public:
Producer(int number ){
//create thread here but don't start here
m_Number=number;
}
void fillStack (ProtectedStack& s ) {
int object = 3; //random value
s.push(object);
//cout<<"push object\n";
}
void produce (ProtectedStack & s){
//call fill within a thread
m_Thread = boost::thread(&Producer::fillStack,this, s);
}
private :
int m_Number;
boost::thread m_Thread;
};
/* The consumer will consume the products produced by the producer */
class Consumer {
private :
int m_Number;
boost::thread m_Thread;
public:
Consumer(int n){
m_Number = n;
}
void remove(ProtectedStack &s ) {
if(s.isEmpty()){ // if the stack is empty sleep and wait for the producer to fill the stack
//cout<<"stack is empty\n";
boost::posix_time::seconds workTime(1);
boost::this_thread::sleep(workTime);
}
else{
s.pop(); //pop it
//cout<<"pop object\n";
}
}
void consume (ProtectedStack & s){
//call remove within a thread
m_Thread = boost::thread(&Consumer::remove, this, s);
}
};
int main(int argc, char* argv[])
{
ProtectedStack s;
Producer p(0);
p.produce(s);
Producer p2(1);
p2.produce(s);
cout<<"size after production "<<s.size()<<endl;
Consumer c(0);
c.consume(s);
Consumer c2(1);
c2.consume(s);
cout<<"size after consumption "<<s.size()<<endl;
getchar();
return 0;
}
After i run that in VC++ 2010 / win7
i got :
0
0
Could you please help me understand why when i call fillStack function from the main i got an effect but when i call it from a thread nothing happens?
Thank you
Your example code suffers from a couple synchronization issues as noted by others:
Missing locks on calls to some of the members of ProtectedStack.
Main thread could exit without allowing worker threads to join.
The producer and consumer do not loop as you would expect. Producers should always (when they can) be producing, and consumers should keep consuming as new elements are pushed onto the stack.
cout's on the main thread may very well be performed before the producers or consumers have had a chance to work yet.
I would recommend looking at using a condition variable for synchronization between your producers and consumers. Take a look at the producer/consumer example here: http://en.cppreference.com/w/cpp/thread/condition_variable
It is a rather new feature in the standard library as of C++11 and supported as of VS2012. Before VS2012, you would either need boost or to use Win32 calls.
Using a condition variable to tackle a producer/consumer problem is nice because it almost enforces the use of a mutex to lock shared data and it provides a signaling mechanism to let consumers know something is ready to be consumed so they don't have so spin (which is always a trade off between the responsiveness of the consumer and CPU usage polling the queue). It also does so being atomic itself which prevents the possibility of threads missing a signal that there is something to consume as explained here: https://en.wikipedia.org/wiki/Sleeping_barber_problem
To give a brief run-down of how a condition variable takes care of this...
A producer does all time consuming activities on its thread without the owning the mutex.
The producer locks the mutex, adds the item it produced to a global data structure (probably a queue of some sort), lets go of the mutex and signals a single consumer to go -- in that order.
A consumer that is waiting on the condition variable re-acquires the mutex automatically, removes the item out of the queue and does some processing on it. During this time, the producer is already working on producing a new item but has to wait until the consumer is done before it can queue the item up.
This would have the following impact on your code:
No more need for ProtectedStack, a normal stack/queue data structure will do.
No need for boost if you are using a new enough compiler - removing build dependencies is always a nice thing.
I get the feeling that threading is rather new to you so I can only offer the advice to look at how others have solved synchronization issues as it is very difficult to wrap your mind around. Confusion about what is going on in an environment with multiple threads and shared data typically leads to issues like deadlocks down the road.
The major problem with your code is that your threads are not synchronized.
Remember that by default threads execution isn't ordered and isn't sequenced, so consumer threads actually can be (and in your particular case are) finished before any producer thread produces any data.
To make sure consumers will be run after producers finished its work you need to use thread::join() function on producer threads, it will stop main thread execution until producers exit:
// Start producers
...
p.m_Thread.join(); // Wait p to complete
p2.m_Thread.join(); // Wait p2 to complete
// Start consumers
...
This will do the trick, but probably this is not good for typical producer-consumer use case.
To achieve more useful case you need to fix consumer function.
Your consumer function actually doesn't wait for produced data, it will just exit if stack is empty and never consume any data if no data were produced yet.
It shall be like this:
void remove(ProtectedStack &s)
{
// Place your actual exit condition here,
// e.g. count of consumed elements or some event
// raised by producers meaning no more data available etc.
// For testing/educational purpose it can be just while(true)
while(!_some_exit_condition_)
{
if(s.isEmpty())
{
// Second sleeping is too big, use milliseconds instead
boost::posix_time::milliseconds workTime(1);
boost::this_thread::sleep(workTime);
}
else
{
s.pop();
}
}
}
Another problem is wrong thread constructor usage:
m_Thread = boost::thread(&Producer::fillStack, this, s);
Quote from Boost.Thread documentation:
Thread Constructor with arguments
template <class F,class A1,class A2,...>
thread(F f,A1 a1,A2 a2,...);
Preconditions:
F and each An must by copyable or movable.
Effects:
As if thread(boost::bind(f,a1,a2,...)). Consequently, f and each an are copied into
internal storage for access by the new thread.
This means that each your thread receives its own copy of s and all modifications aren't applied to s but to local thread copies. It's the same case when you pass object to function argument by value. You need to pass s object by reference instead - using boost::ref:
void produce(ProtectedStack& s)
{
m_Thread = boost::thread(&Producer::fillStack, this, boost::ref(s));
}
void consume(ProtectedStack& s)
{
m_Thread = boost::thread(&Consumer::remove, this, boost::ref(s));
}
Another issues is about your mutex usage. It's not the best possible.
Why do you use mutex from Signals2 library? Just use boost::mutex from Boost.Thread and remove uneeded dependency to Signals2 library.
Use RAII wrapper boost::lock_guard instead of direct lock/unlock calls.
As other people mentioned, you shall protect with lock all members of ProtectedStack.
Sample:
boost::mutex m;
void push(int x)
{
boost::lock_guard<boost::mutex> lock(m);
m_Stack.push(x);
}
void pop()
{
boost::lock_guard<boost::mutex> lock(m);
if(!m_Stack.empty()) m_Stack.pop();
}
int size()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.size();
}
bool isEmpty()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.empty();
}
int top()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.top();
}
You're not checking that the producing thread has executed before you try to consume. You're also not locking around size/empty/top... that's not safe if the container's being updated.

C++ Templated Producer-Consumer BlockingQueue, unbounded buffer: How do I end elegantly?

I wrote a BlockingQueue in order to communicate two threads. You could say it follows the Producer-Consumer pattern, with an unbounded buffer. Therefore, I implemented it using a Critical Section and a Semaphore, like this:
#pragma once
#include "Semaphore.h"
#include "Guard.h"
#include <queue>
namespace DRA{
namespace CommonCpp{
template<class Element>
class BlockingQueue{
CCriticalSection m_csQueue;
CSemaphore m_semElementCount;
std::queue<Element> m_Queue;
//Forbid copy and assignment
BlockingQueue( const BlockingQueue& );
BlockingQueue& operator=( const BlockingQueue& );
public:
BlockingQueue( unsigned int maxSize );
~BlockingQueue();
Element Pop();
void Push( Element newElement );
};
}
}
//Template definitions
template<class Element>
DRA::CommonCpp::BlockingQueue<Element>::BlockingQueue( unsigned int maxSize ):
m_csQueue( "BlockingQueue::m_csQueue" ),
m_semElementCount( 0, maxSize ){
}
template<class Element>
DRA::CommonCpp::BlockingQueue<Element>::~BlockingQueue(){
//TODO What can I do here?
}
template<class Element>
void DRA::CommonCpp::BlockingQueue<Element>::Push( Element newElement ){
{//RAII block
CGuard g( m_csQueue );
m_Queue.push( newElement );
}
m_semElementCount.Signal();
}
template<class Element>
Element DRA::CommonCpp::BlockingQueue<Element>::Pop(){
m_semElementCount.Wait();
Element popped;
{//RAII block
CGuard g( m_csQueue );
popped = m_Queue.front();
m_Queue.pop();
}
return popped;
}
CGuard is a RAII wrapper for a CCriticalSection, it enters it on construction and leaves it on destruction. CSemaphore is a wrapper for a Windows semaphore.
So far, so good, the threads are communicating perfectly. However, when the producer thread stops producing and ends, and the consumer thread has consumed everything, the consumer thread stays forever hung on a Pop() call.
How can I tell the consumer to end elegantly? I thought of sending a special empty Element, but it seems too sloppy.
You better use events instead of Semaphore. While adding, take lock on CS, and check element count (store into bIsEmpty local variable). Add into queue, then check if number of elements WAS empty, SetEvent!
On pop method, lock first, then check if it is empty, then WaitForSingleObject - as soon as WFSO returns you get that there is at least one item in queue.
Check this article
Does your Semaphore implementation have a timed wait function available? On Windows, that would be WaitForSingleObject() specifying a timeout. If so, Pop() could be implemented like this:
// Pseudo code
bool Pop(Element& r, timeout)
{
if(sem.wait(timeout))
{
r = m_Queue.front();
m_Queue.pop();
}
return false;
}
This way, the Pop() is still blocking, though it can be easily interrupted. Even with very short timeouts this won't consume significant amounts of CPU (more than absolutely necessary, yes -- and potentially introduce additional context switching -- so note those caveats).
You need a way of telling the consumer to stop. This could be a special element in the queue, say a simple wrapper structure around the Element, or a flag - a member variable of the queue class (in which case you want to make sure the flag is dealt with atomically - lookup windows "interlocked" functions). Then you need to check that condition in the consumer every time it wakes up. Finally, in the destructor, set that stop condition and signal the semaphore.
One issue remains - what to return from the consumer's pop(). I'd go for a boolean return value and an argument of type Element& to copy result into on success.
Edit:
Something like this:
bool Queue::Pop( Element& result ) {
sema.Wait();
if ( stop_condition ) return false;
critical_section.Enter();
result = m_queue.front();
m_queue.pop;
critical_section.Leave();
return true;
}
Change pop to return a boost optional (or do it like the standard library does with top/pop to separate the tasks) and then signal m_semElementCount one last time on destruction.