MPSC Queue Design Issue (thread cannot join) - c++

I have the following MpscQueue implementation
EDIT: added an is_running atomic, but problem still persists.
template<typename T>
class MpscQueue {
public:
MpscQueue() = default;
MpscQueue(MpscQueue&&) = delete;
bool wait_and_pop(T& val, std::atomic<bool>& is_running) {
std::unique_lock<std::mutex> lock(mutex);
cond_var.wait(lock,
[this, &is_running]{ return queue.size() > 0 || !is_running; });
if (!is_running) return false;
val = std::move(queue.front());
queue.pop();
return true;
}
template<typename U>
void push(U&& val) {
auto const is_empty = [&]{
auto const lock = std::unique_lock(mutex);
auto const res = queue.empty();
queue.push(std::forward<U>(val));
return res;
}();
if (is_empty) cond_var.notify_one();
}
private:
std::queue<T> queue;
std::mutex mutex;
std::condition_variable cond_var;
};
I am attempting to pop a value like this
// At some point earlier
MpscQueue<Message> mailbox;
std::atomic<boo> is_running{true}; // Is set to false at a later time
void run_once() {
Message m;
mailbox.wait_and_pop(m, is_running);
// process_message(std::move(m));
}
The above code run_once is being fed into the thread constructor. My issue is that if I attempt to join the thread that this is on, it gets stuck in the condition variable wait condition. What would be the best way to solve this? I tried passing an atomic by reference as a parameter into wait_and_pop but it did not seem to be updating and also did not seem like a smart implementation decision.

Related

Would it be good to create a wrapper class when using condition_variable?

I see a common pattern when using condition_variable to let one thread wait for another thread to finish some work:
Define a condition, a mutex, and a condition_variable:
bool workDone = false;
std::mutex mutex;
std::condition_variable cv;
In one thread, do some work. And when the work is done, update the condition under lock, and notify the condition_variable:
std::unique_lock<std::mutex> lock(mutex);
workDone = true;
lock.unlock();
cv.notify_all();
In another thread that needs to wait the work to be done, create a lock and wait on the condition_variable:
std::unique_lock<std::mutex> lock(mutex);
cv.wait(lock, []() { return workDone; });
In each of the 3 parts, we need multiple lines of code. We can create a class like below to wrap up above codes:
template<class T>
class CVWaiter
{
private:
T m_val;
std::mutex m_mutex;
std::condition_variable m_cv;
public:
Waiter(_In_ const T& val) : m_val(val)
{
}
void Notify(_In_ std::function<void(T& val)> update)
{
std::unique_lock<std::mutex> lock(m_mutex);
update(m_val);
lock.unlock();
m_cv.notify_all();
}
void Wait(_In_ std::function<bool(const T& val)> condition)
{
std::unique_lock<std::mutex> lock(m_mutex);
m_cv.wait(lock, [this, condition]() { return condition(m_val); });
}
};
With this class, the 3 parts on the top can be respectively written as:
CVWaiter workDoneWaiter(false);
workDoneWaiter.Notify([](bool& val) { val = true; });
workDoneWaiter.Wait([](const bool& val) { return val; });
Is it a correct way to implement the wrapper class? Is it a good practice to use the wrapper class for scenarios like this? Is there already anything that can achieve the same in a simpler way in STL?

Thread Pool: Block destruction until all work is done

I have the following thread pool implementation:
template<typename... event_args>
class thread_pool{
public:
using handler_type = std::function<void(event_args...)>;
thread_pool(handler_type&& handler, std::size_t N = 4, bool finish_before_exit = true) : _handler(std::forward<handler_type&&>(handler)),_workers(N),_running(true),_finish_work_before_exit(finish_before_exit)
{
for(auto&& worker: _workers)
{
//worker function
worker = std::thread([this]()
{
while (_running)
{
//wait for work
std::unique_lock<std::mutex> _lk{_wait_mutex};
_cv.wait(_lk, [this]{
return !_events.empty() || !_running;
});
//_lk unlocked
//check to see why we woke up
if (!_events.empty()) {//was it new work
std::unique_lock<std::mutex> _readlk(_queue_mutex);
auto data = _events.front();
_events.pop();
_readlk.unlock();
invoke(std::move(_handler), std::move(data));
_cv.notify_all();
}else if(!_running){//was it a signal to exit
break;
}
//or was it spurious and we should just ignore it
}
});
//end worker function
}
}
~thread_pool()
{
if(_finish_work_before_exit)
{//block destruction until all work is done
std::condition_variable _work_remains;
std::mutex _wr;
std::unique_lock<std::mutex> lk{_wr};
_work_remains.wait(lk,[this](){
return _events.empty();
});
}
_running=false;
//let all workers know to exit
_cv.notify_all();
//attempt to join all workers
for(auto&& _worker: _workers)
{
if(_worker.joinable())
{
_worker.join();
}
}
}
handler_type& handler()
{
return _handler;
}
void propagate(event_args&&... args)
{
//lock before push
std::unique_lock<std::mutex> _lk(_queue_mutex);
{
_events.emplace(std::make_tuple(args...));
}
_lk.unlock();//explicit unlock
_cv.notify_one();//let worker know that data is available
}
private:
bool _finish_work_before_exit;
handler_type _handler;
std::queue<std::tuple<event_args...>> _events;
std::vector<std::thread> _workers;
std::atomic_bool _running;
std::condition_variable _cv;
std::mutex _wait_mutex;
std::mutex _queue_mutex;
//helpers used to unpack tuple into function call
template<typename Func, typename Tuple, std::size_t... I>
auto invoke_(Func&& func, Tuple&& t, std::index_sequence<I...>)
{
return func(std::get<I>(std::forward<Tuple&&>(t))...);
}
template<typename Func, typename Tuple, typename Indicies = std::make_index_sequence<std::tuple_size<Tuple>::value>>
auto invoke(Func&& func, Tuple&& t)
{
return invoke_(std::forward<Func&&>(func), std::forward<Tuple&&>(t), Indicies());
}
};
I recently added this section to the destructor:
if(_finish_work_before_exit)
{//block destruction until all work is done
std::condition_variable _work_remains;
std::mutex _wr;
std::unique_lock<std::mutex> lk{_wr};
_work_remains.wait(lk,[this](){
return _events.empty();
});
}
The intent was to have the destructor block until the work queue was fully consumed.
But it seems to put the program into deadlock. aAll of the work does get completed, but the wait does not seem to end when the work is done.
Consider this example main:
std::mutex writemtx;
thread_pool<int> pool{
[&](int i){
std::unique_lock<std::mutex> lk{writemtx};
std::cout<<i<<" : "<<std::this_thread::get_id()<<std::endl;
},
8//threads
};
for (int i=0; i<8192; ++i) {
pool.propagate(std::move(i));
}
How can I have the destructor wait for the completion of the work without causing deadlock?
The reason your code is deadlocked is that _work_remains is a condition variable which is not "notified" by any part of your code. You would need to make that a class attribute and have it notified by any thread that picks up the last event from the _events.

C++ Thread safe queue shutdown

I'm using this class for producer-consumer setup in C++:
#pragma once
#include <queue>
#include <mutex>
#include <condition_variable>
#include <memory>
#include <atomic>
template <typename T> class SafeQueue
{
public:
SafeQueue() :
_shutdown(false)
{
}
void Enqueue(T item)
{
std::unique_lock<std::mutex> lock(_queue_mutex);
bool was_empty = _queue.empty();
_queue.push(std::move(item));
lock.unlock();
if (was_empty)
_condition_variable.notify_one();
}
bool Dequeue(T& item)
{
std::unique_lock<std::mutex> lock(_queue_mutex);
while (!_shutdown && _queue.empty())
_condition_variable.wait(lock);
if(!_shutdown)
{
item = std::move(_queue.front());
_queue.pop();
return true;
}
return false;
}
bool IsEmpty()
{
std::lock_guard<std::mutex> lock(_queue_mutex);
return _queue.empty();
}
void Shutdown()
{
_shutdown = true;
_condition_variable.notify_all();
}
private:
std::mutex _queue_mutex;
std::condition_variable _condition_variable;
std::queue<T> _queue;
std::atomic<bool> _shutdown;
};
And I use it like this:
class Producer
{
public:
Producer() :
_running(true),
_t(std::bind(&Producer::ProduceThread, this))
{ }
~Producer()
{
_running = false;
_incoming_packets.Shutdown();
_t.join();
}
SafeQueue<Packet> _incoming_packets;
private:
void ProduceThread()
{
while(_running)
{
Packet p = GetNewPacket();
_incoming_packets.Enqueue(p);
}
}
std::atomic<bool> _running;
std::thread _t;
}
class Consumer
{
Consumer(Producer* producer) :
_producer(producer),
_t(std::bind(&Consumer::WorkerThread, this))
{ }
~Consumer()
{
_t.join();
}
private:
void WorkerThread()
{
Packet p;
while(producer->_incoming_packets.Dequeue(p))
ProcessPacket(p);
}
std::thread _t;
Producer* _producer;
}
This works most of the time. But once in a while when I delete the producer (and causing it's deconstructor to call SafeQueue::Shutdown, the _t.join() blocks forever.
My guess is the that the problem is here (in SafeQueue::Dequeue):
while (!_shutdown && _queue.empty())
_condition_variable.wait(lock);
SafeQueue::Shutdown from thread #1 gets called while thread #2 finished checking _shutdown but before it executed _condition_variable.wait(lock), so it "misses" the notify_all(). Can this happen?
If that's the problem, what's the best way to solve it?
Since the SafeQueue object is owned by the producer, deleting the producer causes a race condition between the consumer being notified and the SafeQueue being deleted out from under it when ~Producer completes.
I suggest having the shared resource being owned by neither the producer nor consumer, but passed as a reference to the constructor of each.
Change the Producer and Consumer constructors;
Producer( SafeQueue<Packet> & queue ) :
_running(false), _incoming_packets(queue) {}
Consumer( SafeQueue<Packet> & queue ) :
_running(false), _incoming_packets(queue) {}
Use your instances this way;
SafeQueue<Packet> queue;
Producer producer(queue);
Consumer consumer(queue);
...do stuff...
queue.shutdown();
This also resolves a poor design issue you have in the Consumer class being so tightly coupled to the Producer class.
Also, it's probably a bad idea to kill and join threads in a destructor, as you do for ~Producer. Better to add a Shutdown() method to each thread class, and call them explicitly;
producer.shutdown();
consumer.shutdown();
queue.shutdown();
Shutdown order doesn't really matter, unless you are concerned about losing unprocessed packets that are still in the queue when you stop the consumer.
In your SafeQueue::Dequeue, you are probably using std::condition_variable the wrong way... Change this:
bool Dequeue(T& item)
{
std::unique_lock<std::mutex> lock(_queue_mutex);
while (!_shutdown && _queue.empty())
_condition_variable.wait(lock);
if(!_shutdown)
{
item = std::move(_queue.front());
_queue.pop();
return true;
}
return false;
}
to
bool Dequeue(T& item)
{
std::unique_lock<std::mutex> lock(_queue_mutex);
_condition_variable.wait(lock, []{ return _shutdown || !_queue.empty() });
if(!_shutdown)
{
item = std::move(_queue.front());
_queue.pop();
return true;
}
return false;
}
Secondly, the order of initialization of the data members in Consumer isn't right with regards to its constructor
class Consumer
{
Consumer(Producer* producer) :
_producer(producer),
_t(std::bind(&Consumer::WorkerThread, this))
{ }
......
// _t will be constructed first, regardless of your constructor initializer list
// Meaning, the thread can even start running using an unintialized _producer
std::thread _t;
Producer* _producer;
}
It should be reordered to:
class Consumer
{
Consumer(Producer* producer) :
_producer(producer),
_t(std::bind(&Consumer::WorkerThread, this))
{ }
......
Producer* _producer;
std::thread _t;
}
Another part of your problem is covered by CAB's answer

Boost thread disabling

I try to implement blocking queue. the main parts are the following (it's a kind of educational task)
template <typename T>
class Blocking_queue
{
public:
std::queue<T> _queue;
boost::mutex _mutex;
boost::condition_variable _cvar;
void Put(T& object);
T Get();
void Disable()
};
template<typename T>
void Blocking_queue::Put(T& object)
{
boost::mutex::scoped_lock lock(_mutex);
_queue.push(T);
lock.unlock();
_cvar.notify_one();
}
template<typename T>
T Blocking_queue::Get()
{
boost::mutex::scoped_lock lock(_mutex);
while(_queue.empty())
{
_cvar.wait(_mutex);
}
T last_el = _queue.front();
_queue.pop();
return last_el;
}
template<typename T>
void Blocking_queue::Disable()
{
}
And i need to implement a function Disable() "releasing" all waiting threads (as written in the task). The problem is that i don't fully understand what "releasing" in this terms means, and what methods should i apply. So my idea - is the following: when Disable() is called we should call some method for current thread in this place (inside the loop)
while(_queue.empty())
{
//here
_cvar.wait(_mutex);
}
which will release current thread, am i right? Thanks.
"releasing all threads that are waiting" is an operation that is hardly useful. What do you want to do with this operation?
What is useful, is to shutdown the queue, thus every thread waiting on the queue will be unblocked and every thread that is going to call Get() will return immediately. To implement such a behaviour, simply add a shutdown flag to the queue and wait for "not empty or shutdown":
template<typename T>
void Blocking_queue::Disable()
{
boost::mutex::scoped_lock lock(_mutex);
_shutdown = true;
_cvar.notify_all()
}
To indicate that there is no data, to the caller of Get(), you could return a pair with an additional bool or throw a special exception. There is no way to return null, as not for all types T there is a null value.
template<typename T>
std::pair< bool, T > Blocking_queue::Get()
{
boost::mutex::scoped_lock lock(_mutex);
while (_queue.empty() && !_shutdown )
_cvar.wait(_mutex);
if ( _shutdown )
return std::make_pair( false, T() );
T last_el = _queue.front();
_queue.pop();
return std::make_pair( true, last_el );
}

pthread_mutex_lock __pthread_mutex_lock_full: Assertion failed with robust and 0x4000000

I'm working on a server-side project, which is supposed to accept more than 100 client connections.
It's multithreaded program using boost::thread. Some places I'm using boost::lock_guard<boost::mutex> to lock the shared member data. There is also a BlockingQueue<ConnectionPtr> which contains the input connections. The implementation of the BlockingQueue:
template <typename DataType>
class BlockingQueue : private boost::noncopyable
{
public:
BlockingQueue()
: nblocked(0), stopped(false)
{
}
~BlockingQueue()
{
Stop(true);
}
void Push(const DataType& item)
{
boost::mutex::scoped_lock lock(mutex);
queue.push(item);
lock.unlock();
cond.notify_one(); // cond.notify_all();
}
bool Empty() const
{
boost::mutex::scoped_lock lock(mutex);
return queue.empty();
}
std::size_t Count() const
{
boost::mutex::scoped_lock lock(mutex);
return queue.size();
}
bool TryPop(DataType& poppedItem)
{
boost::mutex::scoped_lock lock(mutex);
if (queue.empty())
return false;
poppedItem = queue.front();
queue.pop();
return true;
}
DataType WaitPop()
{
boost::mutex::scoped_lock lock(mutex);
++nblocked;
while (!stopped && queue.empty()) // Or: if (queue.empty())
cond.wait(lock);
--nblocked;
if (stopped)
{
cond.notify_all(); // Tell Stop() that this thread has left
BOOST_THROW_EXCEPTION(BlockingQueueTerminatedException());
}
DataType tmp(queue.front());
queue.pop();
return tmp;
}
void Stop(bool wait)
{
boost::mutex::scoped_lock lock(mutex);
stopped = true;
cond.notify_all();
if (wait) // Wait till all blocked threads on the waiting queue to leave BlockingQueue::WaitPop()
{
while (nblocked)
cond.wait(lock);
}
}
private:
std::queue<DataType> queue;
mutable boost::mutex mutex;
boost::condition_variable_any cond;
unsigned int nblocked;
bool stopped;
};
For each Connection, there is a ConcurrentQueue<StreamPtr>, which contains the input Streams. The implementation of the ConcurrentQueue:
template <typename DataType>
class ConcurrentQueue : private boost::noncopyable
{
public:
void Push(const DataType& item)
{
boost::mutex::scoped_lock lock(mutex);
queue.push(item);
}
bool Empty() const
{
boost::mutex::scoped_lock lock(mutex);
return queue.empty();
}
bool TryPop(DataType& poppedItem)
{
boost::mutex::scoped_lock lock(mutex);
if (queue.empty())
return false;
poppedItem = queue.front();
queue.pop();
return true;
}
private:
std::queue<DataType> queue;
mutable boost::mutex mutex;
};
When debugging the program, it's okay. But in a load testing with 50 or 100 or more client connections, sometimes it aborted with
pthread_mutex_lock.c:321: __pthread_mutex_lock_full: Assertion `robust || (oldval & 0x40000000) == 0' failed.
I have no idea what happened, and it cannot be reproduced every time.
I googled a lot, but no luck. Please advise.
Thanks.
Peter
0x40000000 is FUTEX_OWNER_DIED - which has the following docs in the futex.h header:
/*
* The kernel signals via this bit that a thread holding a futex
* has exited without unlocking the futex. The kernel also does
* a FUTEX_WAKE on such futexes, after setting the bit, to wake
* up any possible waiters:
*/
#define FUTEX_OWNER_DIED 0x40000000
So the assertion seems to be an indication that a thread that's holding the lock is exiting for some reason - is there a way tha a thread object might be destroyed while it's holding a lock?
Another thing to check is if you have some sort of memory corruption somewhere. Valgrind might be a tool that can help you with that.
I had a similar issue and found this post. It may be useful for some of you: in my case I was just missing the init.
pthread_mutex_init(&_mutexChangeMapEvent, NULL);