C++ Thread safe queue shutdown - c++

I'm using this class for producer-consumer setup in C++:
#pragma once
#include <queue>
#include <mutex>
#include <condition_variable>
#include <memory>
#include <atomic>
template <typename T> class SafeQueue
{
public:
SafeQueue() :
_shutdown(false)
{
}
void Enqueue(T item)
{
std::unique_lock<std::mutex> lock(_queue_mutex);
bool was_empty = _queue.empty();
_queue.push(std::move(item));
lock.unlock();
if (was_empty)
_condition_variable.notify_one();
}
bool Dequeue(T& item)
{
std::unique_lock<std::mutex> lock(_queue_mutex);
while (!_shutdown && _queue.empty())
_condition_variable.wait(lock);
if(!_shutdown)
{
item = std::move(_queue.front());
_queue.pop();
return true;
}
return false;
}
bool IsEmpty()
{
std::lock_guard<std::mutex> lock(_queue_mutex);
return _queue.empty();
}
void Shutdown()
{
_shutdown = true;
_condition_variable.notify_all();
}
private:
std::mutex _queue_mutex;
std::condition_variable _condition_variable;
std::queue<T> _queue;
std::atomic<bool> _shutdown;
};
And I use it like this:
class Producer
{
public:
Producer() :
_running(true),
_t(std::bind(&Producer::ProduceThread, this))
{ }
~Producer()
{
_running = false;
_incoming_packets.Shutdown();
_t.join();
}
SafeQueue<Packet> _incoming_packets;
private:
void ProduceThread()
{
while(_running)
{
Packet p = GetNewPacket();
_incoming_packets.Enqueue(p);
}
}
std::atomic<bool> _running;
std::thread _t;
}
class Consumer
{
Consumer(Producer* producer) :
_producer(producer),
_t(std::bind(&Consumer::WorkerThread, this))
{ }
~Consumer()
{
_t.join();
}
private:
void WorkerThread()
{
Packet p;
while(producer->_incoming_packets.Dequeue(p))
ProcessPacket(p);
}
std::thread _t;
Producer* _producer;
}
This works most of the time. But once in a while when I delete the producer (and causing it's deconstructor to call SafeQueue::Shutdown, the _t.join() blocks forever.
My guess is the that the problem is here (in SafeQueue::Dequeue):
while (!_shutdown && _queue.empty())
_condition_variable.wait(lock);
SafeQueue::Shutdown from thread #1 gets called while thread #2 finished checking _shutdown but before it executed _condition_variable.wait(lock), so it "misses" the notify_all(). Can this happen?
If that's the problem, what's the best way to solve it?

Since the SafeQueue object is owned by the producer, deleting the producer causes a race condition between the consumer being notified and the SafeQueue being deleted out from under it when ~Producer completes.
I suggest having the shared resource being owned by neither the producer nor consumer, but passed as a reference to the constructor of each.
Change the Producer and Consumer constructors;
Producer( SafeQueue<Packet> & queue ) :
_running(false), _incoming_packets(queue) {}
Consumer( SafeQueue<Packet> & queue ) :
_running(false), _incoming_packets(queue) {}
Use your instances this way;
SafeQueue<Packet> queue;
Producer producer(queue);
Consumer consumer(queue);
...do stuff...
queue.shutdown();
This also resolves a poor design issue you have in the Consumer class being so tightly coupled to the Producer class.
Also, it's probably a bad idea to kill and join threads in a destructor, as you do for ~Producer. Better to add a Shutdown() method to each thread class, and call them explicitly;
producer.shutdown();
consumer.shutdown();
queue.shutdown();
Shutdown order doesn't really matter, unless you are concerned about losing unprocessed packets that are still in the queue when you stop the consumer.

In your SafeQueue::Dequeue, you are probably using std::condition_variable the wrong way... Change this:
bool Dequeue(T& item)
{
std::unique_lock<std::mutex> lock(_queue_mutex);
while (!_shutdown && _queue.empty())
_condition_variable.wait(lock);
if(!_shutdown)
{
item = std::move(_queue.front());
_queue.pop();
return true;
}
return false;
}
to
bool Dequeue(T& item)
{
std::unique_lock<std::mutex> lock(_queue_mutex);
_condition_variable.wait(lock, []{ return _shutdown || !_queue.empty() });
if(!_shutdown)
{
item = std::move(_queue.front());
_queue.pop();
return true;
}
return false;
}
Secondly, the order of initialization of the data members in Consumer isn't right with regards to its constructor
class Consumer
{
Consumer(Producer* producer) :
_producer(producer),
_t(std::bind(&Consumer::WorkerThread, this))
{ }
......
// _t will be constructed first, regardless of your constructor initializer list
// Meaning, the thread can even start running using an unintialized _producer
std::thread _t;
Producer* _producer;
}
It should be reordered to:
class Consumer
{
Consumer(Producer* producer) :
_producer(producer),
_t(std::bind(&Consumer::WorkerThread, this))
{ }
......
Producer* _producer;
std::thread _t;
}
Another part of your problem is covered by CAB's answer

Related

Thread-safe reference-counted queue C++

I'm struggling to implement a thread-safe reference-counted queue. The idea is that I have a number of tasks that each maintain a shared_ptr to a task manager that owns the queue. Here is a minimal implementation that should encounter that same issue:
#include <condition_variable>
#include <deque>
#include <functional>
#include <iostream>
#include <memory>
#include <mutex>
#include <thread>
namespace {
class TaskManager;
struct Task {
std::function<void()> f;
std::shared_ptr<TaskManager> manager;
};
class Queue {
public:
Queue()
: _queue()
, _mutex()
, _cv()
, _running(true)
, _thread([this]() { sweepQueue(); })
{
}
~Queue() { close(); }
void close() noexcept
{
try {
{
std::lock_guard<std::mutex> lock(_mutex);
if (!_running) {
return;
}
_running = false;
}
_cv.notify_one();
_thread.join();
} catch (...) {
std::cerr << "An error occurred while closing the queue\n";
}
}
void push(Task&& task)
{
std::unique_lock<std::mutex> lock(_mutex);
_queue.emplace_back(std::move(task));
lock.unlock();
_cv.notify_one();
}
private:
void sweepQueue() noexcept
{
while (true) {
try {
std::unique_lock<std::mutex> lock(_mutex);
_cv.wait(lock, [this] { return !_running || !_queue.empty(); });
if (!_running && _queue.empty()) {
return;
}
if (!_queue.empty()) {
const auto task = _queue.front();
_queue.pop_front();
task.f();
}
} catch (...) {
std::cerr << "An error occurred while sweeping the queue\n";
}
}
}
std::deque<Task> _queue;
std::mutex _mutex;
std::condition_variable _cv;
bool _running;
std::thread _thread;
};
class TaskManager : public std::enable_shared_from_this<TaskManager> {
public:
void addTask(std::function<void()> f)
{
_queue.push({ f, shared_from_this() });
}
private:
Queue _queue;
};
} // anonymous namespace
int main(void)
{
const auto manager = std::make_shared<TaskManager>();
manager->addTask([]() { std::cout << "Hello world\n"; });
}
The problem I find is that on rare occasions, the queue will try to invoke its own destructor within the sweepQueue method. Upon further inspection, it seems that the reference count on the TaskManager hits zero once the last task is dequeued. How can I safely maintain the reference count without invoking the destructor?
Update: The example does not clarify the need for the std::shared_ptr<TaskManager> within Task. Here is an example use case that should illustrate the need for this seemingly unnecessary ownership cycle.
std::unique_ptr<Task> task;
{
const auto manager = std::make_shared<TaskManager>();
task = std::make_unique<Task>(someFunc, manager);
}
// Guarantees manager is not destroyed while task is still in scope.
The ownership hierarchy here is TaskManager owns Queue and Queue owns Tasks. Tasks maintaining a shared pointer to TaskManager create an ownership cycle which does not seem to serve a useful purpose here.
This is the ownership what is root of the problem here. A Queue is owned by TaskManager, so that Queue can have a plain pointer to TaskManager and pass that pointer to Task in sweepQueue. You do not need std::shared_pointer<TaskManager> in Task at all here.
I'd refactor the queue from the thread first.
But to fix your problem:
struct am_I_alive {
explicit operator bool() const { return m_ptr.lock(); }
private:
std::weak_ptr<void> m_ptr;
};
struct lifetime_tracker {
am_I_alive track_lifetime() {
if (!m_ptr) m_ptr = std::make_shared<bool>(true);
return {m_ptr};
}
lifetime_tracker() = default;
lifetime_tracker(lifetime_tracker const&) {} // do nothing, don't copy
lifetime_tracker& operator=(lifetime_tracker const&){ return *this; }
private:
std::shared_ptr<void> m_ptr;
};
this is a little utility to detect if we have been deleted. It is useful in any code that calls an arbitrary callback whose side effect could include delete(this).
Privately inherit your Queue from it.
Then split popping the task from running it.
std::optional<Task> get_task() {
std::unique_lock<std::mutex> lock(_mutex);
_cv.wait(lock, [this] { return !_running || !_queue.empty(); });
if (!_running && _queue.empty()) {
return {}; // end
}
auto task = _queue.front();
_queue.pop_front();
return task;
}
void sweepQueue() noexcept
{
while (true) {
try {
auto task = get_task();
if (!task) return;
// we are alive here
auto alive = track_lifetime();
try {
(*task).f();
} catch(...) {
std::cerr << "An error occurred while running a task\n";
}
task={};
// we could be deleted here
if (!alive)
return; // this was deleted, get out of here
}
} catch (...) {
std::cerr << "An error occurred while sweeping the queue\n";
}
}
}
and now you are safe.
After that you need to deal with the thread problem.
The thread problem is that you need your code to destroy the thread from within the thread it is running. At the same time, you also need to guarantee that the thread has terminated before main ends.
These are not compatible.
To fix that, you need to create a thread owning pool that doesn't have your "keep alive" semantics, and get your thread from there.
These threads don't delete themselves; instead, they return themselves to that pool for reuse by another client.
At shutdown, those threads are blocked on to ensure you don't have code running elsewhere that hasn't halted before the end of main.
To write such a pool without your inverted dependency mess, split the queue part of your code off. This queue owns no thread.
template<class T>
struct threadsafe_queue {
void push(T);
std::optional<T> pop(); // returns empty if thread is aborted
void abort();
~threadsafe_queue();
private:
std::mutex m;
std::condition_variable v;
std::deque<T> data;
bool aborted = false;
};
then a simple thread pool:
struct thread_pool {
template<class F>
std::future<std::result_of_t<F&()>> enqueue( F&& f );
template<class F>
std::future<std::result_of_t<F&()>> thread_off_now( F&& f ); // starts a thread if there aren't any free
void abort();
void start_thread( std::size_t n = 1 );
std::size_t count_threads() const;
~thread_pool();
private:
threadsafe_queue< std::function<void()> > tasks;
std::vector< std::thread > threads;
static void thread_loop( thread_pool* pool );
};
make a thread pool singleton. Get your threads for your queue from thread_off_now method, guaranteeing you a thread that (when you are done with it) can be recycled, and whose lifetime is handled by someone else.
But really, you should instead be thinking with ownership in mind. The idea that tasks and task queues mutually own each other is a mess.
If someone disposes of a task queue, it is probably a good idea to abandon the tasks instead of persisting it magically and silently.
Which is what my simple thread pool does.

Terminating an std::thread which runs in endless loop

How can I terminate my spun off thread in the destructor of Bar (without having to wait until the thread woke up form its sleep)?
class Bar {
public:
Bar() : thread(&Bar:foo, this) {
}
~Bar() { // terminate thread here}
...
void foo() {
while (true) {
std::this_thread::sleep_for(
std::chrono::seconds(LONG_PERIOD));
//do stuff//
}
}
private:
std::thread thread;
};
You could use a std::condition_variable:
class Bar {
public:
Bar() : t_(&Bar::foo, this) { }
~Bar() {
{
// Lock mutex to avoid race condition (see Mark B comment).
std::unique_lock<std::mutex> lk(m_);
// Update keep_ and notify the thread.
keep_ = false;
} // Unlock the mutex (see std::unique_lock)
cv_.notify_one();
t_.join(); // Wait for the thread to finish
}
void foo() {
std::unique_lock<std::mutex> lk(m_);
while (keep_) {
if (cv_.wait_for(lk, LONG_PERIOD) == std::cv_status::no_timeout) {
continue; // On notify, just continue (keep_ is updated).
}
// Do whatever the thread needs to do...
}
}
private:
bool keep_{true};
std::thread t_;
std::mutex m_;
std::condition_variable cv_;
};
This should give you a global idea of what you may do:
You use an bool to control the loop (with protected read and write access using a std::mutex);
You use an std::condition_variable to wake up the thread to avoid waiting LONG_PERIOD.

detached thread crashing on exiting

I am using a simple thread pool as below-
template<typename T>
class thread_safe_queue // thread safe worker queue.
{
private:
std::atomic<bool> finish;
mutable std::mutex mut;
std::queue<T> data_queue;
std::condition_variable data_cond;
public:
thread_safe_queue() : finish{ false }
{}
~thread_safe_queue()
{}
void setDone()
{
finish.store(true);
data_cond.notify_one();
}
void push(T new_value)
{
std::lock_guard<std::mutex> lk(mut);
data_queue.push(std::move(new_value));
data_cond.notify_one();
}
void wait_and_pop(T& value)
{
std::unique_lock<std::mutex> lk(mut);
data_cond.wait(lk, [this]
{
return false == data_queue.empty();
});
if (finish.load() == true)
return;
value = std::move(data_queue.front());
data_queue.pop();
}
bool empty() const
{
std::lock_guard<std::mutex> lk(mut);
return data_queue.empty();
}
};
//Thread Pool
class ThreadPool
{
private:
std::atomic<bool> done;
unsigned thread_count;
std::vector<std::thread> threads;
public:
explicit ThreadPool(unsigned count = 1);
ThreadPool(const ThreadPool & other) = delete;
ThreadPool& operator = (const ThreadPool & other) = delete;
~ThreadPool()
{
done.store(true);
work_queue.setDone();
// IF thread is NOT marked detached and this is uncommented the worker threads waits infinitely.
//for (auto &th : threads)
//{
// if (th.joinable())
// th.join();
// }
}
void init()
{
try
{
thread_count = std::min(thread_count, std::thread::hardware_concurrency());
for (unsigned i = 0; i < thread_count; ++i)
{
threads.emplace_back(std::move(std::thread(&ThreadPool::workerThread, this)));
threads.back().detach();
// here the problem is if i dont mark it detatched thread infinitely waits for condition.
// if i comment out the detach line and uncomment out comment lines in ~ThreadPool main threads waits infinitely.
}
}
catch (...)
{
done.store(true);
throw;
}
}
void workerThread()
{
while (true)
{
std::function<void()> task;
work_queue.wait_and_pop(task);
if (done == true)
break;
task();
}
}
void submit(std::function<void(void)> fn)
{
work_queue.push(fn);
}
};
The usage is like :
struct start
{
public:
ThreadPool::ThreadPool m_NotifPool;
ThreadPool::ThreadPool m_SnapPool;
start()
{
m_NotifPool.init();
m_SnapPool.init();
}
};
int main()
{
start s;
return 0;
}
I am running this code on visual studio 2013. The problem is when main thread exits. The program crashes. It throws exception.
Please help me with what am i doing wrong? How do i stop the worker thread properly? I have spent quite some time but still figuring out what is the issue.
Thanks for your help in advance.
I am not familiar with threads in c++ but have worked with threading in C. In C what actually happens is when you creates child threads of from the main thread then you have to stop the main thread until the childs finishes. If main exits the threads becomes zombie. I think C don't throw an exception in case of Zombies. And may be you are getting exception because of these zombies only. Try stopping the main until the childs finishes and see if it works.
When main exits, detached threads are allowed to continue running, however, object s is destroyed. So, as your threads attempt to access members of object s, you are running into UB.
See accepted answer of this question for more details about your issue : What happens to a detached thread when main() exits?
A rule of thumb would be not to detach threads from main, but signal thread pool that app is ending and join all thread. Or do as is answered in What happens to a detached thread when main() exits?

Multithreaded Tasking System fails list iterator not derefercable

Okay this meight be a bit off for Stack but ill try to keep it as short as possible.
I got thread which takes tasks out of a list and executes them. Simple as this (The worker class has its own thread and runs doTask m_thread(&Worker::doTask, this)):
void Worker::doTask()
{
while (m_running)
{
auto task = m_tasks.pop_front();
task->execute();
if (task->isContinuous())
m_tasks.push_pack(task);
}
}
The list itself is/should be threadsafe:
Header:
class TaskQueue
{
public:
void push_pack(std::shared_ptr<Task> t);
std::shared_ptr<Task> pop_front();
private:
std::list<std::shared_ptr<Task>> m_tasks;
std::condition_variable m_cond;
std::mutex m_mutex;
void TaskQueue::push_pack(std::shared_ptr<Task> t)
}
Impls of the importand part:
void TaskQueue::push_pack(std::shared_ptr<Task> t)
{
m_tasks.push_back(t);
//notify that there is one more task, so one thread can work now
m_cond.notify_one();
}
std::shared_ptr<Task> TaskQueue::pop_front()
{
//regular lock so noone else acces this area now
std::unique_lock<std::mutex> lock(m_mutex);
while (m_tasks.size() == 0)
m_cond.wait(lock);
auto task = m_tasks.front();
m_tasks.pop_front();
return task;
}
last but not least the tasks:
class Task
{
public:
virtual ~Task()
{
}
virtual void execute() = 0;
virtual bool isContinuous()
{
return false;
};
};
So if i try to add this Task:
class NetworkRequestTask:public Task
{
public:
NetworkRequestTask(TaskQueue &q);
~NetworkRequestTask();
void execute() override;
bool isContinuous() override;
private:
TaskQueue &m_tasks;
};
Impl:
NetworkRequestTask::NetworkRequestTask(TaskQueue& q): m_tasks(q)
{
}
NetworkRequestTask::~NetworkRequestTask()
{
}
void NetworkRequestTask::execute()
{
while(dosomething)
{
//do something here
}
}
bool NetworkRequestTask::isContinuous()
{
return true;
}
Main:
int main(int argc, char* argv[])
{
TaskQueue tasks;
tasks.push_pack(std::make_shared<NetworkRequestTask>(tasks));
}
it gets into a bad state:
Expression: list iterator not derefercable
I am Confused. This only happens if i override continouse and this only happens at this task. If i add the queue to a other continouse task as reference it does not get into that bad state.
So whats going wrong here and more importand what have i done wrong?
As from the comments, i already tried to lock the push_back method which did not change anything to the behaviour. (You can exchange it for a regular mutex it doesnt matter.)
void TaskQueue::push_pack(std::shared_ptr<Task> t)
{
std::lock_guard<SpinLock> lock(m_spin);
m_tasks.push_back(t);
//notify that there is one more task, so one thread can work now
m_cond.notify_one();
}

pthread_mutex_lock __pthread_mutex_lock_full: Assertion failed with robust and 0x4000000

I'm working on a server-side project, which is supposed to accept more than 100 client connections.
It's multithreaded program using boost::thread. Some places I'm using boost::lock_guard<boost::mutex> to lock the shared member data. There is also a BlockingQueue<ConnectionPtr> which contains the input connections. The implementation of the BlockingQueue:
template <typename DataType>
class BlockingQueue : private boost::noncopyable
{
public:
BlockingQueue()
: nblocked(0), stopped(false)
{
}
~BlockingQueue()
{
Stop(true);
}
void Push(const DataType& item)
{
boost::mutex::scoped_lock lock(mutex);
queue.push(item);
lock.unlock();
cond.notify_one(); // cond.notify_all();
}
bool Empty() const
{
boost::mutex::scoped_lock lock(mutex);
return queue.empty();
}
std::size_t Count() const
{
boost::mutex::scoped_lock lock(mutex);
return queue.size();
}
bool TryPop(DataType& poppedItem)
{
boost::mutex::scoped_lock lock(mutex);
if (queue.empty())
return false;
poppedItem = queue.front();
queue.pop();
return true;
}
DataType WaitPop()
{
boost::mutex::scoped_lock lock(mutex);
++nblocked;
while (!stopped && queue.empty()) // Or: if (queue.empty())
cond.wait(lock);
--nblocked;
if (stopped)
{
cond.notify_all(); // Tell Stop() that this thread has left
BOOST_THROW_EXCEPTION(BlockingQueueTerminatedException());
}
DataType tmp(queue.front());
queue.pop();
return tmp;
}
void Stop(bool wait)
{
boost::mutex::scoped_lock lock(mutex);
stopped = true;
cond.notify_all();
if (wait) // Wait till all blocked threads on the waiting queue to leave BlockingQueue::WaitPop()
{
while (nblocked)
cond.wait(lock);
}
}
private:
std::queue<DataType> queue;
mutable boost::mutex mutex;
boost::condition_variable_any cond;
unsigned int nblocked;
bool stopped;
};
For each Connection, there is a ConcurrentQueue<StreamPtr>, which contains the input Streams. The implementation of the ConcurrentQueue:
template <typename DataType>
class ConcurrentQueue : private boost::noncopyable
{
public:
void Push(const DataType& item)
{
boost::mutex::scoped_lock lock(mutex);
queue.push(item);
}
bool Empty() const
{
boost::mutex::scoped_lock lock(mutex);
return queue.empty();
}
bool TryPop(DataType& poppedItem)
{
boost::mutex::scoped_lock lock(mutex);
if (queue.empty())
return false;
poppedItem = queue.front();
queue.pop();
return true;
}
private:
std::queue<DataType> queue;
mutable boost::mutex mutex;
};
When debugging the program, it's okay. But in a load testing with 50 or 100 or more client connections, sometimes it aborted with
pthread_mutex_lock.c:321: __pthread_mutex_lock_full: Assertion `robust || (oldval & 0x40000000) == 0' failed.
I have no idea what happened, and it cannot be reproduced every time.
I googled a lot, but no luck. Please advise.
Thanks.
Peter
0x40000000 is FUTEX_OWNER_DIED - which has the following docs in the futex.h header:
/*
* The kernel signals via this bit that a thread holding a futex
* has exited without unlocking the futex. The kernel also does
* a FUTEX_WAKE on such futexes, after setting the bit, to wake
* up any possible waiters:
*/
#define FUTEX_OWNER_DIED 0x40000000
So the assertion seems to be an indication that a thread that's holding the lock is exiting for some reason - is there a way tha a thread object might be destroyed while it's holding a lock?
Another thing to check is if you have some sort of memory corruption somewhere. Valgrind might be a tool that can help you with that.
I had a similar issue and found this post. It may be useful for some of you: in my case I was just missing the init.
pthread_mutex_init(&_mutexChangeMapEvent, NULL);