std::async thread pool continue execution without blocking - c++

I have a Thread Pool where each thread must be a waiting thread and keep listening to new tasks to process them asynchronously (the processing takes some long time). However, in the following code I am not able to get this behaviour. The problem is that when I create the thread pool, they execute successfully the first task given. The process() function reaches de return 0; while threads are computing tasks, but it never returns to main(). It stands in the v.wait(l, [&] {return !tasks.empty(); }); line, that is, it still waits for new tasks to be pushed into the tasks queue and that never happens. I've readed that it's something related to the std::future destructor: If I am not wrong, I think that when process() reaches the return, the std::future destructor is called and it waits till all the threads ends, but they never ends!
Here's the code:
static int callings = 0;
class ThreadPool
{
private:
std::queue<int> tasks;
std::mutex m;
std::vector<std::future<void>> finished;
std::condition_variable v;
public:
void push_task(int arg) {
std::unique_lock<std::mutex> l(m);
tasks.push(arg);
v.notify_one(); // wake a thread to work on the task
}
void read_tasks() {
while (true) {
std::unique_lock<std::mutex> l(m);
if (tasks.empty()) {
//waits till new task
v.wait(l, [&] {return !tasks.empty(); }); //after completing the first task, the program stays here forever
}
int task = tasks.front(); // read task
tasks.pop(); //delete task
//run the task
std::this_thread::sleep_for(std::chrono::milliseconds(5 * 1000)); //simulate computation
}//while true
}
void create_thread_pool(int m_threads_count) {
for (int t_i = 0; t_i < m_threads_count; t_i++) {
finished.push_back(std::async(std::launch::async,[this] { read_tasks(); }));
printf("Thread %d is doing work...\n", t_i);
}
}
}; //ThreadPool
int process(){
ThreadPool pool;
if(callings == 0)
{
pool.create_thread_pool(4);
}
//give some task to do...
pool.push_task(callings);
callings++;
return 0; //point reached but never returning to main
}
int main(){
while(true){
// do things...
process();
// do more things...
// this does not execute, how to solve this?
}
return 0;
}
How can I return to main() while the threads keep waiting for new tasks without blocking?
Thanks in advance

Related

Why having jobs and waits for them in other thread cause whole application to wait?

I've got a thread pool
struct ThreadPool
{
using Task = std::packaged_task<void()>;
explicit ThreadPool(size_t workersCount)
{
workers.reserve(workersCount);
for(uint32_t i = 0u; i < workersCount; ++i) {
workers.emplace_back([=]() {
while(true) {
Task result;
{
std::unique_lock<std::mutex> locker(mutex);
condition.wait(locker, [=]() { return stop || !tasks.empty(); });
if(stop && tasks.empty()) {
break;
}
result = std::move(tasks.front());
tasks.pop();
}
result();
}
});
}
}
~ThreadPool() noexcept
{
stop = true;
condition.notify_all();
for(auto& worker : workers) {
worker.join();
} workers.clear();
}
template<typename T>
inline auto Enqueue(T task)->std::future<decltype(task())>
{
auto package = std::packaged_task<decltype(task())()>(std::move(task));
auto result = package.get_future();
{
std::unique_lock<std::mutex> locker(mutex);
tasks.emplace(std::move(package));
}
condition.notify_one();
return result;
}
std::vector<std::thread> workers;
std::queue<Task> tasks;
std::mutex mutex;
std::condition_variable condition;
std::atomic_bool stop = false;
};
and this example
//just for the example this is a global
static ThreadPool pool{4};
struct DoSomethingStruct
{
void DoSomething()
{
std::vector<std::future<void>> futures;
for(uint32_t i = 0; i < 10; ++i) {
futures.push_back(pool.Enqueue([this, i]() {
ints.push_back(i);
}));
}
for(const auto& future : futures) {
future.wait();
}
}
std::vector<int> ints;
};
int main()
{
DoSomethingStruct dss;
std::vector<std::future<void>> futures;
for(uint32_t i = 0; i < 10; ++i) {
futures.push_back(pool.Enqueue([&dss, i]() {
dss.DoSomething();
}));
}
for(const auto& future : futures) {
future.wait();
}
return 0;
}
When I run the application it never ends.
The example above does not actually present an actual use case. I am wondering why it does not wait for 10 futures in DoSomethingStruct::DoSomthing(); and then in main for 10 other jobs.
I wanted to do something similar to what this guy did https://wickedengine.net/2018/11/24/simple-job-system-using-standard-c/, but with futures and mutex and condition variable.
Why is that? What did I do wrong?
First, your pool creates 4 working threads. Then, in main, you add some tasks into the pool queue, which call dss.DoSomething();.
The workers then start executing these tasks. Inside, they first enqueue some more tasks, and then, they start waiting for their futures forever. These waits never end since there are no threads that could start resolving the next enqueued tasks.
Creating a thread pool with the ability to enqueue tasks from within processed tasks is not trivial. Basically, what you would need is to suspend the current task here instead of waiting. There is no native mechanism for this in C++ (at least, until C++20 coroutines).
As a workaround, you can use OpenMP or Intel TBB, which both provide the described functionality. For example, in OpenMP, you can suspend a current task and wait for its sub-tasks completion with #pragma omp takswait.

C++ conditional wait race condition

Suppose that I have a program that has a worker-thread that squares number from a queue. The problem is that if the work is to light (takes to short time to do), the worker finishes the work and notifies the main thread before it have time to even has time to wait for the worker to finish.
My simple program looks as follows:
#include <atomic>
#include <condition_variable>
#include <queue>
#include <thread>
std::atomic<bool> should_end;
std::condition_variable work_to_do;
std::mutex work_to_do_lock;
std::condition_variable fn_done;
std::mutex fn_done_lock;
std::mutex data_lock;
std::queue<int> work;
std::vector<int> result;
void worker() {
while(true) {
if(should_end) return;
data_lock.lock();
if(work.size() > 0) {
int front = work.front();
work.pop();
if (work.size() == 0){
fn_done.notify_one();
}
data_lock.unlock();
result.push_back(front * front);
} else {
data_lock.unlock();
// nothing to do, so we just wait
std::unique_lock<std::mutex> lck(work_to_do_lock);
work_to_do.wait(lck);
}
}
}
int main() {
should_end = false;
std::thread t(worker); // start worker
data_lock.lock();
const int N = 10;
for(int i = 0; i <= N; i++) {
work.push(i);
}
data_lock.unlock();
work_to_do.notify_one(); // notify the worker that there is work to do
//if the worker is quick, it signals done here already
std::unique_lock<std::mutex> lck(fn_done_lock);
fn_done.wait(lck);
for(auto elem : result) {
printf("result = %d \n", elem);
}
work_to_do.notify_one(); //notify the worker so we can shut it down
should_end = true;
t.join();
return 0;
}
Your try to use notification itself over conditional variable as a flag that job is done is fundamentally flawed. First and foremost std::conditional_variable can have spurious wakeups so it should not be done this way. You should use your queue size as an actual condition for end of work, check and modify it under the same mutex protected in all threads and use the same mutex lock for condition variable. Then you may use std::conditional_variable to wait until work is done but you do it after you check queue size and if work is done at the moment you do not go to wait at all. Otherwise you check queue size in a loop (because of spurious wakeups) and wait if it is still not empty or you use std::condition_variable::wait() with a predicate, that has the loop internally.

QtConcurrent: why releaseThread and reserveThread cause deadlock?

In Qt 4.7 Reference for QThreadPool, we find:
void QThreadPool::releaseThread()
Releases a thread previously reserved by a call to reserveThread().
Note: Calling this function without previously reserving a thread
temporarily increases maxThreadCount(). This is useful when a thread
goes to sleep waiting for more work, allowing other threads to
continue. Be sure to call reserveThread() when done waiting, so that
the thread pool can correctly maintain the activeThreadCount().
See also reserveThread().
void QThreadPool::reserveThread()
Reserves one thread, disregarding activeThreadCount() and
maxThreadCount().
Once you are done with the thread, call releaseThread() to allow it to
be reused.
Note: This function will always increase the number of active threads.
This means that by using this function, it is possible for
activeThreadCount() to return a value greater than maxThreadCount().
See also releaseThread().
I want to use releaseThread() to make it possible to use nested concurrent map, but in the following code, it hangs in waitForFinished():
#include <QApplication>
#include <QMainWindow>
#include <QtConcurrentMap>
#include <QtConcurrentRun>
#include <QFuture>
#include <QThreadPool>
#include <QtTest/QTest>
#include <QFutureSynchronizer>
struct Task2 { // only calculation
typedef void result_type;
void operator()(int count) {
int k = 0;
for (int i = 0; i < count * 10; ++i) {
for (int j = 0; j < count * 10; ++j) {
k++;
}
}
assert(k >= 0);
}
};
struct Task1 { // will launch some other concurrent map
typedef void result_type;
void operator()(int count) {
QVector<int> vec;
for (int i = 0; i < 5; ++i) {
vec.push_back(i+count);
}
Task2 task;
QFuture<void> f = QtConcurrent::map(vec.begin(), vec.end(), task);
{
// with out releaseThread before wait, it will hang directly
QThreadPool::globalInstance()->releaseThread();
f.waitForFinished(); // BUG: may hang there
QThreadPool::globalInstance()->reserveThread();
}
}
};
int main() {
QThreadPool* gtpool = QThreadPool::globalInstance();
gtpool->setExpiryTimeout(50);
int count = 0;
for (;;) {
QVector<int> vec;
for (int i = 0; i < 40 ; i++) {
vec.push_back(i);
}
// launch a task with nested map
Task1 task; // Task1 will have nested concurrent map
QFuture<void> f = QtConcurrent::map(vec.begin(), vec.end(),task);
f.waitForFinished(); // BUG: may hang there
count++;
// waiting most of thread in thread pool expire
while (QThreadPool::globalInstance()->activeThreadCount() > 0) {
QTest::qSleep(50);
}
// launch a task only calculation
Task2 task2;
QFuture<void> f2 = QtConcurrent::map(vec.begin(), vec.end(), task2);
f2.waitForFinished(); // BUG: may hang there
qDebug() << count;
}
return 0;
}
This code will not run forever; it will hang in after many loops (1~10000), with all threads waiting for condition variable.
My questions are:
Why does it hang?
Can I fix it and keep the nested concurrent map?
dev env:
Linux version 2.6.32-696.18.7.el6.x86_64; Qt4.7.4; GCC 3.4.5
Windows 7; Qt4.7.4; mingw 4.4.0
The program hangs because of the race condition in QThreadPool when you try to deal with expiryTimeout. Here is the analysis in detail :
The problem in QThreadPool - source
When starting a task, QThreadPool did something along the lines of:
QMutexLocker locker(&mutex);
taskQueue.append(task); // Place the task on the task queue
if (waitingThreads > 0) {
// there are already running idle thread. They are waiting on the 'runnableReady'
// QWaitCondition. Wake one up them up.
waitingThreads--;
runnableReady.wakeOne();
} else if (runningThreadCount < maxThreadCount) {
startNewThread(task);
}
And the the thread's main loop looks like this:
void QThreadPoolThread::run()
{
QMutexLocker locker(&manager->mutex);
while (true) {
/* ... */
if (manager->taskQueue.isEmpty()) {
// no pending task, wait for one.
bool expired = !manager->runnableReady.wait(locker.mutex(),
manager->expiryTimeout);
if (expired) {
manager->runningThreadCount--;
return;
} else {
continue;
}
}
QRunnable *r = manager->taskQueue.takeFirst();
// run the task
locker.unlock();
r->run();
locker.relock();
}
}
The idea is that the thread will wait for a given amount of second for
a task, but if no task was added in a given amount of time, the thread
expires and is terminated. The problem here is that we rely on the
return value of runnableReady. If there is a task that is scheduled at
exactly the same time as the thread expires, then the thread will see
false and will expire. But the main thread will not restart any other
thread. That might let the application hang as the task will never be
run.
The quick workaround is to use a long expiryTime (30000 by default) and remove the while loop that waits for the threads expired.
Here is the main function modified, the program runs smoothly in Windows 7, 4 threads used by default :
int main() {
QThreadPool* gtpool = QThreadPool::globalInstance();
//gtpool->setExpiryTimeout(50); <-- don't set the expiry Timeout, use the default one.
qDebug() << gtpool->maxThreadCount();
int count = 0;
for (;;) {
QVector<int> vec;
for (int i = 0; i < 40 ; i++) {
vec.push_back(i);
}
// launch a task with nested map
Task1 task; // Task1 will have nested concurrent map
QFuture<void> f = QtConcurrent::map(vec.begin(), vec.end(),task);
f.waitForFinished(); // BUG: may hang there
count++;
/*
// waiting most of thread in thread pool expire
while (QThreadPool::globalInstance()->activeThreadCount() > 0)
{
QTest::qSleep(50);
}
*/
// launch a task only calculation
Task2 task2;
QFuture<void> f2 = QtConcurrent::map(vec.begin(), vec.end(), task2);
f2.waitForFinished(); // BUG: may hang there
qDebug() << count ;
}
return 0;
}
#tungIt's answer is good enough, I found the qtbug and fix commit, just for reference:
https://bugreports.qt.io/browse/QTBUG-3786
https://github.com/qt/qtbase/commit/a9b6a78e54670a70b96c122b10ad7bd64d166514#diff-6d5794cef91df41c39b5e7cc6b71d041

Thread join hangs

Thread join is hanging in case of single producer and multiple consumer case.
I am attaching the codebase below:
1) This is the Consumer Thread
class ConsumerThread-
{
wqueue<WorkItem*>& m_queue;
-
public:
ConsumerThread(wqueue<WorkItem*>& queue) : m_queue(queue) {}
std::thread start() {
return std::thread( [=] {runThr();} );
}
-
void runThr() {
// Remove 1 item at a time and process it. Blocks if no items are-
// available to process.
for (int i = 0;; i++) {
printf("thread %lu, loop %d - waiting for item...\n",-
std::this_thread::get_id(), i);
WorkItem* item = (WorkItem*)m_queue.remove();
printf("thread %lu, loop %d - got one item\n",-
std::this_thread::get_id(), i);
printf("thread %lu, loop %d - item: message - %s, number - %d\n",-
std::this_thread::get_id(), i, item->getMessage(),-
item->getNumber());
delete item;
}
}
};
2) This is Work Item
class WorkItem
{
std::string m_message;
int m_number;
-
public:
WorkItem(const char* message, int number)-
: m_message(message), m_number(number) {}
~WorkItem() {}
-
const char* getMessage() { return m_message.c_str(); }
int getNumber() { return m_number; }
};
3). This class is has the queue where the producer pushes and consumers consume the WorkItem.
template <typename T> class wqueue
{
std::list<T> m_queue;
std::mutex m_mutex;
std::condition_variable m_condv;-
public:
wqueue() {}
~wqueue() {}
void add(T item) {
m_mutex.lock();
m_queue.push_back(item);
m_condv.notify_one();
m_mutex.unlock();
}
T remove() {
std::unique_lock<std::mutex> lk(m_mutex);
while(m_queue.size() == 0)
m_condv.wait(lk);
T item = m_queue.front();
m_queue.pop_front();
return item;
}
int size() {
m_mutex.lock();
int size = m_queue.size();
m_mutex.unlock();
return size;
}
};
4) This is the class containing the main function
int main(int argc, char* argv[])
{
// Process command line arguments
if ( argc != 2 ) {
printf("usage: %s <iterations>\n", argv[0]);
exit(-1);
}
int iterations = atoi(argv[1]);
// Create the queue and consumer (worker) threads
wqueue<WorkItem*> queue;
ConsumerThread* thread1 = new ConsumerThread(queue);
ConsumerThread* thread2 = new ConsumerThread(queue);
std::thread t1 = thread1->start();
std::thread t2 = thread2->start();
t1.join();
t2.join();
// Add items to the queue
WorkItem* item;
for (int i = 0; i < iterations; i++) {
item = new WorkItem("abc", 123);
queue.add(item);
item = new WorkItem("def", 456);
queue.add(item);
item = new WorkItem("ghi", 789);
queue.add(item);
}
The t1.join() and t2.join() hangs mentioned in the section 4.
Your consumer thread has no terminating condition so it runs forever:
for (int i = 0;; i++) // never ends
Joining a thread won't magically make it break out of its loop, you need to set an ended flag or something.
Also when the wqueue is empty all threads trying to remove() an element will block:
while(m_queue.size() == 0)
m_condv.wait(lk);
You try to join() the threads before putting anything in them.
There is nothing wrong with the behaviour, calling join() on a thread object will simply wait until the thread finishes before continuing. Your problem is rather that your threads don't terminate, which is a whole different issue.
In particular in a producer-consumer setup, both peers typically sit and wait for work. Unless you explicitly tell them not to wait for work any longer, they will sit there forever! If you in turn wait for them to finish, you will also wait forever, which is your problem. You need to signal them to stop looping and additionally you might have to interrupt them from waiting for work.

Shutdown boost threads correctly

I have x boost threads that work at the same time. One producer thread fills a synchronised queue with calculation tasks. The consumer threads pop out tasks and calculates them.
Image Source: https://www.quantnet.com/threads/c-multithreading-in-boost.10028/
The user may finish the programm during this process, so I need to shutdown my threads properly. My current approach seems to not work, since exceptions are thrown. It's intented that on system shutdown all processes should be killed and stop their current task no matter what they do. Could you please show me, how you would kill thoses threads?
Thread Initialisation:
for (int i = 0; i < numberOfThreads; i++)
{
std::thread* thread = new std::thread(&MyManager::worker, this);
mThreads.push_back(thread);
}
Thread Destruction:
void MyManager::shutdown()
{
for (int i = 0; i < numberOfThreads; i++)
{
mThreads.at(i)->join();
delete mThreads.at(i);
}
mThreads.clear();
}
Worker:
void MyManager::worker()
{
while (true)
{
int current = waitingList.pop();
Object * p = objects.at(current);
p->calculateMesh(); //this task is internally locked by a mutex
try
{
boost::this_thread::interruption_point();
}
catch (const boost::thread_interrupted&)
{
// Thread interruption request received, break the loop
std::cout << "- Thread interrupted. Exiting thread." << std::endl;
break;
}
}
}
Synchronised Queue:
#include <queue>
#include <thread>
#include <mutex>
#include <condition_variable>
template <typename T>
class ThreadSafeQueue
{
public:
T pop()
{
std::unique_lock<std::mutex> mlock(mutex_);
while (queue_.empty())
{
cond_.wait(mlock);
}
auto item = queue_.front();
queue_.pop();
return item;
}
void push(const T& item)
{
std::unique_lock<std::mutex> mlock(mutex_);
queue_.push(item);
mlock.unlock();
cond_.notify_one();
}
int sizeIndicator()
{
std::unique_lock<std::mutex> mlock(mutex_);
return queue_.size();
}
private:
bool isEmpty() {
std::unique_lock<std::mutex> mlock(mutex_);
return queue_.empty();
}
std::queue<T> queue_;
std::mutex mutex_;
std::condition_variable cond_;
};
The thrown error call stack:
... std::_Mtx_lockX(_Mtx_internal_imp_t * * _Mtx) Line 68 C++
... std::_Mutex_base::lock() Line 42 C++
... std::unique_lock<std::mutex>::unique_lock<std::mutex>(std::mutex & _Mtx) Line 220 C++
... ThreadSafeQueue<int>::pop() Line 13 C++
... MyManager::worker() Zeile 178 C++
From my experience on working with threads in both Boost and Java, trying to shut down threads externally is always messy. I've never been able to really get that to work cleanly.
The best I've gotten is to have a boolean value available to all the consumer threads that is set to true. When you set it to false, the threads will simply return on their own. In your case, that could easily be put into the while loop you have.
On top of that, you're going to need some synchronization so that you can wait for the threads to return before you delete them, otherwise you can get some hard to define behavior.
An example from a past project of mine:
Thread creation
barrier = new boost::barrier(numOfThreads + 1);
threads = new detail::updater_thread*[numOfThreads];
for (unsigned int t = 0; t < numOfThreads; t++) {
//This object is just a wrapper class for the boost thread.
threads[t] = new detail::updater_thread(barrier, this);
}
Thread destruction
for (unsigned int i = 0; i < numOfThreads; i++) {
threads[i]->requestStop();//Notify all threads to stop.
}
barrier->wait();//The update request will allow the threads to get the message to shutdown.
for (unsigned int i = 0; i < numOfThreads; i++) {
threads[i]->waitForStop();//Wait for all threads to stop.
delete threads[i];//Now we are safe to clean up.
}
Some methods that may be of interest from the thread wrapper.
//Constructor
updater_thread::updater_thread(boost::barrier * barrier)
{
this->barrier = barrier;
running = true;
thread = boost::thread(&updater_thread::run, this);
}
void updater_thread::run() {
while (running) {
barrier->wait();
if (!running) break;
//Do stuff
barrier->wait();
}
}
void updater_thread::requestStop() {
running = false;
}
void updater_thread::waitForStop() {
thread.join();
}
Try moving 'try' up (like in the sample below). If your thread is waiting for data (inside waitingList.pop()) then may be waiting inside the condition variable .wait(). This is an 'interruption point' and so may throw when the thread gets interrupted.
void MyManager::worker()
{
while (true)
{
try
{
int current = waitingList.pop();
Object * p = objects.at(current);
p->calculateMesh(); //this task is internally locked by a mutex
boost::this_thread::interruption_point();
}
catch (const boost::thread_interrupted&)
{
// Thread interruption request received, break the loop
std::cout << "- Thread interrupted. Exiting thread." << std::endl;
break;
}
}
}
Maybe you are catching the wrong exception class?
Which would mean it does not get caught.
Not too familiar with threads but is it the mix of std::threads and boost::threads that is causing this?
Try catching the lowest parent exception.
I think this is a classic problem of reader/writer thread working on a common buffer. One of the most secured way of working out this problem is to use mutexes and signals.( I am not able to post the code here. Please send me an email, I post the code to you).