I want to call a function in parallel in C++, which waits for some time and performs some task. But I don't want the execution flow to wait for the function. I considered using pthread in a simple way but again, I have to wait till it joins back !
void A_Function()
{
/* Call a function which waits for some time and then perform some tasks */
/* Do not wait for the above function to return and continue performing the background tasks */
}
Note: If I do not perform the background tasks while calling the function in parallel then in the next cycle, the function doesn't give me correct output.
Thanks in advance.
Use a std::future to package a std::async task. Wait for the future at the head of your function to ensure that it's completed before the next iteration, since you stated that the next iteration depends on the execution of this background task.
In the example below, I make the background task a simple atomic increment of a counter, and the foreground task just returns the counter value. This is for illustrative purposes only!
#include <iostream>
#include <future>
#include <thread>
class Foo {
public:
Foo() : counter_(0) {}
std::pair<int, std::future<void>> a_function(std::future<void>& f) {
// Ensure that the background task from the previous iteration
// has completed
f.wait();
// Set the task for the next iteration
std::future<void> fut = std::async(std::launch::async,
&Foo::background_task, this);
// Do some work
int value = counter_.load();
// Return the result and the future for the next iteration
return std::make_pair(value, std::move(fut));
}
void background_task() {
++counter_;
}
private:
std::atomic<int> counter_;
};
int main() {
// Bootstrap the procedure with some empty task...
std::future<void> bleak = std::async(std::launch::deferred, [](){});
Foo foo;
// Iterate...
for (size_t i = 0; i < 10; ++i) {
// Call the function
std::pair<int, std::future<void>> result = foo.a_function(bleak);
// Set the future for the next iteration
bleak = std::move(result.second);
// Do something with the result
std::cout << result.first << "\n";
}
}
Live example
Related
Im new to C++, and trying to get my head around multithreading. I’ve got the basics covered. Now imagine this situation:
I have, say, N tasks that I want to have completed ASAP. That‘s easy, just start N threads and lean back. But I’m not sure if this will work for N=200 or more.
So I’d like to say: I have N tasks, and I want to start a limited number of M worker threads. How do I schedule a task to be issued to a new thread once one of the previous threads has finished?
Or is all this taken care of by the OS or runtime, and I need not worry at all, even if N gets really big?
No, you don’t want to create 200 threads. While it would likely work just fine, creating a thread involves significant processing overhead. Rather, you want a “task queue” system, where a pool of worker threads (generally equal in size to the number of CPU cores) draw from a shared queue of things that need to be done. Intel TBB contains a commonly used task queue implementation, but there are others as well.
std::thread::hardware_concurrancy may be useful to decide how many threads you want. If it returns anything but 0 it is the number of concurrent threads which can run simultaneously. It's often the number of CPU cores multiplied with the number of hyperthreads each core may run. 12 cores and 2 HT:s/core makes 24. Exceeding this number will likely just slow everything down.
You can create a pool of threads standing by to grab work on your command since creating threads is somewhat expensive. If you have 1000000 tasks to deal with, you want the 24 threads (in this example) to be up all the time.
This is a very common scenario though and since C++17 there is an addition to many of the standard algorithms, like std::for_each, to make them execute according to execution policies. If you want it to execute in parallel, it'll use a built-in thread pool (most likely) to finish the task.
Example:
#include <algorithm>
#include <execution>
#include <vector>
struct Task {
some_type data_to_work_on;
some_type result;
};
int main() {
std::vector<Task> tasks;
std::for_each(std::execution::par, tasks.begin(), tasks.end(), [](Task& t) {
// work on task `t` here
});
// all tasks done, check the result in each.
}
I have N tasks, and I want to start a limited number of M worker threads.
How do I schedule a task to be issued to a new thread once
one of the previous threads has finished?
Set your thread pool size, M, taking into account the number of threads available in your system (hardware_concurrency).
Use a counting_semaphore to make sure you don't launch a task if there is not an available thread pool slot.
Loop through your N tasks, acquiring a thread pool slot, running the task, and releasing the thread pool slot. Notice that, since tasks are launched asynchronously, you will be able to have M tasks running in parallel.
[Demo]
#include <future> // async
#include <iostream> // cout
#include <semaphore> // counting_semaphore
#include <vector>
static const size_t THREAD_POOL_SIZE_DEFAULT{ std::thread::hardware_concurrency() };
static const size_t THREAD_POOL_SIZE_MAX{ std::thread::hardware_concurrency() * 2 };
static const size_t NUM_TASKS_DEFAULT{ 20 };
template <typename F>
void run_tasks(
F&& f,
size_t thread_pool_size = THREAD_POOL_SIZE_DEFAULT,
size_t num_tasks = NUM_TASKS_DEFAULT)
{
thread_pool_size = std::min(thread_pool_size, THREAD_POOL_SIZE_MAX);
std::counting_semaphore task_slots(thread_pool_size);
auto futures{ std::vector<std::future<void>>(num_tasks) };
auto task_results{ std::vector<int>(num_tasks) };
// We can run thread_pool_size tasks in parallel
// If all task slots are busy, we have to wait for a task to finish
for (size_t i{ 0 }; i < num_tasks; ++i)
{
// Wait for a task slot to be free
task_slots.acquire();
futures[i] = std::async(
std::launch::async,
[i, &f, &task_result = task_results[i], &task_slots]() {
// Execute task
task_result = std::forward<F>(f)(i);
// Release the task slot
task_slots.release();
}
);
}
// Wait for all the tasks to finish
for (auto& future : futures) { future.get(); };
for (auto& result: task_results) { std::cout << result << " "; }
}
int main()
{
run_tasks([](int i) { return i * i; }, 4, 20);
}
This is my take on a threadpool (not extensively debugged yet). In main, it starts a threadpool with the maximum of threads the hardware allows (the thing Ted Lyngmo was referring to)
There are quite a few things involved since this threadpool also allows callers to get back the results of asynchronously started call
std::shared_future (to return a result to caller if needed)
std::packaged_task (to hold a call)
std::condition_variable (to communicate that stuff has entered the queue, or to signal all threads should stop)
std::mutex/std::unique_lock (to protect the queue of calls)
std::thread (ofcourse)
use of lambda's
#include <cassert>
#include <condition_variable>
#include <exception>
#include <iostream>
#include <mutex>
#include <future>
#include <thread>
#include <vector>
#include <queue>
//=====================================================================================================================================
namespace details
{
// task_itf is something the threadpool can call to start a scheduled function call
// independent of argument and/or return value types
class task_itf
{
public:
virtual void execute() = 0;
};
//-------------------------------------------------------------------------------------------------------------------------------------
// A task is a container for a function call + arguments a future.
// but is already specialized for the return value type of the function call
// which the future also needs
//
template<typename retval_t>
class task final :
public task_itf
{
public:
template<typename lambda_t>
explicit task(lambda_t&& lambda) :
m_task(lambda)
{
}
std::future<retval_t> get_future()
{
return m_task.get_future();
}
std::shared_future<retval_t> get_shared_future()
{
return std::shared_future<retval_t>(m_task.get_future());
}
virtual void execute() override
{
m_task();
}
private:
std::packaged_task<retval_t()> m_task;
};
class stop_exception :
public std::exception
{
};
}
//-------------------------------------------------------------------------------------------------------------------------------------
// actual thread_pool class
class thread_pool_t
{
public:
// construct a thread_pool with specified number of threads.
explicit thread_pool_t(const std::size_t size) :
m_stop{ false }
{
std::condition_variable signal_started;
std::atomic<std::size_t> number_of_threads_started{ 0u };
for (std::size_t n = 0; n < size; ++n)
{
// move the thread into the vector, no need to copy
m_threads.push_back(std::move(std::thread([&]()
{
{
number_of_threads_started++;
signal_started.notify_all();
}
thread_loop();
})));
}
// wait for all threads to have started.
std::mutex mtx;
std::unique_lock<std::mutex> lock{ mtx };
signal_started.wait(lock, [&] { return number_of_threads_started == size; });
}
// destructor signals all threads to stop as soon as they are done.
// then waits for them to stop.
~thread_pool_t()
{
{
std::unique_lock<std::mutex> lock(m_queue_mutex);
m_stop = true;
}
m_wakeup.notify_all();
for (auto& thread : m_threads)
{
thread.join();
}
}
// pass a function asynchronously to the threadpool
// this function returns a future so the calling thread
// my synchronize with a result if it so wishes.
template<typename lambda_t>
auto async(lambda_t&& lambda)
{
using retval_t = decltype(lambda());
auto task = std::make_shared<details::task<retval_t>>(lambda);
queue_task(task);
return task->get_shared_future();
}
// let the threadpool run the function but wait for
// the threadpool thread to finish
template<typename lambda_t>
auto sync(lambda_t&& lambda)
{
auto ft = async(lambda);
return ft.get();
}
void synchronize()
{
sync([] {});
}
private:
void queue_task(const std::shared_ptr<details::task_itf>& task_ptr)
{
{
std::unique_lock<std::mutex> lock(m_queue_mutex);
m_queue.push(task_ptr);
}
// signal only one thread, first waiting thread to wakeup will run the next task.
m_wakeup.notify_one();
}
std::shared_ptr<details::task_itf> get_next_task()
{
static auto pred = [this] { return (m_stop || (m_queue.size() > 0)); };
std::unique_lock<std::mutex> lock(m_queue_mutex);
while (!pred())
{
m_wakeup.wait(lock, pred);
}
if (m_stop)
{
// use exception to break out of the mainloop
throw details::stop_exception();
}
auto task = m_queue.front();
m_queue.pop();
return task;
}
void thread_loop()
{
try
{
while (auto task = get_next_task())
{
task->execute();
}
}
catch (const details::stop_exception&)
{
}
}
std::vector<std::thread> m_threads;
std::mutex m_queue_mutex;
std::queue<std::shared_ptr<details::task_itf>> m_queue;
std::condition_variable m_wakeup;
bool m_stop;
};
//-----------------------------------------------------------------------------
int main()
{
thread_pool_t thread_pool{ std::thread::hardware_concurrency() };
for (int i = 0; i < 200; i++)
{
// just schedule asynchronous calls, returned futures are not used in this example
thread_pool.async([i]
{
std::cout << i << " ";
});
}
// this threadpool will not by default wait until all work is finished
// but stops processing when destructed.
// a call to synchronize will block until all work is done that is queued up till this moment.
thread_pool.synchronize();
std::cout << "\nDone...\n";
return 0;
}
Having played a little with the current implementation of Coroutine TS in Clang, I stumbled upon the asio stackless coroutine implementation. They are described to be Portable Stackless Coroutines in One* Header.
Dealing mostly with asynchronous code I wanted to try them as well.
The coroutine block inside the main function shall await the result asynchronously set by the thread spawned in function foo. However I am uncertain on how to let execution continue at the point <1> (after the yield expression) once the thread set the value.
Using the Coroutine TS I would call the coroutine_handle, however boost::asio::coroutine seems not to be callable.
Is this even possible using boost::asio::coroutine?
#include <thread>
#include <chrono>
#include <boost/asio/coroutine.hpp>
#include <boost/asio/yield.hpp>
#include <cstdio>
using namespace std::chrono_literals;
using coroutine = boost::asio::coroutine;
void foo(coroutine & coro, int & result) {
std::thread([&](){
std::this_thread::sleep_for(1s);
result = 3;
// how to resume at <1>?
}).detach();
}
int main(int, const char**) {
coroutine coro;
int result;
reenter(coro) {
// Wait for result
yield foo(coro, result);
// <1>
std::printf("%d\n", result);
}
std::thread([](){
std::this_thread::sleep_for(2s);
}).join();
return 0;
}
Thanks for your help
First off, stackless coroutines are better described as resumable functions. The problem you're currently having is using main. If you extract your logic to a separate functor it would be possible:
class task; // Forward declare both because they should know about each other
void foo(task &task, int &result);
// Common practice is to subclass coro
class task : coroutine {
// All reused variables should not be local or they will be
// re-initialized
int result;
void start() {
// In order to actually begin, we need to "invoke ourselves"
(*this)();
}
// Actual task implementation
void operator()() {
// Reenter actually manages the jumps defined by yield
// If it's executed for the first time, it will just run from the start
// If it reenters (aka, yield has caused it to stop and we re-execute)
// it will jump to the right place for you
reenter(this) {
// Yield will store the current location, when reenter
// is ran a second time, it will jump past yield for you
yield foo(*this, result);
std::printf("%d\n", result)
}
}
}
// Our longer task
void foo(task & t, int & result) {
std::thread([&](){
std::this_thread::sleep_for(1s);
result = 3;
// The result is done, reenter the task which will go to just after yield
// Keep in mind this will now run on the current thread
t();
}).detach();
}
int main(int, const char**) {
task t;
// This will start the task
t.start();
std::thread([](){
std::this_thread::sleep_for(2s);
}).join();
return 0;
}
Note that it's not possible to yield from sub functions. This is a limitation of stackless coroutines.
How it works:
yield stores a unique identifier to jump to inside the coroutine
yield will run the expression you put behind it, should be an async call or little benefit would be given
after running, it will break out of the reenter block.
Now "start" is done, and you start another thread to wait for. Meanwhile, foo's thread finishes its sleep and call your task again. Now:
the reenter block will read the state of your coroutine, to find it has to jump past the foo call
your task will resume, print the result and drop out of the function, returning to the foo thread.
foo thread is now done and main is likely still waiting for the 2nd thread.
In c++11, I have a ThreadPool object which manages a number of threads that are enqueued via a single lambda function. I know how many rows of data I have to work on and so I know ahead of time that I will need to queue N jobs. What I am not sure about is how to tell when all of those jobs are finished, so I can move on to the next step.
This is the code to manage the ThreadPool:
#include <cstdlib>
#include <vector>
#include <deque>
#include <iostream>
#include <atomic>
#include <thread>
#include <mutex>
#include <condition_variable>
class ThreadPool;
class Worker {
public:
Worker(ThreadPool &s) : pool(s) { }
void operator()();
private:
ThreadPool &pool;
};
class ThreadPool {
public:
ThreadPool(size_t);
template<class F>
void enqueue(F f);
~ThreadPool();
void joinAll();
int taskSize();
private:
friend class Worker;
// the task queue
std::deque< std::function<void()> > tasks;
// keep track of threads
std::vector< std::thread > workers;
// sync
std::mutex queue_mutex;
std::condition_variable condition;
bool stop;
};
void Worker::operator()()
{
std::function<void()> task;
while(true)
{
{ // acquire lock
std::unique_lock<std::mutex>
lock(pool.queue_mutex);
// look for a work item
while ( !pool.stop && pool.tasks.empty() ) {
// if there are none wait for notification
pool.condition.wait(lock);
}
if ( pool.stop ) {// exit if the pool is stopped
return;
}
// get the task from the queue
task = pool.tasks.front();
pool.tasks.pop_front();
} // release lock
// execute the task
task();
}
}
// the constructor just launches some amount of workers
ThreadPool::ThreadPool(size_t threads)
: stop(false)
{
for (size_t i = 0;i<threads;++i) {
workers.push_back(std::thread(Worker(*this)));
}
//workers.
//tasks.
}
// the destructor joins all threads
ThreadPool::~ThreadPool()
{
// stop all threads
stop = true;
condition.notify_all();
// join them
for ( size_t i = 0;i<workers.size();++i) {
workers[i].join();
}
}
void ThreadPool::joinAll() {
// join them
for ( size_t i = 0;i<workers.size();++i) {
workers[i].join();
}
}
int ThreadPool::taskSize() {
return tasks.size();
}
// add new work item to the pool
template<class F>
void ThreadPool::enqueue(F f)
{
{ // acquire lock
std::unique_lock<std::mutex> lock(queue_mutex);
// add the task
tasks.push_back(std::function<void()>(f));
} // release lock
// wake up one thread
condition.notify_one();
}
And then I distribute my job among threads like this:
ThreadPool pool(4);
/* ... */
for (int y=0;y<N;y++) {
pool->enqueue([this,y] {
this->ProcessRow(y);
});
}
// wait until all threads are finished
std::this_thread::sleep_for( std::chrono::milliseconds(100) );
Waiting for 100 milliseconds works just because I know those jobs can complete in less time than 100ms, but obviously its not the best approach. Once it has completed N rows of processing it needs to go through another 1000 or so generations of the same thing. Obviously, I want to begin the next generation as soon as I can.
I know there must be some way to add code into my ThreadPool so that I can do something like this:
while ( pool->isBusy() ) {
std::this_thread::sleep_for( std::chrono::milliseconds(1) );
}
I've been working on this for a couple nights now and I find it hard to find good examples of how to do this. So, what would be the proper way to implementat my isBusy() method?
I got it!
First of all, I introduced a few extra members to the ThreadPool class:
class ThreadPool {
/* ... exisitng code ... */
/* plus the following */
std::atomic<int> njobs_pending;
std::mutex main_mutex;
std::condition_variable main_condition;
}
Now, I can do better than checking some status every X amount of time. Now, I can block the Main loop until no more jobs are pending:
void ThreadPool::waitUntilCompleted(unsigned n) {
std::unique_lock<std::mutex> lock(main_mutex);
main_condition.wait(lock);
}
As long as I manage what's pending with the following bookkeeping code, at the head of the ThreadPool.enqueue() function:
njobs_pending++;
and right after I run the task in the Worker::operator()() function:
if ( --pool.njobs_pending == 0 ) {
pool.main_condition.notify_one();
}
Then the main thread can enqueue whatever tasks are necessary and then sit and wait until all calculations are completed with:
for (int y=0;y<N;y++) {
pool->enqueue([this,y] {
this->ProcessRow(y);
});
}
pool->waitUntilCompleted();
You may need to create an internal structure of threads associated with a bool variable flag.
class ThreadPool {
private:
// This Structure Will Keep Track Of Each Thread's Progress
struct ThreadInfo {
std::thread thread;
bool isDone;
ThreadInfo( std::thread& threadIn ) :
thread( threadIn ), isDone(false)
{}
}; // ThredInfo
// This Vector Should Be Populated In The Constructor Initially And
// Updated Anytime You Would Add A New Task.
// This Should Also Replace // std::vector<std::thread> workers
std::vector<ThreadInfo> workers;
public:
// The rest of your class would appear to be the same, but you need a
// way to test if a particular thread is currently active. When the
// thread is done this bool flag would report as being true;
// This will only return or report if a particular thread is done or not
// You would have to set this variable's flag for a particular thread to
// true when it completes its task, otherwise it will always be false
// from moment of creation. I did not add in any bounds checking to keep
// it simple which should be taken into consideration.
bool isBusy( unsigned idx ) const {
return workers[idx].isDone;
}
};
If you have N jobs and they have to be awaited for by calling thread sleep, then the most efficient way would be to create somewhere a variable, that would be set by an atomic operation to N before scheduling jobs and inside each job when done with computation, there would be atomic decrement of the variable. Then you can use atomic instruction to test if the variable is zero.
Or locked decrement with wait handles, when the variable would decrement to zero.
I just have to say, I do not like this idea you are asking for:
while ( pool->isBusy() ) {
std::this_thread::sleep_for( std::chrono::milliseconds(1) );
}
It just does not fit well, it won't be 1ms almost never, it is using resources needlessly etc...
The best way would be to decrement some variable atomically, and test atomically the variable if all done and the last job will simply based on atomic test set WaitForSingleObject.
And if you must, the waiting will be on WaitForSingleObject, and would woke up after completion, not many times.
WaitForSingleObject
I have a vector of Timer Objects. Each Timer Object launches an std::thread that simulates a growing period. I am using a Command pattern.
What is happening is each Timer is getting executed one after another but what I really want is for one to be executed....then once finished, the next one...once finished the next...while not interfering with the main execution of the program
class Timer
{
public:
bool _bTimerStarted;
bool _bTimerCompleted;
int _timerDuration;
virtual ~Timer() { }
virtual void execute()=0;
virtual void runTimer()=0;
inline void setDuration(int _s) { _timerDuration = _s; };
inline int getDuration() { return _timerDuration; };
inline bool isTimerComplete() { return _bTimerCompleted; };
};
class GrowingTimer : public Timer
{
public:
void execute()
{
//std::cout << "Timer execute..." << std::endl;
_bTimerStarted = false;
_bTimerCompleted = false;
//std::thread t1(&GrowingTimer::runTimer, this); //Launch a thread
//t1.detach();
runTimer();
}
void runTimer()
{
//std::cout << "Timer runTimer..." << std::endl;
_bTimerStarted = true;
auto start = std::chrono::high_resolution_clock::now();
std::this_thread::sleep_until(start + std::chrono::seconds(20));
_bTimerCompleted = true;
std::cout << "Growing Timer Finished..." << std::endl;
}
};
class Timers
{
std::vector<Timer*> _timers;
struct ExecuteTimer
{
void operator()(Timer* _timer) { _timer->execute(); }
};
public:
void add_timer(Timer& _timer) { _timers.push_back(&_timer); }
void execute()
{
//std::for_each(_timers.begin(), _timers.end(), ExecuteTimer());
for (int i=0; i < _timers.size(); i++)
{
Timer* _t = _timers.at(i);
_t->execute();
//while ( ! _t->isTimerComplete())
//{
//}
}
}
};
Executing the above like:
Timers _timer;
GrowingTimer _g, g1;
_g.setDuration(BROCCOLI::growTimeSeconds);
_g1.setDuration(BROCCOLI::growTimeSeconds);
_timer.add_timer(_g);
_timer.add_timer(_g1);
start_timers();
}
void start_timers()
{
_timer.execute();
}
In Timers::execute I am trying a few different ways to execute the first and not execute the
next until I somehow signal it is done.
UPDATE:
I am now doing this to execute everything:
Timers _timer;
GrowingTimer _g, g1;
_g.setDuration(BROCCOLI::growTimeSeconds);
_g1.setDuration(BROCCOLI::growTimeSeconds);
_timer.add_timer(_g);
_timer.add_timer(_g1);
//start_timers();
std::thread t1(&Broccoli::start_timers, this); //Launch a thread
t1.detach();
}
void start_timers()
{
_timer.execute();
}
The first time completes (I see the "completed" cout), but crashes at _t->execute(); inside the for loop with an EXEC_BAD_ACCESS. I added a cout to check the size of the vector and it is 2 so both timers are inside. I do see this in the console:
this Timers * 0xbfffd998
_timers std::__1::vector<Timer *, std::__1::allocator<Timer *> >
if I change the detach() to join() everything completes without the crash, but it blocks execution of my app until those timers finish.
Why are you using threads here? Timers::execute() calls execute on a timer, then waits for it to finish, then calls execute on the next, and so forth. Why don't you just call the timer function directly in Timers::execute() rather than spawning a thread and then waiting for it?
Threads allow you to write code that executes concurrently. What you want is serial execution, so threads are the wrong tool.
Update: In the updated code you run start_timers on a background thread, which is good. However, by detaching that thread you leave the thread running past the end of the scope. This means that the timer objects _g and _g1 and even the Timers object _timers are potentially destroyed before the thread has completed. Given the time-consuming nature of the timers thread, and the fact that you used detach rather than join in order to avoid your code blocking, this is certainly the cause of your problem.
If you run code on a thread then you need to ensure that all objects accessed by that thread have a long-enough lifetime that they are still valid when the thread accesses them. For detached threads this is especially hard to achieve, so detached threads are not recommended.
One option is to create an object containing _timers, _g and _g1 along side the thread t1, and have its destructor join with the thread. All you need to do then is to ensure that the object lives until the point that it is safe to wait for the timers to complete.
If you don't want to interfere with the execution of the program, you could do something like #Joel said but also adding a thread in the Timers class which would execute the threads in the vector.
You could include a unique_ptr to the thread in GrowingTimer instead of creating it as a local object in execute and calling detach. You can still create the thread in execute, but you would do it with a unique_ptr::reset call.
Then use join instead of isTimerComplete (add a join function to the Timer base class). The isTimerComplete polling mechanism will be extremely inefficient because it will basically use up that thread's entire time slice continually polling, whereas join will block until the other thread is complete.
An example of join:
#include <iostream>
#include <chrono>
#include <thread>
using namespace std;
void threadMain()
{
this_thread::sleep_for(chrono::seconds(5));
cout << "Done sleeping\n";
}
int main()
{
thread t(threadMain);
for (int i = 0; i < 10; ++i)
{
cout << i << "\n";
}
t.join();
cout << "Press Enter to exit\n";
cin.get();
return 0;
}
Note how the main thread keeps running while the other thread does its thing. Note that Anthony's answer is right in that it doesn't really seem like you need more than one background thread that just executes tasks sequentially rather than starting a thread and waiting for it to finish before starting a new one.
IS there a way of running a function back on the main thread ?
So if I called a function via Async that downloaded a file and then parsed the data. It would then call a callback function which would run on my main UI thread and update the UI ?
I know threads are equal in the default C++ implementation so would I have to create a shared pointer to my main thread. How would I do this and pass the Async function not only the shared pointer to the main thread but also a pointer to the function I want to rrun on it and then run it on that main thread ?
I have been reading C++ Concurrency in Action and chapter four (AKA "The Chapter I Just Finished") describes a solution.
The Short Version
Have a shared std::deque<std::packaged_task<void()>> (or a similar sort of message/task queue). Your std::async-launched functions can push tasks to the queue, and your GUI thread can process them during its loop.
There Isn't Really a Long Version, but Here Is an Example
Shared Data
std::deque<std::packaged_task<void()>> tasks;
std::mutex tasks_mutex;
std::atomic<bool> gui_running;
The std::async Function
void one_off()
{
std::packaged_task<void()> task(FUNCTION TO RUN ON GUI THREAD); //!!
std::future<void> result = task.get_future();
{
std::lock_guard<std::mutex> lock(tasks_mutex);
tasks.push_back(std::move(task));
}
// wait on result
result.get();
}
The GUI Thread
void gui_thread()
{
while (gui_running) {
// process messages
{
std::unique_lock<std::mutex> lock(tasks_mutex);
while (!tasks.empty()) {
auto task(std::move(tasks.front()));
tasks.pop_front();
// unlock during the task
lock.unlock();
task();
lock.lock();
}
}
// "do gui work"
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
}
Notes:
I am (always) learning, so there is a decent chance that my code is not great. The concept is at least sound though.
The destructor of the return value from std::async (a std::future<>) will block until the operation launched with std::async completes (see std::async ), so waiting on the result of a task (as I do in my example) in one_off might not be a brilliant idea.
You may want to (I would, at least) create your own threadsafe MessageQueue type to improve code readability/maintainability/blah blah blah.
I swear there was one more thing I wanted to point out, but it escapes me right now.
Full Example
#include <atomic>
#include <chrono>
#include <deque>
#include <iostream>
#include <mutex>
#include <future>
#include <thread>
// shared stuff:
std::deque<std::packaged_task<void()>> tasks;
std::mutex tasks_mutex;
std::atomic<bool> gui_running;
void message()
{
std::cout << std::this_thread::get_id() << std::endl;
}
void one_off()
{
std::packaged_task<void()> task(message);
std::future<void> result = task.get_future();
{
std::lock_guard<std::mutex> lock(tasks_mutex);
tasks.push_back(std::move(task));
}
// wait on result
result.get();
}
void gui_thread()
{
std::cout << "gui thread: "; message();
while (gui_running) {
// process messages
{
std::unique_lock<std::mutex> lock(tasks_mutex);
while (!tasks.empty()) {
auto task(std::move(tasks.front()));
tasks.pop_front();
// unlock during the task
lock.unlock();
task();
lock.lock();
}
}
// "do gui work"
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
}
int main()
{
gui_running = true;
std::cout << "main thread: "; message();
std::thread gt(gui_thread);
for (unsigned i = 0; i < 5; ++i) {
// note:
// these will be launched sequentially because result's
// destructor will block until one_off completes
auto result = std::async(std::launch::async, one_off);
// maybe do something with result if it is not void
}
// the for loop will not complete until all the tasks have been
// processed by gui_thread
// ...
// cleanup
gui_running = false;
gt.join();
}
Dat Output
$ ./messages
main thread: 140299226687296
gui thread: 140299210073856
140299210073856
140299210073856
140299210073856
140299210073856
140299210073856
Are you looking for std::launch::deferred ? Passing this parameter to std::async makes the task executed on the calling thread when the get() function is called for the first time.