Displaying results as soon as they are ready with std::async

Displaying results as soon as they are ready with std::async - c++

I'm trying to discover asynchronous programming in C++. Here's a toy example I've been using:
#include <iostream>
#include <future>
#include <vector>
#include <chrono>
#include <thread>
#include <random>
// For simplicity
using namespace std;
int called_from_async(int m, int n)
{
this_thread::sleep_for(chrono::milliseconds(rand() % 1000));
return m * n;
}
void test()
{
int m = 12;
int n = 42;
vector<future<int>> results;
for(int i = 0; i < 10; i++)
{
for(int j = 0; j < 10; j++)
{
results.push_back(async(launch::async, called_from_async, i, j));
}
}
for(auto& f : results)
{
cout << f.get() << endl;
}
}
Now, the example is not really interesting, but it raises a question that is, to me, interesting. Let's say I want to display results as they "arrive" (I don't know what will be ready first, since the delay is random), how should I do it?
What I'm doing here is obviously wrong, since I wait for all the tasks in the order in which I created them - so I'll wait for the first to finish even if it's longer than the others.
I thought about the following idea: for each future, using wait_for on a small time and if it's ready, display the value. But I feel weird doing that:
while (any_of(results.begin(), results.end(), [](const future<int>& f){
return f.wait_for(chrono::seconds(0)) != future_status::ready;
}))
{
cout << "Loop" << endl;
for(auto& f : results)
{
auto result = f.wait_for(std::chrono::milliseconds(20));
if (result == future_status::ready)
cout << f.get() << endl;
}
}
This brings another issue: we'd call get several times on some futures, which is illegal:
terminate called after throwing an instance of 'std::future_error' what(): std::future_error: No associated state
So I don't really know what to do here, please suggest!

Use valid() to skip the futures for which you have already called get().
bool all_ready;
do {
all_ready = true;
for(auto& f : results) {
if (f.valid()) {
auto result = f.wait_for(std::chrono::milliseconds(20));
if (result == future_status::ready) {
cout << f.get() << endl;
}
else {
all_ready = false;
}
}
}
}
while (!all_ready);

Related

Let main thread wait async threads complete

I'm new to c++ and don't know how to let main thread wait for all async threads done. I refered this but makes void consume() not parallel.
#include <iostream>
#include <vector>
#include <unistd.h> // sleep
#include <future>
using namespace std;
class Myclass {
private:
std::vector<int> resources;
std::vector<int> res;
std::mutex resMutex;
std::vector<std::future<void>> m_futures;
public:
Myclass() {
for (int i = 0; i < 10; i++) resources.push_back(i); // add task
res.reserve(resources.size());
}
void consume() {
for (int i = 0; i < resources.size(); i++) {
m_futures.push_back(std::async(std::launch::async, &Myclass::work, this, resources[i]));
// m_futures.back().wait();
}
}
void work(int x) {
sleep(1); // Simulation time-consuming
std::lock_guard<std::mutex> lock(resMutex);
res.push_back(x);
printf("%d be added.---done by %d.\n", x, std::this_thread::get_id());
}
std::vector<int> &getRes() { return res;}
};
int main() {
Myclass obj;
obj.consume();
auto res = obj.getRes();
cout << "Done. res.size = " << res.size() << endl;
for (int i : res) cout << i << " ";
cout <<"main thread over\n";
}
Main thread ends up when res = 0. I want obj.getRes() be be executed when all results be added into res.
Done. res.size = 0
main thread over
4 be added.---done by 6.
9 be added.---done by 11...

You had the right idea with the commented out line: m_futures.back().wait();, you just have it in the wrong place.
As you note, launching a std::async and then waiting for its result right after, forces the entire thing to execute in series and makes the async pointless.
Instead you want two functions: One, like your consume() that launches all the async's, and then another that loops over the futures and calls wait (or get, whatever suits your needs) on them - and then call that from main.
This lets them all run in parallel, while still making main wait for the final result.

Addition to #Frodyne 's answer,
consume() function calls are parallel, and main thread waits for the all consume() s have their work done;
void set_wait(void)
{
for (int i = 0; i < resources.size(); i++) {
m_futures[i].wait();
}
}
And call it here
void consume() {
for (int i = 0; i < resources.size(); i++) {
m_futures.push_back(std::async(std::launch::async, &Myclass::work, this, resources[i]));
// Calling wait() here makes no sense
}
set_wait(); // Waits for all threads do work
}
I created new function for convenience.

You can use std::future:wait after you add task to m_futures. Example.
void consume() {
for (int i = 0; i < resources.size(); i++) {
m_futures.push_back(std::async(std::launch::async, &Myclass::work, this, resources[i]));
//m_futures.back().wait();
}
for(auto& f: m_futures) f.wait();
}

boost::thread resource temporarily not available

I have a very similar problem to this. Unfortunately, I am not allowed to comment on it so please excuse me for opening up another topic for this. My code is running a two-stage calculation iteratively which in principle looks like this:
while(!finishing_condition_met)
{
boost::thread_group executionGrp1;
for(int w = 0; w < numThreads; w++)
{
boost::thread * curThread = new boost::thread(&Class::operation1, this, boost::ref(argument1), ...);
executionGrp1.add_thread(curThread);
}
executionGrp1.join_all();
boost::thread_group executionGrp2;
for(int w = 0; w < numThreads; w++)
{
boost::thread * curThread = new boost::thread(&Class::operation2, this, boost::ref(argument1), ...);
executionGrp2.add_thread(curThread);
}
executionGrp2.join_all();
update_finished_criterion();
}
Since numThreads is significantly smaller than what the kernel would allow (it is set to hardware concurrency which is 56 on the current machine), I was surprised see this error. Does join_all() not take care of the finished threads?
The thread_pool-approach suggested in the other post seems interesting but I am not exactly sure how to adapt it such that I can rerun everything within the loop multiple times while still waiting for the first stage to finish before starting the second stage.
Any suggestions are welcome! Thanks in advance.
EDIT: This is how I can cause this error in a minimalistic fashion. AFAIK, this is the standard way to implement parallel sections. Am I missing something?
#include "boost/thread.hpp"
#include "boost/chrono.hpp"
#include <iostream>
#include <algorithm>
#include <ctime>
using namespace std;
int numThreads = boost::thread::hardware_concurrency();
void wait(int seconds) {
boost::this_thread::sleep_for(boost::chrono::milliseconds(seconds));
return;
}
int subthread(int i) {
wait(i/numThreads);
return 1;
}
void threads(int nT) {
boost::thread_group exeGrp;
for (int i=0;i<nT;i++) {
boost::thread * curThread = new boost::thread(&subthread, i);
exeGrp.add_thread(curThread);
}
exeGrp.join_all();
}
int main() {
for (int a=0;a<numThreads;a++) {
cout << "Starting " << numThreads << " threads [" << a << "/" << numThreads << "]" << endl;
threads(numThreads);
}
cout << "done" << endl;
}
Output when running code

How to apply a concurrent solution to a Producer-Consumer like situation

I have a XML file with a sequence of nodes. Each node represents an element that I need to parse and add in a sorted list (the order must be the same of the nodes found in the file).
At the moment I am using a sequential solution:
struct Graphic
{
bool parse()
{
// parsing...
return parse_outcome;
}
};
vector<unique_ptr<Graphic>> graphics;
void producer()
{
for (size_t i = 0; i < N_GRAPHICS; i++)
{
auto g = new Graphic();
if (g->parse())
graphics.emplace_back(g);
else
delete g;
}
}
So, only if the graphic (that actually is an instance of a class derived from Graphic, a Line, a Rectangle and so on, that is why the new) can be properly parse, it will be added to my data structure.
Since I only care about the order in which thes graphics are added to my list, I though to call the parse method asynchronously, such that the producer has the task of read each node from the file and add this graphic to the data structure, while the consumer has the task of parse each graphic whenever a new graphic is ready to be parsed.
Now I have several consumer threads (created in the main) and my code looks like the following:
queue<pair<Graphic*, size_t>> q;
mutex m;
atomic<size_t> n_elements;
void producer()
{
for (size_t i = 0; i < N_GRAPHICS; i++)
{
auto g = new Graphic();
graphics.emplace_back(g);
q.emplace(make_pair(g, i));
}
n_elements = graphics.size();
}
void consumer()
{
pair<Graphic*, size_t> item;
while (true)
{
{
std::unique_lock<std::mutex> lk(m);
if (n_elements == 0)
return;
n_elements--;
item = q.front();
q.pop();
}
if (!item.first->parse())
{
// here I should remove the item from the vector
assert(graphics[item.second].get() == item.first);
delete item.first;
graphics[item.second] = nullptr;
}
}
}
I run the producer first of all in my main, so that when the first consumer starts the queue is already completely full.
int main()
{
producer();
vector<thread> threads;
for (auto i = 0; i < N_THREADS; i++)
threads.emplace_back(consumer);
for (auto& t : threads)
t.join();
return 0;
}
The concurrent version seems to be at least twice as faster as the original one.
The full code has been uploaded here.
Now I am wondering:
Are there any (synchronization) errors in my code?
Is there a way to achieve the same result faster (or better)?
Also, I noticed that on my computer I get the best result (in terms of elapsed time) if I set the number of thread equals to 8. More (or less) threads give me worst results. Why?

Blockquote
There isn't synchronization errors, but I think that the memory managing could be better, since your code leaked if parse() throws an exception.
There isn't synchronization errors, but I think that your memory managing could be better, since you will have leaks if parse() throw an exception.
Blockquote
Is there a way to achieve the same result faster (or better)?
Probably. You could use a simple implementation of a thread pool and a lambda that do the parse() for you.
The code below illustrate this approach. I use the threadpool implementation
here
#include <iostream>
#include <stdexcept>
#include <vector>
#include <memory>
#include <chrono>
#include <utility>
#include <cassert>
#include <ThreadPool.h>
using namespace std;
using namespace std::chrono;
#define N_GRAPHICS (1000*1000*1)
#define N_THREADS 8
struct Graphic;
using GPtr = std::unique_ptr<Graphic>;
static vector<GPtr> graphics;
struct Graphic
{
Graphic()
: status(false)
{
}
bool parse()
{
// waste time
try
{
throw runtime_error("");
}
catch (runtime_error)
{
}
status = true;
//return false;
return true;
}
bool status;
};
int main()
{
auto start = system_clock::now();
auto producer_unit = []()-> GPtr {
std::unique_ptr<Graphic> g(new Graphic);
if(!g->parse()){
g.reset(); // if g don't parse, return nullptr
}
return g;
};
using ResultPool = std::vector<std::future<GPtr>>;
ResultPool results;
// ThreadPool pool(thread::hardware_concurrency());
ThreadPool pool(N_THREADS);
for(int i = 0; i <N_GRAPHICS; ++i){
// Running async task
results.emplace_back(pool.enqueue(producer_unit));
}
for(auto &t : results){
auto value = t.get();
if(value){
graphics.emplace_back(std::move(value));
}
}
auto duration = duration_cast<milliseconds>(system_clock::now() - start);
cout << "Elapsed: " << duration.count() << endl;
for (size_t i = 0; i < graphics.size(); i++)
{
if (!graphics[i]->status)
{
cerr << "Assertion failed! (" << i << ")" << endl;
break;
}
}
cin.get();
return 0;
}
It is a bit faster (1s) on my machine, more readable, and removes the necessity of shared datas (synchronization is evil, avoid it or hide it in a reliable and efficient way).

Array of Threads

I'm trying to create an array of threads and give each of them a function but not working
Reader *readers = new Reader[10];
thread *ts = new thread[10];
for (int i = 0; i<10; i++){
readers[i].setAll(q, "t" + i);
ts[i] = thread(readers[i].run); //Error: "cannot be referenced. its a deleted function"
}
Run function:
void Reader::run(){
log(this->source);
log.log("starting");
while (flag){
this->m = this->q.get(this->source);
log.log("retrieved " + m.toString());
if (m.getData().compare("stop") && m.getTarget().compare(this->source))
{
log.log("stopping");
this->setFlag(false);//stop the current thread if the message is "stop"
}
else{
if (!m.getTarget().compare(source)){//if the message isnt from the current thread then put the message back
string dest = m.getSource();
m.setSource(this->source);
log.log("putting back [" + dest + "->" + m.getTarget() + ":" + " " + m.getData() + "] as " + m.toString());
q.put(m);
log.log("done putting back " + m.toString());
}
}
}
}
I'm actually trying to do the following code:
thread t0(readers[0].run);
thread t1(readers[1].run);
etc...
but it also gives me the same error seen below:

If you're using c++11 you might want to take advantage of the nice memory management and binding features:
(edit: updated code in response to comments)
#include <iostream>
#include <thread>
#include <memory>
#include <vector>
using namespace std;
struct runner
{
void run() {
cout << "hello" << endl;
}
};
int main()
{
vector<unique_ptr<runner>> runners;
for(size_t i = 0 ; i < 10 ; ++i) {
runners.emplace_back(new runner);
}
vector<thread> threads;
for(const auto& r : runners) {
threads.emplace_back(&runner::run, r.get());
}
for(auto& t : threads) {
t.join();
}
return 0;
}

While readers[i].run looks like it should bind an object to a member function to make a callable object, sadly it doesn't. Instead, you need to pass a pointer to the member function and a pointer (or reference wrapper) to the object as separate arguments:
thread(&Reader::run, &readers[i]); // or std::ref(readers[i])
or you might find it nicer to wrap the function call in a lambda:
thread([=]{readers[i].run();})

C++ Syncing threads in most elegant way

I am try to solve the following problem, I know there are multiple solutions but I'm looking for the most elegant way (less code) to solve it.
I've 4 threads, 3 of them try to write a unique value (0,1,or 2) to a volatile integer variable in an infinite loop, the forth thread try to read the value of this variable and print the value to the stdout also in an infinite loop.
I'd like to sync between the thread so the thread that writes 0 will be run and then the "print" thread and then the thread that writes 1 and then again the print thread, an so on...
So that finally what I expect to see at the output of the "print" thread is a sequence of zeros and then sequence of 1 and then 2 and then 0 and so on...
What is the most elegant and easy way to sync between these threads.
This is the program code:
volatile int value;
int thid[4];
int main() {
HANDLE handle[4];
for (int ii=0;ii<4;ii++) {
thid[ii]=ii;
handle[ii] = (HANDLE) CreateThread( NULL, 0, (LPTHREAD_START_ROUTINE) ThreadProc, &thid[ii], 0, NULL);
}
return 0;
}
void WINAPI ThreadProc( LPVOID param ) {
int h=*((int*)param);
switch (h) {
case 3:
while(true) {
cout << value << endl;
}
break;
default:
while(true) {
// setting a unique value to the volatile variable
value=h;
}
break;
}
}

your problem can be solved with the producer consumer pattern.
I got inspired from Wikipedia so here is the link if you want some more details.
https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem
I used a random number generator to generate the volatile variable but you can change that part.
Here is the code: it can be improved in terms of style (using C++11 for random numbers) but it produces what you expect.
#include <iostream>
#include <sstream>
#include <vector>
#include <stack>
#include <thread>
#include <mutex>
#include <atomic>
#include <condition_variable>
#include <chrono>
#include <stdlib.h> /* srand, rand */
using namespace std;
//random number generation
std::mutex mutRand;//mutex for random number generation (given that the random generator is not thread safe).
int GenerateNumber()
{
std::lock_guard<std::mutex> lk(mutRand);
return rand() % 3;
}
// print function for "thread safe" printing using a stringstream
void print(ostream& s) { cout << s.rdbuf(); cout.flush(); s.clear(); }
// Constants
//
const int num_producers = 3; //the three producers of random numbers
const int num_consumers = 1; //the only consumer
const int producer_delay_to_produce = 10; // in miliseconds
const int consumer_delay_to_consume = 30; // in miliseconds
const int consumer_max_wait_time = 200; // in miliseconds - max time that a consumer can wait for a product to be produced.
const int max_production = 1; // When producers has produced this quantity they will stop to produce
const int max_products = 1; // Maximum number of products that can be stored
//
// Variables
//
atomic<int> num_producers_working(0); // When there's no producer working the consumers will stop, and the program will stop.
stack<int> products; // The products stack, here we will store our products
mutex xmutex; // Our mutex, without this mutex our program will cry
condition_variable is_not_full; // to indicate that our stack is not full between the thread operations
condition_variable is_not_empty; // to indicate that our stack is not empty between the thread operations
//
// Functions
//
// Produce function, producer_id will produce a product
void produce(int producer_id)
{
while (true)
{
unique_lock<mutex> lock(xmutex);
int product;
is_not_full.wait(lock, [] { return products.size() != max_products; });
product = GenerateNumber();
products.push(product);
print(stringstream() << "Producer " << producer_id << " produced " << product << "\n");
is_not_empty.notify_all();
}
}
// Consume function, consumer_id will consume a product
void consume(int consumer_id)
{
while (true)
{
unique_lock<mutex> lock(xmutex);
int product;
if(is_not_empty.wait_for(lock, chrono::milliseconds(consumer_max_wait_time),
[] { return products.size() > 0; }))
{
product = products.top();
products.pop();
print(stringstream() << "Consumer " << consumer_id << " consumed " << product << "\n");
is_not_full.notify_all();
}
}
}
// Producer function, this is the body of a producer thread
void producer(int id)
{
++num_producers_working;
for(int i = 0; i < max_production; ++i)
{
produce(id);
this_thread::sleep_for(chrono::milliseconds(producer_delay_to_produce));
}
print(stringstream() << "Producer " << id << " has exited\n");
--num_producers_working;
}
// Consumer function, this is the body of a consumer thread
void consumer(int id)
{
// Wait until there is any producer working
while(num_producers_working == 0) this_thread::yield();
while(num_producers_working != 0 || products.size() > 0)
{
consume(id);
this_thread::sleep_for(chrono::milliseconds(consumer_delay_to_consume));
}
print(stringstream() << "Consumer " << id << " has exited\n");
}
//
// Main
//
int main()
{
vector<thread> producers_and_consumers;
// Create producers
for(int i = 0; i < num_producers; ++i)
producers_and_consumers.push_back(thread(producer, i));
// Create consumers
for(int i = 0; i < num_consumers; ++i)
producers_and_consumers.push_back(thread(consumer, i));
// Wait for consumers and producers to finish
for(auto& t : producers_and_consumers)
t.join();
return 0;
}
Hope that helps, tell me if you need more info or if you disagree with something :-)
And Good Bastille Day to all French people!

If you want to synchronise the threads, then using a sync object to hold each of the threads in a "ping-pong" or "tick-tock" pattern.
In C++ 11 you can use condition variables, the example here shows something similar to what you are asking for.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Displaying results as soon as they are ready with std::async - c++

Related

Let main thread wait async threads complete

boost::thread resource temporarily not available

How to apply a concurrent solution to a Producer-Consumer like situation

Array of Threads

C++ Syncing threads in most elegant way

Categories

Resources