adding series of numbers [1-->5000] with tread C++? - c++

I want to add a series of numbers [1-->5000] with threads. But the result is not correct.
The goal is only to understand the threading well, because I am a beginner.
I tried this:
void thread_function(int i, int (*S))
{
(*S) = (*S) + i;
}
main()
{
std::vector<std::thread> vecto_Array;
int i = 0, Som = 0;
for(i = 1; i <= 5000; i++)
{
vecto_Array.emplace_back([&](){ thread_function(i, &Som); });
}
for(auto& t: vecto_Array)
{
t.join();
}
std::cout << Som << std::endl;
}
And I tried this:
int thread_function(int i)
{
return i;
}
main()
{
std::vector<std::thread> vecto_Array;
int i = 0, Som = 0;
for(i = 1; i <= 5000; i++)
{
vecto_Array.emplace_back([&](){ Som = Som + thread_function(i); });
}
for(auto& t: vecto_Array)
{
t.join();
}
std::cout << Som << std::endl;
}
The result is always wrong. Why?
I solved the problem as follows:
void thread_function(int (*i),int (*S))
{
(*S)=(*S)+(*i);
(*i)++;
}
main()
{
std::vector<std::thread> vecto_Array;
int i=0,j=0,Som=0;
for(i=1;i<=5000;i++)
{
vecto_Array.emplace_back([&](){thread_function(&j,&Som);});
}
for(auto& t: vecto_Array)
{
t.join();
}
std::cout << Som<<std::endl;
}
But is there anyone to explain to me why it did not work when taking "i of loop" ?

Your attempt #1 has a race condition. See What is a race condition?
Your Attempt #2 neglects the standard, which says about the thread function these words:
Any return value from the function is ignored.
(see: https://en.cppreference.com/w/cpp/thread/thread/thread )
Your attempt #3 has a race condition.
Concurrent programming is an advanced topic. What you need is a book or tutorial. I first learned it from Bartosz Milewski's course: https://www.youtube.com/watch?v=80ifzK3b8QQ&list=PL1835A90FC78FF8BE&index=1
but be warned that it will likely take years before you become comfortable in concurrency. I am still not. I guess what you need as a beginner is std::async (see Milewski's tutorial or use Google). Even gentler learning curve is with OpenMP https://en.wikipedia.org/wiki/OpenMP , which could be called "parallelization for the masses".

Related

Let main thread wait async threads complete

I'm new to c++ and don't know how to let main thread wait for all async threads done. I refered this but makes void consume() not parallel.
#include <iostream>
#include <vector>
#include <unistd.h> // sleep
#include <future>
using namespace std;
class Myclass {
private:
std::vector<int> resources;
std::vector<int> res;
std::mutex resMutex;
std::vector<std::future<void>> m_futures;
public:
Myclass() {
for (int i = 0; i < 10; i++) resources.push_back(i); // add task
res.reserve(resources.size());
}
void consume() {
for (int i = 0; i < resources.size(); i++) {
m_futures.push_back(std::async(std::launch::async, &Myclass::work, this, resources[i]));
// m_futures.back().wait();
}
}
void work(int x) {
sleep(1); // Simulation time-consuming
std::lock_guard<std::mutex> lock(resMutex);
res.push_back(x);
printf("%d be added.---done by %d.\n", x, std::this_thread::get_id());
}
std::vector<int> &getRes() { return res;}
};
int main() {
Myclass obj;
obj.consume();
auto res = obj.getRes();
cout << "Done. res.size = " << res.size() << endl;
for (int i : res) cout << i << " ";
cout <<"main thread over\n";
}
Main thread ends up when res = 0. I want obj.getRes() be be executed when all results be added into res.
Done. res.size = 0
main thread over
4 be added.---done by 6.
9 be added.---done by 11...
You had the right idea with the commented out line: m_futures.back().wait();, you just have it in the wrong place.
As you note, launching a std::async and then waiting for its result right after, forces the entire thing to execute in series and makes the async pointless.
Instead you want two functions: One, like your consume() that launches all the async's, and then another that loops over the futures and calls wait (or get, whatever suits your needs) on them - and then call that from main.
This lets them all run in parallel, while still making main wait for the final result.
Addition to #Frodyne 's answer,
consume() function calls are parallel, and main thread waits for the all consume() s have their work done;
void set_wait(void)
{
for (int i = 0; i < resources.size(); i++) {
m_futures[i].wait();
}
}
And call it here
void consume() {
for (int i = 0; i < resources.size(); i++) {
m_futures.push_back(std::async(std::launch::async, &Myclass::work, this, resources[i]));
// Calling wait() here makes no sense
}
set_wait(); // Waits for all threads do work
}
I created new function for convenience.
You can use std::future:wait after you add task to m_futures. Example.
void consume() {
for (int i = 0; i < resources.size(); i++) {
m_futures.push_back(std::async(std::launch::async, &Myclass::work, this, resources[i]));
//m_futures.back().wait();
}
for(auto& f: m_futures) f.wait();
}

Displaying results as soon as they are ready with std::async

I'm trying to discover asynchronous programming in C++. Here's a toy example I've been using:
#include <iostream>
#include <future>
#include <vector>
#include <chrono>
#include <thread>
#include <random>
// For simplicity
using namespace std;
int called_from_async(int m, int n)
{
this_thread::sleep_for(chrono::milliseconds(rand() % 1000));
return m * n;
}
void test()
{
int m = 12;
int n = 42;
vector<future<int>> results;
for(int i = 0; i < 10; i++)
{
for(int j = 0; j < 10; j++)
{
results.push_back(async(launch::async, called_from_async, i, j));
}
}
for(auto& f : results)
{
cout << f.get() << endl;
}
}
Now, the example is not really interesting, but it raises a question that is, to me, interesting. Let's say I want to display results as they "arrive" (I don't know what will be ready first, since the delay is random), how should I do it?
What I'm doing here is obviously wrong, since I wait for all the tasks in the order in which I created them - so I'll wait for the first to finish even if it's longer than the others.
I thought about the following idea: for each future, using wait_for on a small time and if it's ready, display the value. But I feel weird doing that:
while (any_of(results.begin(), results.end(), [](const future<int>& f){
return f.wait_for(chrono::seconds(0)) != future_status::ready;
}))
{
cout << "Loop" << endl;
for(auto& f : results)
{
auto result = f.wait_for(std::chrono::milliseconds(20));
if (result == future_status::ready)
cout << f.get() << endl;
}
}
This brings another issue: we'd call get several times on some futures, which is illegal:
terminate called after throwing an instance of 'std::future_error' what(): std::future_error: No associated state
So I don't really know what to do here, please suggest!
Use valid() to skip the futures for which you have already called get().
bool all_ready;
do {
all_ready = true;
for(auto& f : results) {
if (f.valid()) {
auto result = f.wait_for(std::chrono::milliseconds(20));
if (result == future_status::ready) {
cout << f.get() << endl;
}
else {
all_ready = false;
}
}
}
}
while (!all_ready);

c++ threading, duplicate/missing threads

I'm trying to write a program that concurrently add and removes items from a "storehouse". I have a "Monitor" class that handles the "storehouse" operations:
class Monitor
{
private:
mutex m;
condition_variable cv;
vector<Storage> S;
int counter = 0;
bool busy = false;;
public:
void add(Computer c, int index) {
unique_lock <mutex> lock(m);
if (busy)
cout << "Thread " << index << ": waiting for !busy " << endl;
cv.wait(lock, [&] { return !busy; });
busy = true;
cout << "Thread " << index << ": Request: add " << c.CPUFrequency << endl;
for (int i = 0; i < counter; i++) {
if (S[i].f == c.CPUFrequency) {
S[i].n++;
busy = false; cv.notify_one();
return;
}
}
Storage s;
s.f = c.CPUFrequency;
s.n = 1;
// put the new item in a sorted position
S.push_back(s);
counter++;
busy = false; cv.notify_one();
}
}
The threads are created like this:
void doThreadStuff(vector<Computer> P, vector <Storage> R, Monitor &S)
{
int Pcount = P.size();
vector<thread> myThreads;
myThreads.reserve(Pcount);
for (atomic<size_t> i = 0; i < Pcount; i++)
{
int index = i;
Computer c = P[index];
myThreads.emplace_back([&] { S.add(c, index); });
}
for (size_t i = 0; i < Pcount; i++)
{
myThreads[i].join();
}
// printing results
}
Running the program produced the following results:
I'm familiar with race conditions, but this doesn't look like one to me. My bet would be on something reference related, because in the results we can see that for every "missing thread" (threads 1, 3, 10, 25) I get "duplicate threads" (threads 2, 9, 24, 28).
I have tried to create local variables in functions and loops but it changed nothing.
I have heard about threads sharing memory regions, but my previous work should have produced similar results, so I don't think that's the case here, but feel free to prove me wrong.
I'm using Visual Studio 2017
Here you catch local variables by reference in a loop, they will be destroyed in every turn, causing undefined behavior:
for (atomic<size_t> i = 0; i < Pcount; i++)
{
int index = i;
Computer c = P[index];
myThreads.emplace_back([&] { S.add(c, index); });
}
You should catch index and c by value:
myThreads.emplace_back([&S, index, c] { S.add(c, index); });
Another approach would be to pass S, i and c as arguments instead of capturing them by defining the following non-capturing lambda, th_func:
auto th_func = [](Monitor &S, int index, Computer c){ S.add(c, index); };
This way you have to explicitly wrap the arguments that must be passed by reference to the thread's callable object with std::reference_wrapper by means of the function template std::ref(). In your case, only S:
for (atomic<size_t> i = 0; i < Pcount; i++) {
int index = i;
Computer c = P[index];
myThreads.emplace_back(th_func, std::ref(S), index, c);
}
Failing to wrap with std::reference_wrapper the arguments that must be passed by reference will result in a compile-time error. That is, the following won't compile:
myThreads.emplace_back(th_func, S, index, c); // <-- it should be std::ref(S)
See also this question.

boost::thread resource temporarily not available

I have a very similar problem to this. Unfortunately, I am not allowed to comment on it so please excuse me for opening up another topic for this. My code is running a two-stage calculation iteratively which in principle looks like this:
while(!finishing_condition_met)
{
boost::thread_group executionGrp1;
for(int w = 0; w < numThreads; w++)
{
boost::thread * curThread = new boost::thread(&Class::operation1, this, boost::ref(argument1), ...);
executionGrp1.add_thread(curThread);
}
executionGrp1.join_all();
boost::thread_group executionGrp2;
for(int w = 0; w < numThreads; w++)
{
boost::thread * curThread = new boost::thread(&Class::operation2, this, boost::ref(argument1), ...);
executionGrp2.add_thread(curThread);
}
executionGrp2.join_all();
update_finished_criterion();
}
Since numThreads is significantly smaller than what the kernel would allow (it is set to hardware concurrency which is 56 on the current machine), I was surprised see this error. Does join_all() not take care of the finished threads?
The thread_pool-approach suggested in the other post seems interesting but I am not exactly sure how to adapt it such that I can rerun everything within the loop multiple times while still waiting for the first stage to finish before starting the second stage.
Any suggestions are welcome! Thanks in advance.
EDIT: This is how I can cause this error in a minimalistic fashion. AFAIK, this is the standard way to implement parallel sections. Am I missing something?
#include "boost/thread.hpp"
#include "boost/chrono.hpp"
#include <iostream>
#include <algorithm>
#include <ctime>
using namespace std;
int numThreads = boost::thread::hardware_concurrency();
void wait(int seconds) {
boost::this_thread::sleep_for(boost::chrono::milliseconds(seconds));
return;
}
int subthread(int i) {
wait(i/numThreads);
return 1;
}
void threads(int nT) {
boost::thread_group exeGrp;
for (int i=0;i<nT;i++) {
boost::thread * curThread = new boost::thread(&subthread, i);
exeGrp.add_thread(curThread);
}
exeGrp.join_all();
}
int main() {
for (int a=0;a<numThreads;a++) {
cout << "Starting " << numThreads << " threads [" << a << "/" << numThreads << "]" << endl;
threads(numThreads);
}
cout << "done" << endl;
}
Output when running code

(C++/Qt) thread runs too much

In the example program below, my intention was to print five of same numbers per second, but in actual runtime the numbers are printed too much, sometimes around 10 or so. What is the problem in my code?
QTextStream qout(stdout);
class my_thread : public QThread
{
public:
int n;
my_thread()
{
n = 0;
}
void run()
{
while(n < 10)
{
qout << n++ << endl;
sleep(1);
}
}
};
int main()
{
enum { N_THREADS = 5 }; // N_THREADS = 2 with no problem...
std::array<my_thread, N_THREADS> thread_array;
for (std::array<my_thread, N_THREADS>::iterator it = thread_array.begin(); it != thread_array.end(); ++it)
{
it->start();
}
for (std::array<my_thread, N_THREADS>::iterator it = thread_array.begin(); it != thread_array.end(); ++it)
{
it->wait();
}
return 0;
}
EDIT
I found some interesting behaviour that my program runs just as expected when N_THREADS is 1 or 2; it starts going weird exactly from when N_THREADS equals 3 or more. Why is this so?