Parallel programming for tasks in main function - c++

Parallel programming for tasks in main function - c++ - c++

is it possible to define two tasks and let them work parallel in C++? I found something about parallel funtcions, but not about parallel tasks in main function like that:
int main()
// task 1
int a = 0;
for(int i = 0; i < 150; i++){
a++;
std::cout << a << std::endl;
// do more stuff
}
// task 2
int b = 0;
for(int i = 0; i < 150; i++){
b++;
std::cout << b << std::endl;
// do more stuff
}
}
Race conditions etc. can't occur.
Thank's for helping !

Yes, you can run the two in parallel, using std::async and lambdas. Here is an example:
int main()
{
auto f1 = std::async(std::launch::async, [](){
// task 1
int a = 0;
for(int i = 0; i < 10; i++){
a++;
std::cout << a << std::endl;
// do more stuff
}
});
auto f2 = std::async(std::launch::async, [](){
// task 2
int b = 100;
for(int i = 0; i < 10; i++){
b++;
std::cout << b << std::endl;
// do more stuff
}
});
f1.wait();
f2.wait();
}
(You'll get messed console output from this, because you need to guard access to the console with mutex or another similar resource.)

Related

Multiple Nested For-loops without knowing the number of for-loops

how would I write n nested for loops without knowing what n is?
For example, how would i write this using recursion or another method:
for (int i = 0; cond1; i++){
for (int j = 0; cond2; j++){
for (int k = 0; cond3; k++)
...
for (int l = 0; cond_N; l++){
if (.....) break;
}
}
}
}
Here, there are n loops with some condition (not necessarily the same condition for each variable) and I'm not sure how to transform this into code using recursion without knowing what n is. Thanks!

Is this what youre trying to do? cond provides the N different conditions (i.e. loop-variable-dependent bool function) and foo introduces recursion:
#include <iostream>
bool cond(int cond_which, int current_loop_variable) {
switch (cond_which) {
case 0:
return current_loop_variable < 5;
case 1:
return current_loop_variable < 3;
/* more... can be hella complicated, related with states, whatever */
default:
return false;
}
}
void foo(int we_may_call_it_the_meta_loop_variable) {
for (int i = 0; cond(we_may_call_it_the_meta_loop_variable, i); ++i) {
foo(we_may_call_it_the_meta_loop_variable + 1);
std::cout << "in loop " << we_may_call_it_the_meta_loop_variable << ", i = " << i << std::endl;
}
};
int main() {
foo(0);
return 0;
}
Clearly this is not an infinite recursion.

gcov not working with pthreads, WSL-2, Clion

Background:
This seems to work fine if I don't use any threads or just spawn 1 thread, which makes this all the more confusing.
Clion Project here
Problem:
I set up a basic example project that starts 2 threads and does some printing to the console from main thread, thread 2, and thread 3.
#include <iostream>
#include <thread>
void thread1()
{
for(int i = 0; i < 10000; i++)
{
std::cout << "thread1" << std::endl;
}
}
void thread2()
{
for(int i = 0; i < 10000; i++)
{
std::cout << "thread2" << std::endl;
}
}
int main()
{
std::cout << "Hello, World!" << std::endl;
std::thread threadObj(thread1);
std::thread threadObj2(thread2);
for(int i = 0; i < 10000; i++)
{
std::cout<<"MainThread"<<std::endl;
}
threadObj.join();
std::cout<<"Exit of Main function"<<std::endl;
return 0;
}
Compiling using:
--coverage -pthread -g -std=gnu++2a
When I run in clion using "Run 'EvalTest' with Coverage", I get the following error:
Could not find code coverage data
So it's not producing the gcov files needed, but it works fine if I comment out the following line of code:
int main()
{
std::cout << "Hello, World!" << std::endl;
std::thread threadObj(thread1);
// std::thread threadObj2(thread2);
for(int i = 0; i < 10000; i++)
{
std::cout<<"MainThread"<<std::endl;
}
threadObj.join();
std::cout<<"Exit of Main function"<<std::endl;
return 0;
}

Needed to do threadObj.join() and threadObj2.join(). So code looks like:
int main()
{
std::cout << "Hello, World!" << std::endl;
std::thread threadObj(thread1);
std::thread threadObj2(thread2);
for(int i = 0; i < 10000; i++)
{
std::cout<<"MainThread"<<std::endl;
}
threadObj.join();
threadObj2.join(); // need to join both thread for gcov to work properly
std::cout<<"Exit of Main function"<<std::endl;
return 0;
}

How do I correctly use std::mutex in C++ without deadlocks and/or races?

I am trying to debug a program that I am trying to run in parallel. I am at a loss for why I have both deadlocks and race conditions when I attempt to compile and run the code in C++. Here is all the relevant code that I have written thus far.
// define job struct here
// define mutex, condition variable, deque, and atomic here
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
void addJobs(...insert parameters here...)
{
job current = {...insert parameters here...};
jobMutex.lock();
std::cout << "We have successfully acquired the mutex." << std::endl;
jobList.push_back(current);
jobCondition.notify_one();
jobMutex.unlock();
std::cout << "We have successfully unlocked the mutex." << std::endl;
}
void work(void) {
job* current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock);
numberThreadsRunning++;
}
current = &jobList.at(0);
jobList.pop_front();
jobMutex.unlock();
std::cout << "We are now going to start a job." << std::endl;
////Call an expensive function for the current job that we want to run in parallel.
////This could either complete the job, or spawn more jobs, by calling addJobs.
////This recursive behavior typically results in there being thousands of jobs.
std::cout << "We have successfully completed a job." << std::endl;
}
numberThreadsRunning--;
std::cout << "There are now " << numberThreadsRunning << " threads running." << std::endl;
}
int main( int argc, char *argv[] ) {
//Initialize everything and add first job to the deque.
std::thread jobThreads[n]
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}
The code compiles, but depending on random factors, it will either deadlock at the very end or have a segmentation fault in the middle while the queue is still quite large. Does anyone know more about why this is happening?
...
EDIT:
I have edited this question to include additional information and a more complete example. While I certainly don't want to bore you with the thousands of lines of code I actually have (an image rendering package), I believe this example better represents the type of problem I am facing. The example given in the answer by Alan Birtles only works on very simple job structure with very simple functionality. In the actual job struct, there are multiple pointers to different vectors and matrices, and therefore we need pointers to the job struct, otherwise the compiler would fail to compile because the constructor function was "implicitly deleted".
I believe the error I am facing has to do with the way I am locking and unlocking the threads. I know that the pointers are also causing some issues, but they probably have to stay. The function thisFunction() represents the function that needs to be run in parallel.
#include <queue>
#include <deque>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <iostream>
#include <cmath>
struct job {
std::vector<std::vector<int>> &matrix;
int num;
};
bool closed = false;
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
std::atomic<int> numJobs;
struct tcout
{
tcout() :lock(mutex) {}
template < typename T >
tcout& operator<< (T&& t)
{
std::cout << t;
return *this;
}
static std::mutex mutex;
std::unique_lock< std::mutex > lock;
};
std::mutex tcout::mutex;
std::vector<std::vector<int>> multiply4x4(
std::vector<std::vector<int>> &A,
std::vector<std::vector<int>> &B) {
//Only deals with 4x4 matrices
std::vector<std::vector<int>> C(4, std::vector<int>(4, 0));
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
for (int k = 0; k < 4; k++) {
C.at(i).at(j) = C.at(i).at(j) + A.at(i).at(k) * B.at(k).at(j);
}
}
}
return C;
}
void addJobs()
{
numJobs++;
std::vector<std::vector<int>> matrix(4, std::vector<int>(4, -1)); //Create random 4x4 matrix
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
matrix.at(i).at(j) = rand() % 10 + 1;
}
}
job current = { matrix, numJobs };
std::unique_lock<std::mutex> lock(jobMutex);
std::cout << "The matrix for job " << current.num << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << matrix.at(i).at(j) << "\t";
}
std::cout << "\n";
}
jobList.push_back(current);
jobCondition.notify_one();
lock.unlock();
}
void thisFunction(std::vector<std::vector<int>> &matrix, int num)
{
std::this_thread::sleep_for(std::chrono::milliseconds(rand() * 500 / RAND_MAX));
std::vector<std::vector<int>> product = matrix;
std::unique_lock<std::mutex> lk(jobMutex);
std::cout << "The imported matrix for job " << num << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << product.at(i).at(j) << "\t";
}
std::cout << "\n";
}
lk.unlock();
int power;
if (num % 2 == 1) {
power = 3;
} else if (num % 2 == 0) {
power = 2;
addJobs();
}
for (int k = 1; k < power; k++) {
product = multiply4x4(product, matrix);
}
std::unique_lock<std::mutex> lock(jobMutex);
std::cout << "The matrix for job " << num << " to the power of " << power << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << product.at(i).at(j) << "\t";
}
std::cout << "\n";
}
lock.unlock();
}
void work(void) {
job *current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock, [] {return !jobList.empty() || closed; });
numberThreadsRunning++;
}
if (jobList.empty())
{
break;
}
current = &jobList.front();
job newcurrent = {current->matrix, current->num};
current = &newcurrent;
jobList.pop_front();
lock.unlock();
thisFunction(current->matrix, current->num);
tcout() << "job " << current->num << " complete\n";
}
numberThreadsRunning--;
}
int main(int argc, char *argv[]) {
const size_t n = 1;
numJobs = 0;
std::thread jobThreads[n];
std::vector<int> buffer;
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < 100; i++)
{
addJobs();
}
{
std::unique_lock<std::mutex> lock(jobMutex);
closed = true;
jobCondition.notify_all();
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}

Here is a fully working example:
#include <queue>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <iostream>
struct job { int num; };
bool closed = false;
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
struct tcout
{
tcout() :lock(mutex) {}
template < typename T >
tcout& operator<< (T&& t)
{
std::cout << t;
return *this;
}
static std::mutex mutex;
std::unique_lock< std::mutex > lock;
};
std::mutex tcout::mutex;
void addJobs()
{
static int num = 0;
job current = { num++ };
std::unique_lock<std::mutex> lock(jobMutex);
jobList.push_back(current);
jobCondition.notify_one();
lock.unlock();
}
void work(void) {
job current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock, [] {return !jobList.empty() || closed; });
numberThreadsRunning++;
}
if (jobList.empty())
{
break;
}
current = jobList.front();
jobList.pop_front();
lock.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(rand() * 500 / RAND_MAX));
tcout() << "job " << current.num << " complete\n";
}
numberThreadsRunning--;
}
int main(int argc, char *argv[]) {
const size_t n = 4;
std::thread jobThreads[n];
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < 100; i++)
{
addJobs();
}
{
std::unique_lock<std::mutex> lock(jobMutex);
closed = true;
jobCondition.notify_all();
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}
I've made the following changes:
Never call lock() or unlock() on a std::mutex, always use std::unique_lock (or similar classes). You were calling jobMutex.unlock() in work() for the mutex you had locked with std::unique_lock, std::unique_lock would then call unlock for the second time leading to undefined behaviour. If an exception was thrown in addJobs then as you weren't using std::unique_lock at all the mutex would remain locked.
You need to use a predicate for jobCondition.wait otherwise a spurious wakeup could cause the wait to return while jobList is still empty.
I've added a closed variable to make the program exit when there's no more work to do
I've added a definition of job
In work you take a pointer to an item on the queue then pop it off the queue, as the item no longer exists the pointer is dangling. You need to copy the item before popping the queue. If you want to avoid the copy either make your job structure movable or change your queue to store std::unique_ptr<job> or std::shared_ptr<job>
I've also added a thread safe version of std::cout, this isn't strictly necessary but stops your output lines overlapping each other. Ideally you should use a proper thread safe logging library instead as locking a mutex for every print is expensive and if you have enough prints will make your program practically single threaded

Replace job* current; with job current; and then current = jobList.at(0);. Otherwise you end up with a pointer to an element of jobList that does not exist after jobList.pop_front().
Replace if (jobList.empty()) with while(jobList.empty()) to handle spurious wakeups.

C++ : Passing threadID to function anomaly

I implemented a concurrent queue with two methods: add (enqueue) & remove (dequeue).
To test my implementation using 2 threads, I generated 10 (NUMBER_OF_OPERATIONS) random numbers between 0 and 1 in a method called getRandom(). This allows me to create different distribution of add and remove operations.
The doWork method splits up the work done by the number of threads.
PROBLEM: The threadID that I am passing in from the main function does not match the threadID that the doWork method receives. Here are some sample runs:
Output 1
Output 2
#define NUMBER_OF_THREADS 2
#define NUMBER_OF_OPERATIONS 10
int main () {
BoundedQueue<int> bQ;
std::vector<double> temp = getRandom();
double* randomNumbers = &temp[0];
std::thread myThreads[NUMBER_OF_THREADS];
for(int i = 0; i < NUMBER_OF_THREADS; i++) {
cout << "Thread " << i << " created.\n";
myThreads[i] = std::thread ( [&] { bQ.doWork(randomNumbers, i); });
}
cout << "Main Thread\n";
for(int i = 0; i < NUMBER_OF_THREADS; i++) {
if(myThreads[i].joinable()) myThreads[i].join();
}
return 0;
}
template <class T> void BoundedQueue<T>::doWork (double randomNumbers[], int threadID) {
cout << "Thread ID is " << threadID << "\n";
srand(time(NULL));
int split = NUMBER_OF_OPERATIONS / NUMBER_OF_THREADS;
for (int i = threadID * split; i < (threadID * split) + split; i++) {
if(randomNumbers[i] <= 0.5) {
int numToAdd = rand() % 10 + 1;
add(numToAdd);
}
else {
int numRemoved = remove();
}
}
}

In this line you're capturing i by reference:
myThreads[i] = std::thread ( [&] { bQ.doWork(randomNumbers, i); });
This means that when the other thread runs the lambda, it'll get the latest value of i, not the value when it was created. Capture it by value instead:
myThreads[i] = std::thread ( [&, i] { bQ.doWork(randomNumbers, i); });
Whats worse, as you've got unordered read and write to i, your current code has undefined behavoir. And the fact i may've gone out of scope on the main thread before the other thread reads it. This fix above fixes all these issues.

Threads: Fine with 10, Crash with 10000

Can someone please explain why the following code crashes:
int a(int x)
{
int s = 0;
for(int i = 0; i < 100; i++)
s += i;
return s;
}
int main()
{
unsigned int thread_no = 10000;
vector<thread> t(thread_no);
for(int i = 0; i < 10; i++)
t[i] = std::thread(a, 10);
for(thread& t_now : t)
t_now.join();
cout << "OK" << endl;
cin.get();
}
But WORKS with 10 threads? I am new to multithreading and simply don't understand what is happening?!

This creates a vector of 10,000 default-initialized threads:
unsigned int thread_no = 10000;
vector<thread> t(thread_no);
You're running into the difference between "capacity" and "size". You didn't just create a vector large enough to house 10,000 threads, you created a vector of 10,000 threads.
See the following (http://ideone.com/i7LBQ6)
#include <iostream>
#include <vector>
struct Foo {
Foo() { std::cout << "Foo()\n"; }
};
int main() {
std::vector<Foo> f(8);
std::cout << "f.capacity() = " << f.capacity() << ", size() = " << f.size() << '\n';
}
You only initialize 10 of the elements as running threads
for(int i = 0; i < 10; i++)
t[i] = std::thread(a, 10);
So your for loop is going to see 10 initialized threads and then 9,990 un-started threads.
for(thread& t_now : t)
t_now.join();
You might want to try using t.reserve(thread_no); and t.emplace_back(a, 10);
Here's a complete example with renaming.
int threadFn(int iterations)
{
int s = 0;
for(int i = 0; i < iterations; i++)
s += i;
return s;
}
int main()
{
enum {
MaximumThreadCapacity = 10000,
DesiredInitialThreads = 10,
ThreadLoopIterations = 100,
};
vector<thread> threads;
threads.reserve(MaximumThreadCapacity);
for(int i = 0; i < DesiredInitialThreads; i++)
threads.emplace_back(threadFn, ThreadLoopIterations);
std::cout << threads.size() << " threads spun up\n";
for(auto& t : threads) {
t.join();
}
std::cout << "threads joined\n";
}
---- EDIT ----
Specifically, the crash you are getting is the attempt to join a non-running thread, http://ideone.com/OuLMyQ
#include <thread>
int main() {
std::thread t;
t.join();
return 0;
}
stderr
terminate called after throwing an instance of 'std::system_error'
what(): Invalid argument
I point this out because you should be aware there is a race condition even with a valid thread, if you do
if (t.joinable())
t.join();
it's possible for 't' to become non-joinable between the test and the action. You should always put a t.join() in a try {} clause. See http://en.cppreference.com/w/cpp/thread/thread/join
Complete example:
int threadFn(int iterations)
{
int s = 0;
for(int i = 0; i < iterations; i++)
s += i;
return s;
}
int main()
{
enum {
MaximumThreadCapacity = 10000,
DesiredInitialThreads = 10,
ThreadLoopIterations = 100,
};
vector<thread> threads;
threads.reserve(MaximumThreadCapacity);
for(int i = 0; i < DesiredInitialThreads; i++)
threads.emplace_back(threadFn, ThreadLoopIterations);
std::cout << threads.size() << " threads spun up\n";
for(auto& t : threads) {
try {
if(t.joinable())
t.join();
} catch (std::system_error& e) {
switch (e.code()) {
case std::errc::invalid_argument:
case std::errc::no_such_process:
continue;
case std::errc::resource_deadlock_would_occur:
std::cerr << "deadlock during join - wth!\n";
return e.code();
default:
std::cout << "error during join: " << e.what() << '\n';
return e.code();
}
}
}
std::cout << "threads joined\n";
}

You create a vector that has 10000 elements in it, you then populate the first ten and you wait for all the the threads inside the vector to join. Your program crashes because you forgot to set the other 9990.
for(int i = 0; i < 10; i++) // Wrong
for(int i = 0; i < thread_no; i++) // Correct

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Parallel programming for tasks in main function - c++ - c++

Related

Multiple Nested For-loops without knowing the number of for-loops

gcov not working with pthreads, WSL-2, Clion

How do I correctly use std::mutex in C++ without deadlocks and/or races?

C++ : Passing threadID to function anomaly

Threads: Fine with 10, Crash with 10000

Categories

Resources