Threads: Fine with 10, Crash with 10000 - c++

Can someone please explain why the following code crashes:
int a(int x)
{
int s = 0;
for(int i = 0; i < 100; i++)
s += i;
return s;
}
int main()
{
unsigned int thread_no = 10000;
vector<thread> t(thread_no);
for(int i = 0; i < 10; i++)
t[i] = std::thread(a, 10);
for(thread& t_now : t)
t_now.join();
cout << "OK" << endl;
cin.get();
}
But WORKS with 10 threads? I am new to multithreading and simply don't understand what is happening?!

This creates a vector of 10,000 default-initialized threads:
unsigned int thread_no = 10000;
vector<thread> t(thread_no);
You're running into the difference between "capacity" and "size". You didn't just create a vector large enough to house 10,000 threads, you created a vector of 10,000 threads.
See the following (http://ideone.com/i7LBQ6)
#include <iostream>
#include <vector>
struct Foo {
Foo() { std::cout << "Foo()\n"; }
};
int main() {
std::vector<Foo> f(8);
std::cout << "f.capacity() = " << f.capacity() << ", size() = " << f.size() << '\n';
}
You only initialize 10 of the elements as running threads
for(int i = 0; i < 10; i++)
t[i] = std::thread(a, 10);
So your for loop is going to see 10 initialized threads and then 9,990 un-started threads.
for(thread& t_now : t)
t_now.join();
You might want to try using t.reserve(thread_no); and t.emplace_back(a, 10);
Here's a complete example with renaming.
int threadFn(int iterations)
{
int s = 0;
for(int i = 0; i < iterations; i++)
s += i;
return s;
}
int main()
{
enum {
MaximumThreadCapacity = 10000,
DesiredInitialThreads = 10,
ThreadLoopIterations = 100,
};
vector<thread> threads;
threads.reserve(MaximumThreadCapacity);
for(int i = 0; i < DesiredInitialThreads; i++)
threads.emplace_back(threadFn, ThreadLoopIterations);
std::cout << threads.size() << " threads spun up\n";
for(auto& t : threads) {
t.join();
}
std::cout << "threads joined\n";
}
---- EDIT ----
Specifically, the crash you are getting is the attempt to join a non-running thread, http://ideone.com/OuLMyQ
#include <thread>
int main() {
std::thread t;
t.join();
return 0;
}
stderr
terminate called after throwing an instance of 'std::system_error'
what(): Invalid argument
I point this out because you should be aware there is a race condition even with a valid thread, if you do
if (t.joinable())
t.join();
it's possible for 't' to become non-joinable between the test and the action. You should always put a t.join() in a try {} clause. See http://en.cppreference.com/w/cpp/thread/thread/join
Complete example:
int threadFn(int iterations)
{
int s = 0;
for(int i = 0; i < iterations; i++)
s += i;
return s;
}
int main()
{
enum {
MaximumThreadCapacity = 10000,
DesiredInitialThreads = 10,
ThreadLoopIterations = 100,
};
vector<thread> threads;
threads.reserve(MaximumThreadCapacity);
for(int i = 0; i < DesiredInitialThreads; i++)
threads.emplace_back(threadFn, ThreadLoopIterations);
std::cout << threads.size() << " threads spun up\n";
for(auto& t : threads) {
try {
if(t.joinable())
t.join();
} catch (std::system_error& e) {
switch (e.code()) {
case std::errc::invalid_argument:
case std::errc::no_such_process:
continue;
case std::errc::resource_deadlock_would_occur:
std::cerr << "deadlock during join - wth!\n";
return e.code();
default:
std::cout << "error during join: " << e.what() << '\n';
return e.code();
}
}
}
std::cout << "threads joined\n";
}

You create a vector that has 10000 elements in it, you then populate the first ten and you wait for all the the threads inside the vector to join. Your program crashes because you forgot to set the other 9990.
for(int i = 0; i < 10; i++) // Wrong
for(int i = 0; i < thread_no; i++) // Correct

Related

Atomics and Multi-Threading in C++

So I'm messing around with atomic and thread and made this program that reads and writes to an array based on whether an atomic is engaged.
When it compiles, however, the outputs seems to vary.
EDIT: by vary, I mean "k" and "out" are not identical on compilation.
#include <atomic>
#include <iostream>
#include <thread>
#include <vector>
std::atomic<bool> busy (false);
std::atomic_int out (0);
void do_thing(std::vector<int>* inputs) {
while(!busy) {
std::this_thread::yield();
}
int next = inputs->back();
inputs->pop_back();
out.store(out.load(std::memory_order_relaxed) + next, std::memory_order_relaxed);
}
int main(void) {
std::vector<int> inputs;
for(int i = 0; i < 100; i++) inputs.push_back(rand() % 10);
// base
int k = 0;
for(auto& el : inputs) k += el;
std::cout << k << std::endl;
// threaded
try {
std::vector<std::thread> threads;
for(int i = 0; i < 100; i++) threads.push_back(std::thread(do_thing, &inputs));
busy = true;
for (auto& th : threads) th.join();
std::cout << out.load(std::memory_order_relaxed);
std::cout << std::endl;
}
catch (const std::exception& ex) {
std::cout << ex.what() << std::endl;
}
return 0;
}
Could anyone point out what is happening?
I have a feeling this could be done better with a mutex, but was curious about why it happens anyhow.
Cheers,
K

Parallel programming for tasks in main function - c++

is it possible to define two tasks and let them work parallel in C++? I found something about parallel funtcions, but not about parallel tasks in main function like that:
int main()
// task 1
int a = 0;
for(int i = 0; i < 150; i++){
a++;
std::cout << a << std::endl;
// do more stuff
}
// task 2
int b = 0;
for(int i = 0; i < 150; i++){
b++;
std::cout << b << std::endl;
// do more stuff
}
}
Race conditions etc. can't occur.
Thank's for helping !
Yes, you can run the two in parallel, using std::async and lambdas. Here is an example:
int main()
{
auto f1 = std::async(std::launch::async, [](){
// task 1
int a = 0;
for(int i = 0; i < 10; i++){
a++;
std::cout << a << std::endl;
// do more stuff
}
});
auto f2 = std::async(std::launch::async, [](){
// task 2
int b = 100;
for(int i = 0; i < 10; i++){
b++;
std::cout << b << std::endl;
// do more stuff
}
});
f1.wait();
f2.wait();
}
(You'll get messed console output from this, because you need to guard access to the console with mutex or another similar resource.)

How do I correctly use std::mutex in C++ without deadlocks and/or races?

I am trying to debug a program that I am trying to run in parallel. I am at a loss for why I have both deadlocks and race conditions when I attempt to compile and run the code in C++. Here is all the relevant code that I have written thus far.
// define job struct here
// define mutex, condition variable, deque, and atomic here
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
void addJobs(...insert parameters here...)
{
job current = {...insert parameters here...};
jobMutex.lock();
std::cout << "We have successfully acquired the mutex." << std::endl;
jobList.push_back(current);
jobCondition.notify_one();
jobMutex.unlock();
std::cout << "We have successfully unlocked the mutex." << std::endl;
}
void work(void) {
job* current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock);
numberThreadsRunning++;
}
current = &jobList.at(0);
jobList.pop_front();
jobMutex.unlock();
std::cout << "We are now going to start a job." << std::endl;
////Call an expensive function for the current job that we want to run in parallel.
////This could either complete the job, or spawn more jobs, by calling addJobs.
////This recursive behavior typically results in there being thousands of jobs.
std::cout << "We have successfully completed a job." << std::endl;
}
numberThreadsRunning--;
std::cout << "There are now " << numberThreadsRunning << " threads running." << std::endl;
}
int main( int argc, char *argv[] ) {
//Initialize everything and add first job to the deque.
std::thread jobThreads[n]
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}
The code compiles, but depending on random factors, it will either deadlock at the very end or have a segmentation fault in the middle while the queue is still quite large. Does anyone know more about why this is happening?
...
EDIT:
I have edited this question to include additional information and a more complete example. While I certainly don't want to bore you with the thousands of lines of code I actually have (an image rendering package), I believe this example better represents the type of problem I am facing. The example given in the answer by Alan Birtles only works on very simple job structure with very simple functionality. In the actual job struct, there are multiple pointers to different vectors and matrices, and therefore we need pointers to the job struct, otherwise the compiler would fail to compile because the constructor function was "implicitly deleted".
I believe the error I am facing has to do with the way I am locking and unlocking the threads. I know that the pointers are also causing some issues, but they probably have to stay. The function thisFunction() represents the function that needs to be run in parallel.
#include <queue>
#include <deque>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <iostream>
#include <cmath>
struct job {
std::vector<std::vector<int>> &matrix;
int num;
};
bool closed = false;
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
std::atomic<int> numJobs;
struct tcout
{
tcout() :lock(mutex) {}
template < typename T >
tcout& operator<< (T&& t)
{
std::cout << t;
return *this;
}
static std::mutex mutex;
std::unique_lock< std::mutex > lock;
};
std::mutex tcout::mutex;
std::vector<std::vector<int>> multiply4x4(
std::vector<std::vector<int>> &A,
std::vector<std::vector<int>> &B) {
//Only deals with 4x4 matrices
std::vector<std::vector<int>> C(4, std::vector<int>(4, 0));
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
for (int k = 0; k < 4; k++) {
C.at(i).at(j) = C.at(i).at(j) + A.at(i).at(k) * B.at(k).at(j);
}
}
}
return C;
}
void addJobs()
{
numJobs++;
std::vector<std::vector<int>> matrix(4, std::vector<int>(4, -1)); //Create random 4x4 matrix
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
matrix.at(i).at(j) = rand() % 10 + 1;
}
}
job current = { matrix, numJobs };
std::unique_lock<std::mutex> lock(jobMutex);
std::cout << "The matrix for job " << current.num << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << matrix.at(i).at(j) << "\t";
}
std::cout << "\n";
}
jobList.push_back(current);
jobCondition.notify_one();
lock.unlock();
}
void thisFunction(std::vector<std::vector<int>> &matrix, int num)
{
std::this_thread::sleep_for(std::chrono::milliseconds(rand() * 500 / RAND_MAX));
std::vector<std::vector<int>> product = matrix;
std::unique_lock<std::mutex> lk(jobMutex);
std::cout << "The imported matrix for job " << num << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << product.at(i).at(j) << "\t";
}
std::cout << "\n";
}
lk.unlock();
int power;
if (num % 2 == 1) {
power = 3;
} else if (num % 2 == 0) {
power = 2;
addJobs();
}
for (int k = 1; k < power; k++) {
product = multiply4x4(product, matrix);
}
std::unique_lock<std::mutex> lock(jobMutex);
std::cout << "The matrix for job " << num << " to the power of " << power << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << product.at(i).at(j) << "\t";
}
std::cout << "\n";
}
lock.unlock();
}
void work(void) {
job *current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock, [] {return !jobList.empty() || closed; });
numberThreadsRunning++;
}
if (jobList.empty())
{
break;
}
current = &jobList.front();
job newcurrent = {current->matrix, current->num};
current = &newcurrent;
jobList.pop_front();
lock.unlock();
thisFunction(current->matrix, current->num);
tcout() << "job " << current->num << " complete\n";
}
numberThreadsRunning--;
}
int main(int argc, char *argv[]) {
const size_t n = 1;
numJobs = 0;
std::thread jobThreads[n];
std::vector<int> buffer;
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < 100; i++)
{
addJobs();
}
{
std::unique_lock<std::mutex> lock(jobMutex);
closed = true;
jobCondition.notify_all();
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}
Here is a fully working example:
#include <queue>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <iostream>
struct job { int num; };
bool closed = false;
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
struct tcout
{
tcout() :lock(mutex) {}
template < typename T >
tcout& operator<< (T&& t)
{
std::cout << t;
return *this;
}
static std::mutex mutex;
std::unique_lock< std::mutex > lock;
};
std::mutex tcout::mutex;
void addJobs()
{
static int num = 0;
job current = { num++ };
std::unique_lock<std::mutex> lock(jobMutex);
jobList.push_back(current);
jobCondition.notify_one();
lock.unlock();
}
void work(void) {
job current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock, [] {return !jobList.empty() || closed; });
numberThreadsRunning++;
}
if (jobList.empty())
{
break;
}
current = jobList.front();
jobList.pop_front();
lock.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(rand() * 500 / RAND_MAX));
tcout() << "job " << current.num << " complete\n";
}
numberThreadsRunning--;
}
int main(int argc, char *argv[]) {
const size_t n = 4;
std::thread jobThreads[n];
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < 100; i++)
{
addJobs();
}
{
std::unique_lock<std::mutex> lock(jobMutex);
closed = true;
jobCondition.notify_all();
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}
I've made the following changes:
Never call lock() or unlock() on a std::mutex, always use std::unique_lock (or similar classes). You were calling jobMutex.unlock() in work() for the mutex you had locked with std::unique_lock, std::unique_lock would then call unlock for the second time leading to undefined behaviour. If an exception was thrown in addJobs then as you weren't using std::unique_lock at all the mutex would remain locked.
You need to use a predicate for jobCondition.wait otherwise a spurious wakeup could cause the wait to return while jobList is still empty.
I've added a closed variable to make the program exit when there's no more work to do
I've added a definition of job
In work you take a pointer to an item on the queue then pop it off the queue, as the item no longer exists the pointer is dangling. You need to copy the item before popping the queue. If you want to avoid the copy either make your job structure movable or change your queue to store std::unique_ptr<job> or std::shared_ptr<job>
I've also added a thread safe version of std::cout, this isn't strictly necessary but stops your output lines overlapping each other. Ideally you should use a proper thread safe logging library instead as locking a mutex for every print is expensive and if you have enough prints will make your program practically single threaded
Replace job* current; with job current; and then current = jobList.at(0);. Otherwise you end up with a pointer to an element of jobList that does not exist after jobList.pop_front().
Replace if (jobList.empty()) with while(jobList.empty()) to handle spurious wakeups.

C++ : Passing threadID to function anomaly

I implemented a concurrent queue with two methods: add (enqueue) & remove (dequeue).
To test my implementation using 2 threads, I generated 10 (NUMBER_OF_OPERATIONS) random numbers between 0 and 1 in a method called getRandom(). This allows me to create different distribution of add and remove operations.
The doWork method splits up the work done by the number of threads.
PROBLEM: The threadID that I am passing in from the main function does not match the threadID that the doWork method receives. Here are some sample runs:
Output 1
Output 2
#define NUMBER_OF_THREADS 2
#define NUMBER_OF_OPERATIONS 10
int main () {
BoundedQueue<int> bQ;
std::vector<double> temp = getRandom();
double* randomNumbers = &temp[0];
std::thread myThreads[NUMBER_OF_THREADS];
for(int i = 0; i < NUMBER_OF_THREADS; i++) {
cout << "Thread " << i << " created.\n";
myThreads[i] = std::thread ( [&] { bQ.doWork(randomNumbers, i); });
}
cout << "Main Thread\n";
for(int i = 0; i < NUMBER_OF_THREADS; i++) {
if(myThreads[i].joinable()) myThreads[i].join();
}
return 0;
}
template <class T> void BoundedQueue<T>::doWork (double randomNumbers[], int threadID) {
cout << "Thread ID is " << threadID << "\n";
srand(time(NULL));
int split = NUMBER_OF_OPERATIONS / NUMBER_OF_THREADS;
for (int i = threadID * split; i < (threadID * split) + split; i++) {
if(randomNumbers[i] <= 0.5) {
int numToAdd = rand() % 10 + 1;
add(numToAdd);
}
else {
int numRemoved = remove();
}
}
}
In this line you're capturing i by reference:
myThreads[i] = std::thread ( [&] { bQ.doWork(randomNumbers, i); });
This means that when the other thread runs the lambda, it'll get the latest value of i, not the value when it was created. Capture it by value instead:
myThreads[i] = std::thread ( [&, i] { bQ.doWork(randomNumbers, i); });
Whats worse, as you've got unordered read and write to i, your current code has undefined behavoir. And the fact i may've gone out of scope on the main thread before the other thread reads it. This fix above fixes all these issues.

a thread related issue

I am learning to use thread. And I find that I can use the following
mutex mx;
void func(int id)
{
mx.lock();
cout << "hey , thread:"<<id << "!" << endl;
mx.unlock();
}
int main(){
vector<thread> threads;
for(int i = 0 ; i < 5 ; i++)
threads.emplace_back(thread(func , i));
for(thread & t : threads)
t.join();
return 0;
}
while I can't do in main()
for(int i = 0 ; i < 5 ; i ++)
{
thread t(func , i);
threads.emplace_back(t);
}
Can any one explain this a little?
You need to move the object:
thread t(func, i);
threads.push_back(std::move(t));
emplace also works, but push_back is idiomatic in this case. And of course #include <utility>.