I'm trying to change a vector in a different thread, but the value of the vector is not changed. I thought that using std::ref will fix the issue but it didn't work.
This is the code that start the threads:
printf("tmp size: %d\n", tmp_size);
printf("before change");
printArray(tmp);
std::thread threads[1];
for(int i = 0; i < 1; i++){
threads[i] = std::thread(callback, std::ref(tmp));
}
for(int i = 0; i < 1; i++){
threads[i].join();
}
printf("after join: ");
printArray(tmp);
this is the callback:
void callback(std::vector<uint64_t> tmp){
tmp[0] = 1;
printf("inside callback");
printArray(tmp);
}
and the output is:
tmp size: 2
before change 0 0
inside callback 1 0
after join: 0 0
I was expecting that after the thread change the vector the values will be: inside callback: 1 0. Isn't it passed by reference?
You are passing a reference to the function, but then the function takes its parameter by value, giving it the value of the reference. Modifying the value of the reference does no good. You need to modify the reference. Here's a demonstration of how to do it correctly:
#include <vector>
#include <stdint.h>
#include <thread>
void callback(std::vector<uint64_t> &tmp)
{
tmp[0] += 1;
}
int main()
{
std::thread threads[1];
std::vector<uint64_t> tmp;
tmp.push_back(1);
for(int i = 0; i < 1; i++)
threads[i] = std::thread(callback, std::ref(tmp));
for(int i = 0; i < 1; i++)
threads[i].join();
printf("%d\n", (int) tmp[0]);
}
If you wanted the callback to change the vector, you would have to pass it by pointer or reference.
Your callback code has made a copy of it instead.
Another option that can sometimes be more thread-safe is if you were to "move" the vector into the thread and then move it back out when the thread finishes. Like so:
#include <thread>
#include <future>
#include <vector>
#include <iostream>
std::vector<int> addtovec(std::vector<int> vec, int add) {
for(auto &x: vec) {
x += add;
}
return vec;
}
std::ostream& operator<<(std::ostream& os, const std::vector<int> &v) {
os << '{';
bool comma = false;
for(const auto &x: v) {
if(comma) os << ',';
comma = true;
os << x;
}
os << '}';
return os;
}
int main() {
std::vector<int> a{1,2,3,9,8,7};
std::cout << "before: " << a << std::endl;
auto future = std::async(addtovec, std::move(a), 5);
std::cout << "after move: " << a << std::endl;
a = future.get();
std::cout << "after get: " << a << std::endl;
return 0;
}
Related
So I'm messing around with atomic and thread and made this program that reads and writes to an array based on whether an atomic is engaged.
When it compiles, however, the outputs seems to vary.
EDIT: by vary, I mean "k" and "out" are not identical on compilation.
#include <atomic>
#include <iostream>
#include <thread>
#include <vector>
std::atomic<bool> busy (false);
std::atomic_int out (0);
void do_thing(std::vector<int>* inputs) {
while(!busy) {
std::this_thread::yield();
}
int next = inputs->back();
inputs->pop_back();
out.store(out.load(std::memory_order_relaxed) + next, std::memory_order_relaxed);
}
int main(void) {
std::vector<int> inputs;
for(int i = 0; i < 100; i++) inputs.push_back(rand() % 10);
// base
int k = 0;
for(auto& el : inputs) k += el;
std::cout << k << std::endl;
// threaded
try {
std::vector<std::thread> threads;
for(int i = 0; i < 100; i++) threads.push_back(std::thread(do_thing, &inputs));
busy = true;
for (auto& th : threads) th.join();
std::cout << out.load(std::memory_order_relaxed);
std::cout << std::endl;
}
catch (const std::exception& ex) {
std::cout << ex.what() << std::endl;
}
return 0;
}
Could anyone point out what is happening?
I have a feeling this could be done better with a mutex, but was curious about why it happens anyhow.
Cheers,
K
I got stuck in many problems where I was trying to store values in 2D vectors.
So I have written this simple code.
I am just storing and printing my values :
int main()
{
vector<vector<int>> vec;
vector<int> row{1,3,5,7,9,12,34,56};
int i,n,m,rs,vs;
rs=row.size();
cout<<"rs = "<<rs<<endl;
for(i=0;i<(rs/2);i++)
{
vec[i].push_back(row.at(i));
vec[i].push_back(row.at(i+4));
}
vs=vec.size();
cout<<vs<<endl;
for(n=0;n<vs;n++)
{
for(m=0;m<2;m++)
{
cout<<vec[n][m]<<" ";
}
cout<<endl;
}
return 0;
}
First you should read Why is “using namespace std;” considered bad practice?.
Declare variables when you use them and not at the beginning of your program.
The vector vec is empty at the beginning. In the loop
for(i=0;i<(rs/2);i++)
{
vec[i].push_back(row.at(i));
vec[i].push_back(row.at(i+4));
}
you are taking a reference to the i-th element in vec with
vec[i]
but this element does not exist. This is undefined behavior and can result in a segmentation fault. You can fix it by constructing the vector with the needed elements
#include <iostream>
#include <vector>
int main()
{
std::vector<int> row{1,3,5,7,9,12,34,56};
int rs = row.size();
std::vector<std::vector<int>> vec(rs / 2);
std::cout << "rs = " << rs << '\n';
for(int i = 0; i < rs / 2; ++i)
{
vec[i].push_back(row.at(i));
vec[i].push_back(row.at(i + 4));
}
int vs = vec.size();
std::cout << vs << '\n';
for(int n = 0; n < vs; ++n)
{
for(int m = 0; m < 2; ++m)
{
std::cout << vec[n][m] << " ";
}
std::cout << '\n';
}
return 0;
}
In this example the line
std::vector<std::vector<int>> vec(rs / 2);
constructs a vector containing rs / 2 default constructed elements. Alternatively you can start with an empty vector and push back elements in the loop
#include <iostream>
#include <vector>
int main()
{
std::vector<int> row{1,3,5,7,9,12,34,56};
int rs=row.size();
std::vector<std::vector<int>> vec;
std::cout << "rs = " << rs << '\n';
for(int i = 0; i < rs / 2; ++i)
{
vec.push_back({row.at(i), row.at(i+4)});
//
// is similar to:
// vec.push_back({});
// vec.back().push_back(row.at(i));
// vec.back().push_back(row.at(i+4));
//
// is similar to:
// vec.push_back({});
// vec[i].push_back(row.at(i));
// vec[i].push_back(row.at(i+4));
}
int vs = vec.size();
std::cout << vs << '\n';
for(int n = 0; n < vs; ++n)
{
for(int m = 0; m < 2; ++m)
{
std::cout << vec[n][m] << " ";
}
std::cout << '\n';
}
return 0;
}
I recommend the first solution. It's better to allocate memory for all elements and work with it instead of allocate memory in each loop iteration.
I am trying to debug a program that I am trying to run in parallel. I am at a loss for why I have both deadlocks and race conditions when I attempt to compile and run the code in C++. Here is all the relevant code that I have written thus far.
// define job struct here
// define mutex, condition variable, deque, and atomic here
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
void addJobs(...insert parameters here...)
{
job current = {...insert parameters here...};
jobMutex.lock();
std::cout << "We have successfully acquired the mutex." << std::endl;
jobList.push_back(current);
jobCondition.notify_one();
jobMutex.unlock();
std::cout << "We have successfully unlocked the mutex." << std::endl;
}
void work(void) {
job* current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock);
numberThreadsRunning++;
}
current = &jobList.at(0);
jobList.pop_front();
jobMutex.unlock();
std::cout << "We are now going to start a job." << std::endl;
////Call an expensive function for the current job that we want to run in parallel.
////This could either complete the job, or spawn more jobs, by calling addJobs.
////This recursive behavior typically results in there being thousands of jobs.
std::cout << "We have successfully completed a job." << std::endl;
}
numberThreadsRunning--;
std::cout << "There are now " << numberThreadsRunning << " threads running." << std::endl;
}
int main( int argc, char *argv[] ) {
//Initialize everything and add first job to the deque.
std::thread jobThreads[n]
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}
The code compiles, but depending on random factors, it will either deadlock at the very end or have a segmentation fault in the middle while the queue is still quite large. Does anyone know more about why this is happening?
...
EDIT:
I have edited this question to include additional information and a more complete example. While I certainly don't want to bore you with the thousands of lines of code I actually have (an image rendering package), I believe this example better represents the type of problem I am facing. The example given in the answer by Alan Birtles only works on very simple job structure with very simple functionality. In the actual job struct, there are multiple pointers to different vectors and matrices, and therefore we need pointers to the job struct, otherwise the compiler would fail to compile because the constructor function was "implicitly deleted".
I believe the error I am facing has to do with the way I am locking and unlocking the threads. I know that the pointers are also causing some issues, but they probably have to stay. The function thisFunction() represents the function that needs to be run in parallel.
#include <queue>
#include <deque>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <iostream>
#include <cmath>
struct job {
std::vector<std::vector<int>> &matrix;
int num;
};
bool closed = false;
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
std::atomic<int> numJobs;
struct tcout
{
tcout() :lock(mutex) {}
template < typename T >
tcout& operator<< (T&& t)
{
std::cout << t;
return *this;
}
static std::mutex mutex;
std::unique_lock< std::mutex > lock;
};
std::mutex tcout::mutex;
std::vector<std::vector<int>> multiply4x4(
std::vector<std::vector<int>> &A,
std::vector<std::vector<int>> &B) {
//Only deals with 4x4 matrices
std::vector<std::vector<int>> C(4, std::vector<int>(4, 0));
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
for (int k = 0; k < 4; k++) {
C.at(i).at(j) = C.at(i).at(j) + A.at(i).at(k) * B.at(k).at(j);
}
}
}
return C;
}
void addJobs()
{
numJobs++;
std::vector<std::vector<int>> matrix(4, std::vector<int>(4, -1)); //Create random 4x4 matrix
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
matrix.at(i).at(j) = rand() % 10 + 1;
}
}
job current = { matrix, numJobs };
std::unique_lock<std::mutex> lock(jobMutex);
std::cout << "The matrix for job " << current.num << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << matrix.at(i).at(j) << "\t";
}
std::cout << "\n";
}
jobList.push_back(current);
jobCondition.notify_one();
lock.unlock();
}
void thisFunction(std::vector<std::vector<int>> &matrix, int num)
{
std::this_thread::sleep_for(std::chrono::milliseconds(rand() * 500 / RAND_MAX));
std::vector<std::vector<int>> product = matrix;
std::unique_lock<std::mutex> lk(jobMutex);
std::cout << "The imported matrix for job " << num << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << product.at(i).at(j) << "\t";
}
std::cout << "\n";
}
lk.unlock();
int power;
if (num % 2 == 1) {
power = 3;
} else if (num % 2 == 0) {
power = 2;
addJobs();
}
for (int k = 1; k < power; k++) {
product = multiply4x4(product, matrix);
}
std::unique_lock<std::mutex> lock(jobMutex);
std::cout << "The matrix for job " << num << " to the power of " << power << " is: \n";
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
std::cout << product.at(i).at(j) << "\t";
}
std::cout << "\n";
}
lock.unlock();
}
void work(void) {
job *current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock, [] {return !jobList.empty() || closed; });
numberThreadsRunning++;
}
if (jobList.empty())
{
break;
}
current = &jobList.front();
job newcurrent = {current->matrix, current->num};
current = &newcurrent;
jobList.pop_front();
lock.unlock();
thisFunction(current->matrix, current->num);
tcout() << "job " << current->num << " complete\n";
}
numberThreadsRunning--;
}
int main(int argc, char *argv[]) {
const size_t n = 1;
numJobs = 0;
std::thread jobThreads[n];
std::vector<int> buffer;
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < 100; i++)
{
addJobs();
}
{
std::unique_lock<std::mutex> lock(jobMutex);
closed = true;
jobCondition.notify_all();
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}
Here is a fully working example:
#include <queue>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <iostream>
struct job { int num; };
bool closed = false;
std::deque<job> jobList;
std::mutex jobMutex;
std::condition_variable jobCondition;
std::atomic<int> numberThreadsRunning;
struct tcout
{
tcout() :lock(mutex) {}
template < typename T >
tcout& operator<< (T&& t)
{
std::cout << t;
return *this;
}
static std::mutex mutex;
std::unique_lock< std::mutex > lock;
};
std::mutex tcout::mutex;
void addJobs()
{
static int num = 0;
job current = { num++ };
std::unique_lock<std::mutex> lock(jobMutex);
jobList.push_back(current);
jobCondition.notify_one();
lock.unlock();
}
void work(void) {
job current;
numberThreadsRunning++;
while (true) {
std::unique_lock<std::mutex> lock(jobMutex);
if (jobList.empty()) {
numberThreadsRunning--;
jobCondition.wait(lock, [] {return !jobList.empty() || closed; });
numberThreadsRunning++;
}
if (jobList.empty())
{
break;
}
current = jobList.front();
jobList.pop_front();
lock.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(rand() * 500 / RAND_MAX));
tcout() << "job " << current.num << " complete\n";
}
numberThreadsRunning--;
}
int main(int argc, char *argv[]) {
const size_t n = 4;
std::thread jobThreads[n];
for (int i = 0; i < n; i++) {
jobThreads[i] = std::thread(work);
}
for (int i = 0; i < 100; i++)
{
addJobs();
}
{
std::unique_lock<std::mutex> lock(jobMutex);
closed = true;
jobCondition.notify_all();
}
for (int i = 0; i < n; i++) {
jobThreads[i].join();
}
}
I've made the following changes:
Never call lock() or unlock() on a std::mutex, always use std::unique_lock (or similar classes). You were calling jobMutex.unlock() in work() for the mutex you had locked with std::unique_lock, std::unique_lock would then call unlock for the second time leading to undefined behaviour. If an exception was thrown in addJobs then as you weren't using std::unique_lock at all the mutex would remain locked.
You need to use a predicate for jobCondition.wait otherwise a spurious wakeup could cause the wait to return while jobList is still empty.
I've added a closed variable to make the program exit when there's no more work to do
I've added a definition of job
In work you take a pointer to an item on the queue then pop it off the queue, as the item no longer exists the pointer is dangling. You need to copy the item before popping the queue. If you want to avoid the copy either make your job structure movable or change your queue to store std::unique_ptr<job> or std::shared_ptr<job>
I've also added a thread safe version of std::cout, this isn't strictly necessary but stops your output lines overlapping each other. Ideally you should use a proper thread safe logging library instead as locking a mutex for every print is expensive and if you have enough prints will make your program practically single threaded
Replace job* current; with job current; and then current = jobList.at(0);. Otherwise you end up with a pointer to an element of jobList that does not exist after jobList.pop_front().
Replace if (jobList.empty()) with while(jobList.empty()) to handle spurious wakeups.
I am sorting a vector using a swap function.
When I use the loop:
for (int i = 0; i < vec.size(); i++)
code runs fine but when I use:
for (auto const &i:vec)
it crashes!
Error in ./run': free(): invalid next size (fast): 0x0000000001df0c20
#include <iostream>
#include <cstdlib>
#include <cstdio>
#include <ctime>
#include <vector>
#include <array>
template <class T>
void myswap(T &a,
T &b)
{
T temp = a;
a = b;
b = temp;
}
int main() {
const int N = 5;
std::vector<int> vec = {112,32,11,4,7};
std::cout << "\nInit\n";
for (const auto &i:vec)
{
std::cout << i << "\t";
}
int j;
for (auto const &i:vec)
//for (int i = 0; i < vec.size(); i++)
{
j = i;
while (j > 0 && vec[j] < vec[j-1])
{
myswap(vec[j], vec[j-1]);
j--;
}
}
std::cout << "\n\nFinal\n";
for (const auto &i:vec)
{
std::cout << i << "\t";
}
std::cout << "\n";
return 0;
}
Answer already in the comments to the question (range based loop iterates over the values, not the indices), but for illustration, try this:
std::vector<std::vector<int>> v;
int j;
for(auto const& i : v)
{
j = v;
}
You will quickly discover that this piece of code does not compile – a good compiler will show you an error like this one (from GCC):
error: cannot convert 'std::vector<std::vector<int> >' to 'int' in assignment
What you now could do would be the following:
std::vector<std::vector<int>> v;
for(auto const& i : v)
{
std::vector<int> const& vv = v; // assign a reference
std::vector<int> vvv = v; // make a COPY(!)
}
Pretty self-explaining, isn't it?
it may be due to auto const try like this const auto and also you are using j in while loop instead of i you change it to j in while or change for(const auto &j:vec)
#include <iostream>
#include <cstdlib>
#include <cstdio>
#include <ctime>
#include <vector>
#include <array>
template <class T>
void myswap(T &a,
T &b)
{
T temp = a;
a = b;
b = temp;
}
int main() {
const int N = 5;
std::vector<int> vec = {112,32,11,4,7};
std::cout << "\nInit\n";
for (const auto &i:vec)
{
std::cout << i << "\t";
}
int j;
for (const auto &j:vec)
//for (int j = 0; j < vec.size(); j++)
{
while (j > 0 && vec[j] < vec[j-1])
{
myswap(vec[j], vec[j-1]);
j--;
}
}
std::cout << "\n\nFinal\n";
for (const auto &i:vec)
{
std::cout << i << "\t";
}
std::cout << "\n";
return 0;
}
From the way you use swap, it is intended for j to represent a valid index of vec. Now i is not an index, but it is a value instead. When you assign j=i, then j contains 112 which is clearly out of bounds.
The way ranged loop works is that i takes the value vec[0], vec[1], vec[2] ... at each iteration.
Read here to learn about ranged loop.
Can someone please explain why the following code crashes:
int a(int x)
{
int s = 0;
for(int i = 0; i < 100; i++)
s += i;
return s;
}
int main()
{
unsigned int thread_no = 10000;
vector<thread> t(thread_no);
for(int i = 0; i < 10; i++)
t[i] = std::thread(a, 10);
for(thread& t_now : t)
t_now.join();
cout << "OK" << endl;
cin.get();
}
But WORKS with 10 threads? I am new to multithreading and simply don't understand what is happening?!
This creates a vector of 10,000 default-initialized threads:
unsigned int thread_no = 10000;
vector<thread> t(thread_no);
You're running into the difference between "capacity" and "size". You didn't just create a vector large enough to house 10,000 threads, you created a vector of 10,000 threads.
See the following (http://ideone.com/i7LBQ6)
#include <iostream>
#include <vector>
struct Foo {
Foo() { std::cout << "Foo()\n"; }
};
int main() {
std::vector<Foo> f(8);
std::cout << "f.capacity() = " << f.capacity() << ", size() = " << f.size() << '\n';
}
You only initialize 10 of the elements as running threads
for(int i = 0; i < 10; i++)
t[i] = std::thread(a, 10);
So your for loop is going to see 10 initialized threads and then 9,990 un-started threads.
for(thread& t_now : t)
t_now.join();
You might want to try using t.reserve(thread_no); and t.emplace_back(a, 10);
Here's a complete example with renaming.
int threadFn(int iterations)
{
int s = 0;
for(int i = 0; i < iterations; i++)
s += i;
return s;
}
int main()
{
enum {
MaximumThreadCapacity = 10000,
DesiredInitialThreads = 10,
ThreadLoopIterations = 100,
};
vector<thread> threads;
threads.reserve(MaximumThreadCapacity);
for(int i = 0; i < DesiredInitialThreads; i++)
threads.emplace_back(threadFn, ThreadLoopIterations);
std::cout << threads.size() << " threads spun up\n";
for(auto& t : threads) {
t.join();
}
std::cout << "threads joined\n";
}
---- EDIT ----
Specifically, the crash you are getting is the attempt to join a non-running thread, http://ideone.com/OuLMyQ
#include <thread>
int main() {
std::thread t;
t.join();
return 0;
}
stderr
terminate called after throwing an instance of 'std::system_error'
what(): Invalid argument
I point this out because you should be aware there is a race condition even with a valid thread, if you do
if (t.joinable())
t.join();
it's possible for 't' to become non-joinable between the test and the action. You should always put a t.join() in a try {} clause. See http://en.cppreference.com/w/cpp/thread/thread/join
Complete example:
int threadFn(int iterations)
{
int s = 0;
for(int i = 0; i < iterations; i++)
s += i;
return s;
}
int main()
{
enum {
MaximumThreadCapacity = 10000,
DesiredInitialThreads = 10,
ThreadLoopIterations = 100,
};
vector<thread> threads;
threads.reserve(MaximumThreadCapacity);
for(int i = 0; i < DesiredInitialThreads; i++)
threads.emplace_back(threadFn, ThreadLoopIterations);
std::cout << threads.size() << " threads spun up\n";
for(auto& t : threads) {
try {
if(t.joinable())
t.join();
} catch (std::system_error& e) {
switch (e.code()) {
case std::errc::invalid_argument:
case std::errc::no_such_process:
continue;
case std::errc::resource_deadlock_would_occur:
std::cerr << "deadlock during join - wth!\n";
return e.code();
default:
std::cout << "error during join: " << e.what() << '\n';
return e.code();
}
}
}
std::cout << "threads joined\n";
}
You create a vector that has 10000 elements in it, you then populate the first ten and you wait for all the the threads inside the vector to join. Your program crashes because you forgot to set the other 9990.
for(int i = 0; i < 10; i++) // Wrong
for(int i = 0; i < thread_no; i++) // Correct