Improper usage of mutex c++ - c++

i have problem with my code. This function pushes a product into queue.
void producent(bool &cont,std::queue<std::string> &queue,std::mutex &mtx, int &milliseconds)
{
while (cont)
{
mtx.lock();
if (queue.size() >= MAX_QUEUE_SIZE)
{
mtx.unlock();
std::cerr << "buffor full " << std::endl;
}
else
{
std::string product = generate();
std::cerr << "producent: " << product << " " << std::endl;
queue.push(product);
mtx.unlock();
}
std::this_thread::sleep_for(std::chrono::milliseconds(milliseconds));
}
}
this function generates a string of 10 signs, which are pushed by void producent function().
std::string generate() {
std::string temp;
temp.resize(10);
for (int i = 0; i < 10; i++) {
temp[i] = rand() % ('z' - 'a' + 1) + 'a';
}
return temp;
}
My question is: why, when i create 2 threads like this:
std::thread prod(producent, std::ref(wykonuj),std::ref(kolejka), std::ref(kolejka_mtx),std::ref(t));
std::thread prod1(producent, std::ref(wykonuj), std::ref(kolejka), std::ref(kolejka_mtx), std::ref(t));
both of them give me same result, for example the outcome is:
producent: qweasdzxca
producent: qweasdzxca
i wanted those outcomes to be different, thats why i used mutex, but it didnt work. Can someone give me some advices?

rand doesn't share a seed between threads. Each thread has its own seed - but without explicitly setting it differently in both threads via srand(), it's going to be the same.
Hence, generate invoked by both threads will produce the same string.
The docs suggest rand_r is the thread safe version, but both functions are threads safe in modern implementations.

Assuming your implementation has a thread-safe rand() (probably unwise), both threads are using the same initial random seed (the default of 1, in this case), and thus producing the same sequence. Rather than doing that, embrace the the C++ <random> offerings, and as far as that goes, the uniform distribution offerings as well.
#include <algorithm>
#include <random>
#include <string>
std::string generate(int n=10)
{
std::mt19937 prng{ std::random_device{}() };
std::uniform_int_distribution<int> dist('a', 'z');
std::string result;
std::generate_n(std::back_inserter(result), n, [&]() { return dist(prng); });
return result;
}
Executed 10x on 10x threads, this produced:
ysudtdcaeq
hwpeyiyyav
dlsdshltyo
pkfafhooxr
nmoxerbqpy
ydauzdvoaj
brjqjgxrgg
ezdsmbhygb
fpdgbkxfut
elywaokbyv
That, or something similar, should produce what you seek.
Note: the above will not work as-expected on platforms where a..z is non-contiguous. If you're on such a beast (typically OS/400 or OS/390 EBCDIC), an alternate solution is required.

Related

Generating and printing random numbers using threads in C++

I'm currently working on a project which requires the use of threads. However, before tackling the project, I want to create a simple exercise for myself to test my understanding of threads.
What I have are 2x functions; one for infinitely generating random numbers and the other for printing the output of this function.
The value of this random number will be continuously updated via a pointer.
From my understanding, I will need a mutex to prevent undefined behavior when reading and writing values to this pointer. I would also need to detach the random number generator function from the main function.
However, I'm having issues trying to build the project in Visual Studio Code which I suspecting due to a flaw in my logic.
#include <bits/stdc++.h>
#include <iostream>
#include <thread>
#include <mutex>
std::mutex global_mu;
void generateRandomNum(int min, int max, int *number)
{
while (true) {
global_mu.lock();
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<int> uni(min, max);
*number = uni(rng);
global_mu.unlock();
}
}
int main()
{
int *int_pointer;
int number = 0;
int_pointer = &number;
std::thread t1(generateRandomNum, 0, 3000, int_pointer);
t1.detach();
while(true) {
global_mu.lock();
std::cout << int_pointer << std::endl;
global_mu.unlock();
}
}
This looks wrong:
std::cout << int_pointer << std::endl;
You're trying to print the value of the pointer instead of printing the value of the int variable to which it points. You either should do this:
std::cout << *int_pointer << std::endl;
or this:
std::cout << number << std::endl;
This also looks like it maybe does not do what you want:
while (true) {
...
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<int> uni(min, max);
*number = uni(rng);
...
}
You are constructing and initializing a new random number generator for each iteration of the loop. You probably should move the PRNG out of the loop:
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<int> uni(min, max);
while (true) {
...
*number = uni(rng);
...
}
Finally, you probably should not ever do this:
while(true) {
global_mu.lock();
...
global_mu.unlock();
}
What's the very next thing that the thread does after it calls unlock()? The next thing it does is, it re-locks the same mutex again.
I don't want to get too technical, but the problem in this situation is that the thread that is most likely to acquire the mutex will be the one that just released it, and not the one that's been waiting for a long time. Whichever thread gets in to the mutex first, is going to starve the other thread.
The way out of the starvation problem is to only lock the mutex for the least amount of time necessary. E.g.,:
void generateRandomNum(int min, int max, int *number)
{
std::random_device rd;
std::mt19937 rng(rd());
std::uniform_int_distribution<int> uni(min, max);
while (true) {
int temp = uni(rng);
global_mu.lock();
*number = temp;
global_mu.unlock();
}
}
int main()
{
int *int_pointer;
int number = 0;
int_pointer = &number;
std::thread t1(generateRandomNum, 0, 3000, int_pointer);
t1.detach();
while(true) {
int temp;
global_mu.lock();
temp = number;
global_mu.unlock();
std::cout << temp << std::endl;
}
}
If this feels like you're writing a lot of extra lines, you're right. Multi-threading is hard to get right. And, in order to get high performance from a multi-threaded program, you are going to have to write extra lines of code, and maybe even make the program do more work per CPU than a single threaded program would do.

Threading returns unexpected result - c++

I'm learning about threads for homework, and I've tried to implement threading on a simple program I've made. Without threading the program works perfectly, but when I thread the two random number generator functions, it returns incorrect results. The result always seems to be '42' for both number generators, not sure why this would be the case.
Also for context, I'm just starting with threads so I understand this program doesn't need multithreading. I'm doing it just for learning purposes.
Thanks for any help!
// struct for vector to use
struct readings {
std::string name;
int data;
};
// random generator for heat value - stores in vector of struct
void gen_heat(std::vector<readings>& storage) {
readings h = {"Heat", rand() % 100 + 1};
storage.insert(storage.begin(), h);
}
// random generator for light value - stores in vector of struct
void gen_light(std::vector<readings>& storage) {
readings l = {"Light", rand() % 100 + 1};
storage.insert(storage.begin(), l);
}
int main() {
// vector of readings struct
std::vector<readings> storage;
srand(time(NULL));
// initialising threads of random generators
std::thread H(gen_heat, std::ref(storage));
std::thread L(gen_light, std::ref(storage));
// waiting for both to finish
H.join();
L.join();
// print values in vec of struct
for (const auto& e : storage) {
std::cout << "Type: " << e.name << std::endl
<< "Numbers: " << e.data << std::endl;
}
// send to another function
smartsensor(storage);
return 0;
}
Since you have several threads accessing a mutual resource, in this case the vector of readings, and some of them are modifying it, you need to make the accesses to that resource exclusive. There are many ways of synchronizing the access; one of them, simple enough and not going down to the use of mutexes, is a binary semaphore (since C++20). You basically:
own the access to the resource by acquiring the semaphore,
use the resource, and then,
release the semaphore so others can access the resource.
If a thread A tries to acquire the semaphore while other thread B is using the resource, thread A will block until the resource is freed.
Notice the semaphore is initialized to 1 indicating the resource is free. Once a thread acquires the semaphore, the count will go down to 0, and no other thread will be able to acquire it until the count goes back to 1 (what will happen after a release).
[Demo]
#include <cstdlib> // rand
#include <iostream> // cout
#include <semaphore>
#include <string>
#include <thread>
#include <vector>
std::binary_semaphore readings_sem{1};
// struct for vector to use
struct readings {
std::string name;
int data;
};
// random generator for heat value - stores in vector of struct
void gen_heat(std::vector<readings>& storage) {
for (auto i{0}; i < 5; ++i) {
readings_sem.acquire();
readings h = {"Heat", rand() % 100 + 1};
storage.insert(storage.begin(), h);
readings_sem.release();
}
}
// random generator for light value - stores in vector of struct
void gen_light(std::vector<readings>& storage) {
for (auto i{0}; i < 5; ++i) {
readings_sem.acquire();
readings l = {"Light", rand() % 100 + 1};
storage.insert(storage.begin(), l);
readings_sem.release();
}
}
int main() {
// vector of readings struct
std::vector<readings> storage;
srand(time(NULL));
// initialising threads of random generators
std::thread H(gen_heat, std::ref(storage));
std::thread L(gen_light, std::ref(storage));
// waiting for both to finish
H.join();
L.join();
// print values in vec of struct
for (const auto& e : storage) {
std::cout << "Type: " << e.name << std::endl
<< "Numbers: " << e.data << std::endl;
}
}
// Outputs (something like):
//
// Type: Heat
// Numbers: 5
// Type: Light
// Numbers: 83
// Type: Light
// Numbers: 40
// ...
[Update on Ben Voigt's comment]
The acquisition and release of the resource can be encapsulated by using RAII (Resource Acquisition Is Initialization), a mechanism which is already provided by the language. E.g.:
Both threads still try and acquire a mutex to get access to the vector of readings resource.
But they acquire it by just creating a lock guard.
Once the lock guard goes out of scope and is destroyed, the mutex is released.
[Demo]
#include <mutex> // lock_guard
std::mutex mtx{};
// random generator for heat value - stores in vector of struct
void gen_heat(std::vector<readings>& storage) {
for (auto i{0}; i < 5; ++i) {
std::lock_guard<std::mutex> lg{ mtx };
readings h = {"Heat", rand() % 100 + 1};
storage.insert(storage.begin(), h);
}
}

If statement passes only when preceded by debug cout line (multi-threading in C)

I created this code to use for solving CPU intensive tasks real-time and potentially as a base for a game engine in the future. For it I created a system where there is an array of ints each thread modifies to signal whether they are done with their current task.
The problem occurs when running it with more than 4 threads. When using 6 threads or more, the "if (threadone_private == threadcount)" stops working UNLESS I add this debug line "cout << threadone_private << endl;" before it.
I cannot comprehend why this debug line makes any difference on whether the if conditional functions as expected, neither why it works without it when using 4 threads or less.
For this code I'm using:
#include <GL/glew.h>
#include <GLFW/glfw3.h>
#include <iostream>
#include <thread>
#include <atomic>
#include <vector>
#include <string>
#include <fstream>
#include <sstream>
using namespace std;
Right now this code only counts up to 60 trillion, in asynchronous steps of 3 billion, really fast.
Here are the relevant parts of the code:
int thread_done[6] = { 0,0,0,0,0,0 };
atomic<long long int> testvar1 = 0;
atomic<long long int> testvar2 = 0;
atomic<long long int> testvar3 = 0;
atomic<long long int> testvar4 = 0;
atomic<long long int> testvar5 = 0;
atomic<long long int> testvar6 = 0;
void task1(long long int testvar, int thread_number)
{
int continue_work = 1;
for (; ; ) {
while (continue_work == 1) {
for (int i = 1; i < 3000000001; i++) {
testvar++;
}
thread_done[thread_number] = 1;
if (thread_number==0) {
testvar1 = testvar;
}
if (thread_number == 1) {
testvar2 = testvar;
}
if (thread_number == 2) {
testvar3 = testvar;
}
if (thread_number == 3) {
testvar4 = testvar;
}
if (thread_number == 4) {
testvar5 = testvar;
}
if (thread_number == 5) {
testvar6 = testvar;
}
continue_work = 0;
}
if (thread_done[thread_number] == 0) {
continue_work = 1;
}
}
}
And here is the relevant part of the main thread:
int main() {
long long int testvar = 0;
int threadcount = 6;
int threadone_private = 0;
thread thread_1(task1, testvar, 0);
thread thread_2(task1, testvar, 1);
thread thread_3(task1, testvar, 2);
thread thread_4(task1, testvar, 3);
thread thread_5(task1, testvar, 4);
thread thread_6(task1, testvar, 5);
for (; ; ) {
if (threadcount == 0) {
for (int i = 1; i < 3000001; i++) {
testvar++;
}
cout << testvar << endl;
}
else {
while (testvar < 60000000000000) {
threadone_private = thread_done[0] + thread_done[1] + thread_done[2] + thread_done[3] + thread_done[4] + thread_done[5];
cout << threadone_private << endl;
if (threadone_private == threadcount) {
testvar = testvar1 + testvar2 + testvar3 + testvar4 + testvar5 + testvar6;
cout << testvar << endl;
thread_done[0] = 0;
thread_done[1] = 0;
thread_done[2] = 0;
thread_done[3] = 0;
thread_done[4] = 0;
thread_done[5] = 0;
}
}
}
}
}
I expected that since each worker thread only modifies one int out of the array threadone_private, and since the main thread only ever reads it until all worker threads are waiting, that this if (threadone_private == threadcount) should be bulletproof... Apparently I'm missing something important that goes wrong whenever I change this:
threadone_private = thread_done[0] + thread_done[1] + thread_done[2] + thread_done[3] + thread_done[4] + thread_done[5];
cout << threadone_private << endl;
if (threadone_private == threadcount) {
To this:
threadone_private = thread_done[0] + thread_done[1] + thread_done[2] + thread_done[3] + thread_done[4] + thread_done[5];
//cout << threadone_private << endl;
if (threadone_private == threadcount) {
Disclaimer: Concurrent code is quite complicated and easy to get wrong, so it's generally a good idea to use higher level abstractions. There are a whole lot of details that are easy to get wrong without ever noticing. You should think very carefully about doing such low-level programming if you're not an expert. Sadly C++ lacks good built-in high level concurrent constructs, but there are libraries out there that handle this.
It's unclear what the whole code is supposed to do anyhow to me. As far as I can see whether the code ever stops relies purely on timing - even if you did the synchronization correctly - which is completely non deterministic. Your threads could execute in such a way that thread_done is never all true.
But apart from that there is at least one correctness issue: You're reading and writing to int thread_done[6] = { 0,0,0,0,0,0 }; without synchronization. This is undefined behavior so the compiler can do what it wants.
What probably happens is that the compiler sees that it can cache the value of threadone_private since the thread never writes to it so the value cannot change (legally). The external call to std::cout means it can't be sure that the value isn't change behind its back so it has to read the value each iteration new (also std::cout uses locks which causes synchronization in most implementations which again limits what the compiler can assume).
I cannot see any std::mutex, std::condition_variable or variants of std::lock in your code. Doing multithreading without any of those will never succeed reliably. Because whenever multiple threads modify the same data, you need to make sure only one thread (including your main thread) has access to that data at any given time.
Edit: I noticed you use atomic. I do not have any experience with this, however I know using mutexes works reliably.
Therefore, you need to lock every access (read or write) to that data with a mutex like this:
//somewhere
std::mutex myMutex;
std::condition_variable myCondition;
int workersDone = 0;
/* main thread */
createWorkerThread1();
createWorkerThread2();
{
std::unique_lock<std::mutex> lock(myMutex); //waits until mutex is locked.
while(workersDone != 2) {
myCondition.wait(lock); //the mutex is unlocked while waiting
}
std::cout << "the data is ready now" << std::endl;
} //the lock is destroyed, unlocking the mutex
/* Worker thread */
while(true) {
{
std::unique_lock<std::mutex> lock(myMutex); //waits until mutex is locked
if(read_or_modify_a_piece_of_shared_data() == DATA_FINISHED) {
break; //lock leaves the scope, unlocks the mutex
}
}
prepare_everything_for_the_next_piece_of_shared_data(); //DO NOT access data here
}
//data is processed
++workersDone;
myCondition.notify_one(); //no mutex here. This wakes up the waiting thread
I hope this gives you an idea on how to use mutexes and condition variables to gain thread safety.
Disclaimer: 100% pseudo code ;)

C++ Reusing a vector of threads that call the same function

I would like to reuse a vector of threads that call the same function several times with different parameters. There is no writing (with the exception of an atomic parameter), so no need for a mutex. To depict the idea, I created a basic example of a parallelized code that finds the maximum value of a vector. There are clearly better ways to find the max of a vector, but for the sake of the explanation and to avoid getting into further details of the real code I am writing, I am going with this silly example.
The code finds the maximum number of a vector by calling a function pFind that checks whether the vector contains the number k (k is initialized with an upper bound). If it does, the execution stops, otherwise k is reduced by one and the process repeats.
The code bellow generates a vector of threads that parallelize the search for k in the vector. The issue is that, for every value of k, the vector of threads is regenerated and each time the new threads are joined.
Generating the vector of threads and joining them every time comes with an overhead that I want to avoid.
I am wondering if there is a way of generating a vector (a pool) of threads only once and reuse them for the new executions. Any other speedup tip will be appreciated.
void pFind(
vector<int>& a,
int n,
std::atomic<bool>& flag,
int k,
int numTh,
int val
) {
int i = k;
while (i < n) {
if (a[i] == val) {
flag = true;
break;
} else
i += numTh;
}
}
int main() {
std::atomic<bool> flag;
flag = false;
int numTh = 8;
int val = 1000;
int pos = 0;
while (!flag) {
vector<thread>threads;
for (int i = 0; i < numTh; i++){
thread th(&pFind, std::ref(a), size, std::ref(flag), i, numTh, val);
threads.push_back(std::move(th));
}
for (thread& th : threads)
th.join();
if (flag)
break;
val--;
}
cout << val << "\n";
return 0;
}
There is no way to assign a different execution function (closure) to a std::thread after construction. This is generally true of all thread abstractions, though often implementations try to memoize or cache lower-level abstractions internally to make thread fork and join fast so just constructing new threads is viable. There is a debate in systems programming circles about whether creating a new thread should be incredibly lightweight or whether clients should be written to not fork threads as frequently. (Given this has been ongoing for a very long time, it should be clear there are a lot of tradeoffs involved.)
There are a lot of other abstractions which try to do what you really want. They have names such as "threadpools," "task executors" (or just "executors"), and "futures." All of them tend to map onto threads by creating some set of threads, often related to the number of hardware cores in the system, and then having each of those threads loop and look for requests.
As the comments indicated, the main way you would do this yourself is to have threads with a top-level loop that accepts execution requests, processes them, and then posts the results. To do this you will need to use other synchronization methods such as mutexes and condition variables. It is generally faster to do things this way if there are a lot of requests and requests are not incredibly large.
As much as standard C++ concurrency support is a good thing, it is also rather significantly lacking for real world high performance work. Something like Intel's TBB is far more of an industrial strength solution.
By piecing together some code from different online searches, the following works, but is not as fast as as the approach that regenerates the threads at each iteration of the while loop.
Perhaps someone can comment on this approach.
The following class describes the thread pool
class ThreadPool {
public:
ThreadPool(int threads) : shutdown_(false){
threads_.reserve(threads);
for (int i = 0; i < threads; ++i)
threads_.emplace_back(std::bind(&ThreadPool::threadEntry, this, i));
}
~ThreadPool(){
{
// Unblock any threads and tell them to stop
std::unique_lock<std::mutex>l(lock_);
shutdown_ = true;
condVar_.notify_all();
}
// Wait for all threads to stop
std::cerr << "Joining threads" << std::endl;
for (auto & thread : threads_) thread.join();
}
void doJob(std::function<void(void)>func){
// Place a job on the queu and unblock a thread
std::unique_lock<std::mutex>l(lock_);
jobs_.emplace(std::move(func));
condVar_.notify_one();
}
void threadEntry(int i){
std::function<void(void)>job;
while (1){
{
std::unique_lock<std::mutex>l(lock_);
while (!shutdown_ && jobs_.empty()) condVar_.wait(l);
if (jobs_.empty()){
// No jobs to do and we are shutting down
std::cerr << "Thread " << i << " terminates" << std::endl;
return;
}
std::cerr << "Thread " << i << " does a job" << std::endl;
job = std::move(jobs_.front());
jobs_.pop();
}
// Do the job without holding any locks
job();
}
}
};
Here is the rest of the code
void pFind(
vector<int>& a,
int n,
std::atomic<bool>& flag,
int k,
int numTh,
int val,
std::atomic<int>& completed) {
int i = k;
while (i < n) {
if (a[i] == val) {
flag = true;
break;
} else
i += numTh;
}
completed++;
}
int main() {
std::atomic<bool> flag;
flag = false;
int numTh = 8;
int val = 1000;
int pos = 0;
std::atomic<int> completed;
completed=0;
ThreadPool p(numThreads);
while (!flag) {
for (int i = 0; i < numThreads; i++) {
p.doJob(std::bind(pFind, std::ref(a), size, std::ref(flag), i, numTh, val, std::ref(completed)));
}
while (completed < numTh) {}
if (flag) {
break;
} else {
completed = 0;
val--;
}
}
cout << val << "\n";
return 0;
}
Your code has a race condition: bool is not an atomic type and is therefore not safe for multiple threads to write to concurrently. You need to use std::atomic_bool or std::atomic_flag.
To answer your question, you're recreating the threads vector each iteration of the loop, which you can avoid by moving its declaration outside the loop body. Reusing the threads themselves is a much more complex topic that's hard to get right or describe concisely.
vector<thread> threads;
threads.reserve(numTh);
while (!flag) {
for (size_t i = 0; i < numTh; ++i)
threads.emplace_back(pFind, a, size, flag, i, numTh, val);
for (auto &th : threads)
th.join();
threads.clear();
}

Parallel execution doesn't update my variable

I want to write a program where, random numbers are going to be created and I am going to track down the greatest of them. Two threads are going to run in parallel. However, my best variable is stuck at its initial variable. Why?
[EDIT]
I updated the code after Joachim's answer, but I am not getting the correct answer at every run! What am I missing?
#include <iostream> // std::cout
#include <thread> // std::thread
#include <mutex> // std::mutex
#include <random>
std::default_random_engine generator((unsigned int)time(0));
int random(int n) {
std::uniform_int_distribution<int> distribution(0, n);
return distribution(generator);
}
std::mutex mtx; // mutex for critical section
void update_cur_best(int& cur_best, int a, int b) {
// critical section (exclusive access to std::cout signaled by locking mtx):
mtx.lock();
if(a > b)
cur_best = a;
else
cur_best = b;
mtx.unlock();
}
void run(int max, int& best) {
for(int i = 0; i < 15; ++i) {
int a = random(max); int b = random(max);
update_cur_best(best, a, b);
mtx.lock();
std::cout << "|" << a << "| |" << b << "|" << std::endl;
mtx.unlock();
}
}
int main ()
{
int best = 0;
std::thread th1 (run, 100, std::ref(best));
std::thread th2 (run, 100, std::ref(best));
th1.join();
th2.join();
std::cout << "best = " << best << std::endl;
return 0;
}
Sample output:
|4| |21|
|80| |75|
|93| |95|
|4| |28|
|52| |92|
|96| |12|
|83| |8|
|4| |33|
|28| |35|
|59| |52|
|20| |73|
|60| |96|
|61| |34|
|67| |79|
|67| |95|
|54| |57|
|20| |75|
|40| |30|
|16| |32|
|25| |100|
|33| |36|
|69| |26|
|94| |46|
|15| |57|
|50| |68|
|9| |56|
|46| |70|
|65| |65|
|76| |73|
|16| |29|
best = 29
I am getting 29, which is not the maximum!
As an answer to the updated question, in update_cur_best the value of best is overwritten on each iteration. In the end, its value will simply be the greater of the most recent a, b pair generated. What you want to do is update it only when the current a or b is greater than best (I'm not sure why you generate two random values on each iteration...)
It's because you can't really pass references to the thread constructor, because they will not be passed on as references, but copied and it's those copies that are passed to your thread function. You have to use std::ref to wrap the reference.
E.g.
std::thread th1 (run, 100, std::ref(best));