pthread_cond_wait strange behaviour - c++

I'm trying to solve Dining philosophers problem using C++.
Code is compiled with g++ -lpthread.
Entire solution is on philosophers github. Repository contains two cpp files: main.cpp and philosopher.cpp. "Main.cpp" creates mutex variable, semaphore, 5 conditinal variables, 5 forks, and starts philosophers. Semaphore is used only to synchronize start of philosophers. Other parameters are passed to philosophers to solve a problem. "Philosopher.cpp" contains solution for given problem but after few steps deadlock occurs.
Deadlock occurs when philosopher 0 is eating, and philosopher 1 (next to him) wants to take forks. Then, philosopher 1 has taken mutex, and wont give it back until philosopher 0 puts his forks down. Philosopher 0 can't put his forks down because of taken mutex, so we have a deadlock. Problem is in Philosopher::take_fork method, call for pthread_cond_wait(a,b) isn't releasing mutex b. Can't figure out why?
// Taking fork. If eather lef or right fork is taken, wait.
void Philosopher::take_fork(){
pthread_mutex_lock(&mon);
std::cout << "Philosopher " << id << " is waiting on forks" << std::endl;
while(!fork[id] || !fork[(id + 1)%N])
pthread_cond_wait(cond + id, &mon);
fork[id] = fork[(id + 1)%N] = false;
std::cout << "Philosopher " << id << " is eating" << std::endl;
pthread_mutex_unlock(&mon);
}
Please reference to this code for the rest.

Your call to pthread_cond_wait() is fine, so the problem must be elsewhere. You have three bugs that I can see:
Firstly, in main() you are only initialising the first condition variable in the array. You need to initialise all N condition variables:
for(int i = 0; i < N; i++) {
fork[i] = true;
pthread_cond_init(&cond[i], NULL);
}
pthread_mutex_init(&mon, NULL);
Secondly, in put_fork() you have an incorrect calculation for one of the condition variables to signal:
pthread_cond_signal(cond + (id-1)%N); /* incorrect */
When id is equal to zero, (id - 1) % N is equal to -1, so this will try to signal cond - 1, which does not point at a condition variable (it's possible that this pointer actually corrupts your mutex, since it might well be placed directly before cond on the stack). The calculation you actually want is:
pthread_cond_signal(cond + (id + N - 1) % N);
The third bug isn't the cause of your deadlock, but you shouldn't call srand(time(NULL)) every time you call rand() - just call that once, at the start of main().

Related

OpenMP integer copied after tasks finish

I do not know if this is documented anywhere, if so I would love a reference to it, however I have found some unexpected behaviour when using OpenMP. I have a simple program below to illustrate the issue. Here in point form I will tell what I expect the program to do:
I want to have 2 threads
They both share an integer
The first thread increments the integer
The second thread reads the integer
Ater incrementing once, an external process must tell the first thread to continue incrementing (via a mutex lock)
The second thread is in charge of unlocking this mutex
As you will see, the counter which is shared between the threads is not altered properly for the second thread. However, if I turn the counter into an integer refernce instead, I get the expected result. Here is a simple code example:
#include <mutex>
#include <thread>
#include <chrono>
#include <iostream>
#include <omp.h>
using namespace std;
using std::this_thread::sleep_for;
using std::chrono::milliseconds;
const int sleep_amount = 2000;
int main() {
int counter = 0; // if I comment this and uncomment the 2 lines below, I get the expected results
/* int c = 0; */
/* int &counter = c; */
omp_lock_t mut;
omp_init_lock(&mut);
int counter_1, counter_2;
#pragma omp parallel
#pragma omp single
{
#pragma omp task default(shared)
// The first task just increments the counter 3 times
{
while (counter < 3) {
omp_set_lock(&mut);
counter += 1;
cout << "increasing: " << counter << endl;
}
}
#pragma omp task default(shared)
{
sleep_for(milliseconds(sleep_amount));
// While sleeping, counter is increased to 1 in the first task
counter_1 = counter;
cout << "counter_1: " << counter << endl;
omp_unset_lock(&mut);
sleep_for(milliseconds(sleep_amount));
// While sleeping, counter is increased to 2 in the first task
counter_2 = counter;
cout << "counter_2: " << counter << endl;
omp_unset_lock(&mut);
// Release one last time to increment the counter to 3
}
}
omp_destroy_lock(&mut);
cout << "expected: 1, actual: " << counter_1 << endl;
cout << "expected: 2, actual: " << counter_2 << endl;
cout << "expected: 3, actual: " << counter << endl;
}
Here is my output:
increasing: 1
counter_1: 0
increasing: 2
counter_2: 0
increasing: 3
expected: 1, actual: 0
expected: 2, actual: 0
expected: 3, actual: 3
gcc version: 9.4.0
Additional discoveries:
If I use OpenMP 'sections' instead of 'tasks', I get the expected result as well. The problem seems to be with 'tasks' specifically
If I use posix semaphores, this problem also persists.
This is not permitted to unlock a mutex from another thread. Doing it causes an undefined behavior. The general solution is to use semaphores in this case. Wait conditions can also help (regarding the real-world use cases). To quote the OpenMP documentation (note that this constraint is shared by nearly all mutex implementation including pthreads):
A program that accesses a lock that is not in the locked state or that is not owned by the task that contains the call through either routine is non-conforming.
A program that accesses a lock that is not in the uninitialized state through either routine is non-conforming.
Moreover, the two tasks can be executed on the same thread or different threads. You should not assume anything about their scheduling unless you tell OpenMP to do so with dependencies. Here, it is completely compliant for a runtime to execute the tasks serially. You need to use OpenMP sections so multiple threads execute different sections. Besides, it is generally considered as a bad practice to use locks in tasks as the runtime scheduler is not aware of them.
Finally, you do not need a lock in this case: an atomic operation is sufficient. Fortunately, OpenMP supports atomic operations (as well as C++).
Additional notes
Note that locks guarantee the consistency of memory accesses in multiple threads thanks to memory barriers. Indeed, an unlock operation on a mutex cause a release memory barrier that make writes visible from others threads. A lock from another thread do an acquire memory barrier that force reads to be done after the lock. When lock/unlocks are not used correctly, the way memory accesses are done is not safe anymore and this cause some variable not to be updated from other threads for example. More generally, this also tends to create race conditions. Thus, put it shortly, don't do that.

How to achieve that a sub-thread in multithreading ends first, and the primary thread continues to execute

I try to implement a function: the primary thread creates multiple sub-threads and blocks the primary thread, and wakes up the primary thread to continue execution when any of the sub-threads ends.
The following code is my attempt to use std::future in C++11:
std::pair<size_t, size_t> fun(size_t i, size_t j)
{
std::this_thread::sleep_for(std::chrono::seconds(i * j));
return { i, j };
}
int main()
{
std::shared_future<std::pair<size_t, size_t>> ret;
std::pair<size_t, size_t> temp;
ret = std::async(std::launch::async, fun, 10, 9);
ret = std::async(std::launch::async, fun, 5, 4);
ret = std::async(std::launch::async, fun, 2, 1);
temp = ret.get();
std::cout << temp.first << "\t" << temp.second << "\n";
return 0;
}
For the result, I hope the program will directly output "2 1" after (2 * 1) seconds and end the primary thread, but in my attempt, the program needs to wait for the first sub-thread to sleep for (10 * 9) seconds before outputting "2 1" and end the primary thread.
You code has a few problems:
With the way you call std::async, you are not guaranteed to get any threads at all. You need to pass std::launch::async as the first argument to achieve the effect you want. See the docs: https://en.cppreference.com/w/cpp/thread/async
std::async returns a future, which you store in ret. But then in the next line, you overwrite the value in ret. That causes the future returned from your first thread to be destroyed. When futures returned from std::async are about to be destroyed, they block the current thread until they are completed. See the "Notes" section of the std::async docs.
What you are trying to achieve is unfortunately surprisingly difficult in c++ without any additional libraries. There is no simple way to say "wait until one of these threads is done". If you want to only use the STL, you have to use a "condition variable" to signal the main thread when the first of your subthreads is ready.

How do I force threads to take turn concatenating to a string?

Basically I have 2 text files with each file having a bunch of lines all 1-character long. Each character in one file is a letter or zero, if the character is zero, I need to look at the other file to see what is supposed to be there. My goal is to start two threads, each one reading a separate file and add each character to a string.
File 1:
t
0
i
s
0
0
0
t
e
0
t
File 2:
0
h
0
0
i
s
a
0
0
s
0
So the expected output of this should be 'thisisatest'.
I'm currently able to run the two threads and have each of them read their respective files, and I know I need to use a mutex lock() and unlock() to make sure only one thread is adding to the string at at time, but I'm having trouble figuring out how to implement it.
mutex m;
int i = 0;
string s = "";
void *readFile(string fileName) {
ifstream file;
char a;
file.open(fileName);
if(!file) {
cout << "Failed to open file." << endl;
exit(1);
}
while(file >> a) {
if(a == '0') {
} else {
s += a;
}
}
}
int main() {
thread p1(readFile, "Person1");
thread p2(readFile, "Person2");
p1.join();
p2.join();
cout << s << endl;
return 0;
}
I have tried placing the m.lock() just inside the while() loop and having the m.unlock() nested in the if() statement, but it did not work. Currently my code will just output file1 with no zeros and file2 with no zeros concatenated (not in any particular order since there's no way to predict which thread completes first).
I want the program to look at the text file, check the character on the current line, and if it's a letter, concatenate it to the string s, and if it's a zero, pause this thread and let the other thread check it's line.
You need to ensure the two threads run in sync, taking turns reading one line at a time. When a 0 is read, skip the turn, otherwise print the value.
For that you can use:
A variable shared between the worker threads, to keep track of turns;
A condition variable to notify threads of turn change;
A mutex to make the condition variable work.
Here's a working example demonstrating the turn-taking approach:
#include <iostream>
#include <condition_variable>
#include <mutex>
#include <thread>
int main() {
std::mutex mtx;
std::condition_variable cond;
int turn = 0;
auto task = [&](int myturn, int turns) {
std::unique_lock<std::mutex> lock(mtx);
while (turn < 9) {
cond.wait(lock, [&] { return turn % turns == myturn; });
std::cout << "Task " << myturn << std::endl;
turn++;
cond.notify_all();
}
};
std::thread p1(task, 0, 2);
std::thread p2(task, 1, 2);
p1.join();
p2.join();
std::cout << "Done" << std::endl;
}
Output:
Task 0
Task 1
Task 0
Task 1
Task 0
Task 1
Task 0
Task 1
Task 0
Task 1
Done
Consider that the index position in the string where each letter must go is predetermined and easily calculated from the data.
The thread which reads the second file:
0
h
0
0
i
s
knows that it is not responsible for the characters at str[0], str[2] and str[3], but is responsible for str[1], str[4] and str[5].
If we add a mutex and a condition variable, the algorithm is straightforward.
index = 0
while reading a line from the file succeeds: {
if the line isn't "0": {
lock(mutex)
while length(str) < index: {
wait(condition, mutex)
}
assert(length(str) == index)
add line[0] to end of str
unlock(mutex)
broadcast(condition)
}
index++
}
Basically, for each character that the thread needs to write, it knows the index. It waits for the string to get that long first, which the other thread(s) will do. Whenever a thread adds a character, it broadcasts the condition variable, to wake up another thread which wants to put a character at the new index.
The assert check should never go off, unless the data is bad (tells two or more threads to place a character at the same index). Also, if all threads hit a 0 line at the same index, of course, this will deadlock; every thread will be waiting for another thread to put a character at that index.
Another solution is possible using a synchronization object called a barrier. This problem is perfect for barriers, because what we have is a group of threads working through some tuples of data in parallel. For each tuple, exactly one thread must take action.
The algorithm is something like this:
// initialization:
init(barrier, 2) // number of threads
// each thread:
while able to read line from file: {
if line is not "0":
append line[0] to str
wait(barrier)
}
What wait(barrier) does is delay execution until 2 threads call it (because we initialized it to 2). When this happens, all threads are released. Then the barrier resets itself for the next wait, whereupon it will wait for 2 threads again.
Thus, the execution is serialized: the threads execute the loop body in lock step as they march through the file. That thread which reads a character instead of 0 adds it to the string. The other threads don't touch the string; they proceed straight to the barrier wait, so there is no data race.

Windows API mutex re-entry issue despite making threads Sleep()

I'm trying to get my head around Windows API threads and thread control. I briefly worked with threads in Java so I know the basic concepts but something that has worked in Java seems to only work halfway in C++.
What I am trying to do is as follows: 2 threads in one process, sharing a common resource(for this case, the common resource is a pair of two global variables int a, b;).
The first thread should acquire a mutex, use rand() to generate pairs of numbers from 0 to 100 until it gets a pair such that b == 2 * a, then release the mutex.
The second thread should then acquire the mutex, and check if the b == 2 * a condition is true for the given values(printing something like "incorrect" in case it is not), then release the mutex so the first thread can get it back. This process of generating an checking pairs of numbers should be repeated quite a few times, say 500/1000 times.
My code is as follows:
Main:
#define INIT_SEED time(NULL)
#define NUMBER_OF_CHECKS 250
int a = 0;
int b = 1;
HANDLE mutexHandle = CreateMutex(NULL, FALSE, NULL);
int main()
{
HANDLE thread1Handle = CreateThread(NULL, NULL, Thread1Behaviour, NULL, NULL, NULL);
Sleep(50);
HANDLE thread2Handle = CreateThread(NULL, NULL, Thread2Behaviour, NULL, NULL, NULL);
WaitForSingleObject(thread1Handle, INFINITE);
WaitForSingleObject(thread2Handle, INFINITE);
return 0;
}
Thread 1 behavior:
DWORD WINAPI Thread1Behaviour( LPVOID _ )
{
srand(INIT_SEED);
for (int i = 0; i < NUMBER_OF_CHECKS; i++)
{
WaitForSingleObject(mutexHandle, INFINITE);
do
{
b = rand() % 100;
a = rand() % 100;
}
while (b != 2 * a);
cout << i << ".\t" << b << " " << a << endl;
ReleaseMutex(mutexHandle);
Sleep(50);
}
return 0;
}
Thread 2 behavior:
DWORD WINAPI Thread2Behaviour( LPVOID _ )
{
for (int i = 0; i < NUMBER_OF_CHECKS; i++)
{
WaitForSingleObject(mutexHandle, INFINITE);
if (b == 2 * a)
cout << i << ".\t" << b << "\t=\t2 * " << a << endl;
else
cout << i << ".\t" << b << "\t=\t2 * " << a << "\tINCORRECT!!!" << endl;
ReleaseMutex(mutexHandle);
Sleep(50);
}
return 0;
}
The implementation is simple enough(I skipped over the handle validity checks to keep the code short, in case a bad handle could be the cause i can add them in, but I imagine that in case of a bad handle everything should just crash & burn, not work but with incorrect outputs).
I remember for working with threads in Java that i used to sleep for some time to make sure the same thread does not reacquire the mutex. However, when I run the above code, it mainly works as intended however, when the number of checks is big enough, somethimes the first thread gets the mutex 2 times in a row, leading to an output like this:
1. 92 46
2. 66 33
1. 66 = 2 * 33
Which means that at the end, the second thread will end up checking the same pair several times:
249. 80 40
248. 80 = 2 * 40
249. 80 = 2 * 40
I have tried changing the sleep timer value with values between 0 and 250 but this remains the case no matter how large the sleep period is. When I put at least 250 it seems to work about half the time.
Also, if I remove the cout in the first thread, the problem becomes 2-3 times worse, with more botched synchronizations.
And one more thing I noticed is that for a certain configuration of sleep timer and cout/no cout in thread 1, the number of times the mutex is immediately reacquired is the same, so this is completely reproducible(at least for me).
Using logic, I got 2 conflicting conclusions:
Since it MOSTLY works as intended, it might be a synchronization problem, with the way threads "rush" for the mutex as soon as it is available
Since I am able to reproduce this issue in a pretty "deterministic" way, it might mean that it is an issue in the logic of the code
But the above can't both be true at once, so what exactly is the problem here?
EDIT: To clarify the question: I know that mutex is not technically used for order of execution, but in this case, why does it not work as intended and what would be the fix?
Thanks in advance!

Using pthreads to process sections of an array/vector

Assume we have an array or vector of length 256(can be more or less) and the number of pthreads to generate to be 4(can be more or less).
I need to figure out how to assign each pthread to a process a section of the vector.
So the following code dispatches the multiple threads.
for(int i = 0; i < thread_count; i++)
{
int *arg = (int *) malloc(sizeof(*arg));
*arg = i;
thread_err = pthread_create(&(threads[i]), NULL, &multiThread_Handler, arg);
if (thread_err != 0)
printf("\nCan't create thread :[%s]", strerror(thread_err));
}
As you can tell from the above code, each thread passes an argument value to the starting function. Where in the case of the four threads, the argument values range from 0 to 3, 5 threads = 0 to 4, and so forth.
Now the starting function does the following:
void* multiThread_Handler(void *arg)
{
int thread_index = *((int *)arg);
unsigned int start_index = (thread_index*(list_size/thread_count));
unsigned int end_index = ((thread_index+1)*(list_size/thread_count));
std::cout << "Start Index: " << start_index << std::endl;
std::cout << "End Index: " << end_index << std::endl;
std::cout << "i: " << thread_index << std::endl;
for(int i = start_index; i < end_index; i++)
{
std::cout <<"Processing array element at: " << i << std::endl;
}
}
So in the above code, the thread whose argument is 0 should process the section 0 - 63(in the case of an array size of 256 and a thread count of 4), the thread whose argument is 1 should process the section 64 - 127, and so forth. The last thread processing 192 - 256.
Each of these four sections should processed in parallel.
Also, the pthread_join() functions are present in the original main code to make sure each thread finishes before the main thread terminates.
The problem is, that the value i in the above for-loop is taking on suspiciously large values. I'm not sure why this would occur since I am fairly new to pthreads.
It seems like sometimes it works perfectly fine and other times and other times, the value of i becomes so large that it causes the program to either abort or presents a segmentation fault.
The problem is indeed a data race caused by lack of synchronization. And the shared variable being used (and modified) by multiple threads is std::cout.
When using streams such as std::cout concurrently, you need to synchronize all operations with a stream by a mutex. Otherwise, depending on the platform and your luck, you might get output from multiple threads messed together (which might sometimes look like printed values being larger than you expect), or you might get the program crashed, or have other sorts of undefined behavior.
// Incorrect Code
unsigned int start_index = (thread_index*(list_size/thread_count));
unsigned int end_index = ((thread_index+1)*(list_size/thread_count));
The above code is critical region is wrong in your above program. as there is no synchronization mechanism has been used so there is data race.This leads to the wrong calculation of start_index and end_index counters and hence we may get wrong(random garbage values) and hence the for loop variable "i" goes on the toss. So you should use the following code to synchronize the critical region of your program.
// Correct Code
s=thread_mutex_lock (&mutexhandle);
start_index = (thread_index*(list_size/thread_count));
end_index = ((thread_index+1)*(list_size/thread_count));
s=thread_mutex_unlock (&mutexhandle);