Can multiple threads wait on the same condition variable? - c++

I'm trying to understand how to better use condition variables, and I have the following code.
Behavior.
The expected behavior of the code is that:
Each thread prints "thread n waiting"
The program waits until the user presses enter
When the user presses enter, notify_one is called once for each thread
All the threads print "thread n ready.", and exit
The observed behavior of the code is that:
Each thread prints "thread n waiting" (Expected)
The program waits until the user presses enter (Expected)
When the user presses enter, notify_one is called once for each thread (Expected)
One of the threads prints "thread n ready", but then the code hangs. (???)
Question.
Why does the code hang? And how can I have multiple threads wait on the same condition variable?
Code
#include <condition_variable>
#include <iostream>
#include <string>
#include <vector>
#include <thread>
int main() {
using namespace std::literals::string_literals;
auto m = std::mutex();
auto lock = std::unique_lock(m);
auto cv = std::condition_variable();
auto wait_then_print =[&](int id) {
return [&, id]() {
auto id_str = std::to_string(id);
std::cout << ("thread " + id_str + " waiting.\n");
cv.wait(lock);
// If I add this line in, the code gives me a system error:
// lock.unlock();
std::cout << ("thread " + id_str + " ready.\n");
};
};
auto threads = std::vector<std::thread>(16);
int counter = 0;
for(auto& t : threads)
t = std::thread(wait_then_print(counter++));
std::cout << "Press enter to continue.\n";
std::getchar();
for(int i = 0; i < counter; i++) {
cv.notify_one();
std::cout << "Notified one.\n";
}
for(auto& t : threads)
t.join();
}
Output
thread 1 waiting.
thread 0 waiting.
thread 2 waiting.
thread 3 waiting.
thread 4 waiting.
thread 5 waiting.
thread 6 waiting.
thread 7 waiting.
thread 8 waiting.
thread 9 waiting.
thread 11 waiting.
thread 10 waiting.
thread 12 waiting.
thread 13 waiting.
thread 14 waiting.
thread 15 waiting.
Press enter to continue.
Notified one.
Notified one.
thread 1 ready.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.
Notified one.

This is undefined behavior.
In order to wait on a condition variable, the condition variable must be waited on by the same exact thread that originally locked the mutex. You cannot lock the mutex in one execution thread, and then wait on the condition variable in another thread.
auto lock = std::unique_lock(m);
This lock is obtained in the main execution thread. Afterwards, the main execution thread creates all these multiple execution threads. Each one of these execution threads executes the following:
cv.wait(lock)
The mutex lock was not acquired by the execution thread that calls wait() here, therefore this is undefined behavior.
A more closer look at what you are attempting to do here suggests that you will likely get your intended results if you simply move
auto lock = std::unique_lock(m);
inside the lambda that gets executed by each new execution thread.
You also need to simply use notify_all() instead of calling notify_one() multiple times, due to various race conditions. Remember that wait() automatically unlocks the mutex and waits on the condition variable, and wait() returns only after the thread successfully relocked the mutex after being notified by the condition variable.

Related

Do spinlocks guarantee context switching when compared to mutexes

Consider the following code snippet
int index = 0;
av::utils::Lock lock(av::utils::Lock::EStrategy::eMutex); // Uses a mutex or a spin lock based on specified strategy.
void fun()
{
for (int i = 0; i < 100; ++i)
{
lock.aquire();
++index;
std::cout << "thread " << std::this_thread::get_id() << " index = " << index << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(500));
lock.release();
}
}
int main()
{
std::thread t1(fun);
std::thread t2(fun);
t1.join();
t2.join();
}
The output that I get with a mutex used for synchronization is first thread 1 gets executed completely followed by thread 2.
While using a spinlock(implemented using std::atomic_flag), I get the order of execution between the threads which is interleaved (one iteration of thread 1 followed by another iteration of thread 2). The latter case happens irrespective of the delay I add in execution of the iteration.
I understand that a mutex only guarantees mutual exclusion and not the order of execution. The question I have is if I want to have an execution order such that two threads are executed in an interleaved manner, is using spinlocks a recommended strategy or not?
The output that I get with a mutex ... is first thread 1 [runs through the whole loop] followed by thread 2.
That's because of how your loop uses the lock: The very last thing the loop body does is, it unlocks the lock. The very next thing it does at the start of the next iteration is, it locks the lock again.
The other thread can be blocked, effectively sleeping, waiting for the mutex. When your thread 1 releases the lock, the OS scheduler may still be running its algorithms, trying to figure out how to respond to that, when thread 1 comes 'round and locks the lock again.
It's like a race to lock the mutex, and thread 1 is on the starting line when the gun goes off, while thread 2 is sitting on the bench, tying its shoes.
While using a spinlock...the order of execution between the threads which is interleaved
That's because the "blocked" thread isn't really blocked. It's still actively running on a different processor while it waits. It has a much better chance at winning the lock when the first thread releases it.

Synchronizing producer/consumer with a barrier

I have a producer thread which produces work for three consumer threads. When work has been produced, the producer thread waits until the consumer threads have finished handling the work. The producer thread then goes on handling the results.
#include <condition_variable>
#include <mutex>
#include <boost/thread/barrier.hpp>
#include <vector>
#include <queue>
std::condition_variable cond;
std::mutex mutex;
boost::barrier barrier(4);
std::vector<std::thread> workers;
std::queue<unsigned int> work;
std::queue<unsigned int> results;
void worker();
int main()
{
// 1 producer and 3 consumers
for(unsigned int i = 0; i < 3; i++)
workers.push_back(std::thread(worker));
// Wait here so the three workers can get to cond.wait();
barrier.wait();
std::unique_lock<std::mutex> lock(mutex);
while(true)
{
// Generate work
std::cout << "gen" << std::endl;
for(unsigned int i = 0; i < 10; i++)
work.push(i);
cond.notify_all();
lock.unlock();
barrier.wait();
// Handle the results
while(results.size() > 0)
results.pop();
lock.lock();
}
return 0;
}
void worker()
{
while(true)
{
std::unique_lock<std::mutex> lock(mutex);
while(results.size() == 0)
{
lock.unlock();
barrier.wait();
lock.lock();
cond.wait(lock);
}
// Get work
unsigned int next = work.front();
work.pop();
// Store the result
results.push(next);
lock.unlock();
}
}
The problem is that I need to make sure that all consumer threads have entered cond.wait(lock) before the producer thread starts its next iteration:
All 4 threads have reached the barrier. The barrier gets released and the threads can continue.
The producer thread locks the mutex before all consumer threads have reached cond.wait(lock). Thus at least one consumer thread is blocked by lock.lock().
The producer thread starts its next iteration, creates work and notifies the consumers. Since at least one consumer thread has not yet reached cond.wait(lock) the notify_all() will be missed by at least one consumer thread. These threads now wait for the next notify_all() - which will never arrive.
The next time the barrier is reached, at least one consumer thread still waits for the next notify_all(). Thus the barrier will not be unlocked and a deadlock occured.
How can I resolve this situation?
A condition_variable should be used together with a flag, to help prevent spurious wake ups. This same flag can also be used to check if the thread should wait at all or just go straight to work.
Add a bool go_to_work=false;, then we simply add it as a predicate in the call to wait and make sure we set/unset it from the main thread.
In main thread before calling notify_all we set our bool
go_to_work=true;
cond.notify_all();
In our worker thread we add the predicate to our wait call
cond.wait(lock, [](){ return go_to_work; });
Lastly, in our main thread we want to set the flag back to false after all work has been done.
barrier.wait();
lock.lock(); // We need to lock the mutex before modifying the bool
go_to_work=false;
lock.unlock();
//Handle result...
Now if a thread reaches the wait call after the main thread has set go_to_work=true it will not wait at all and simply go ahead and do the work. As a bonus this also guards against spurious wake-ups.

What if the condition_variable::wait_for delay parameter changes during the wait process?

I prefer using condition_variable::wait_for as a timer over chrono based timer since I can override the wait condition by specifying a predicate. So I wrote a test program to check what if the delay duration changes once the wait process has already started. The condition_variable::wait_for seems to ignore the change and instead comes out of the wait process entirely. Why is this? Can I ever change it in the middle of a wait process?
enum STATE {START,STOP,NONE};
STATE state ;
condition_variable cv;
mutex mu;
chrono::milliseconds delay;
void cv_wait_thread()
{
{
unique_lock<mutex> lock(mu);
cv.wait(lock, [](){ return state == START; });
}
chrono::time_point<chrono::high_resolution_clock> start_time = chrono::high_resolution_clock::now();
{
unique_lock<mutex> lock(mu);
cv.wait_for(lock, delay);
}
auto diff = chrono::duration_cast<chrono::milliseconds>(chrono::high_resolution_clock::now() - start_time);
cout << "Conditional_wait: "<< diff.count() << " milliseconds" << endl;
}
void main()
{
thread thread1([](){ cv_wait_thread(); });
state = NONE;
this_thread::sleep_for(chrono::seconds(1));
delay = chrono::milliseconds(3000);
state = START;
cv.notify_all(); // ask thread to sleep for 3 sec
this_thread::sleep_for(chrono::milliseconds(2000)); // let cv_wait thread start the wait process for at least 2 sec already
delay = chrono::milliseconds(5000); // ask thread to correct and sleep for 5 sec instead
cv.notify_all();
thread1.join(); // thread prints 2000 milli
}
As you can see in the specification of the wait_for() method, the delay interval is specified as a const parameter:
template< class Rep, class Period >
std::cv_status wait_for( std::unique_lock<std::mutex>& lock,
const std::chrono::duration<Rep, Period>& rel_time);
Furthermore, there's nothing in the specification of wait_for() that indicates that any changes to the parameter will be observable by the thread that's executing the wait_for().
In fact, the only sequencing here comes as an indirect result of releasing and re-acquiring the underlying mutex. Note, however, that this will not occur until after the specified timeout expires. As such, any changes made to the delay parameter by another thread will not be sequenced until that time, and will not be observable by the thread that's executing the wait_for() call.
Note, that if the thread in question was doing something other than locking the mutex and executing wait_for(), unless the main execution thread explicitly does something that sequences its change to delay with respect to the other thread, the other thread is not guaranteed to observe the new value of delay ever.

weird pthread_mutex_t behavior

Consider the next piece of code -
#include <iostream>
using namespace std;
int sharedIndex = 10;
pthread_mutex_t mutex;
void* foo(void* arg)
{
while(sharedIndex >= 0)
{
pthread_mutex_lock(&mutex);
cout << sharedIndex << endl;
sharedIndex--;
pthread_mutex_unlock(&mutex);
}
return NULL;
}
int main() {
pthread_t p1;
pthread_t p2;
pthread_t p3;
pthread_create(&p1, NULL, foo, NULL);
pthread_create(&p2, NULL, foo, NULL);
pthread_create(&p3, NULL, foo, NULL);
pthread_join(p1, NULL);
pthread_join(p2, NULL);
pthread_join(p3, NULL);
return 0;
}
I simply created three pthreads and gave them all the same function foo, in hope that every thread, at its turn, will print and decrement the sharedIndex.
But this is the output -
10
9
8
7
6
5
4
3
2
1
0
-1
-2
I don't understand why the process doesn't stop when sharedIndex
reaches 0.
sharedIndex is protected by a mutex. How come it's accessed after it became 0? Aren't the threads supposed to directly skip to return NULL;?
EDIT
In addition, it seems that only the first thread decrements the sharedIndex.
Why isn't every thread decrement the shared resource at it's turn?
Here's the output after a fix -
Current thread: 140594495477504
10
Current thread: 140594495477504
9
Current thread: 140594495477504
8
Current thread: 140594495477504
7
Current thread: 140594495477504
6
Current thread: 140594495477504
5
Current thread: 140594495477504
4
Current thread: 140594495477504
3
Current thread: 140594495477504
2
Current thread: 140594495477504
1
Current thread: 140594495477504
0
Current thread: 140594495477504
Current thread: 140594478692096
Current thread: 140594487084800
I wish that all of the thread will decrement the shared source - Meaning, every contex switch, a different thread will access the resource and do its thing.
This program's behaviour is undefined.
You have not initialized the mutex. You need to either call pthread_mutex_init or statically initialize it:
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
You read this variable outside the critical section:
while(sharedIndex >= 0)
That means you could read a garbage value while another thread is updating it. You should not read the shared variable until you have locked the mutex and have exclusive access to it.
Edit:
it seems that only the first thread decrements the sharedIndex
That's because of the undefined behaviour. Fix the problems above and you should see other threads run.
With your current code the compiler is allowed to assume that the sharedIndex is never updated by other threads, so it doesn't bother re-reading it, but just lets the first thread run ten times, then the other two threads run once each.
Meaning, every contex switch, a different thread will access the resource and do its thing.
There is no guarantee that pthread mutexes behave fairly. If you want to guarantee a round-robin behaviour where each thread runs in turn then you will need to impose that yourself, e.g. by having another shared variable (and maybe a condition variable) that says which thread's turn it is to run, and blocking the other threads until it is their turn.
The threads will be hanging out on pthread_mutex_lock(&mutex); waiting to get the lock. Once a thread decrements to 0 and releases the lock, the next thread waiting at lock will then go about it's business (making the value -1), and same for the next thread (making the value -2).
You need to alter your logic on checking value and locking the mutex.
int sharedIndex = 10;
pthread_mutex_t mutex;
void* foo(void* arg)
{
while(sharedIndex >= 0)
{
pthread_mutex_lock(&mutex);
cout << sharedIndex << endl;
sharedIndex--;
pthread_mutex_unlock(&mutex);
}
return NULL;
}
According to this code sharedIndex is the shared resource for all the threads.
Thus each access to it (both read and write) should be wrapped by mutex.
Otherwise assume the situation where all the threads sample sharedIndex simultaneously and its value is 1.
All threads, then, enter the while loop and each one decreases sharedIndex by one leading it to -2 at the end.
EDIT
Possible fix (as one of the possible options):
bool is_positive;
do
{
pthread_mutex_lock(&mutex);
is_positive = (sharedIndex >= 0);
if (is_positive)
{
cout << sharedIndex << endl;
sharedIndex--;
}
pthread_mutex_unlock(&mutex);
}while(is_positive);
EDIT2
Note that you must initialize the mutex:
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

How do I make threads run sequentially instead of concurrently?

For example I want each thread to not start running until the previous one has completed, is there a flag, something like thread.isRunning()?
#include <iostream>
#include <vector>
#include <thread>
using namespace std;
void hello() {
cout << "thread id: " << this_thread::get_id() << endl;
}
int main() {
vector<thread> threads;
for (int i = 0; i < 5; ++i)
threads.push_back(thread(hello));
for (thread& thr : threads)
thr.join();
cin.get();
return 0;
}
I know that the threads are meant to run concurrently, but what if I want to control the order?
There is no thread.isRunning(). You need some synchronization primitive to do it.
Consider std::condition_variable for example.
One approachable way is to use std::async. With the current definition of std::async is that the associated state of an operation launched by std::async can cause the returned std::future's destructor to block until the operation is complete. This can limit composability and result in code that appears to run in parallel but in reality runs sequentially.
{
std::async(std::launch::async, []{ hello(); });
std::async(std::launch::async, []{ hello(); }); // does not run until hello() completes
}
If we need the second thread start to run after the first one is completed, is a thread really needed?
For solution I think try to set a global flag, the set the value in the first thread, and when start the second thread, check the flag first should work.
You can't simply control the order like saying "First, thread 1, then thread 2,..." you will need to make use of synchronization (i.e. std::mutex and condition-variables std::condition_variable_any).
You can create events so as to block one thread until a certain event happend.
See cppreference for an overview of threading-mechanisms in C++-11.
You will need to use semaphore or lock.
If you initialize semaphore to value 0:
Call wait after thread.start() and call signal/ release in the end of thread execution function (e.g. run funcition in java, OnExit function etc...)
So the main thread will keep waiting until the thread in loop has completed its execution.
Task-based parallelism can achieve this, but C++ does not currently offer task model as part of it's threading libraries. If you have TBB or PPL you can use their task-based facilities.
I think you can achieve this by using std::mutex and std::condition_variable from C++11. To be able to run threads sequentially array of booleans in used, when thread is done doing some work it writes true in specific index of the array.
For example:
mutex mtx;
condition_variable cv;
int ids[10] = { false };
void shared_method(int id) {
unique_lock<mutex> lock(mtx);
if (id != 0) {
while (!ids[id - 1]) {
cv.wait(lock);
}
}
int delay = rand() % 4;
cout << "Thread " << id << " will finish in " << delay << " seconds." << endl;
this_thread::sleep_for(chrono::seconds(delay));
ids[id] = true;
cv.notify_all();
}
void test_condition_variable() {
thread threads[10];
for (int i = 0; i < 10; ++i) {
threads[i] = thread(shared_method, i);
}
for (thread &t : threads) {
t.join();
}
}
Output:
Thread 0 will finish in 3 seconds.
Thread 1 will finish in 1 seconds.
Thread 2 will finish in 1 seconds.
Thread 3 will finish in 2 seconds.
Thread 4 will finish in 2 seconds.
Thread 5 will finish in 0 seconds.
Thread 6 will finish in 0 seconds.
Thread 7 will finish in 2 seconds.
Thread 8 will finish in 3 seconds.
Thread 9 will finish in 1 seconds.