Purpose of condition_variable - c++

Application without std::condition_variable:
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <queue>
#include <chrono>
std::mutex mutex;
std::queue<int> queue;
int counter;
void loadData()
{
while(true)
{
std::unique_lock<std::mutex> lock(mutex);
queue.push(++counter);
lock.unlock();
std::this_thread::sleep_for(std::chrono::seconds(1));
}
}
void writeData()
{
while(true)
{
std::lock_guard<std::mutex> lock(mutex);
while(queue.size() > 0)
{
std::cout << queue.front() << std::endl;
queue.pop();
}
}
}
int main()
{
std::thread thread1(loadData);
std::thread thread2(writeData);
thread1.join();
thread2.join();
return 0;
}
Application with std::condition_variable:
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <queue>
#include <chrono>
std::mutex mutex;
std::queue<int> queue;
std::condition_variable condition_variable;
int counter;
void loadData()
{
while(true)
{
std::unique_lock<std::mutex> lock(mutex);
queue.push(++counter);
lock.unlock();
condition_variable.notify_one();
std::this_thread::sleep_for(std::chrono::seconds(1));
}
}
void writeData()
{
while(true)
{
std::unique_lock<std::mutex> lock(mutex);
condition_variable.wait(lock, [](){return !queue.empty();});
std::cout << queue.front() << std::endl;
queue.pop();
}
}
int main()
{
std::thread thread1(loadData);
std::thread thread2(writeData);
thread1.join();
thread2.join();
return 0;
}
If I am right, it means that second version of this application is unsafe, because of queue.empty() function, which is used without any synchronization, so there are no locks. And there is my question: Should we use condition_variables if they cause problems like this one mentioned before?

Your first example busy waits -- there is a thread pounding on the lock, checking, then releasing the lock. This both increases contention of the mutex and wastes up to an entire CPU when nothing is being processed.
The second example has the waiting thread mostly sleeping. It only wakes up when there is data ready, or when there is a "spurious wakeup" (with the standard permits).
When it wakes up, it reacquires the mutex and checks the predicate. If the predicate fails, it releases the lock and waits on the condition variable again.
It is safe, because the predicate is guaranteed to be run within the mutex you acquired and passed to the wait function.

The second code is safe because the call to wait(lock, pred) is equivalent to (directly from the standard):
while (!pred())
wait(lock);
And a call to wait(lock) release (unlock) lock, and reacquire (lock) it on notification.
In your case, this is equivalent to:
auto pred = [](){return !queue.empty();};
std::unique_lock<std::mutex> lock(mutex); // acquire
while (!pred) { // Ok, we are locked
condition_variable.wait(lock); // release
// if you get here, the lock as been re-acquired
}
So all the calls to your pred are made with lock locked/acquired - No issue here as long as all other operations to queue are also guarded.

Related

C++ Async Threads

I'm very confused with std::thread and std::async. Here is my code:
#include <vector>
#include <thread>
#include <mutex>
std::vector<int> myvector = {};
std::mutex mu;
int result;
void samplefunc(int a, int& result){
result = 2 * a;
}
void func1(){
while(true){
std::unique_lock locker(mu);
locker.lock();
std::vector<int> temp = myvector;
locker.unlock();
if(myvector.size() > 0){
samplefunc(myvector.at(0),result);
myvector.clear();
}else{
samplefunc(0,result);
}
}
}
void func2(){
while(true){
//do complex things, suppose after complex calculations we have result dummy which is type int
int dummy = 0;
std::unique_lock locker(mu);
locker.lock();
myvector.push_back(dummy);
locker.unlock();
}
}
int main(){
std::thread t1(func1);
std::thread t2(func2);
t1.join();
t2.join();
}
What I want to do is very simple. We have two threads which should run in parallel (i.e. they don't have to wait on each other). However, thread t1 is supposed to change its behaviour if thread t2 puts some integer in a shared vector. Does the code given above achieve this, or should we use std::async, std::future, etc?
If the shared data is properly synchronized, it is fine for modification from one thread on the shared data to affect computation using that data in separate thread.
But you haven't locked myvector sufficiently. The lock has to surround all reading steps on the shared data. In func1, you release the lock too soon.

Would adding an empty lock scope before notifying the condition variable prevent lost notifications?

Consider a predicate that checks for flags updated outside of a condition variable mutex lock (since I want to abstract away from the wait implementation). Would adding an empty lock scope before notifying the condition variable prevent lost notifications? Here is a minimal example.
#include <condition_variable>
#include <mutex>
#include <atomic>
#include <functional>
#include <thread>
class TaskScheduler {
public:
static void someMethod(std::function<bool(void)>&& pred) {
_wait(std::move(pred));
}
static void notify() {
{
std::lock_guard<std::mutex> lock(waitMutex);
}
waitCV.notify_all();
}
private:
static std::mutex waitMutex;
static std::condition_variable waitCV;
static void _wait(std::function<bool(void)>&& pred) {
std::unique_lock<std::mutex> lock(waitMutex);
waitCV.wait(lock, [&](){return pred();});
}
};
std::mutex TaskScheduler::waitMutex;
std::condition_variable TaskScheduler::waitCV;
std::atomic<bool> waiting{false};
std::atomic<bool> atomicFlag{false};
void thread1() {
TaskScheduler::someMethod([&](){waiting = true; return atomicFlag.load();});
}
void thread2() {
// this is called while thread1 is waiting
atomicFlag = true;
TaskScheduler::notify();
}
int main(int argc, char **argv) {
std::thread th1([&](){thread1();});
while (!waiting) {};
std::thread th2([&](){thread2();});
th2.join();
th1.join();
}

Wake up of the thread is time consuming

#ifndef THREADPOOL_H
#define THREADPOOL_H
#include <iostream>
#include <deque>
#include <functional>
#include <thread>
#include <condition_variable>
#include <mutex>
#include <atomic>
#include <vector>
//thread pool
class ThreadPool
{
public:
ThreadPool(unsigned int n = std::thread::hardware_concurrency())
: busy()
, processed()
, stop()
{
for (unsigned int i=0; i<n; ++i)
workers.emplace_back(std::bind(&ThreadPool::thread_proc, this));
}
template<class F> void enqueue(F&& f)
{
std::unique_lock<std::mutex> lock(queue_mutex);
tasks.emplace_back(std::forward<F>(f));
cv_task.notify_one();
}
void waitFinished()
{
std::unique_lock<std::mutex> lock(queue_mutex);
cv_finished.wait(lock, [this](){ return tasks.empty() && (busy == 0); });
}
~ThreadPool()
{
// set stop-condition
std::unique_lock<std::mutex> latch(queue_mutex);
stop = true;
cv_task.notify_all();
latch.unlock();
// all threads terminate, then we're done.
for (auto& t : workers)
t.join();
}
unsigned int getProcessed() const { return processed; }
private:
std::vector< std::thread > workers;
std::deque< std::function<void()> > tasks;
std::mutex queue_mutex;
std::condition_variable cv_task;
std::condition_variable cv_finished;
unsigned int busy;
std::atomic_uint processed;
bool stop;
void thread_proc()
{
while (true)
{
std::unique_lock<std::mutex> latch(queue_mutex);
cv_task.wait(latch, [this](){ return stop || !tasks.empty(); });
if (!tasks.empty())
{
// got work. set busy.
++busy;
// pull from queue
auto fn = tasks.front();
tasks.pop_front();
// release lock. run async
latch.unlock();
// run function outside context
fn();
++processed;
latch.lock();
--busy;
cv_finished.notify_one();
}
else if (stop)
{
break;
}
}
}
};
#endif // THREADPOOL_H
I have the above thread pool implementation using a latch. However, every time I add a task through the enqueue call, the overhead is quite large, it takes about 100 micro seconds.
How can I improve the performance of the threadpool?
Your code looks fine. The comments above in your question about compiling with release optimizations on are probably correct and all you need to do.
Disclaimer: Always measure code first with appropriate tools to identify where the bottlenecks are before attempting to improve it's performance. Otherwise, you might not get the improvements you seek.
But a couple of potential micro-optimizations I see are this.
Change this in your thread_proc function
while (true)
{
std::unique_lock<std::mutex> latch(queue_mutex);
cv_task.wait(latch, [this](){ return stop || !tasks.empty(); });
if (!tasks.empty())
To this:
std::unique_lock<std::mutex> latch(queue_mutex);
while (!stop)
{
cv_task.wait(latch, [this](){ return stop || !tasks.empty(); });
while (!tasks.empty() && !stop)
And then remove the else if (stop) block and the end of the function.
The main impact this has is that it avoids the extra "unlock" and "lock" on queue_mutex as a result of latch going out of scope on each iteration of the while loop. The changing of if (!tasks.empty()) to while (!tasks.empty()) might save a cycle or two as well by letting the currently executing thread which has the quantum keep the lock and try to deque the next work item.
<opinion>
One final thing. I'm always of the opinion that the notify should be outside the lock. That way, there's no lock contention when the other thread is woken up by the thread that just updated the queue. But I've never actually measured this assumption, so take it with a grain of salt:
template<class F> void enqueue(F&& f)
{
queue_mutex.lock();
tasks.emplace_back(std::forward<F>(f));
queue_mutex.unlock();
cv_task.notify_one();
}

Modify shared state and notify std::condition_variable if std::mutex::lock throws

I encountered some problem and I'm not sure how to deal with it.
#include <iostream>
#include <thread>
#include <condition_variable>
#include <chrono>
std::condition_variable CV;
std::mutex m;
std::size_t i{0};
void set_value() try
{
std::this_thread::sleep_for(std::chrono::seconds{2});
{
std::lock_guard<std::mutex> lock{m};
i = 20;
}
CV.notify_one();
}
catch(...){
//what to do?
}
int main()
{
std::thread t{set_value};
t.detach();
std::unique_lock<std::mutex> lock{m};
CV.wait(lock, []{ return i != 0; });
std::cout << "i has changed to " << i << std::endl;
}
This of course works fine but how should I handle the case when std::lock_guard::lock throws an exception?
At first I was thinking to create global std::atomic<bool> mutex_lock_throwed{ false }; that I could set to true inside the catch block. Than I could notify_one()
catch(...){
mutex_lock_throwed.store(true);
CV.notify_one();
}
and change predicate for wait function to
[]{ return i != 0 || mutex_lock_throwed.load(); }
This actually worked very well but I read this in cppreference
Even if the shared variable is atomic, it must be modified under the mutex in order to correctly publish the modification to the waiting thread.
As you can see its not possible if mutex throws. So what should be the correct way to handle this?

C++ thread deadlock on waiting condition

Trying to expand in my two previous questions Move operations for a class with a thread as member variable and Call function inside a lambda passed to a thread
I'm not understanding why the thread doing a wait_for is somtimes not being notified wich results in a deadlock. Cppreference says on condition variables http://en.cppreference.com/w/cpp/thread/condition_variable/notify_one
The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.
MCVE, the commented line explains what changes if I hold the lock, but I dont undrestand why:
#include <atomic>
#include <condition_variable>
#include <mutex>
#include <thread>
#include <iostream>
using namespace std;
class worker {
public:
template <class Fn, class... Args>
explicit worker(Fn func, Args... args) {
t = std::thread(
[&func, this](Args... cargs) -> void {
std::unique_lock<std::mutex> lock(mtx);
while (true) {
cond.wait(lock, [this]() -> bool { return ready; });
if (terminate) {
break;
}
func(cargs...);
ready = false;
}
},
std::move(args)...);
}
~worker() {
terminate = true;
if (t.joinable()) {
run_once();
t.join();
}
}
void run_once() {
// If i dont hold this mutex the thread is never notified of ready being
// true.
std::unique_lock<std::mutex> lock(mtx);
ready = true;
cout << "ready run once " << ready << endl;
cond.notify_all();
}
bool done() { return (!ready.load()); }
private:
std::thread t;
std::atomic<bool> terminate{false};
std::atomic<bool> ready{false};
std::mutex mtx;
std::condition_variable cond;
};
// main.cpp
void foo() {
worker t([]() -> void { cout << "Bark" << endl; });
t.run_once();
while (!t.done()) {
}
}
int main() {
while (true) {
foo();
}
return 0;
}
You need a memory barrier to make sure that the other thread will see the modified "ready" value. "ready" being atomic only ensures that the memory access is ordered so that modifications that happened before the atomic access are actually flushed to main memory. This does not guarantee that the other threads will see that memory, since these threads may have their own cache of the memory. Therefore, to ensure that the other thread sees the "ready" modification, the mutex is required.
{
std::unique_lock<std::mutex> lock(mtx);
ready = true;
}