C++ conditional wait race condition - c++

Suppose that I have a program that has a worker-thread that squares number from a queue. The problem is that if the work is to light (takes to short time to do), the worker finishes the work and notifies the main thread before it have time to even has time to wait for the worker to finish.
My simple program looks as follows:
#include <atomic>
#include <condition_variable>
#include <queue>
#include <thread>
std::atomic<bool> should_end;
std::condition_variable work_to_do;
std::mutex work_to_do_lock;
std::condition_variable fn_done;
std::mutex fn_done_lock;
std::mutex data_lock;
std::queue<int> work;
std::vector<int> result;
void worker() {
while(true) {
if(should_end) return;
data_lock.lock();
if(work.size() > 0) {
int front = work.front();
work.pop();
if (work.size() == 0){
fn_done.notify_one();
}
data_lock.unlock();
result.push_back(front * front);
} else {
data_lock.unlock();
// nothing to do, so we just wait
std::unique_lock<std::mutex> lck(work_to_do_lock);
work_to_do.wait(lck);
}
}
}
int main() {
should_end = false;
std::thread t(worker); // start worker
data_lock.lock();
const int N = 10;
for(int i = 0; i <= N; i++) {
work.push(i);
}
data_lock.unlock();
work_to_do.notify_one(); // notify the worker that there is work to do
//if the worker is quick, it signals done here already
std::unique_lock<std::mutex> lck(fn_done_lock);
fn_done.wait(lck);
for(auto elem : result) {
printf("result = %d \n", elem);
}
work_to_do.notify_one(); //notify the worker so we can shut it down
should_end = true;
t.join();
return 0;
}

Your try to use notification itself over conditional variable as a flag that job is done is fundamentally flawed. First and foremost std::conditional_variable can have spurious wakeups so it should not be done this way. You should use your queue size as an actual condition for end of work, check and modify it under the same mutex protected in all threads and use the same mutex lock for condition variable. Then you may use std::conditional_variable to wait until work is done but you do it after you check queue size and if work is done at the moment you do not go to wait at all. Otherwise you check queue size in a loop (because of spurious wakeups) and wait if it is still not empty or you use std::condition_variable::wait() with a predicate, that has the loop internally.

Related

How to say to std::thread to stop?

I have two questions.
1) I want to launch some function with an infinite loop to work like a server and checking for messages in a separate thread. However I want to close it from the parent thread when I want. I'm confusing how to std::future or std::condition_variable in this case. Or is it better to create some global variable and change it to true/false from the parent thread.
2) I'd like to have something like this. Why this one example crashes during the run time?
#include <iostream>
#include <chrono>
#include <thread>
#include <future>
std::mutex mu;
bool stopServer = false;
bool serverFunction()
{
while (true)
{
// checking for messages...
// processing messages
std::this_thread::sleep_for(std::chrono::seconds(1));
mu.lock();
if (stopServer)
break;
mu.unlock();
}
std::cout << "Exiting func..." << std::endl;
return true;
}
int main()
{
std::thread serverThread(serverFunction);
// some stuff
system("pause");
mu.lock();
stopServer = true;
mu.unlock();
serverThread.join();
}
Why this one example crashes during the run time?
When you leave the inner loop of your thread, you leave the mutex locked, so the parent thread may be blocked forever if you use that mutex again.
You should use std::unique_lock or something similar to avoid problems like that.
You leave your mutex locked. Don't lock mutexes manually in 999/1000 cases.
In this case, you can use std::unique_lock<std::mutex> to create a RAII lock-holder that will avoid this problem. Simply create it in a scope, and have the lock area end at the end of the scope.
{
std::unique_lock<std::mutex> lock(mu);
stopServer = true;
}
in main and
{
std::unique_lock<std::mutex> lock(mu);
if (stopServer)
break;
}
in serverFunction.
Now in this case your mutex is pointless. Remove it. Replace bool stopServer with std::atomic<bool> stopServer, and remove all references to mutex and mu from your code.
An atomic variable can safely be read/written to from different threads.
However, your code is still busy-waiting. The right way to handle a server processing messages is a condition variable guarding the message queue. You then stop it by front-queuing a stop server message (or a flag) in the message queue.
This results in a server thread that doesn't wake up and pointlessly spin nearly as often. Instead, it blocks on the condition variable (with some spurious wakeups, but rare) and only really wakes up when there are new messages or it is told to shut down.
template<class T>
struct cross_thread_queue {
void push( T t ) {
{
auto l = lock();
data.push_back(std::move(t));
}
cv.notify_one();
}
boost::optional<T> pop() {
auto l = lock();
cv.wait( l, [&]{ return halt || !data.empty(); } );
if (halt) return {};
T r = data.front();
data.pop_front();
return std::move(r); // returning to optional<T>, so we'll explicitly `move` here.
}
void terminate() {
{
auto l = lock();
data.clear();
halt = true;
}
cv.notify_all();
}
private:
std::mutex m;
std::unique_lock<std::mutex> lock() {
return std::unique_lock<std::mutex>(m);
}
bool halt = false;
std::deque<T> data;
std::condition_variable cv;
};
We use boost::optional for the return type of pop -- if the queue is halted, pop returns an empty optional. Otherwise, it blocks until there is data.
You can replace this with anything optional-like, even a std::pair<bool, T> where the first element says if there is anything to return, or a std::unique_ptr<T>, or a std::experimental::optional, or a myriad of other choices.
cross_thread_queue<int> queue;
bool serverFunction()
{
while (auto message = queue.pop()) {
// processing *message
std::cout << "Processing " << *message << std::endl;
}
std::cout << "Exiting func..." << std::endl;
return true;
}
int main()
{
std::thread serverThread(serverFunction);
// some stuff
queue.push(42);
system("pause");
queue.terminate();
serverThread.join();
}
live example.

C++ Fork Join Parallelism Blocking

Suppose you wish you run a section in parallel, then merge back into the main thread then back to section in parallel, and so on. Similar to the childhood game red light green light.
I've given an example of what I'm trying to do, where I'm using a conditional variable to block the threads at the start but wish to start them all in parallel but then block them at the end so they can be printed out serially. The *= operation could be a much larger operation spanning many seconds. Reusing the threads is also important. Using a task queue might be too heavy.
I need to use some kind of blocking construct that isn't just a plain busy loop, because I know how to solve this problem with busy loops.
In English:
Thread 1 creates 10 threads that are blocked
Thread 1 signals all threads to start (without blocking eachother)
Thread 2-11 process their exclusive memory
Thread 1 is waiting until 2-11 are complete (can use an atomic to count here)
Thread 2-11 complete, each can notify for 1 to check its condition if necessary
Thread 1 checks its condition and prints the array
Thread 1 resignals 2-11 to process again, continuing from 2
Example code (Naive adapted from example on cplusplus.com):
// condition_variable example
#include <iostream> // std::cout
#include <thread> // std::thread
#include <mutex> // std::mutex, std::unique_lock
#include <condition_variable> // std::condition_variable
#include <atomic>
std::mutex mtx;
std::condition_variable cv;
bool ready = false;
std::atomic<int> count(0);
bool end = false;
int a[10];
void doublea (int id) {
while(!end) {
std::unique_lock<std::mutex> lck(mtx);
while (!ready) cv.wait(lck);
a[id] *= 2;
count.fetch_add(1);
}
}
void go() {
std::unique_lock<std::mutex> lck(mtx);
ready = true;
cv.notify_all();
ready = false; // Naive
while (count.load() < 10) sleep(1);
for(int i = 0; i < 10; i++) {
std::cout << a[i] << std::endl;
}
ready = true;
cv.notify_all();
ready = false;
while (count.load() < 10) sleep(1);
for(int i = 0; i < 10; i++) {
std::cout << a[i] << std::endl;
}
end = true;
cv.notify_all();
}
int main () {
std::thread threads[10];
// spawn 10 threads:
for (int i=0; i<10; ++i) {
a[i] = 0;
threads[i] = std::thread(doublea,i);
}
std::cout << "10 threads ready to race...\n";
go(); // go!
return 0;
}
This is not as trivial to implement it efficiently. Moreover, it does not make any sense unless you are learning this subject. Conditional variable is not a good choice here because it does not scale well.
I suggest you to look how mature run-time libraries implement fork-join parallelism and learn from them or use them in your app. See http://www.openmprtl.org/, http://opentbb.org/, https://www.cilkplus.org/ - all these are open-source.
OpenMP is the closest model for what you are looking for and it has the most efficient implementation of fork-join barriers. Though, it has its disadvantages because it is designed for HPC and lacks dynamic composability. TBB and Cilk work best for nested parallelism and usage in modules and libraries which can be used in context of external parallel regions.
You can use barrier or condition variable to start all threads. Then thread one can wait to when all threads end their work (by join method on all threads, it is blocking) and then print in one for loop their data.

Shutdown boost threads correctly

I have x boost threads that work at the same time. One producer thread fills a synchronised queue with calculation tasks. The consumer threads pop out tasks and calculates them.
Image Source: https://www.quantnet.com/threads/c-multithreading-in-boost.10028/
The user may finish the programm during this process, so I need to shutdown my threads properly. My current approach seems to not work, since exceptions are thrown. It's intented that on system shutdown all processes should be killed and stop their current task no matter what they do. Could you please show me, how you would kill thoses threads?
Thread Initialisation:
for (int i = 0; i < numberOfThreads; i++)
{
std::thread* thread = new std::thread(&MyManager::worker, this);
mThreads.push_back(thread);
}
Thread Destruction:
void MyManager::shutdown()
{
for (int i = 0; i < numberOfThreads; i++)
{
mThreads.at(i)->join();
delete mThreads.at(i);
}
mThreads.clear();
}
Worker:
void MyManager::worker()
{
while (true)
{
int current = waitingList.pop();
Object * p = objects.at(current);
p->calculateMesh(); //this task is internally locked by a mutex
try
{
boost::this_thread::interruption_point();
}
catch (const boost::thread_interrupted&)
{
// Thread interruption request received, break the loop
std::cout << "- Thread interrupted. Exiting thread." << std::endl;
break;
}
}
}
Synchronised Queue:
#include <queue>
#include <thread>
#include <mutex>
#include <condition_variable>
template <typename T>
class ThreadSafeQueue
{
public:
T pop()
{
std::unique_lock<std::mutex> mlock(mutex_);
while (queue_.empty())
{
cond_.wait(mlock);
}
auto item = queue_.front();
queue_.pop();
return item;
}
void push(const T& item)
{
std::unique_lock<std::mutex> mlock(mutex_);
queue_.push(item);
mlock.unlock();
cond_.notify_one();
}
int sizeIndicator()
{
std::unique_lock<std::mutex> mlock(mutex_);
return queue_.size();
}
private:
bool isEmpty() {
std::unique_lock<std::mutex> mlock(mutex_);
return queue_.empty();
}
std::queue<T> queue_;
std::mutex mutex_;
std::condition_variable cond_;
};
The thrown error call stack:
... std::_Mtx_lockX(_Mtx_internal_imp_t * * _Mtx) Line 68 C++
... std::_Mutex_base::lock() Line 42 C++
... std::unique_lock<std::mutex>::unique_lock<std::mutex>(std::mutex & _Mtx) Line 220 C++
... ThreadSafeQueue<int>::pop() Line 13 C++
... MyManager::worker() Zeile 178 C++
From my experience on working with threads in both Boost and Java, trying to shut down threads externally is always messy. I've never been able to really get that to work cleanly.
The best I've gotten is to have a boolean value available to all the consumer threads that is set to true. When you set it to false, the threads will simply return on their own. In your case, that could easily be put into the while loop you have.
On top of that, you're going to need some synchronization so that you can wait for the threads to return before you delete them, otherwise you can get some hard to define behavior.
An example from a past project of mine:
Thread creation
barrier = new boost::barrier(numOfThreads + 1);
threads = new detail::updater_thread*[numOfThreads];
for (unsigned int t = 0; t < numOfThreads; t++) {
//This object is just a wrapper class for the boost thread.
threads[t] = new detail::updater_thread(barrier, this);
}
Thread destruction
for (unsigned int i = 0; i < numOfThreads; i++) {
threads[i]->requestStop();//Notify all threads to stop.
}
barrier->wait();//The update request will allow the threads to get the message to shutdown.
for (unsigned int i = 0; i < numOfThreads; i++) {
threads[i]->waitForStop();//Wait for all threads to stop.
delete threads[i];//Now we are safe to clean up.
}
Some methods that may be of interest from the thread wrapper.
//Constructor
updater_thread::updater_thread(boost::barrier * barrier)
{
this->barrier = barrier;
running = true;
thread = boost::thread(&updater_thread::run, this);
}
void updater_thread::run() {
while (running) {
barrier->wait();
if (!running) break;
//Do stuff
barrier->wait();
}
}
void updater_thread::requestStop() {
running = false;
}
void updater_thread::waitForStop() {
thread.join();
}
Try moving 'try' up (like in the sample below). If your thread is waiting for data (inside waitingList.pop()) then may be waiting inside the condition variable .wait(). This is an 'interruption point' and so may throw when the thread gets interrupted.
void MyManager::worker()
{
while (true)
{
try
{
int current = waitingList.pop();
Object * p = objects.at(current);
p->calculateMesh(); //this task is internally locked by a mutex
boost::this_thread::interruption_point();
}
catch (const boost::thread_interrupted&)
{
// Thread interruption request received, break the loop
std::cout << "- Thread interrupted. Exiting thread." << std::endl;
break;
}
}
}
Maybe you are catching the wrong exception class?
Which would mean it does not get caught.
Not too familiar with threads but is it the mix of std::threads and boost::threads that is causing this?
Try catching the lowest parent exception.
I think this is a classic problem of reader/writer thread working on a common buffer. One of the most secured way of working out this problem is to use mutexes and signals.( I am not able to post the code here. Please send me an email, I post the code to you).

C++11 thread to modify std::list

I'll post my code, and then tell you what I think it's doing.
#include <thread>
#include <mutex>
#include <list>
#include <iostream>
using namespace std;
...
//List of threads and ints
list<thread> threads;
list<int> intList;
//Whether or not a thread is running
bool running(false);
//Counters
int busy(0), counter(0);
//Add 10000 elements to the list
for (int i = 0; i < 10000; ++i){
//push back an int
intList.push_back(i);
counter++;
//If the thread is running, make a note of it and continue
if (running){
busy++;
continue;
}
//If we haven't yet added 10 elements before a reset, continue
if (counter < 10)
continue;
//If we've added more than 10 ints, and there's no active thread,
//reset the counter and launch
counter = 0;
threads.push_back(std::thread([&]
//These iterators are function args
(list<int>::iterator begin, list<int>::iterator end){
//mutex for the running bool
mutex m;
m.lock();
running = true;
m.unlock();
//Remove either 10 elements or every element till the end
int removed(0);
while (removed < 10 && begin != end){
begin = intList.erase(begin);
removed++;
}
//unlock the running bool
m.lock();
running = false;
m.unlock();
//Pass into the thread func the current beginning and end of the list
}, intList.begin(), intList.end()));
}
for (auto& thread : threads){
thread.join();
}
What I think this code is doing is adding 10000 elements to the end of a list. For every 10 we add, launch a (single) thread that deletes the first 10 elements of the list (at the time the thread was launched).
I don't expect this to remove every list element, I was just interested in seeing if I could add to the end of a list while removing elements from the beginning. In Visual Studio I get a "list iterators incompatible" error quite often, but I figure the problem is cross platform.
What's wrong with my thinking? I know it's something
EDIT:
So I see now that this code is very incorrect. Really I just want one auxiliary thread active at a time to delete elements, which is why I though calling erase was ok. However I don't know how to declare a thread without joining it up, and if I wait for that then I don't really see the point of doing any of this.
Should I declare my thread before the loop and have it wait for a signal from the main thread?
To clarify, my goal here is to do the following: I want to grab keyboard presses on one thread and store them in a list, and every so often log them to a file on a separate thread while removing the things I've logged. Since I don't want to spend a lot of time writing to the disk, I'd like to write in discrete chunks (of 10).
Thanks to Christophe, and everyone else. Here's my code now... I may be using lock_guard incorrectly.
#include <thread>
#include <mutex>
#include <list>
#include <iostream>
#include <atomic>
using namespace std;
...
atomic<bool> running(false);
list<int> intList;
int busy(0), counter(0);
mutex m;
thread * t(nullptr);
for (int i = 0; i < 100000; ++i){
//Would a lock_guard here be inappropriate?
m.lock();
intList.push_back(i);
m.unlock();
counter++;
if (running){
busy++;
continue;
}
if (counter < 10)
continue;
counter = 0;
if (t){
t->join();
delete t;
}
t = new thread([&](){
running = true;
int removed(0);
while (removed < 10){
lock_guard<mutex> lock(m);
if (intList.size())
intList.erase(intList.begin());
removed++;
}
running = false;
});
}
if (t){
t->join();
delete t;
}
Your code won't work for because:
your mutex is local to each thread (each thread has it's own copy used only by itself: no chance of interthread synchronisation!)
intList is not an atomic type, but you access to it from several threads causing race conditions and undefined behaviour.
the begin and end that you send to your threads at their creation, might no longer be valid during the execution.
Here some improvements (look at the commented lines):
atomic<bool> running(false); // <=== atomic (to avoid unnecessary use of mutex)
int busy(0), counter(0);
mutex l; // define the mutex here, so that it will be the same for all threads
for (int i = 0; i < 10000; ++i){
l.lock(); // <===you need to protect each access to the list
intList.push_back(i);
l.unlock(); // <===and unlock
counter++;
if (running){
busy++;
continue;
}
if (counter < 10)
continue;
counter = 0;
threads.push_back(std::thread([&]
(){ //<====No iterator args as they might be outdated during executionof threads!!
running = true; // <=== no longer surrounded from lock/unlock as it is now atomic
int removed(0);
while (removed < 10){
l.lock(); // <====you really need to protect access to the list
if (intList.size()) // <=== check if elements exist NOW
intList.erase(intList.begin()); // <===use current data, not a prehistoric outdated local begin !!
l.unlock(); // <====end of protected section
removed++;
}
running = false; // <=== no longer surrounded from lock/unlock as it is now atomic
})); //<===No other arguments
}
...
By the way, I'd suggest that you have a look at lock_guard<mutex> for the locks, as these ensure the unlock in all circumstances (especially when there are exceptions or orhter surprises like this).
Edit: I've avoided the lock protection of running with a mutex, by making it atomic<bool>.

Stop infinite looping thread from main

I am relatively new to threads, and I'm still learning best techniques and the C++11 thread library. Right now I'm in the middle of implementing a worker thread which infinitely loops, performing some work. Ideally, the main thread would want to stop the loop from time to time to sync with the information that the worker thread is producing, and then start it again. My idea initially was this:
// Code run by worker thread
void thread() {
while(run_) {
// Do lots of work
}
}
// Code run by main thread
void start() {
if ( run_ ) return;
run_ = true;
// Start thread
}
void stop() {
if ( !run_ ) return;
run_ = false;
// Join thread
}
// Somewhere else
volatile bool run_ = false;
I was not completely sure about this so I started researching, and I discovered that volatile is actually not required for synchronization and is in fact generally harmful. Also, I discovered this answer, which describes a process nearly identical to the one I though about. In the answer's comments however, this solution is described as broken, as volatile does not guarantee that different processor cores readily (if ever) communicate changes on the volatile values.
My question is this then: Should I use an atomic flag, or something else entirely? What exactly is the property that is lacking in volatile and that is then provided by whatever construct is needed to solve my problem effectively?
Have you looked for the Mutex ? They're made to lock the Threads avoiding conflicts on the shared data. Is it what you're looking for ?
I think you want to use barrier synchronization using std::mutex?
Also take a look at boost thread, for a relatively high level threading library
Take a look at this code sample from the link:
#include <iostream>
#include <map>
#include <string>
#include <chrono>
#include <thread>
#include <mutex>
std::map<std::string, std::string> g_pages;
std::mutex g_pages_mutex;
void save_page(const std::string &url)
{
// simulate a long page fetch
std::this_thread::sleep_for(std::chrono::seconds(2));
std::string result = "fake content";
g_pages_mutex.lock();
g_pages[url] = result;
g_pages_mutex.unlock();
}
int main()
{
std::thread t1(save_page, "http://foo");
std::thread t2(save_page, "http://bar");
t1.join();
t2.join();
g_pages_mutex.lock(); // not necessary as the threads are joined, but good style
for (const auto &pair : g_pages) {
std::cout << pair.first << " => " << pair.second << '\n';
}
g_pages_mutex.unlock();
}
I would suggest to use std::mutex and std::condition_variable to solve the problem. Here's an example how it can work with C++11:
#include <condition_variable>
#include <iostream>
#include <mutex>
#include <thread>
using namespace std;
int main()
{
mutex m;
condition_variable cv;
// Tells, if the worker should stop its work
bool done = false;
// Zero means, it can be filled by the worker thread.
// Non-zero means, it can be consumed by the main thread.
int result = 0;
// run worker thread
auto t = thread{ [&]{
auto bound = 1000;
for (;;) // ever
{
auto sum = 0;
for ( auto i = 0; i != bound; ++i )
sum += i;
++bound;
auto lock = unique_lock<mutex>( m );
// wait until we can safely write the result
cv.wait( lock, [&]{ return result == 0; });
// write the result
result = sum;
// wake up the consuming thread
cv.notify_one();
// exit the loop, if flag is set. This must be
// done with mutex protection. Hence this is not
// in the for-condition expression.
if ( done )
break;
}
} };
// the main threads loop
for ( auto i = 0; i != 20; ++i )
{
auto r = 0;
{
// lock the mutex
auto lock = unique_lock<mutex>( m );
// wait until we can safely read the result
cv.wait( lock, [&]{ return result != 0; } );
// read the result
r = result;
// set result to zero so the worker can
// continue to produce new results.
result = 0;
// wake up the producer
cv.notify_one();
// the lock is released here (the end of the scope)
}
// do time consuming io at the side.
cout << r << endl;
}
// tell the worker to stop
{
auto lock = unique_lock<mutex>( m );
result = 0;
done = true;
// again the lock is released here
}
// wait for the worker to finish.
t.join();
cout << "Finished." << endl;
}
You could do the same with std::atomics by essentially implementing spin locks. Spin locks can be slower than mutexes. So I repeat the advise on the boost website:
Do not use spinlocks unless you are certain that you understand the consequences.
I believe that mutexes and condition variables are the way to go in your case.