I am learning to write multithreading applications. So share I run into trouble anytime I want my threads to access even the simples shared resources, despite using mutex.
For example, consider this code:
using namespace std;
mutex mu;
std::vector<string> ob;
void addSomeAValues(){
mu.lock();
for(int a=0; a<10; a++){
ob.push_back("A" + std::to_string(a));
usleep(300);
}
mu.unlock();
}
void addSomeBValues(){
mu.lock();
for(int b=0; b<10; b++){
ob.push_back("B" + std::to_string(b));
usleep(300);
}
mu.unlock();
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
thread t0(addSomeAValues);
thread t1(addSomeBValues);
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
t0.join();
t1.join();
//Display the results
cout << "Code Run Complete; results: \n";
for(auto k : ob){
cout << k <<endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t;
millisecs_t duration(std::chrono::duration_cast<millisecs_t>(end-start));
std::cout << duration.count() << " milliseconds.\n";
return 0;
}
When I run the program, it behaves unpredictably. Sometimes, the values A0-9 and B0-9 is printed to console no problem, sometimes there is a segmentation fault with crash report, sometimes, A0-3 & B0-5 is presented.
If i am missing a core synchronization issue, pleasee help
Edit: after alot of useful feed back i changed the code to
#include <iostream>
#include <string>
#include <vector>
#include <mutex>
#include <unistd.h>
#include <thread>
#include <chrono>
using namespace std;
mutex mu;
std::vector<string> ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
mu.lock();
ob.push_back("A" + std::to_string(a));
mu.unlock();
usleep(300);
}
}
void addSomeBValues(){
for(int b=0; b<10; b++){
mu.lock();
ob.push_back("B" + std::to_string(b));
mu.unlock();
usleep(300);
}
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now() ;
thread t0(addSomeAValues);
thread t1(addSomeBValues);
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now() ;
t0.join();
t1.join();
//Display the results
cout << "Code Run Complete; results: \n";
for(auto k : ob){
cout << k <<endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t ;
millisecs_t duration( std::chrono::duration_cast<millisecs_t>(end-start) ) ;
std::cout << duration.count() << " milliseconds.\n" ;
return 0;
}
however I get the following output sometimes:
*** Error in `/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment':
double free or corruption (fasttop): 0x00007f19fc000920 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x80a46)[0x7f1a0687da46]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402dd4]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402930]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402a8d]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402637
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402278]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x4019cf]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x4041e3]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x404133]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x404088]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb29f0)[0x7f1a06e8d9f0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7f8e)[0x7f1a060c6f8e]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f1a068f6e1d]
Update & Solution
With the problem I was experiencing (namely: unpredictable executing of the program with intermittent dump of corruption complaints), all was solved by including -lpthread as part of my eclipse build (under project settings).
I am using C++11. It's odd, at least to me, that the program would compile without issuing a complaint that I have not yet linked against pthread.
So to anyone using C++11, std::thread, and linux, make sure you link against pthread otherwise your program runtime will be VERY unpredictable, and buggy.
If you're going to use threads, I'd advise doing the job at least a little differently.
Right now, one thread gets the mutex, does all it's going to do (including sleeping for 3000 microseconds), then quits. Then the other thread does essentially the same thing. This being the case, threads have accomplished essentially nothing positive and a fair amount of negative (synchronization code and such).
Your current code is almost unsafe with respect to exceptions -- if an exception were to be thrown inside one of your thread functions, the mutex wouldn't be unlocked, even though that thread could no longer execute.
Finally, right now, you're exposing a mutex, and leaving it to all code that accesses the associated resource to use the mutex correctly. I'd prefer to centralize the mutex locking so its exception safe, and most of the code can ignore it completely.
// use std::lock_guard, if available.
class lock {
mutex &m
public:
lock(mutex &m) : m(m) { m.lock(); }
~lock() { m.unlock(); }
};
class synched_vec {
mutex m;
std::vector<string> data;
public:
void push_back(std::string const &s) {
lock l(m);
data.push_back(s);
}
} ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
ob.push_back("A" + std::to_string(a));
usleep(300);
}
}
This also means that if (for example) you decide to use a lock-free (or minimal locking) structure in the future, you should only have to modify the synched_vec, not all the rest of the code that uses it. Likewise, by keeping all the mutex handling in one place, it's much easier to get the code right, and if you do find a bug, much easier to ensure you've fixed it (rather than looking through all the client code).
The code in the question runs without any segmentation faults (with adding headers and replacing the sleep with a sleep for my system).
There are two problems with the code though, that could cause unexpected results:
Each thread locks the mutex during his full execution. This prevents the other thread to run. The two threads are not running in parallel! In your case, you should only lock, when you are accessing the vector.
Your end time point is taken after creating the threads and not after they are done executing. Both threads are done, when they are both joined.
Working compilable code with headers, chrono-sleep and the two errors fixed:
#include <mutex>
#include <string>
#include <vector>
#include <thread>
#include <iostream>
std::mutex mu;
std::vector<std::string> ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
mu.lock();
ob.push_back("A" + std::to_string(a));
mu.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(300));
}
}
void addSomeBValues(){
for(int b=0; b<10; b++){
mu.lock();
ob.push_back("B" + std::to_string(b));
mu.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(300));
}
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
std::thread t0(addSomeAValues);
std::thread t1(addSomeBValues);
t0.join();
t1.join();
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
//Display the results
std::cout << "Code Run Complete; results: \n";
for(auto k : ob){
std::cout << k << std::endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t;
millisecs_t duration(std::chrono::duration_cast<millisecs_t>(end-start));
std::cout << duration.count() << " milliseconds.\n";
return 0;
}
Related
'''The original post has been edited'''
How can I make a thread pool for two for loops in C++? I need to run the start_thread function 22 times for each number between 0 and 6. And I will have a flexible number of threads available depending on the machine I am using. How can I create a pool to allocate the free threads to the next of the nested loop?
for (int t=0; t <22; t++){
for(int p=0; p<6; p++){
thread th1(start_thread, p);
thread th2(start_thread, p);
th1.join();
th2.join();
}
}
Not really certain about what you want, but maybe it's something like this.
for (int t=0; t <22; t++){
std::vector<std::thread> th;
for(int p=0; p<6; p++){
th.emplace_back(std::thread(start_thread, p));
}
for(int p=0; p<6; p++){
th[i].join();
}
}
(or maybe permute the two loops)
Edit if you want to control the number of threads
#include <iostream>
#include <thread>
#include <vector>
void
start_thread(int t, int p)
{
std::cout << "th " << t << ' ' << p << '\n';
}
void
join_all(std::vector<std::thread> &th)
{
for(auto &e: th)
{
e.join();
}
th.clear();
}
int
main()
{
std::size_t max_threads=std::thread::hardware_concurrency();
std::vector<std::thread> th;
for(int t=0; t <22; ++t)
{
for(int p=0; p<6; ++p)
{
th.emplace_back(std::thread(start_thread, t, p));
if(size(th)==max_threads)
{
join_all(th);
}
}
}
join_all(th);
return 0;
}
If you don't want dependency on a third-party library, this is pretty simple.
Just create a number of threads you like and let them pick a "job" from some queue.
For example:
#include <iostream>
#include <mutex>
#include <chrono>
#include <vector>
#include <thread>
#include <queue>
void work(int p)
{
// do the "work"
std::this_thread::sleep_for(std::chrono::milliseconds(200));
std::cout << p << std::endl;
}
std::mutex m;
std::queue<int> jobs;
void worker()
{
while (true)
{
int job(0);
// sync access to the jobs queue
{
std::lock_guard<std::mutex> l(m);
if (jobs.empty())
return;
job = jobs.front();
jobs.pop();
}
work(job);
}
}
int main()
{
// queue all jobs
for (int t = 0; t < 22; t++) {
for (int p = 0; p < 6; p++) {
jobs.push(p);
}
}
// create reasonable number of threads
static const int n = std::thread::hardware_concurrency();
std::vector<std::thread> threads;
for (int i = 0; i < n; ++i)
threads.emplace_back(std::thread(worker));
// wait for all of them to finish
for (int i = 0; i < n; ++i)
threads[i].join();
}
[ADDED] Obviously, you don't want global variables in your production code; this is simply a demo solution.
Stop trying to code and draw out what you need to do and the pieces you need to have in order to do it.
You need one queue to hold the jobs, one mutex to protect the queue so the threads don't smurf it up with simultaneous accesses, and N threads.
Each thread function is a loop that
grabs the mutex,
gets a job from the queue,
releases the mutex, and
processes the job.
In this case I'd keep things simple by exiting the loop and the thread when there are no more jobs in the queue in step 2. In production you'd have the thread block and wait on the queue so it's still available to service jobs added later.
Wrap that up in a class with a function that allows you to add jobs to the queue, a function to start N threads, and a function to join on all of the running threads.
main defines an instance of the class, feeds in the jobs, starts the thread pool and then blocks on join until everyone's done.
Once you've beaten the design into something you have high confidence does what you need it to do, then you start writing code. Write code, especially multi-threaded code, without a plan and you're in for a lot of debugging and re-writing that usually exceeds the time spent on design by a significant margin.
Since C++17 you can use one of the execution policies for many of the algorithms in the standard library. This can simplify going over a number of work packages greatly. What goes on behind the curtains is usually that it picks threads from a built-in thread pool and distribute work to them efficiently. It usually use just enoughâ„¢ threads in both Linux and Windows and it'll use all the CPU you've got left (0% idle on all cores when the CPU:s have started spinning at max frequency) - strangely without making neither Linux nor Windows "sluggish".
Here I've used the execution policy std::execution::parallel_policy (indicated by the std::execution::par constant). If you can prepare the work that needs to be done and put it in a container, like a std::vector, it'll be really easy.
#include <algorithm>
#include <chrono>
#include <execution> // std::execution::par
#include <iostream>
// #include <thread> // not needed to run with execuion policies
#include <vector>
struct work_package {
work_package() : payload(co) { ++co; }
int payload;
static int co;
};
int work_package::co = 10;
int main() {
std::vector<work_package> wps(22*6); // 132 work packages
for(const auto& wp : wps) std::cout << wp.payload << '\n'; // prints 10 to 141
// work on the work packages
std::for_each(std::execution::par, wps.begin(), wps.end(), [](auto& wp) {
// Probably in a thread - As long as you do not write to the same work package
// from different threads, you don't need synchronization here.
// do some work with the work package
++wp.payload;
});
for(const auto& wp : wps) std::cout << wp.payload << '\n'; // prints 11 to 142
}
With g++ you may need to install tbb (The Threading Building Blocks) that you also need to link with: -ltbb.
apt install libtbb-dev on Ubuntu.
dnf install tbb-devel.x86_64 on Fedora.
Other distributions may call it something different.
Visual Studio (2017 and later) links with the proper library automatically (also tbb if I'm now mistaken).
#include <iostream>
#include <mutex>
#include <condition_variable>
#include <thread>
std::mutex lock_bar_;
std::mutex lock_foo_;
int n = 3;
void foo() {
for (int i = 0; i < n; i++) {
lock_foo_.lock();
// printFoo() outputs "foo". Do not change or remove this line.
std::cout << "1\n";
lock_bar_.unlock();
}
}
void bar() {
for (int i = 0; i < n; i++) {
lock_bar_.lock();
// printBar() outputs "bar". Do not change or remove this line.
std::cout << "2\n";
lock_foo_.unlock();
}
}
int main(){
lock_bar_.lock();
std::thread t1{foo};
std::thread t2{bar};
t1.join(); // line 1
std::cout << "333\n"; // line 2
t2.join(); // line 3
std::cout << "3\n"; // line 4
}
the result is
1
2
1
2
1
2
333
3
or
1
2
1
2
1
333
2
3
my question is : why this programs can run without deadlock?
how join() is actually working?
when program executes line 1, according to cppreference https://en.cppreference.com/w/cpp/thread/thread/join
"Blocks the current thread until the thread identified by *this finishes its execution."
My understanding is that the main thead should stop. It waits until thread t1 is finsihed. then execute line 2 and the rest.
but program seems like that it executes line 1 and line 3. when thread t1 is finshed, it runs line 2. when thread t2 is finished, it executes line 4.
I am confused about join().
if anyone can help, much appreciated
first edited:
ignore original program
new program is
#include <iostream>
#include <mutex>
#include <condition_variable>
#include <thread>
int n = 10;
bool first = true;
std::condition_variable cv1;
std::condition_variable cv2;
std::mutex m;
void foo() {
std::unique_lock<std::mutex> ul(m, std::defer_lock);
for (int i = 0; i < n; i++) {
ul.lock();
cv1.wait(ul, [&]()->bool {return first;} );
std::cout << "1\n";
// printFoo() outputs "foo". Do not change or remove this line.
first = !first;
ul.unlock();
cv2.notify_all();
}
}
void bar() {
std::unique_lock<std::mutex> ul(m, std::defer_lock);
for (int i = 0; i < n; i++) {
ul.lock();
cv2.wait(ul, [&]()->bool {return !first;} );
// printBar() outputs "bar". Do not change or remove this line.
std::cout << "2\n";
first = !first;
ul.unlock();
cv1.notify_all();
}
}
int main(){
std::thread t1{foo};
std::thread t2{bar};
t1.join();
std::cout << "3\n";
t2.join();
}
same questions
Your threads do very little work. Depending on your os and number of cpu cores, threads will only switch at a fixed interval. There is a reasonable chance that after t1.join returns t2 has already finished executing (your first output).
If you add some sleeps to the loops in your threads you should see your second output every time as t2 will still be executing when t1.join returns.
Note that unlocking a mutex from a thread that didn't originally lock the mutex has undefined behaviour: https://en.cppreference.com/w/cpp/thread/mutex/unlock
You have made the wrong assumption that mutexes can be locked and unlocked from different threads.
A mutex locked by one thread cannot be unlocked by another thread. The whole lock/unlock process is per thread.
lock_bar_.unlock();
This line in your first function has no meaning. See ReleaseMutex in Windows (I guess it works that way in other OSes). It releases the mutex previously locked from the current thread, not from anyother else.
the answer is that when std::thread t2{} is created, it enters the queue. that is why t2 is also executed. join() does mean start().
I'm trying to unit test an atomic library (I am aware that an atomic library is not suitable for unit testing, but I still want to give it a try)
For this, I want to let X parallel threads increment a counter and evaluate the resulting value (it should be X).
The code is below. The problem is that is it never breaks. The Counter always nicely ends up being 2000 (see below). What I also notice is that the cout is also printed as a whole (instead of being mingled, what I remember seeing with other multithreaded couts)
My question is: why doesn't this break? Or how can I let this break?
#include <iostream>
#include <thread>
#include <vector>
#include <mutex>
#include <condition_variable>
std::mutex m;
std::condition_variable cv;
bool start = false;
int Counter = 0;
void Inc() {
// Wait until test says start
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, [] {return start; });
std::cout << "Incrementing in thread " << std::this_thread::get_id() << std::endl;
Counter++;
}
int main()
{
std::vector<std::thread> threads;
for (int i = 0; i < 2000; ++i) {
threads.push_back(std::thread(Inc));
}
// signal the threads to start
{
std::lock_guard<std::mutex> lk(m);
start = true;
}
cv.notify_all();
for (auto& thread : threads) {
thread.join();
}
// Now check whether value is right
std::cout << "Counter: " << Counter << std::endl;
}
The results looks like this (but then 2000 lines)
Incrementing in thread 130960
Incrementing in thread 130948
Incrementing in thread 130944
Incrementing in thread 130932
Incrementing in thread 130928
Incrementing in thread 130916
Incrementing in thread 130912
Incrementing in thread 130900
Incrementing in thread 130896
Counter: 2000
Any help would be appreciated
UPDATE: Reducing the nr of threads to 4, but incrementing a million times in a for loop (as suggested by #tkausl) the cout of thread id appear to be sequential..
UPDATE2: Turns out that the lock had to be unlocked to prevent exclusive access per thread (lk.unlock()). An additional yield in the for-loop increased the race condition effect.
cv.wait(lk, [] {return start; }); only returns with the lk acquired. So it's exclusive. You might want to unlock lk right after:
void Inc() {
// Wait until test says start
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, [] {return start; });
lk.unlock();
Counter++;
}
And you must remove std::cout, because it potentially introduces synchronization.
I was reading some literature on C++11 threads and tried the following code:
#include "iostream"
#include "thread"
using namespace std;
class background_task{
int data;
int flag;
public:
background_task(int val):data(val),flag(data%2){}
void operator()(void){
int count = 0;
while(count < 100)
{
if(flag)
cout <<'\n'<<data++;
else
cout <<'\n'<<data--;
count++;
}
}
};
int main(int argc , char** argv){
std::thread T1 {background_task(2)};
std::thread T2 {background_task(3)};
T1.join();
T2.join();
return 0;
}
the output doesn't make sense given that i am running two threads so each should be printing almost together and not wait for one thread to finish to start. Instead each thread finishes and then the next thread starts, like in a synchronous fashion.Am i missing something here?
its probably because of creating a new thread takes some time and the first thread finishes before the next one begin .
and you have the choice to detach or join a thread like
t1.detach();//don't care about t1 finishing
or t1.join()//wait for t1 to finish
Your operating system need not start the threads at the same time; it need not start them on different cores; it need not provide equal time to each thread. I really don't believe the standard mandates anything of the sort, but I haven't checked the standard to cite the right parts to verify.
You may be able to (no promises!) get the behavior you desire by changing your code to the following. This code is "encouraging" the OS to give more time to both threads, and hopefully allows for both threads to be fully constructed before one of them finishes.
#include <chrono>
#include <iostream>
#include <thread>
class background_task {
public:
background_task(int val) : data(val), flag(data % 2) {}
void operator()() {
int count = 0;
while (count < 100) {
std::this_thread::sleep_for(std::chrono::milliseconds(50));
if (flag)
std::cout << '\n' << data++;
else
std::cout << '\n' << data--;
count++;
}
}
private:
int data;
int flag;
};
int main() {
std::thread T1{background_task(2)};
std::thread T2{background_task(3)};
T1.join();
T2.join();
return 0;
}
Try below code, modified you earlier code to show the result:
#include "iostream"
#include "thread"
using namespace std;
class background_task{
int data;
int flag;
public:
background_task(int val):data(val),flag(data%2){}
void operator()(void){
int count = 0;
while(count < 10000000)
{
if(flag)
cout <<'\n'<<"Yes";
else
cout <<'\n'<<" "<<"No";
count++;
}
}
};
int main(int argc , char** argv){
std::thread T1 {background_task(2)};
std::thread T2 {background_task(3)};
T1.join();
T2.join();
return 0;
}
By the time second thread starts first thread is already done processing hence you saw what you saw.
In addition to Amir Rasti's answer I think it's worth mentioning the scheduler.
If you use a while(1) instead, you will see that the output isn't exactly parallel even after the two threads running "parallel". The scheduler (part of the operating system) will give each process time to run, but the time can vary. So it can be that one process will print 100 characters before the scheduler let the other process print again.
while(count < 10000)
Loop may be finished before starting of next thread, you can see the difference if you increase the loop or insert some sleep inside the loop.
I am relatively new to threads, and I'm still learning best techniques and the C++11 thread library. Right now I'm in the middle of implementing a worker thread which infinitely loops, performing some work. Ideally, the main thread would want to stop the loop from time to time to sync with the information that the worker thread is producing, and then start it again. My idea initially was this:
// Code run by worker thread
void thread() {
while(run_) {
// Do lots of work
}
}
// Code run by main thread
void start() {
if ( run_ ) return;
run_ = true;
// Start thread
}
void stop() {
if ( !run_ ) return;
run_ = false;
// Join thread
}
// Somewhere else
volatile bool run_ = false;
I was not completely sure about this so I started researching, and I discovered that volatile is actually not required for synchronization and is in fact generally harmful. Also, I discovered this answer, which describes a process nearly identical to the one I though about. In the answer's comments however, this solution is described as broken, as volatile does not guarantee that different processor cores readily (if ever) communicate changes on the volatile values.
My question is this then: Should I use an atomic flag, or something else entirely? What exactly is the property that is lacking in volatile and that is then provided by whatever construct is needed to solve my problem effectively?
Have you looked for the Mutex ? They're made to lock the Threads avoiding conflicts on the shared data. Is it what you're looking for ?
I think you want to use barrier synchronization using std::mutex?
Also take a look at boost thread, for a relatively high level threading library
Take a look at this code sample from the link:
#include <iostream>
#include <map>
#include <string>
#include <chrono>
#include <thread>
#include <mutex>
std::map<std::string, std::string> g_pages;
std::mutex g_pages_mutex;
void save_page(const std::string &url)
{
// simulate a long page fetch
std::this_thread::sleep_for(std::chrono::seconds(2));
std::string result = "fake content";
g_pages_mutex.lock();
g_pages[url] = result;
g_pages_mutex.unlock();
}
int main()
{
std::thread t1(save_page, "http://foo");
std::thread t2(save_page, "http://bar");
t1.join();
t2.join();
g_pages_mutex.lock(); // not necessary as the threads are joined, but good style
for (const auto &pair : g_pages) {
std::cout << pair.first << " => " << pair.second << '\n';
}
g_pages_mutex.unlock();
}
I would suggest to use std::mutex and std::condition_variable to solve the problem. Here's an example how it can work with C++11:
#include <condition_variable>
#include <iostream>
#include <mutex>
#include <thread>
using namespace std;
int main()
{
mutex m;
condition_variable cv;
// Tells, if the worker should stop its work
bool done = false;
// Zero means, it can be filled by the worker thread.
// Non-zero means, it can be consumed by the main thread.
int result = 0;
// run worker thread
auto t = thread{ [&]{
auto bound = 1000;
for (;;) // ever
{
auto sum = 0;
for ( auto i = 0; i != bound; ++i )
sum += i;
++bound;
auto lock = unique_lock<mutex>( m );
// wait until we can safely write the result
cv.wait( lock, [&]{ return result == 0; });
// write the result
result = sum;
// wake up the consuming thread
cv.notify_one();
// exit the loop, if flag is set. This must be
// done with mutex protection. Hence this is not
// in the for-condition expression.
if ( done )
break;
}
} };
// the main threads loop
for ( auto i = 0; i != 20; ++i )
{
auto r = 0;
{
// lock the mutex
auto lock = unique_lock<mutex>( m );
// wait until we can safely read the result
cv.wait( lock, [&]{ return result != 0; } );
// read the result
r = result;
// set result to zero so the worker can
// continue to produce new results.
result = 0;
// wake up the producer
cv.notify_one();
// the lock is released here (the end of the scope)
}
// do time consuming io at the side.
cout << r << endl;
}
// tell the worker to stop
{
auto lock = unique_lock<mutex>( m );
result = 0;
done = true;
// again the lock is released here
}
// wait for the worker to finish.
t.join();
cout << "Finished." << endl;
}
You could do the same with std::atomics by essentially implementing spin locks. Spin locks can be slower than mutexes. So I repeat the advise on the boost website:
Do not use spinlocks unless you are certain that you understand the consequences.
I believe that mutexes and condition variables are the way to go in your case.