Why does std::counting_semaphore::acquire() suffer deadlock in this case? - c++

I am testing std::counting_semaphore on C++20 with Windows 10 and MinGW x64.
As I learned from https://en.cppreference.com/w/cpp/thread/counting_semaphore, std::counting_semaphore is an atomic counter. We can use release() to increase the counter, and use acquire() to decrease the counter. If the counter equals to 0, than the thread wait.
I build the following simplified example to show my problem.
If I always release() before acquire() in the thread, the internal counter value(v) of std::counting_semaphore should always stay between v and v+1, and this code should never suffer any block.
When I run this example code, it suffers deadlock very often, but sometimes it can finish correctly.
I try to use std::cout message to understand the deadlock situation, but the deadlock disappeared when I using std::cout. In another hand, the deadlock disappeared when I use std::unique_lock.
The example is as follows:
#include <iostream>
#include <thread>
#include <atomic>
#include <vector>
#include <mutex>
#include <semaphore>
using namespace std::literals;
std::mutex mtx;
const int numOfThr {2};
const int numOfForLoop {1000};
const int max_smph {numOfThr* numOfForLoop *2};
std::counting_semaphore<max_smph> smph {numOfThr+1};
void thrf_TestSmph ( const int iThr )
{
for ( int i = 0; i < numOfForLoop; ++i )
{
// std::unique_lock ul(mtx);
//unique_lock can stop deadlock.
smph.release(); //smph counter ++
smph.acquire(); //smph counter --
// if ( i % 1000 == 1 ) std::cout << iThr << " : " << i << "\n";
//print out message can stop deadlock.
}
}
int main()
{
std::cout << "Start testing semaphore ..." << "\n\n";
std::vector<std::thread> thrf_TestSmphVec ( numOfThr );
for ( int iThr = 0; iThr < numOfThr; ++iThr )
{
thrf_TestSmphVec[iThr] = std::thread ( thrf_TestSmph, iThr );
}
for ( auto& thr : thrf_TestSmphVec )
{
if ( thr.joinable() )
thr.join();
}
std::cout << "Test is done." << "\n";
return 0;
}

Update: Found this bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928
This is not really an answer.
I can reproduce the infinite blocking on my M1 macbook air, when it is compiled with gcc or clang and libstdc++. Printing message did't prevent the blocking. When it is compiled with clang and libc++, the program finished normally.
I noticed this piece of code and comment in my included header include/c++/11/bits/semaphore_base.h of libstdc++:
_GLIBCXX_ALWAYS_INLINE void
_M_release(ptrdiff_t __update) noexcept
{
if (0 < __atomic_impl::fetch_add(&_M_counter, __update, memory_order_release))
return;
if (__update > 1)
__atomic_notify_address_bare(&_M_counter, true);
else
__atomic_notify_address_bare(&_M_counter, true);
// FIXME - Figure out why this does not wake a waiting thread
// __atomic_notify_address_bare(&_M_counter, false);
}
Then I changed the first return to __atomic_notify_address_bare(&_M_counter, true);, and the problem seems disappear.
That comment is commited in this commit.
_GLIBCXX_ALWAYS_INLINE void
_M_release(ptrdiff_t __update) noexcept
{
if (0 < __atomic_impl::fetch_add(&_M_counter, __update, memory_order_release))
return;
if (__update > 1)
__atomic_notify_address_bare(&_M_counter, true);
else
- __atomic_notify_address_bare(&_M_counter, false);
+ __atomic_notify_address_bare(&_M_counter, true);
+ // FIXME - Figure out why this does not wake a waiting thread
+ // __atomic_notify_address_bare(&_M_counter, false);
It seems that the developer team has known the problem, but their short-term solution didn't fix the problem.

After doing a lot of experimentations about std::counting_semaphore::acquire(), I noticed that it will suffer a blocking when two threads trigger std::counting_semaphore::acquire() in a very close time interval. It seems to make the internal counter inside of the std::counting_semaphore be frozen, so
std::counting_semaphore::release() can not increase the internal counter of the std::counting_semaphore correctly. In this situation, the next std::counting_semaphore::acquire() will be blocked, because the internal counter is frozen. This situation happens in a lot of intense threads experimentations with std::counting_semaphore::acquire() on my system. The example code in my question is the most simplified one to reproduce this problem.
I guess it is a kind of collision issue inside of my system. Base on this assumption, I try to use back-off to bypass this problem.
I use while(!std::counting_semaphore::try_acquire_for(1ns)){} to substitude std::counting_semaphore::acquire(), because std::counting_semaphore::try_acquire_for() can return false when it can not decrease the internal counter.
It works well at this moment, even I increase the const int numOfThr {2} to {100`000}.
Here comes the example code as follows:
#include <iostream>
#include <thread>
#include <atomic>
#include <vector>
#include <mutex>
#include <semaphore>
using namespace std::literals;
std::mutex mtx;
const int numOfThr {2};
const int numOfForLoop {1000};
const int max_smph {numOfThr* numOfForLoop * 2};
std::counting_semaphore<max_smph> smph {numOfThr + 1};
void thrf_TestSmph ( const int iThr )
{
for ( int i = 0; i < numOfForLoop; ++i )
{
smph.release(); //smph counter ++
while ( !smph.try_acquire_for ( 1ns ) ) {} //smph counter --
//don't use smph.acquire() directly, it easily makes blocking.
}
}
int main()
{
std::cout << "Start testing semaphore ..." << "\n\n";
std::vector<std::thread> thrf_TestSmphVec ( numOfThr );
for ( int iThr = 0; iThr < numOfThr; ++iThr )
{
thrf_TestSmphVec[iThr] = std::thread ( thrf_TestSmph, iThr );
}
for ( auto& thr : thrf_TestSmphVec )
{
if ( thr.joinable() )
thr.join();
}
std::cout << "Test is done." << "\n";
return 0;
}

Related

How to use `std::async` to call a function in a mutex protected loop?

vector<int> vecCustomers;
// populate vecCustomers
void funA()
{
std::lock_guard<std::mutex> guard( _mutex ); // need lock here
for(int i=0; i<vecCustomers.size(); ++i)
{
funB( vecCustomers[i] ); // can I run this asynchronously
}
}
void funB(int i)
{
// do something here
}
Question> funA accesses critical resources and it uses the lock to protect the resources. funB doesn't use any critical resources and it doesn't need mutex. Is there a way that I can make use of std::async so that I can call funB and immediately return to prepare the calling next funB inside the loop? Also, before the return of the function, all tasks of funB must finish.
Thank you
== Update ==
I write the following code based on the suggestion. Now, the new issue is why all threads are blocked by the first thread?
The output is always as follows:
From[0]:H0 << why this thread blocks all others?
From[1]:H1
From[2]:H2
From[3]:H3
From[4]:H4
#include <vector>
#include <future>
#include <mutex>
#include <string>
#include <iostream>
#include <chrono>
using namespace std;
struct ClassA
{
ClassA()
{
vecStr.push_back( "H0" );
vecStr.push_back( "H1" );
vecStr.push_back( "H2" );
vecStr.push_back( "H3" );
vecStr.push_back( "H4" );
}
void start()
{
for ( int i = 0; i < 5; ++i )
{
std::unique_lock<std::mutex> guard( _mutex );
std::string strCopy = vecStr[i];
guard.unlock();
std::async( std::launch::async, &ClassA::PrintString, this, i, strCopy );
//PrintString( i, vecStr[i] );
guard.lock();
}
}
void PrintString( int i, const string& str) const
{
if ( i == 0 )
std::this_thread::sleep_for( std::chrono::seconds( 10 ) );
cout << "From[" << i << "]:" << str << endl;
}
mutex _mutex;
vector<string> vecStr;
};
int main()
{
ClassA ca;
ca.start();
return 0;
}
===Update 2===
#include <vector>
#include <future>
#include <mutex>
#include <string>
#include <iostream>
#include <chrono>
using namespace std;
struct ClassA
{
ClassA()
{
vecStr.push_back( "H0" );
vecStr.push_back( "H1" );
vecStr.push_back( "H2" );
vecStr.push_back( "H3" );
vecStr.push_back( "H4" );
}
void start()
{
std::vector<std::future<void>> result;
for ( int i = 0; i < 5; ++i )
{
std::unique_lock<std::mutex> guard( _mutex );
std::string strCopy = vecStr[i];
guard.unlock();
result.push_back( std::async( std::launch::async, &ClassA::PrintString, this, i, strCopy ) );
//PrintString( i, vecStr[i] );
}
for(auto &e : result)
{
e.get();
}
}
void PrintString( int i, const string& str) const
{
static std::mutex m;
std::unique_lock<std::mutex> _(m);
if ( i == 0 )
{
cout << "From[" << i << "]:" << str << " sleep for a while" << endl;
_.unlock();
std::this_thread::sleep_for( std::chrono::seconds( 10 ) );
}
else
cout << "From[" << i << "]:" << str << endl;
}
mutex _mutex;
vector<string> vecStr;
};
int main()
{
ClassA ca;
ca.start();
return 0;
}
The primary reason you see the calls executed in order is that you aren't taking advantage of parallelism in any way (wait, what? but...). Let me explain
std::async doesn't just launch a task to be ran asynchronously, it also returns a std::future which can be used to get the returned value (should the launched function return something). However, because you do not store the future it is immediately destroyed after the task is launched. And unfortunately for you, in this case, the destructor blocks until the call is completed.
[std::future::~future()] may block if all of the following are true: the shared state was created by a call to std::async, the shared state is not yet ready, and this was the last reference to the shared state.
(quote) Many people have expressed frustration due to this fact but that's how it is set by the standard.
So what you'll have to do is store the std::futures (in a vector or something) until all are launched.
You can certainly call std::async while keeping a mutex lock. The downside is that calling std::async takes some time thereby increasing the time you keep that mutex locked, and hence decreasing parallelism.
It may be cheaper to make a copy of that vector while holding the lock. Then release the mutex and process the copy asynchronously.
Your optimization objective is to minimize the time the mutex is locked and probably time spent in the function.

Stop infinite looping thread from main

I am relatively new to threads, and I'm still learning best techniques and the C++11 thread library. Right now I'm in the middle of implementing a worker thread which infinitely loops, performing some work. Ideally, the main thread would want to stop the loop from time to time to sync with the information that the worker thread is producing, and then start it again. My idea initially was this:
// Code run by worker thread
void thread() {
while(run_) {
// Do lots of work
}
}
// Code run by main thread
void start() {
if ( run_ ) return;
run_ = true;
// Start thread
}
void stop() {
if ( !run_ ) return;
run_ = false;
// Join thread
}
// Somewhere else
volatile bool run_ = false;
I was not completely sure about this so I started researching, and I discovered that volatile is actually not required for synchronization and is in fact generally harmful. Also, I discovered this answer, which describes a process nearly identical to the one I though about. In the answer's comments however, this solution is described as broken, as volatile does not guarantee that different processor cores readily (if ever) communicate changes on the volatile values.
My question is this then: Should I use an atomic flag, or something else entirely? What exactly is the property that is lacking in volatile and that is then provided by whatever construct is needed to solve my problem effectively?
Have you looked for the Mutex ? They're made to lock the Threads avoiding conflicts on the shared data. Is it what you're looking for ?
I think you want to use barrier synchronization using std::mutex?
Also take a look at boost thread, for a relatively high level threading library
Take a look at this code sample from the link:
#include <iostream>
#include <map>
#include <string>
#include <chrono>
#include <thread>
#include <mutex>
std::map<std::string, std::string> g_pages;
std::mutex g_pages_mutex;
void save_page(const std::string &url)
{
// simulate a long page fetch
std::this_thread::sleep_for(std::chrono::seconds(2));
std::string result = "fake content";
g_pages_mutex.lock();
g_pages[url] = result;
g_pages_mutex.unlock();
}
int main()
{
std::thread t1(save_page, "http://foo");
std::thread t2(save_page, "http://bar");
t1.join();
t2.join();
g_pages_mutex.lock(); // not necessary as the threads are joined, but good style
for (const auto &pair : g_pages) {
std::cout << pair.first << " => " << pair.second << '\n';
}
g_pages_mutex.unlock();
}
I would suggest to use std::mutex and std::condition_variable to solve the problem. Here's an example how it can work with C++11:
#include <condition_variable>
#include <iostream>
#include <mutex>
#include <thread>
using namespace std;
int main()
{
mutex m;
condition_variable cv;
// Tells, if the worker should stop its work
bool done = false;
// Zero means, it can be filled by the worker thread.
// Non-zero means, it can be consumed by the main thread.
int result = 0;
// run worker thread
auto t = thread{ [&]{
auto bound = 1000;
for (;;) // ever
{
auto sum = 0;
for ( auto i = 0; i != bound; ++i )
sum += i;
++bound;
auto lock = unique_lock<mutex>( m );
// wait until we can safely write the result
cv.wait( lock, [&]{ return result == 0; });
// write the result
result = sum;
// wake up the consuming thread
cv.notify_one();
// exit the loop, if flag is set. This must be
// done with mutex protection. Hence this is not
// in the for-condition expression.
if ( done )
break;
}
} };
// the main threads loop
for ( auto i = 0; i != 20; ++i )
{
auto r = 0;
{
// lock the mutex
auto lock = unique_lock<mutex>( m );
// wait until we can safely read the result
cv.wait( lock, [&]{ return result != 0; } );
// read the result
r = result;
// set result to zero so the worker can
// continue to produce new results.
result = 0;
// wake up the producer
cv.notify_one();
// the lock is released here (the end of the scope)
}
// do time consuming io at the side.
cout << r << endl;
}
// tell the worker to stop
{
auto lock = unique_lock<mutex>( m );
result = 0;
done = true;
// again the lock is released here
}
// wait for the worker to finish.
t.join();
cout << "Finished." << endl;
}
You could do the same with std::atomics by essentially implementing spin locks. Spin locks can be slower than mutexes. So I repeat the advise on the boost website:
Do not use spinlocks unless you are certain that you understand the consequences.
I believe that mutexes and condition variables are the way to go in your case.

Why is this piece of C++ code not synchronized

I am learning to write multithreading applications. So share I run into trouble anytime I want my threads to access even the simples shared resources, despite using mutex.
For example, consider this code:
using namespace std;
mutex mu;
std::vector<string> ob;
void addSomeAValues(){
mu.lock();
for(int a=0; a<10; a++){
ob.push_back("A" + std::to_string(a));
usleep(300);
}
mu.unlock();
}
void addSomeBValues(){
mu.lock();
for(int b=0; b<10; b++){
ob.push_back("B" + std::to_string(b));
usleep(300);
}
mu.unlock();
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
thread t0(addSomeAValues);
thread t1(addSomeBValues);
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
t0.join();
t1.join();
//Display the results
cout << "Code Run Complete; results: \n";
for(auto k : ob){
cout << k <<endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t;
millisecs_t duration(std::chrono::duration_cast<millisecs_t>(end-start));
std::cout << duration.count() << " milliseconds.\n";
return 0;
}
When I run the program, it behaves unpredictably. Sometimes, the values A0-9 and B0-9 is printed to console no problem, sometimes there is a segmentation fault with crash report, sometimes, A0-3 & B0-5 is presented.
If i am missing a core synchronization issue, pleasee help
Edit: after alot of useful feed back i changed the code to
#include <iostream>
#include <string>
#include <vector>
#include <mutex>
#include <unistd.h>
#include <thread>
#include <chrono>
using namespace std;
mutex mu;
std::vector<string> ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
mu.lock();
ob.push_back("A" + std::to_string(a));
mu.unlock();
usleep(300);
}
}
void addSomeBValues(){
for(int b=0; b<10; b++){
mu.lock();
ob.push_back("B" + std::to_string(b));
mu.unlock();
usleep(300);
}
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now() ;
thread t0(addSomeAValues);
thread t1(addSomeBValues);
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now() ;
t0.join();
t1.join();
//Display the results
cout << "Code Run Complete; results: \n";
for(auto k : ob){
cout << k <<endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t ;
millisecs_t duration( std::chrono::duration_cast<millisecs_t>(end-start) ) ;
std::cout << duration.count() << " milliseconds.\n" ;
return 0;
}
however I get the following output sometimes:
*** Error in `/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment':
double free or corruption (fasttop): 0x00007f19fc000920 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x80a46)[0x7f1a0687da46]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402dd4]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402930]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402a8d]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402637
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x402278]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x4019cf]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x4041e3]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x404133]
/home/soliduscode/eclipse_workspace/CppExperiment/Debug/CppExperiment[0x404088]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb29f0)[0x7f1a06e8d9f0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7f8e)[0x7f1a060c6f8e]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f1a068f6e1d]
Update & Solution
With the problem I was experiencing (namely: unpredictable executing of the program with intermittent dump of corruption complaints), all was solved by including -lpthread as part of my eclipse build (under project settings).
I am using C++11. It's odd, at least to me, that the program would compile without issuing a complaint that I have not yet linked against pthread.
So to anyone using C++11, std::thread, and linux, make sure you link against pthread otherwise your program runtime will be VERY unpredictable, and buggy.
If you're going to use threads, I'd advise doing the job at least a little differently.
Right now, one thread gets the mutex, does all it's going to do (including sleeping for 3000 microseconds), then quits. Then the other thread does essentially the same thing. This being the case, threads have accomplished essentially nothing positive and a fair amount of negative (synchronization code and such).
Your current code is almost unsafe with respect to exceptions -- if an exception were to be thrown inside one of your thread functions, the mutex wouldn't be unlocked, even though that thread could no longer execute.
Finally, right now, you're exposing a mutex, and leaving it to all code that accesses the associated resource to use the mutex correctly. I'd prefer to centralize the mutex locking so its exception safe, and most of the code can ignore it completely.
// use std::lock_guard, if available.
class lock {
mutex &m
public:
lock(mutex &m) : m(m) { m.lock(); }
~lock() { m.unlock(); }
};
class synched_vec {
mutex m;
std::vector<string> data;
public:
void push_back(std::string const &s) {
lock l(m);
data.push_back(s);
}
} ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
ob.push_back("A" + std::to_string(a));
usleep(300);
}
}
This also means that if (for example) you decide to use a lock-free (or minimal locking) structure in the future, you should only have to modify the synched_vec, not all the rest of the code that uses it. Likewise, by keeping all the mutex handling in one place, it's much easier to get the code right, and if you do find a bug, much easier to ensure you've fixed it (rather than looking through all the client code).
The code in the question runs without any segmentation faults (with adding headers and replacing the sleep with a sleep for my system).
There are two problems with the code though, that could cause unexpected results:
Each thread locks the mutex during his full execution. This prevents the other thread to run. The two threads are not running in parallel! In your case, you should only lock, when you are accessing the vector.
Your end time point is taken after creating the threads and not after they are done executing. Both threads are done, when they are both joined.
Working compilable code with headers, chrono-sleep and the two errors fixed:
#include <mutex>
#include <string>
#include <vector>
#include <thread>
#include <iostream>
std::mutex mu;
std::vector<std::string> ob;
void addSomeAValues(){
for(int a=0; a<10; a++){
mu.lock();
ob.push_back("A" + std::to_string(a));
mu.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(300));
}
}
void addSomeBValues(){
for(int b=0; b<10; b++){
mu.lock();
ob.push_back("B" + std::to_string(b));
mu.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(300));
}
}
int main() {
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
std::thread t0(addSomeAValues);
std::thread t1(addSomeBValues);
t0.join();
t1.join();
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
//Display the results
std::cout << "Code Run Complete; results: \n";
for(auto k : ob){
std::cout << k << std::endl;
}
//Code running complete, report the time it took
typedef std::chrono::duration<int,std::milli> millisecs_t;
millisecs_t duration(std::chrono::duration_cast<millisecs_t>(end-start));
std::cout << duration.count() << " milliseconds.\n";
return 0;
}

Using a C++11 condition variable in VS2012

I can't get code working reliably in a simple VS2012 console application consisting of a producer and consumer that uses a C++11 condition variable. I am aiming at producing a small reliable program (to use as the basis for a more complex program) that uses the 3 argument wait_for method or perhaps the wait_until method from code I have gathered at these websites:
condition_variable:
wait_for,
wait_until
I'd like to use the 3 argument wait_for with a predicate like below except it will need to use a class member variable to be most useful to me later. I am receiving "Access violation writing location 0x__" or "An invalid parameter was passed to a service or function" as errors after only about a minute of running.
Would steady_clock and the 2 argument wait_until be sufficient to replace the 3 argument wait_for? I've also tried this without success.
Can someone show how to get the code below to run indefinitely with no bugs or weird behavior with either changes in wall-clock time from daylight savings time or Internet time synchronizations?
A link to reliable sample code could be just as helpful.
// ConditionVariable.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <condition_variable>
#include <mutex>
#include <thread>
#include <iostream>
#include <queue>
#include <chrono>
#include <atomic>
#define TEST1
std::atomic<int>
//int
qcount = 0; //= ATOMIC_VAR_INIT(0);
int _tmain(int argc, _TCHAR* argv[])
{
std::queue<int> produced_nums;
std::mutex m;
std::condition_variable cond_var;
bool notified = false;
unsigned int count = 0;
std::thread producer([&]() {
int i = 0;
while (1) {
std::this_thread::sleep_for(std::chrono::microseconds(1500));
std::unique_lock<std::mutex> lock(m);
produced_nums.push(i);
notified = true;
qcount = produced_nums.size();
cond_var.notify_one();
i++;
}
cond_var.notify_one();
});
std::thread consumer([&]() {
std::unique_lock<std::mutex> lock(m);
while (1) {
#ifdef TEST1
// Version 1
if (cond_var.wait_for(
lock,
std::chrono::microseconds(1000),
[&]()->bool { return qcount != 0; }))
{
if ((count++ % 1000) == 0)
std::cout << "consuming " << produced_nums.front () << '\n';
produced_nums.pop();
qcount = produced_nums.size();
notified = false;
}
#else
// Version 2
std::chrono::steady_clock::time_point timeout1 =
std::chrono::steady_clock::now() +
//std::chrono::system_clock::now() +
std::chrono::milliseconds(1);
while (qcount == 0)//(!notified)
{
if (cond_var.wait_until(lock, timeout1) == std::cv_status::timeout)
break;
}
if (qcount > 0)
{
if ((count++ % 1000) == 0)
std::cout << "consuming " << produced_nums.front() << '\n';
produced_nums.pop();
qcount = produced_nums.size();
notified = false;
}
#endif
}
});
while (1);
return 0;
}
Visual Studio Desktop Express had 1 important update which it installed and Windows Update has no other important updates. I'm using Windows 7 32-bit.
Sadly, this is actually a bug in VS2012's implementation of condition_variable, and the fix will not be patched in. You'll have to upgrade to VS2013 when it's released.
See:
http://connect.microsoft.com/VisualStudio/feedback/details/762560
First of all, while using condition_variables I personally prefer some wrapper classes like AutoResetEvent from C#:
struct AutoResetEvent
{
typedef std::unique_lock<std::mutex> Lock;
AutoResetEvent(bool state = false) :
state(state)
{ }
void Set()
{
auto lock = AcquireLock();
state = true;
variable.notify_one();
}
void Reset()
{
auto lock = AcquireLock();
state = false;
}
void Wait(Lock& lock)
{
variable.wait(lock, [this] () { return this->state; });
state = false;
}
void Wait()
{
auto lock = AcquireLock();
Wait(lock);
}
Lock AcquireLock()
{
return Lock(mutex);
}
private:
bool state;
std::condition_variable variable;
std::mutex mutex;
};
This may not be the same behavior as C# type or may not be as efficient as it should be but it gets things done for me.
Second, when I need to implement a producing/consuming idiom I try to use a concurrent queue implementation (eg. tbb queue) or write a one for myself. But you should also consider making things right by using Active Object Pattern. But for simple solution we can use this:
template<typename T>
struct ProductionQueue
{
ProductionQueue()
{ }
void Enqueue(const T& value)
{
{
auto lock = event.AcquireLock();
q.push(value);
}
event.Set();
}
std::size_t GetCount()
{
auto lock = event.AcquireLock();
return q.size();
}
T Dequeue()
{
auto lock = event.AcquireLock();
event.Wait(lock);
T value = q.front();
q.pop();
return value;
}
private:
AutoResetEvent event;
std::queue<T> q;
};
This class has some exception safety issues and misses const-ness on the methods but like I said, for a simple solution this should fit.
So as a result your modified code looks like this:
int main(int argc, char* argv[])
{
ProductionQueue<int> produced_nums;
unsigned int count = 0;
std::thread producer([&]() {
int i = 0;
while (1) {
std::this_thread::sleep_for(std::chrono::microseconds(1500));
produced_nums.Enqueue(i);
qcount = produced_nums.GetCount();
i++;
}
});
std::thread consumer([&]() {
while (1) {
int item = produced_nums.Dequeue();
{
if ((count++ % 1000) == 0)
std::cout << "consuming " << item << '\n';
qcount = produced_nums.GetCount();
}
}
});
producer.join();
consumer.join();
return 0;
}

how to use boost atomic to remove race condition?

I am trying to use boost::atomic to do multithreading synchronization on linux.
But, the result is not consistent.
Any help will be appreciated.
thanks
#include <boost/bind.hpp>
#include <boost/threadpool.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread.hpp>
#include <boost/atomic.hpp>
boost::atomic<int> g(0) ;
void f()
{
g.fetch_add(1, boost::memory_order_relaxed);
return ;
}
const int threadnum = 10;
int main()
{
boost::threadpool::fifo_pool tp(threadnum);
for (int i = 0 ; i < threadnum ; ++i)
tp.schedule(boost::bind(f));
tp.wait();
std::cout << g << std::endl ;
return 0 ;
}
I'm not familiar with the boost thread library specifically, or boost::threadpool, but it looks to me like the threads have not necessarily completed when you access the value of g, so you will get some value between zero and 10.
Here's your program, modified to use the standard library, with joins inserted so that the fetch adds happen before the output of g.
std::atomic<int> g(0);
void f() {
g.fetch_add(1, std::memory_order_relaxed);
}
int main() {
const int threadnum = 10;
std::vector<std::thread> v;
for (int i = 0 ; i < threadnum ; ++i)
v.push_back(std::thread(f));
for (auto &th : v)
th.join();
std::cout << g << '\n';
}
edit:
If your program still isn't consistent even with the added tp.wait() then that is puzzling. The adds should happen before the threads end, and I would think that the threads ending would synchronize with the tp.wait(), which happens before the read. So all the adds should happen before g is printed, even though you use memory_order_relaxed, so the printed value should be 10.
Here are some examples that might help:
http://www.chaoticmind.net/~hcb/projects/boost.atomic/doc/atomic/usage_examples.html
Basically, you're trying to "protect" a "critical region" with a "lock".
You can set or unset a semaphore.
Or you can "exchange" a boost "atomic" variable. For example (from the above link):
class spinlock {
private:
typedef enum {Locked, Unlocked} LockState;
boost::atomic<LockState> state_;
public:
spinlock() : state_(Unlocked) {}
lock()
{
while (state_.exchange(Locked, boost::memory_order_acquire) == Locked) {
/* busy-wait */
}
}
unlock()
{
state_.store(Unlocked, boost::memory_order_release);
}
};