I am now testing std::condition_variable recently , and find it is quite different with pthread_cond_t after test , I like to know if anything in my test wrong ? or std::condition_variable is really quite different with pthread_cond_t ?
The pthread_cond_t source is the following , compiled at gcc 4.4.6 :
pthread_cond_t condA = PTHREAD_COND_INITIALIZER;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
int ProcessRow = 0 ;
#define LOOPCNT 10
void *producer()
{
int idx ;
for(idx=0;idx<LOOPCNT;idx++)
{
//pthread_mutex_lock(&mutex);
__sync_add_and_fetch(&ProcessRow,1) ;
pthread_cond_signal(&condA);
printf("sending signal...(%d)\n",ProcessRow) ;
//pthread_mutex_unlock(&mutex);
}
printf("I am out ... \n") ;
}
void *consumer()
{
int icnt = 0 ;
while(1)
{
pthread_mutex_lock(&mutex);
while (ProcessRow <= 0)
pthread_cond_wait(&condA, &mutex);
pthread_mutex_unlock(&mutex); // I forget to add unlock to fail this test
__sync_sub_and_fetch(&ProcessRow,1) ;
++icnt ;
printf("receving=(%d)\n",ProcessRow) ;
usleep(10000) ;
}
printf("(%d)\n",ProcessRow) ;
}
The output :
sending signal...(1)
sending signal...(2)
sending signal...(3)
sending signal...(4)
sending signal...(5)
sending signal...(6)
sending signal...(7)
sending signal...(8)
sending signal...(9)
sending signal...(10)
I am out ...
receving=(9)
Look like comsumer thread block in pthread_cond_wait , so that "receving" only print
one time !!!!
and then the following test is for std::condition_variable !!!!
The following binsem.hpp comes from
https://gist.github.com/yohhoy/2156481
with a little modification , compiled at g++ 4.8.1
class binsem {
public:
explicit binsem(int init_count = count_max)
: count_(init_count) {}
// P-operation / acquire
void wait()
{
std::unique_lock<std::mutex> lk(m_);
cv_.wait(lk, [this]{ return 0 < count_; });
--count_;
}
bool try_wait()
{
std::lock_guard<std::mutex> lk(m_);
if (0 < count_)
{
--count_;
return true;
} else
{
return false;
}
}
// V-operation / release
void signal()
{
std::lock_guard<std::mutex> lk(m_);
//if (count_ < count_max) // I mark here
//{ // I mark here
++count_;
cv_.notify_one();
//} // I mark here
}
// Lockable requirements
void lock() { wait(); }
bool try_lock() { return try_wait(); }
void unlock() { signal(); }
private:
static const int count_max = 1;
int count_;
std::mutex m_;
std::condition_variable cv_;
};
and my source :
#define LOOPCNT 10
atomic<int> ProcessRow ;
void f4()
{
for(int i=0;i<LOOPCNT;i++)
{
sem2.unlock() ;
++ProcessRow ;
}
cout << "i am out" << endl ;
}
void f5()
{
int icnt = 0 ;
std::chrono::milliseconds sleepDuration(1000);
while(1)
{
sem2.lock() ;
++icnt ;
std::this_thread::sleep_for(sleepDuration);
cout << ProcessRow << "in f5 " << endl ;
--ProcessRow ;
if(icnt >= LOOPCNT)
break ;
}
printf("(%d)\n",icnt) ;
}
The output :
i am out
10in f5
9in f5
8in f5
7in f5
6in f5
5in f5
4in f5
3in f5
2in f5
1in f5
(10)
Look like signal only effect if the pthread_cond_wait is waiting!! if not , signal is losted !!
And for std::condition_variable , look like std::condition_variable.wait() will wake up the times notify_one() are called ,if you call notify_one() 10 seconds ago and then call wait() , std::condition_variable.wait() still will get that notify_one() message , quite different with pthread_cond_t !!
Am I miss something in this test ? Or just like my test , std::condition and pthread_cond_t just act like the test showes ?
Edit :
I think the following will showes more easier for this test , sorry to forget to unlock so that the test failed , they are the same behavior !!!!
int main()
{
//pthread_mutex_lock(&mutex);
++ProcessRow ;
pthread_cond_signal(&condA);
//pthread_mutex_unlock(&mutex);
printf("sending signal...\n") ;
sleep(10) ;
pthread_mutex_lock(&mutex);
while (ProcessRow <= 0)
pthread_cond_wait(&condA, &mutex);
pthread_mutex_unlock(&mutex);
printf("wait pass through\n") ;
}
This will showes :
sending signal...
wait pass through
And for std::condition_variable
int main()
{
sem2.unlock() ;
std::chrono::milliseconds sleepDuration(10000);
cout << "going sleep" << endl ;
std::this_thread::sleep_for(sleepDuration);
sem2.lock() ;
cout << "lock pass through " << endl ;
}
Will showes :
going sleep
lock pass through
So it is my fault to do the test wrong , cause to deadlock !!! Thanks for all great advice!
In your pthread code, you never unlock the mutex, The consumer() function deadlocks on the second iteration. Also, the outer while loop should break out when some condition is satisfied. I suggest that it should break out when icnt reaches the LOOPCNT. This sort of matches how you break the loop in f5().
void *consumer(void *x)
{
int icnt = 0 ;
while(1)
{
pthread_mutex_lock(&mutex);
while (ProcessRow <= 0)
pthread_cond_wait(&condA, &mutex);
__sync_sub_and_fetch(&ProcessRow,1) ;
++icnt ;
printf("receving=(%d) icnt=(%d)\n",ProcessRow, icnt) ;
pthread_mutex_unlock(&mutex);
if (icnt == LOOPCNT) break;
usleep(10000) ;
}
printf("(%d)\n",ProcessRow) ;
}
It doesn't seem like your std::thread version of the code closely matches the pthread version at all, so I don't think you can compare their executions in this way. Instead of mimicking a semaphore, I think it better to just use the std::condition_variable exactly like you use it in the pthread version of the code. This way, you can really compare apples to apples.
std::condition_variable condA;
std::mutex mutex;
volatile int ProcessRow = 0 ;
#define LOOPCNT 10
void producer()
{
int idx ;
for(idx=0;idx<LOOPCNT;idx++)
{
std::unique_lock<std::mutex> lock(mutex);
__sync_add_and_fetch(&ProcessRow,1) ;
condA.notify_one();
printf("sending signal...(%d)\n",ProcessRow) ;
}
printf("I am out ... \n") ;
}
void consumer()
{
int icnt = 0 ;
while(icnt < LOOPCNT)
{
if(icnt > 0) usleep(10000);
std::unique_lock<std::mutex> lock(mutex);
while (ProcessRow <= 0)
condA.wait(lock);
__sync_sub_and_fetch(&ProcessRow,1) ;
++icnt ;
printf("receving=(%d) icnt=(%d)\n",ProcessRow, icnt) ;
}
printf("(%d)\n",ProcessRow) ;
}
Both pthread_cond_t and std::condition_variable work the same way. They are stateless and a signal can only get "lost" if no thread is blocked, in which case no signal is needed because there is no thread that needs one.
Related
I am implementing the producer-consumer problem. To implement this, we need to have std::condition_variable along with std::mutex to notify threads to wake up. Using these 2 primitives, the producer can notify to consumer and vice-versa to wake up. This is generally required to avoid thread starvation issues. But I am thinking does this issue really persist in the case of the multiprocessors system?
This question comes because I am implementing this using lock-free ring buffer and I don't want to use std::mutex and std::condition_variable at the producer and consumer sides. Since this queue can't have a data-race issue calling enqueue() and dequeue(). Below is the code.
template<typename MessageType>
class MessageProcessor
{
public:
~MessageProcessor()
{
stop();
if (workerThread_.joinable())
workerThread_.join();
}
bool postMessage(MessageType const &msg)
{
return queue_.enqueue(msg);
}
void registerHandler(std::function<void(MessageType)> handler, int32_t coreId=-1, std::string_view const &name="")
{
std::call_once(init_, [&](){
handler_ = std::move(handler);
workerThread_ = std::thread{&MessageProcessor::process, this};
if (!setAffinity(coreId, workerThread_))
LOG("Msg Processing thread couldn't be pinned to core: " << coreId);
else
LOG("Msg Processing thread pinned to core: " << coreId);
if (! name.empty())
pthread_setname_np(workerThread_.native_handle(), name.data());
});
}
void stop()
{
stop_ = true;
}
private:
void process() //This is a consumer, runs in a separate thread
{
while(!stop_.load(std::memory_order_acquire))
{
MessageType msg;
if (! queue_.dequeue(msg))
continue;
try
{
handler_(msg);
}
catch(std::exception const &ex)
{
LOG("Error while processing data: " << msg << ", Exception: " << ex.what());
}
catch(...)
{
LOG("UNKOWN Error while processing data: " << msg);
}
}
}
bool setAffinity(int32_t const coreId, std::thread &thread)
{
int cpuCoreCount = __sysconf(_GLIBCXX_USE_SC_NPROCESSORS_ONLN);
if (coreId < 0 || coreId >= cpuCoreCount)
return false;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(coreId, &cpuset);
pthread_t currentThread = thread.native_handle();
return pthread_setaffinity_np(currentThread, sizeof(cpu_set_t), &cpuset) == 0;
}
std::thread workerThread_;
std::atomic<bool> stop_{false};
MPMC_Circular_Queue<MessageType, 1024> queue_;
std::function<void(MessageType)> handler_{};
std::once_flag init_;
};
int main()
{
pthread_setname_np(pthread_self(), "MAIN");
MessageProcessor<int> processor;
processor.registerHandler([](int i){
LOG("Received value: " << i);
}, 2, "PROCESSOR");
std::thread t1([&]() { //Producer thread1
for (int i = 1; i <= 100000; i += 2)
{
LOG("Submitting value: " << i);
processor.postMessage(i);
}
});
pthread_setname_np(t1.native_handle(), "ODD ");
std::thread t2([&]() { //Producer thread2
for (int i = 2; i <= 100000; i += 2)
{
LOG("Submitting value: " << i);
processor.postMessage(i);
}
});
pthread_setname_np(t2.native_handle(), "EVEN");
for (int i = 1; i <= 100000; ++i)
{
LOG("Runing main thread: " << i);
}
t1.join();
t2.join();
return 0;
}
Can this code raise thread starvation issue in modern multiprocessors system? MPMC_Circular_Queue is a lock free bounded queue.
I need to do proper synchronization over several threads in my application. The threads are devided into a group of threads - graup A which may contain more then one thread and thread B. Thread B is supposed to be unlocker thread while only one thread from group A at the same time is supposed to be unlocked by thread B. I tryied to achive stable solution using pthread_mutex_t with code like this:
// thread group A
...
while(...)
{
pthread_mutex_lock(&lock) ;
// only one thread at the same time allowed from here
...
}
// thread B
while(...)
{
pthread_mutex_unlock(&lock)
...
}
...
int main()
{
...
pthread_mutex_init(&lock, NULL) ;
pthread_mutex_lock(&lock) ;
...
// start threads
...
}
This solution works but is unstable and sometimes causes deadlock because if it happens that
pthread_mutex_unlock(&lock) ;
is called before
pthread_mutex_lock(&lock) ;
then mutex stays locked and causes deadlock because
pthread_mutex_unlock(&lock) ;
has no effect if it is called before
pthread_mutex_lock(&lock) ;
I found one crappy solution to this but it's crappy because it eats additional cpu time needlessly. Such solution is this:
bool lock_cond ;
// thread group A
...
while(...)
{
lock_cond = true ;
pthread_mutex_lock(&lock) ;
lock_cond = false ;
// only one thread at the same time allowed from here
...
}
// thread B
while(...)
{
while(!lock_cond)
;
pthread_mutex_unlock(&lock)
...
}
...
int main()
{
...
pthread_mutex_init(&lock, NULL) ;
pthread_mutex_lock(&lock) ;
...
// start threads
...
}
So my question is how to properly implement threads synchronization in such scenario ?. Can I use
pthread_mutex_t
variables for that or does I have to use semaphore ?
Please explain with code examples.
There are many kinds of synchronization patterns between different threads.
Your scenario seems to be a good fit for a binary semaphore rather than a mutex:
Thread B doesn't "lock and release" - it just signals threads in the A group that they may proceed with their work.
It's not clear that a thread in A, once done with its own work, allows other threads in A to start work.
C++ will have an std::binary_semaphore in the next language standard version. Until then, you'll need to use a C++ library implementing them (perhaps this one? I haven't tried it myself), or using POSIX semaphores in C-style coding.
After studying and modifying code samples taken from
https://en.cppreference.com/w/cpp/thread/condition_variable
for my needs I created the following:
#include <iostream>
#include <string>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <unistd.h>
#include <random>
#include <ctime>
std::mutex m, m1;
std::condition_variable cv, cv1;
bool ready = false, ready2 = false;
bool processed = false;
pthread_mutex_t only_one ;
bool done, done2 ;
class Task
{
public:
void thread_groupA(std::string msg)
{
while(!done)
{
pthread_mutex_lock(&only_one) ;
{
std::lock_guard<std::mutex> lk(m1);
ready2 = true;
}
cv1.notify_one();
std::cout << msg << std::endl ;
std::cout << "before sleep 1 second" << std::endl ;
sleep(1); // sleep for demonstration that it really works
std::cout << "after sleep 1 second" << std::endl ;
std::cout << "before cv.wait()" << std::endl ;
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return ready;});
pthread_mutex_unlock(&only_one) ;
std::cout << "after cv.wait()" << std::endl ;
ready = false ;
processed = true;
lk.unlock();
cv.notify_one();
int val = rand() % 10000 ;
usleep(val) ; // server clients timing simulation
// different clients provide different data so clients timing isn't the same.
// fastest client's thread gets passed through 'pthread_mutex_lock(&only_one)'
}
}
} ;
void threadB()
{
int aa = 2, bb = 0 ;
while(!done2)
{
std::unique_lock<std::mutex> lk(m1);
cv1.wait(lk, []{return ready2;});
ready2 = false ;
if(done2)
break ;
if(bb % aa)
{
std::cout << "before sleep 5 seconds" << std::endl ;
sleep(5); // sleep for demonstration that it really works
std::cout << "after sleep 5 seconds" << std::endl ;
}
{
std::lock_guard<std::mutex> lk(m);
ready = true;
}
cv.notify_one();
{
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return processed;});
processed = false ;
}
++bb ;
}
}
int main()
{
pthread_mutex_init(&only_one, NULL) ;
done = false ;
done2 = false ;
srand(time(0)) ;
Task * taskPtr1 = new Task();
Task * taskPtr2 = new Task();
std::thread worker1(&Task::thread_groupA, taskPtr1, "thread 1");
std::thread worker2(&Task::thread_groupA, taskPtr2, "thread 2");
std::thread signal(threadB);
std::string s ;
do
{
getline(std::cin, s) ;
}
while(s.compare("stop") != 0) ;
done = true ;
worker1.join();
worker2.join();
done2 = true ;
{
std::lock_guard<std::mutex> lk(m1);
ready2 = true;
}
cv1.notify_one();
signal.join();
}
Now based on this code I can make implementation to my app. I hope this will work pretty stable.
As an educational exercise I'm implementing a thread pool using condition variables. A controller thread creates a pool of threads that wait on a signal (an atomic variable being set to a value above zero). When signaled the threads wake, perform their work, and when the last thread is done it signals the main thread to awaken. The controller thread blocks until the last thread is complete. The pool is then available for subsequent re-use.
Every now and then I was getting a timeout on the controller thread waiting for the worker to signal completion (likely because of a race condition when decrementing the active work counter), so in an attempt to solidify the pool I replaced the "wait(lck)" form of the condition variable's wait method with "wait(lck, predicate)". Since doing this, the behaviour of the thread pool is such that it seems to permit decrementing of the active work counter below 0 (which is the condition for reawakening the controller thread) - I have a race condition. I've read countless articles on atomic variables, synchronisation, memory ordering, spurious and lost wakeups on stackoverflow and various other sites, have incorporated what I've learnt to the best of my ability, and still cannot for the life of me work out why the way I've coded the predicated wait just does not work. The counter should only ever be as high as the number of threads in the pool (say, 8) and as low as zero. I've started losing faith in myself - it just shouldn't be this hard to do something fundamentally simple. There is clearly something else I need to learn here :)
Considering of course that there was a race condition I ensured that the two variables that drive the awakening and termination of the pool are both atomic, and that both are only ever changed while protected with a unique_lock. Specifically, I made sure that when a request to the pool was launched, the lock was acquired, the active thread counter was changed from 0 to 8, unlocked the mutex, and then "notified_all". The controller thread would only then be awakened with the active thread count at zero, once the last worker thread decremented it that far and "notified_one".
In the worker thread, the condition variable would wait and wake only when the active thread count is greater than zero, unlock the mutex, in parallel proceed to execute the work preassigned to the processor when the pool was created, re-acquire the mutex, and atomically decrement the active thread count. It would then, while still supposedly protected by the lock, test if it was the last thread still active, and if so, again unlock the mutex and "notify_one" to awaken the controller.
The problem is - the active thread counter repeatedly proceeds below zero after even only 1 or 2 iterations. If I test the active thread count at the start of a new workload, I could find the active thread count down around -6 - it is as if the pool was allowed to reawaken the controller thread before the work was completed.
Given that the thread counter and terminate flag are both atomic variables and are only ever modified while under the protection of the same mutex, I am using sequential memory ordering for all updates, I just cannot see how this is happening and I'm lost.
#include <stdafx.h>
#include <Windows.h>
#include <iostream>
#include <thread>
using std::thread;
#include <mutex>
using std::mutex;
using std::unique_lock;
#include <condition_variable>
using std::condition_variable;
#include <atomic>
using std::atomic;
#include <chrono>
#include <vector>
using std::vector;
class IWorkerThreadProcessor
{
public:
virtual void Process(int) = 0;
};
class MyProcessor : public IWorkerThreadProcessor
{
int index_ = 0;
public:
MyProcessor(int index)
{
index_ = index;
}
void Process(int threadindex)
{
for (int i = 0; i < 5000000; i++);
std::cout << '(' << index_ << ':' << threadindex << ") ";
}
};
#define MsgBox(x) do{ MessageBox(NULL, x, L"", MB_OK ); }while(false)
class ThreadPool
{
private:
atomic<unsigned int> invokations_ = 0;
//This goes negative when using the wait_for with predicate
atomic<int> threadsActive_ = 0;
atomic<bool> terminateFlag_ = false;
vector<std::thread> threads_;
atomic<unsigned int> poolSize_ = 0;
mutex mtxWorker_;
condition_variable cvSignalWork_;
condition_variable cvSignalComplete_;
public:
~ThreadPool()
{
TerminateThreads();
}
void Init(std::vector<IWorkerThreadProcessor*>& processors)
{
unique_lock<mutex> lck2(mtxWorker_);
threadsActive_ = 0;
terminateFlag_ = false;
poolSize_ = processors.size();
for (int i = 0; i < poolSize_; ++i)
threads_.push_back(thread(&ThreadPool::launchMethod, this, processors[i], i));
}
void ProcessWorkload(std::chrono::milliseconds timeout)
{
//Only used to see how many invocations I was getting through before experiencing the issue - sadly it's only one or two
invocations_++;
try
{
unique_lock<mutex> lck(mtxWorker_);
//!!!!!! If I use the predicated wait this break will fire !!!!!!
if (threadsActive_.load() != 0)
__debugbreak();
threadsActive_.store(poolSize_);
lck.unlock();
cvSignalWork_.notify_all();
lck.lock();
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[this] { return threadsActive_.load() == 0; })
)
{
//As you can tell this has taken me through a journey trying to characterise the issue...
if (threadsActive_ > 0)
MsgBox(L"Thread pool timed out with still active threads");
else if (threadsActive_ == 0)
MsgBox(L"Thread pool timed out with zero active threads");
else
MsgBox(L"Thread pool timed out with negative active threads");
}
}
catch (std::exception e)
{
__debugbreak();
}
}
void launchMethod(IWorkerThreadProcessor* processor, int threadIndex)
{
do
{
unique_lock<mutex> lck(mtxWorker_);
//!!!!!! If I use this predicated wait I see the failure !!!!!!
cvSignalWork_.wait(
lck,
[this] {
return
threadsActive_.load() > 0 ||
terminateFlag_.load();
});
//!!!!!!!! Does not cause the failure but obviously will not handle
//spurious wake-ups !!!!!!!!!!
//cvSignalWork_.wait(lck);
if (terminateFlag_.load())
return;
//Unlock to parallelise the work load
lck.unlock();
processor->Process(threadIndex);
//Re-lock to decrement the work count
lck.lock();
//This returns the value before the subtraction so theoretically if the previous value was 1 then we're the last thread going and we can now signal the controller thread to wake. This is the only place that the decrement happens so I don't know how it could possibly go negative
if (threadsActive_.fetch_sub(1, std::memory_order_seq_cst) == 1)
{
lck.unlock();
cvSignalComplete_.notify_one();
}
else
lck.unlock();
} while (true);
}
void TerminateThreads()
{
try
{
unique_lock<mutex> lck(mtxWorker_);
if (!terminateFlag_)
{
terminateFlag_ = true;
lck.unlock();
cvSignalWork_.notify_all();
for (int i = 0; i < threads_.size(); i++)
threads_[i].join();
}
}
catch (std::exception e)
{
__debugbreak();
}
}
};
int main()
{
std::vector<IWorkerThreadProcessor*> processors;
for (int i = 0; i < 8; i++)
processors.push_back(new MyProcessor(i));
std::cout << "Instantiating thread pool\n";
auto pool = new ThreadPool;
std::cout << "Initialisting thread pool\n";
pool->Init(processors);
std::cout << "Thread pool initialised\n";
for (int i = 0; i < 200; i++)
{
std::cout << "Workload " << i << "\n";
pool->ProcessWorkload(std::chrono::milliseconds(500));
std::cout << "Workload " << i << " complete." << "\n";
}
for (auto a : processors)
delete a;
delete pool;
return 0;
}
class ThreadPool
{
private:
atomic<unsigned int> invokations_ = 0;
std::atomic<unsigned int> awakenings_ = 0;
std::atomic<unsigned int> startedWorkloads_ = 0;
std::atomic<unsigned int> completedWorkloads_ = 0;
atomic<bool> terminate_ = false;
atomic<bool> stillFiring_ = false;
vector<std::thread> threads_;
atomic<unsigned int> poolSize_ = 0;
mutex mtx_;
condition_variable cvSignalWork_;
condition_variable cvSignalComplete_;
public:
~ThreadPool()
{
TerminateThreads();
}
void Init(std::vector<IWorkerThreadProcessor*>& processors)
{
unique_lock<mutex> lck2(mtx_);
//threadsActive_ = 0;
terminate_ = false;
poolSize_ = processors.size();
for (int i = 0; i < poolSize_; ++i)
threads_.push_back(thread(&ThreadPool::launchMethod, this, processors[i], i));
awakenings_ = 0;
completedWorkloads_ = 0;
startedWorkloads_ = 0;
invokations_ = 0;
}
void ProcessWorkload(std::chrono::milliseconds timeout)
{
try
{
unique_lock<mutex> lck(mtx_);
invokations_++;
if (startedWorkloads_ != 0)
__debugbreak();
if (completedWorkloads_ != 0)
__debugbreak();
if (awakenings_ != 0)
__debugbreak();
if (stillFiring_)
__debugbreak();
stillFiring_ = true;
lck.unlock();
cvSignalWork_.notify_all();
lck.lock();
if (!cvSignalComplete_.wait_for(
lck,
timeout,
//[this] { return this->threadsActive_.load() == 0; })
[this] { return completedWorkloads_ == poolSize_ && !stillFiring_; })
)
{
if (completedWorkloads_ < poolSize_)
{
if (startedWorkloads_ < poolSize_)
MsgBox(L"Thread pool timed out with some threads unstarted");
else if (startedWorkloads_ == poolSize_)
MsgBox(L"Thread pool timed out with all threads started but not all completed");
}
else
__debugbreak();
}
if (completedWorkloads_ != poolSize_)
__debugbreak();
if (awakenings_ != poolSize_)
__debugbreak();
awakenings_ = 0;
completedWorkloads_ = 0;
startedWorkloads_ = 0;
}
catch (std::exception e)
{
__debugbreak();
}
}
void launchMethod(IWorkerThreadProcessor* processor, int threadIndex)
{
do
{
unique_lock<mutex> lck(mtx_);
cvSignalWork_.wait(
lck,
[this] {
return
(stillFiring_ && (startedWorkloads_ < poolSize_)) ||
terminate_;
});
awakenings_++;
if (startedWorkloads_ == 0 && terminate_)
return;
if (stillFiring_ && startedWorkloads_ < poolSize_) //guard against spurious wakeup
{
startedWorkloads_++;
if (startedWorkloads_ == poolSize_)
stillFiring_ = false;
lck.unlock();
processor->Process(threadIndex);
lck.lock();
completedWorkloads_++;
if (completedWorkloads_ == poolSize_)
{
lck.unlock();
cvSignalComplete_.notify_one();
}
else
lck.unlock();
}
else
lck.unlock();
} while (true);
}
void TerminateThreads()
{
try
{
unique_lock<mutex> lck(mtx_);
if (!terminate_) //Don't attempt to double-terminate
{
terminate_ = true;
lck.unlock();
cvSignalWork_.notify_all();
for (int i = 0; i < threads_.size(); i++)
threads_[i].join();
}
}
catch (std::exception e)
{
__debugbreak();
}
}
};
I'm not certain if the following helps solve the problem, but I think the error is as shown below:
This
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[this] { return threadsActive_.load() == 0; })
)
should be replaced by
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[&] { return threadsActive_.load() == 0; })
)
Looks like the lambda is not accessing the instantiated member of the class. Here is some reference to back my case. Look at Lambda Capture section of this page.
Edit:
Another place you are using wait for with lambdas.
cvSignalWork_.wait(
lck,
[this] {
return
threadsActive_.load() > 0 ||
terminateFlag_.load();
});
Maybe modify all the lambdas and then see if it works?
The reason I'm looking at the lambda is because it seems like a case similar to a spurious wakeup. Hope it helps.
pthread_cond_wait wake many threads example
Code to wake up thread 1 & 3 on some broadcast from thread 0.
Setup: Win7 with mingw32, g++ 4.8.1 with mingw32-pthreads-w32
pthread condition variable
Solution:
http://pastebin.com/X8aQ5Fz8
#include <iostream>
#include <string>
#include <list>
#include <map>
#include <pthread.h>
#include <fstream>
#include <sstream> // for ostringstream
#define N_THREAD 7
using namespace std;
// Prototypes
int main();
int scheduler();
void *worker_thread(void *ptr);
string atomic_output(int my_int, int thread_id);
// Global variables
//pthread_t thread0, thread1, thread2, thread3, thread4, thread5, thread6, thread7;
pthread_t m_thread[N_THREAD];
int count = 1;
pthread_mutex_t count_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t condition_var = PTHREAD_COND_INITIALIZER;
// Main
int main() {
cout << "Launching main. \n";
//Start to monitor for exceptions
register_exception_handler();
//Start scheduler
scheduler();
return 0;
}
// Scheduler
int scheduler() {
// Starting scheduler log file
ofstream scheduler_log;
scheduler_log.open ("scheduler_log.txt");
//scheduler_log << "[Scheduler] Starting." << endl;
cout << "[Scheduler] Starting. \n";
// Scheduler::Main Section
int thread_id[N_THREAD];
for(int i=0;i<N_THREAD;i++) {
thread_id[i] = i;
pthread_create( &m_thread[i], NULL, worker_thread, (void *) &thread_id[i]);
}
for(int i=0;i<N_THREAD;i++)
pthread_join(m_thread[i], NULL);
cout << "[Scheduler] Ending. \n";
// Closing scheduler log file
scheduler_log.close();
return 0;
}
string atomic_output(int my_int, int thread_id) {
ostringstream stm;
stm << "Thread ";
stm << thread_id;
stm << ": ";
//count fn
stm << my_int;
stm << "\n";
//stm << "Finished. \n";
return stm.str();
}
void *worker_thread(void *ptr) {
string line;
//int boo = 0;
int thread_id = *(int *) ptr;
//if(thread_id == 0)
// pthread_mutex_lock( &count_mutex );
for(int i=0;i<10;i++) {
//boo++;
if (thread_id == 1) {
pthread_mutex_lock(&count_mutex);
while (count == 1) {
cout << "[Thread 1] Before pthread_cond_wait...\n";
pthread_cond_wait( &condition_var, &count_mutex );
cout << "[Thread 1] After pthread_cond_wait...\n";
}
pthread_mutex_unlock(&count_mutex);
}
if (thread_id == 3) {
pthread_mutex_lock(&count_mutex);
while (count == 1) {
cout << "[Thread 3] Before pthread_cond_wait...\n";
pthread_cond_wait( &condition_var, &count_mutex );
cout << "[Thread 3] After pthread_cond_wait...\n";
}
pthread_mutex_unlock(&count_mutex);
}
//count fn
line = atomic_output(i, *(int *)ptr);
cout << line;
if (i == 5) {
if(thread_id == 0) {
pthread_mutex_lock( &count_mutex );
count = 0;
pthread_mutex_unlock( &count_mutex );
pthread_cond_broadcast(&condition_var);
}
}
}
//line = atomic_output(0, *(int *)ptr);
//cout << line;
}
(old) -= What I've tried =-
*Edit: early problem in the code with while(0) instead of while(predicate). Keeping it there for easy reference with the comments.
Code 1: http://pastebin.com/rCbYjPKi
I tried to while(0) pthread_cond_wait( &condition_var, &count_mutex );
with pthread_cond_broadcast(&condition_var); ... The thread does not respect the condition.
Proof of condition non-respect : http://pastebin.com/GW1cg4fY
Thread 0: 0
Thread 0: 1
Thread 0: 2
Thread 0: 3
Thread 2: 0
Thread 6: 0
Thread 1: 0 <-- Here, Thread 1 is not supposed to tick before Thread 0 hit 5. Thread 0 is at 3.
Code 2: http://pastebin.com/g3E0Mw9W
I tried pthread_cond_wait( &condition_var, &count_mutex ); in thread 1 and 3 and the program does not return.
either thread 1, or thread 3 waits forever. Even using broadcast which says it should wake up all waiting threads. Obviously something is not working, code or lib?
More:
I've tried to unlock the mutex first, then broadcast. I've tried to broadcast then unlock. Both don't work.
I've tried to use signal instead of broadcast, same problem.
References that I can't make work (top google search)
http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html
http://docs.oracle.com/cd/E19455-01/806-5257/6je9h032r/index.html
http://www-01.ibm.com/support/knowledgecenter/ssw_i5_54/apis/users_76.htm
Code 3: http://pastebin.com/tKP7F8a8
Trying to use a predicate variable count, to fix race problem condition. Still a problem, doesn't prevent thread1 and thread3 from running when thread0 is between 0 and 5.
What would be the code to wake up thread 1 & 3 on some function call from thread0
if(thread_id == 0)
pthread_mutex_lock( &count_mutex );
for(int i=0;i<10;i++) {
//boo++;
if (thread_id == 1) {
while(0)
pthread_cond_wait( &condition_var, &count_mutex );
}
None of this makes any sense. The correct way to wait for a condition variable is:
pthread_mutex_lock(&mutex_associated_with_condition_variable);
while (!predicate)
pthread_cond_wait(&condition_variable, mutex_associated_with_condition_variable);
Notice:
The mutex must be locked.
The predicate (thing you are waiting for) must be checked before waiting.
The wait must be in a loop.
Breaking any of these three rules will cause the kind of problems you are seeing. Your main problem is that you break the second rule, waiting even when the thing you want to wait for has already happened.
I am using boost::thread, and I meet some problems.
The thing is, are there any ways I can join a thread before the last join finish?
for example,
int id=1;
void temp()
{
int theardID = id++;
for(int i=0;i<3;i++)
{
cout<<theardID << " : "<<i<<endl;
boost::this_thread::sleep(boost::posix_time::millisec(100));
}
}
int main(void)
{
boost::thread thrd1(temp);
thrd1.join();
boost::thread thrd2(temp);
boost::thread thrd3(temp);
thrd2.join();
thrd3.join();
return 0;
}
In this simple example, the order of output may be:
1:0
1:1
1:2
2:0
3:0
3:1
2:1
2:2
3:2
As the above example, we can see find out that thrd2 and thrd3 start to run after thrd1 finish.
Are there any ways to let thrd2 and thrd3 run before thrd1 finish?
You can use Boost.Thread's condition variables to synchronize on a condition more complex than what join can provide. Here's a example based on yours:
#include <iostream>
#include <boost/thread.hpp>
#include <boost/thread/locks.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/thread/condition_variable.hpp>
boost::mutex mutex;
boost::condition_variable cond;
// These three variables protected by mutex
bool finishedFlag = false;
int finishedID = 0;
int finishedCount = 0;
int id=1;
void temp()
{
int threadID = id++;
for(int i=0;i<3;i++)
{
std::cout << threadID << " : " << i << std::endl;
boost::this_thread::sleep(boost::posix_time::millisec(100));
}
{
boost::lock_guard<boost::mutex> lock(mutex);
finishedFlag = true;
finishedID = threadID;
++finishedCount;
}
cond.notify_one();
}
int main(void)
{
boost::thread thrd1(temp);
boost::this_thread::sleep(boost::posix_time::millisec(300));
boost::thread thrd2(temp);
boost::thread thrd3(temp);
boost::unique_lock<boost::mutex> lock(mutex);
while (finishedCount < 3)
{
while (finishedFlag != true)
{
// mutex is released while we wait for cond to be signalled.
cond.wait(lock);
// mutex is reacquired as soon as we finish waiting.
}
finishedFlag = false;
if (finishedID == 1)
{
// Do something special about thrd1 finishing
std::cout << "thrd1 finished" << std::endl;
}
};
// All 3 threads finished at this point.
return 0;
}
The join function means "stop this thread until that thread finishes." It's a simple tool for a simple purpose: ensuring that, past this point in the code, thread X is finished.
What you want to do isn't a join operation at all. What you want is some kind of synchronization primitive to communicate and synchronize behavior between threads. Boost.Thread has a number of alternatives for synchronization, from conditions to mutexes.