producer consumer condition variable - c++

I read data from a file and process it in a seperate thread. I am trying to parallelize the data read and processing parts on two threads and using conditional variables with infinite loops.
However, I end up with deadlocks.
char totBuf[300000];
unsigned long toLen = 0;
unsigned long lenProc = 0;
void procData()
{
while(lenProc < totLen)
{
//process data from totBuf
//increment lenProc;
}
ready = false;
if(lenProc >= totLen && totLen > 100000)
{
cv.notify_one();
unique_lock<mutex> lk(m);
cv.wait(lk, []{return totLen>0 && lenProc<totLen;});
}
}
void readData()
{
//declared so that we notify procData only once
bool firstNot = true;
while(true)
{
//read data into
//file.read(len);
//file.read(oBf, len);
memcpy(&totBuf[totLen], oBf, len);
//increment totLen
if(totLen > 10000)
cv.notify_one();
if(totLen > 100000)
{
cv.notify_one();
unique_lock<mutex> lk(m);
cv.wait_for(lk, []{return !ready;});
totLen = 0;
firstNot = true;
lenProc = 0;
}
}
}
int main(int argc, char* argv[])
{
inFile.open(argv[1], ios::in|ios::binary);
thread prod(readData);
thread cons(procSeqMsg);
prod.join();
ready = true;
cout << "prod joined\n";
cv.notify_all();
cons.join();
cout << "cons joined\n";
inFile.close();
return(0);
}
Some explanations if it looks weird. Though I have declared totBuf size to 300k but i reset totLen to 0 when its 100k because i read data in chunks from the file where the new chunks could be big. When size reaches 100k I reset totLen to read data at begining of totBuf again.
I notify the consumer first when size reaches 10k so that maximum concurrent processing can be achieved.
This could be a really bad design afaik and am willing to redesign from scratch. What I want is total lockfree implementation but am new to threads hence this is done as a stop-gap/ best I could do right now.

Related

How to properly wait for condition variable in C++?

In trying to create an asynchronous I/O file reader in C++ under Linux. The example I have has two buffers. The first read blocks. Then, for each time around the main loop, I asynchronously launch the IO and call process() which runs the simulated processing of the current block. When processing is done, we wait for the condition variable. The idea is that the asynchronous handler should notify the condition variable.
Unfortunately the notify seems to happen before wait, and it seems like this is not the way the condition variable wait() function works. How should I rewrite the code so that the loop waits until the asynchronous io has completed?
#include <aio.h>
#include <fcntl.h>
#include <signal.h>
#include <unistd.h>
#include <condition_variable>
#include <cstring>
#include <iostream>
#include <thread>
using namespace std;
using namespace std::chrono_literals;
constexpr uint32_t blockSize = 512;
mutex readMutex;
condition_variable cv;
int fh;
int bytesRead;
void process(char* buf, uint32_t bytesRead) {
cout << "processing..." << endl;
usleep(100000);
}
void aio_completion_handler(sigval_t sigval) {
struct aiocb* req = (struct aiocb*)sigval.sival_ptr;
// check whether asynch operation is complete
if (aio_error(req) == 0) {
int ret = aio_return(req);
bytesRead = req->aio_nbytes;
cout << "ret == " << ret << endl;
cout << (char*)req->aio_buf << endl;
}
{
unique_lock<mutex> readLock(readMutex);
cv.notify_one();
}
}
void thready() {
char* buf1 = new char[blockSize];
char* buf2 = new char[blockSize];
aiocb cb;
char* processbuf = buf1;
char* readbuf = buf2;
fh = open("smallfile.dat", O_RDONLY);
if (fh < 0) {
throw std::runtime_error("cannot open file!");
}
memset(&cb, 0, sizeof(aiocb));
cb.aio_fildes = fh;
cb.aio_nbytes = blockSize;
cb.aio_offset = 0;
// Fill in callback information
/*
Using SIGEV_THREAD to request a thread callback function as a notification
method
*/
cb.aio_sigevent.sigev_notify_attributes = nullptr;
cb.aio_sigevent.sigev_notify = SIGEV_THREAD;
cb.aio_sigevent.sigev_notify_function = aio_completion_handler;
/*
The context to be transmitted is loaded into the handler (in this case, a
reference to the aiocb request itself). In this handler, we simply refer to
the arrived sigval pointer and use the AIO function to verify that the request
has been completed.
*/
cb.aio_sigevent.sigev_value.sival_ptr = &cb;
int currentBytesRead = read(fh, buf1, blockSize); // read the 1st block
while (true) {
cb.aio_buf = readbuf;
aio_read(&cb); // each next block is read asynchronously
process(processbuf, currentBytesRead); // process while waiting
{
unique_lock<mutex> readLock(readMutex);
cv.wait(readLock);
}
currentBytesRead = bytesRead; // make local copy of global modified by the asynch code
if (currentBytesRead < blockSize) {
break; // last time, get out
}
cout << "back from wait" << endl;
swap(processbuf, readbuf); // switch to other buffer for next time
currentBytesRead = bytesRead; // create local copy
}
delete[] buf1;
delete[] buf2;
}
int main() {
try {
thready();
} catch (std::exception& e) {
cerr << e.what() << '\n';
}
return 0;
}
A condition varible should generally be used for
waiting until it is possible that the predicate (for example a shared variable) has changed, and
notifying waiting threads that the predicate may have changed, so that waiting threads should check the predicate again.
However, you seem to be attempting to use the state of the condition variable itself as the predicate. This is not how condition variables are supposed to be used and may lead to race conditions such as those described in your question. Another reason to always check the predicate is that spurious wakeups are possible with condition variables.
In your case, it would probably be appropriate to create a shared variable
bool operation_completed = false;
and use that variable as the predicate for the condition variable. Access to that variable should always be controlled by the mutex.
You can then change the lines
{
unique_lock<mutex> readLock(readMutex);
cv.notify_one();
}
to
{
unique_lock<mutex> readLock(readMutex);
operation_completed = true;
cv.notify_one();
}
and change the lines
{
unique_lock<mutex> readLock(readMutex);
cv.wait(readLock);
}
to:
{
unique_lock<mutex> readLock(readMutex);
while ( !operation_completed )
cv.wait(readLock);
}
Instead of
while ( !operation_completed )
cv.wait(readLock);
you can also write
cv.wait( readLock, []{ return operation_completed; } );
which is equivalent. See the documentation of std::condition_varible::wait for further information.
Of course, operation_completed should also be set back to false when appropriate, while the mutex is locked.

C++Mutex and conditional Variable Unlocking/Synchronisation

I'm wanting to have several threads all waiting on a conditional variable (CV) and when the main thread updates a variable they all execute. However, I need the main thread to wait until all these have completed before moving on. The other threads don't end and simply go back around and wait again, so I can't use thread.join() for example.
I've got the first half working, I can trigger the threads, but the main just hangs and doesn't continue. Below is my current code
#include <iostream> // std::cout
#include <thread> // std::thread
#include <mutex> // std::mutex, std::unique_lock
#include <condition_variable> // std::condition_variable
#include <Windows.h>
#define N 3
std::mutex mtx;
std::condition_variable cv;
bool ready = false;
bool finished[N];
void print_id(int id) {
while (1) {
std::unique_lock<std::mutex> lck(mtx); //Try and Lock the Mutex
while (finished[id]) cv.wait(lck); //Wait until finished is false
// ...
std::cout << "thread " << id << '\n';
finished[id] = true; //Set finished to be true. When true, program should continue
}
}
int main()
{
std::thread threads[N];
// spawn 10 threads:
for (int i = 0; i < N; ++i) {
threads[i] = std::thread(print_id, i); //Create n threads
finished[i] = true; //Set default finished to be true
}
std::cout << "N threads ready to race...\n";
for (int i = 0; i < 5; i++) {
std::unique_lock<std::mutex> lck(mtx); //Lock mutex
for (int i = 0; i < N; i++) {
finished[i] = false; //Set finished to false, this will break the CV in each thread
}
cv.notify_all(); //Notify all threads
cv.wait(lck, [] {return finished[0] == true; }); //Wait until all threads have finished (but not ended)
std::cout << "finished, Sleeping for 2s\n";
Sleep(2000);
}
return 0;
}
Thank you.
Edit: I am aware I am only currently checking the status of the finished[0] and not each one. This is done just for simplicity atm and would eventually need to be all of them. I will write a function to manage this later.
You have cv.wait(lck, [] {return finished[0] == true; }); in main thread, but it is not being notified.
You'd need to notify it, and you'd better use another condition_variable for it, not the same as for worker thead notifiecation.

Filling and saving shared buffer between threads

I'm working with an API that retrieves I/Q data. Calling the function bbGetIQ(m_handle, &pkt);fills a buffer. This is a thread looping while the user hasn't input "stop". Pkt is a structure and the buffer used is pkt.iqData = &m_buffer[0]; which is a vector of float. The size of the vector is 5000 and each time we're looping the buffer is filled with 5000 values.
I want to save the data from the buffer into a file, and I was doing it right after a call to bbgetIQ but doing like so is a time consuming task, data wasn't retrieved fast enough resulting in the API dropping data so it can continue filling its buffer.
Here's what my code looked like :
void Acquisition::recordIQ(){
int cpt = 0;
ofstream myfile;
while(1){
while (keep_running)
{
cpt++;
if(cpt < 2)
myfile.open ("/media/ssd/IQ_Data.txt");
bbGetIQ(m_handle, &pkt); //Retrieve I/Q data
//Writing content of buffer into the file.
for(int i=0; i<m_buffer.size(); i++)
myfile << m_buffer[i] << endl;
}
cpt = 0;
myfile.close();
}
}
Then i tried to only write into the file when we leave the loop :
void Acquisition::recordIQ(){
int cpt = 0;
ofstream myfile;
int next=0;
vector<float> data;
while(1){
while ( keep_running)
{
if(keep_running == false){
myfile.open ("/media/ssd/IQ_Data.txt");
for(int i=0; i<data.size(); i++)
myfile << data[i] << endl;
myfile.close();
break;
}
cpt++;
data.resize(next + m_buffer.size());
bbGetIQ(m_handle, &pkt); //retrieve data
std::copy(m_buffer.begin(), m_buffer.end(), data.begin() + next); //copy content of the buffer into final vector
next += m_buffer.size(); //next index
}
cpt = 0;
}
}
I am no longer getting data loss from the API, but the issue is that i'm limited by the size of data vector. For example, I can't let it retrieve data all night.
My idea is to make 2 threads. One will retrieve data and the other will write the data into a file. The 2 threads will share a circular buffer where the first thread will fill the buffer and the second thread will read the buffer and write the content to a file. As it is a shared buffer, i guess i should use mutexes.
I'm new to multi-threading and mutex, so would this be a good idea? I don't really know where to start and how the consumer thread can read the buffer while the producer will fill it. Will locking the buffer while reading cause data drop by the API ? (because it won't be able to write it into the circular buffer).
EDIT : As i want my record thread to run in background so i can do other stuff while it's recording, i detached it and the user can launch a record by setting the condition keep_running to true.
thread t1(&Acquisition::recordIQ, &acq);
t1.detach();
You need to use something like this (https://en.cppreference.com/w/cpp/thread/condition_variable):
globals:
std::mutex m;
std::condition_variable cv;
std::vector<std::vector<float>> datas;
bool keep_running = true, start_running = false;
writing thread:
void writing_thread()
{
myfile.open ("/media/ssd/IQ_Data.txt");
while(1) {
// Wait until main() sends data
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return keep_running && !datas.empty();});
if (!keep_running) break;
auto d = std::move(datas);
lk.unlock();
for(auto &entry : d) {
for(auto &e : entry)
myfile << e << endl;
}
}
}
sending thread:
void sending_thread() {
while(1) {
{
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return keep_running && start_running;});
if (!keep_running) break;
}
bbGetIQ(m_handle, &pkt); //retrieve data
std::vector<float> d = m_buffer;
{
std::lock_guard<std::mutex> lk(m);
if (!keep_running) break;
datas.push_back(std::move(d));
}
cv.notify_one();
}
}
void start() {
{
std::unique_lock<std::mutex> lk(m);
start_running = true;
}
cv.notify_all();
}
void stop() {
{
std::unique_lock<std::mutex> lk(m);
start_running = false;
}
cv.notify_all();
}
void terminate() {
{
std::unique_lock<std::mutex> lk(m);
keep_running = false;
}
cv.notify_all();
thread1.join();
thread2.join();
}
In short:
Sending thread receives data from whatever it comes, locks mutex mt and moves data to datas storage. Then it uses cv condition variable to notify waiting threads, that there's something to do. Writing thread waits for condition variable to be signaled, then locks mutex mt, moves data from datas global variable to local, then releases mutex and proceed to write just received data to file. Key is to keep mutexed locked for least time possible.
EDIT:
to terminate whole thing you need to set keep_running to false. Then call cv.notify_all(). Then join threads involved. Order is important. You need to join threads, because writing thread might be still in process of writing data.
EDIT2:
added delayed start. Now create two threads, in one run sending_thread, in other writing_thread. Call start() to enable processing and stop() to stop it.

How to properly synchronous threads in pthreads?

I am implementing a producer-consumer problem using pthreads and semaphore. I have 1 Producer and 2 Consumers. My Producer reads characters one by one from a file and enqueues them into a circular queue. I want the consumers to read from the queue and store into separate arrays. I want the reading in such a way that the first consumer reads 2 characters and the second consumer reads every 3rd character. I am trying to do this using pthread_cond_wait() but it is not working out. This is my code:
#include<iostream>
#include<pthread.h>
#include<fstream>
#include<unistd.h>
#include<semaphore.h>
#include<queue>
#include "circular_queue"
// define queue size
#define QUEUE_SIZE 5
// declare and initialize semaphore and read/write counter
static sem_t mutex,queueEmptyMutex;
//static int counter = 0;
// Queue for saving characters
static Queue charQueue(QUEUE_SIZE);
//static std::queue<char> charQueue;
// indicator for end of file
static bool endOfFile = false;
// save arrays
static char consumerArray1[100];
static char consumerArray2[100];
static pthread_cond_t cond;
static pthread_mutex_t cond_mutex;
static bool thirdCharToRead = false;
void *Producer(void *ptr)
{
int i=0;
std::ifstream input("string.txt");
char temp;
while(input>>temp)
{
std::cout<<"reached here a"<<std::endl;
sem_wait(&mutex);
std::cout<<"reached here b"<<std::endl;
if(!charQueue.full())
{
charQueue.enQueue(temp);
}
sem_post(&queueEmptyMutex);
sem_post(&mutex);
i++;
sleep(4);
}
endOfFile = true;
sem_post(&queueEmptyMutex);
pthread_exit(NULL);
}
void *Consumer1(void *ptr)
{
int i = 0;
sem_wait(&queueEmptyMutex);
bool loopCond = endOfFile;
while(!loopCond)
{
std::cout<<"consumer 1 loop"<<std::endl;
if(endOfFile)
{
loopCond = charQueue.empty();
std::cout<<loopCond<<std::endl;
sem_post(&queueEmptyMutex);
}
sem_wait(&queueEmptyMutex);
sem_wait(&mutex);
if(!charQueue.empty())
{
consumerArray1[i] = charQueue.deQueue();
i++;
if(i%2==0)
{
pthread_mutex_lock(&cond_mutex);
std::cout<<"Signal cond. i = "<<i<<std::endl;
thirdCharToRead = true;
pthread_mutex_unlock(&cond_mutex);
pthread_cond_signal(&cond);
}
}
if(charQueue.empty()&&endOfFile)
{
sem_post(&mutex);
sem_post(&queueEmptyMutex);
break;
}
sem_post(&mutex);
sleep(2);
std::cout<<"consumer 1 loop end"<<std::endl;
}
consumerArray1[i] = '\0';
pthread_exit(NULL);
}
void *Consumer2(void *ptr)
{
int i = 0;
sem_wait(&queueEmptyMutex);
bool loopCond = endOfFile;
while(!loopCond)
{
std::cout<<"consumer 2 loop"<<std::endl;
if(endOfFile)
{
loopCond = charQueue.empty();
std::cout<<loopCond<<std::endl;
sem_post(&queueEmptyMutex);
}
sem_wait(&queueEmptyMutex);
sem_wait(&mutex);
if(!charQueue.empty())
{
pthread_mutex_lock(&cond_mutex);
while(!thirdCharToRead)
{
std::cout<<"Waiting for condition"<<std::endl;
pthread_cond_wait(&cond,&cond_mutex);
}
std::cout<<"Wait over"<<std::endl;
thirdCharToRead = false;
pthread_mutex_unlock(&cond_mutex);
consumerArray2[i] = charQueue.deQueue();
i++;
}
if(charQueue.empty()&& endOfFile)
{
sem_post(&mutex);
sem_post(&queueEmptyMutex);
break;
}
sem_post(&mutex);
std::cout<<"consumer 2 loop end"<<std::endl;
sleep(2);
}
consumerArray2[i] = '\0';
pthread_exit(NULL);
}
int main()
{
pthread_t thread[3];
sem_init(&mutex,0,1);
sem_init(&queueEmptyMutex,0,1);
pthread_mutex_init(&cond_mutex,NULL);
pthread_cond_init(&cond,NULL);
pthread_create(&thread[0],NULL,Producer,NULL);
int rc = pthread_create(&thread[1],NULL,Consumer1,NULL);
if(rc)
{
std::cout<<"Thread not created"<<std::endl;
}
pthread_create(&thread[2],NULL,Consumer2,NULL);
pthread_join(thread[0],NULL);pthread_join(thread[1],NULL);pthread_join(thread[2],NULL);
std::cout<<"First array: "<<consumerArray1<<std::endl;
std::cout<<"Second array: "<<consumerArray2<<std::endl;
sem_destroy(&mutex);
sem_destroy(&queueEmptyMutex);
pthread_exit(NULL);
}
The problem I am having is after one read, consumer 2 goes into infinite loop in the while(!thirdCharToRead). Is there any better way to implement this?
Okay, let's start with this code:
std::cout<<"Wait over"<<std::endl;
pthread_mutex_unlock(&cond_mutex);
thirdCharToRead = false;
This code says that cond_mutex does not protect thirdCharToRead from concurrent access. Why? Because it modifies thirdCharToRead without holding that mutex.
Now look at this code:
pthread_mutex_lock(&cond_mutex);
while(!thirdCharToRead)
{
std::cout<<"Waiting for condition"<<std::endl;
pthread_cond_wait(&cond,&cond_mutex);
}
Now, the while loop checks thirdCharToRead, so we must hold whatever lock protects thirdCharToRead from concurrent access when we test it. But the while loop will loop forever if thirdCharToRead stays locked for the whole loop since no other thread could ever change it. Thus, this code only makes sense if somewhere in the loop we release the lock that protects thirdCharToRead, and the only lock we release in the loop is cond_mutex in the call to pthread_cond_wait.
So this code only makes sense if cond_mutex protects thirdCharToRead.
Houston, we have a problem. One chunk of code says cond_mutex does not protect thirdCharToRead and one chunk of code says cond_mutex does protect thirdCharToRead.

Condition variable's "wait" function causing unexpected behaviour when predicate is provided

As an educational exercise I'm implementing a thread pool using condition variables. A controller thread creates a pool of threads that wait on a signal (an atomic variable being set to a value above zero). When signaled the threads wake, perform their work, and when the last thread is done it signals the main thread to awaken. The controller thread blocks until the last thread is complete. The pool is then available for subsequent re-use.
Every now and then I was getting a timeout on the controller thread waiting for the worker to signal completion (likely because of a race condition when decrementing the active work counter), so in an attempt to solidify the pool I replaced the "wait(lck)" form of the condition variable's wait method with "wait(lck, predicate)". Since doing this, the behaviour of the thread pool is such that it seems to permit decrementing of the active work counter below 0 (which is the condition for reawakening the controller thread) - I have a race condition. I've read countless articles on atomic variables, synchronisation, memory ordering, spurious and lost wakeups on stackoverflow and various other sites, have incorporated what I've learnt to the best of my ability, and still cannot for the life of me work out why the way I've coded the predicated wait just does not work. The counter should only ever be as high as the number of threads in the pool (say, 8) and as low as zero. I've started losing faith in myself - it just shouldn't be this hard to do something fundamentally simple. There is clearly something else I need to learn here :)
Considering of course that there was a race condition I ensured that the two variables that drive the awakening and termination of the pool are both atomic, and that both are only ever changed while protected with a unique_lock. Specifically, I made sure that when a request to the pool was launched, the lock was acquired, the active thread counter was changed from 0 to 8, unlocked the mutex, and then "notified_all". The controller thread would only then be awakened with the active thread count at zero, once the last worker thread decremented it that far and "notified_one".
In the worker thread, the condition variable would wait and wake only when the active thread count is greater than zero, unlock the mutex, in parallel proceed to execute the work preassigned to the processor when the pool was created, re-acquire the mutex, and atomically decrement the active thread count. It would then, while still supposedly protected by the lock, test if it was the last thread still active, and if so, again unlock the mutex and "notify_one" to awaken the controller.
The problem is - the active thread counter repeatedly proceeds below zero after even only 1 or 2 iterations. If I test the active thread count at the start of a new workload, I could find the active thread count down around -6 - it is as if the pool was allowed to reawaken the controller thread before the work was completed.
Given that the thread counter and terminate flag are both atomic variables and are only ever modified while under the protection of the same mutex, I am using sequential memory ordering for all updates, I just cannot see how this is happening and I'm lost.
#include <stdafx.h>
#include <Windows.h>
#include <iostream>
#include <thread>
using std::thread;
#include <mutex>
using std::mutex;
using std::unique_lock;
#include <condition_variable>
using std::condition_variable;
#include <atomic>
using std::atomic;
#include <chrono>
#include <vector>
using std::vector;
class IWorkerThreadProcessor
{
public:
virtual void Process(int) = 0;
};
class MyProcessor : public IWorkerThreadProcessor
{
int index_ = 0;
public:
MyProcessor(int index)
{
index_ = index;
}
void Process(int threadindex)
{
for (int i = 0; i < 5000000; i++);
std::cout << '(' << index_ << ':' << threadindex << ") ";
}
};
#define MsgBox(x) do{ MessageBox(NULL, x, L"", MB_OK ); }while(false)
class ThreadPool
{
private:
atomic<unsigned int> invokations_ = 0;
//This goes negative when using the wait_for with predicate
atomic<int> threadsActive_ = 0;
atomic<bool> terminateFlag_ = false;
vector<std::thread> threads_;
atomic<unsigned int> poolSize_ = 0;
mutex mtxWorker_;
condition_variable cvSignalWork_;
condition_variable cvSignalComplete_;
public:
~ThreadPool()
{
TerminateThreads();
}
void Init(std::vector<IWorkerThreadProcessor*>& processors)
{
unique_lock<mutex> lck2(mtxWorker_);
threadsActive_ = 0;
terminateFlag_ = false;
poolSize_ = processors.size();
for (int i = 0; i < poolSize_; ++i)
threads_.push_back(thread(&ThreadPool::launchMethod, this, processors[i], i));
}
void ProcessWorkload(std::chrono::milliseconds timeout)
{
//Only used to see how many invocations I was getting through before experiencing the issue - sadly it's only one or two
invocations_++;
try
{
unique_lock<mutex> lck(mtxWorker_);
//!!!!!! If I use the predicated wait this break will fire !!!!!!
if (threadsActive_.load() != 0)
__debugbreak();
threadsActive_.store(poolSize_);
lck.unlock();
cvSignalWork_.notify_all();
lck.lock();
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[this] { return threadsActive_.load() == 0; })
)
{
//As you can tell this has taken me through a journey trying to characterise the issue...
if (threadsActive_ > 0)
MsgBox(L"Thread pool timed out with still active threads");
else if (threadsActive_ == 0)
MsgBox(L"Thread pool timed out with zero active threads");
else
MsgBox(L"Thread pool timed out with negative active threads");
}
}
catch (std::exception e)
{
__debugbreak();
}
}
void launchMethod(IWorkerThreadProcessor* processor, int threadIndex)
{
do
{
unique_lock<mutex> lck(mtxWorker_);
//!!!!!! If I use this predicated wait I see the failure !!!!!!
cvSignalWork_.wait(
lck,
[this] {
return
threadsActive_.load() > 0 ||
terminateFlag_.load();
});
//!!!!!!!! Does not cause the failure but obviously will not handle
//spurious wake-ups !!!!!!!!!!
//cvSignalWork_.wait(lck);
if (terminateFlag_.load())
return;
//Unlock to parallelise the work load
lck.unlock();
processor->Process(threadIndex);
//Re-lock to decrement the work count
lck.lock();
//This returns the value before the subtraction so theoretically if the previous value was 1 then we're the last thread going and we can now signal the controller thread to wake. This is the only place that the decrement happens so I don't know how it could possibly go negative
if (threadsActive_.fetch_sub(1, std::memory_order_seq_cst) == 1)
{
lck.unlock();
cvSignalComplete_.notify_one();
}
else
lck.unlock();
} while (true);
}
void TerminateThreads()
{
try
{
unique_lock<mutex> lck(mtxWorker_);
if (!terminateFlag_)
{
terminateFlag_ = true;
lck.unlock();
cvSignalWork_.notify_all();
for (int i = 0; i < threads_.size(); i++)
threads_[i].join();
}
}
catch (std::exception e)
{
__debugbreak();
}
}
};
int main()
{
std::vector<IWorkerThreadProcessor*> processors;
for (int i = 0; i < 8; i++)
processors.push_back(new MyProcessor(i));
std::cout << "Instantiating thread pool\n";
auto pool = new ThreadPool;
std::cout << "Initialisting thread pool\n";
pool->Init(processors);
std::cout << "Thread pool initialised\n";
for (int i = 0; i < 200; i++)
{
std::cout << "Workload " << i << "\n";
pool->ProcessWorkload(std::chrono::milliseconds(500));
std::cout << "Workload " << i << " complete." << "\n";
}
for (auto a : processors)
delete a;
delete pool;
return 0;
}
class ThreadPool
{
private:
atomic<unsigned int> invokations_ = 0;
std::atomic<unsigned int> awakenings_ = 0;
std::atomic<unsigned int> startedWorkloads_ = 0;
std::atomic<unsigned int> completedWorkloads_ = 0;
atomic<bool> terminate_ = false;
atomic<bool> stillFiring_ = false;
vector<std::thread> threads_;
atomic<unsigned int> poolSize_ = 0;
mutex mtx_;
condition_variable cvSignalWork_;
condition_variable cvSignalComplete_;
public:
~ThreadPool()
{
TerminateThreads();
}
void Init(std::vector<IWorkerThreadProcessor*>& processors)
{
unique_lock<mutex> lck2(mtx_);
//threadsActive_ = 0;
terminate_ = false;
poolSize_ = processors.size();
for (int i = 0; i < poolSize_; ++i)
threads_.push_back(thread(&ThreadPool::launchMethod, this, processors[i], i));
awakenings_ = 0;
completedWorkloads_ = 0;
startedWorkloads_ = 0;
invokations_ = 0;
}
void ProcessWorkload(std::chrono::milliseconds timeout)
{
try
{
unique_lock<mutex> lck(mtx_);
invokations_++;
if (startedWorkloads_ != 0)
__debugbreak();
if (completedWorkloads_ != 0)
__debugbreak();
if (awakenings_ != 0)
__debugbreak();
if (stillFiring_)
__debugbreak();
stillFiring_ = true;
lck.unlock();
cvSignalWork_.notify_all();
lck.lock();
if (!cvSignalComplete_.wait_for(
lck,
timeout,
//[this] { return this->threadsActive_.load() == 0; })
[this] { return completedWorkloads_ == poolSize_ && !stillFiring_; })
)
{
if (completedWorkloads_ < poolSize_)
{
if (startedWorkloads_ < poolSize_)
MsgBox(L"Thread pool timed out with some threads unstarted");
else if (startedWorkloads_ == poolSize_)
MsgBox(L"Thread pool timed out with all threads started but not all completed");
}
else
__debugbreak();
}
if (completedWorkloads_ != poolSize_)
__debugbreak();
if (awakenings_ != poolSize_)
__debugbreak();
awakenings_ = 0;
completedWorkloads_ = 0;
startedWorkloads_ = 0;
}
catch (std::exception e)
{
__debugbreak();
}
}
void launchMethod(IWorkerThreadProcessor* processor, int threadIndex)
{
do
{
unique_lock<mutex> lck(mtx_);
cvSignalWork_.wait(
lck,
[this] {
return
(stillFiring_ && (startedWorkloads_ < poolSize_)) ||
terminate_;
});
awakenings_++;
if (startedWorkloads_ == 0 && terminate_)
return;
if (stillFiring_ && startedWorkloads_ < poolSize_) //guard against spurious wakeup
{
startedWorkloads_++;
if (startedWorkloads_ == poolSize_)
stillFiring_ = false;
lck.unlock();
processor->Process(threadIndex);
lck.lock();
completedWorkloads_++;
if (completedWorkloads_ == poolSize_)
{
lck.unlock();
cvSignalComplete_.notify_one();
}
else
lck.unlock();
}
else
lck.unlock();
} while (true);
}
void TerminateThreads()
{
try
{
unique_lock<mutex> lck(mtx_);
if (!terminate_) //Don't attempt to double-terminate
{
terminate_ = true;
lck.unlock();
cvSignalWork_.notify_all();
for (int i = 0; i < threads_.size(); i++)
threads_[i].join();
}
}
catch (std::exception e)
{
__debugbreak();
}
}
};
I'm not certain if the following helps solve the problem, but I think the error is as shown below:
This
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[this] { return threadsActive_.load() == 0; })
)
should be replaced by
if (!cvSignalComplete_.wait_for(
lck,
timeout,
[&] { return threadsActive_.load() == 0; })
)
Looks like the lambda is not accessing the instantiated member of the class. Here is some reference to back my case. Look at Lambda Capture section of this page.
Edit:
Another place you are using wait for with lambdas.
cvSignalWork_.wait(
lck,
[this] {
return
threadsActive_.load() > 0 ||
terminateFlag_.load();
});
Maybe modify all the lambdas and then see if it works?
The reason I'm looking at the lambda is because it seems like a case similar to a spurious wakeup. Hope it helps.