deadlock in producers/consumers queue - c++

I try to solve one interesting problem which is described below:
Write a console application which has N producers (N=1…10), M
consumers (M=1…10) and one data queue. Each producer and consumer is a
separate thread and all threads are working concurrently. Producer
thread sleeps 0…100 milliseconds randomly then it wakes up and
generates a random number between 1 and 100 and then puts this number
to data queue. Consumer thread sleeps 0…100 milliseconds randomly and
then wakes up and takes the number from the queue and saves it to the
output ‘data.txt’ file. All numbers are appended in the file and all
they are comma delimited (for example 4,67,99,23,…). When producer
thread puts the next number to data queue it checks the size of data
queue, and if it is >=100 the producer thread is blocked until the
number of elements gets <= 80. When consumer thread wants to take the
next number from data queue and no elements in it, consumer thread is
blocked until new element is added to data queue by a producer.
When we start application we need to insert the N (number of
producers) and the M (number of consumers) after which program starts
all threads. It should print current number of elements of data queue
in each second. When we stop program it should interrupt all producers
and wait for all consumers to save all queued data then program exits.
To solve this except all I wrote the thread-safe queue
template <class T>
class global::safe_queue
{
private:
sync* m_sync;
size_t m_lcorner;
size_t m_rcorner;
std::queue<T> m_data;
public:
safe_queue(sync* snc, size_t lcorn, size_t rcorn) :
m_sync(snc),
m_lcorner(lcorn),
m_rcorner(rcorn) {}
~safe_queue()
{
m_sync->lock();
while (m_data.size()){
m_data.pop();
}
m_sync->unlock();
}
size_t size() const
{
m_sync->lock();
size_t sz = m_data.size();
m_sync->unlock();
return sz;
}
size_t front() const
{
m_sync->lock();
T item = m_data.front();
m_sync->unlock();
return item;
}
void push(T item)
{
m_sync->lock();
while(m_data.size() >= m_rcorner) {
m_sync->unlock();
usleep(5);
m_sync->lock();
// conditional wait
//m_sync->wait();
}
m_data.push(item);
if(m_data.size() == 1) {
m_sync->unlock();
// conditional signal
//m_sync->signal();
}
m_sync->unlock();
}
T pop()
{
m_sync->lock();
while(m_data.size() == 0) {
m_sync->unlock();
usleep(5);
m_sync->lock();
// conditional wait
//m_sync->wait();
}
T item = m_data.front();
if(m_data.size() <= m_lcorner) {
m_sync->unlock();
// conditional signal
// m_sync->signal();
}
m_data.pop();
m_sync->unlock();
return item;
}
};
But as a result I get deadlock. What is wrong ?

Related

What's the good way to pass data to a thread in c++?

I'm learning multi-thread coding using c++. What I need to do is continuously read word from keyboard, and pass it to a data thread for data processing. I used global variable word[] to pass the data. When word[0] != 0 means a new input from keyboard. And the data thread will set word[0] to 0 once it read the data. It works! But I'm not sure if it safe or not, or there are better ways to do this. Here is my code:
#include <iostream>
#include <thread>
#include <cstdio>
#include <cstring>
using namespace std;
static const int buff_len = 32;
static char* word = new char[buff_len];
static void data_thread () { // thread to handle data
while (1)
{
if (word[0]) { // have a new word
char* w = new char[buff_len];
strcpy(w, word);
cout << "Data processed!\n";
word[0] = 0; // Inform the producer that we consumed the word
}
}
};
static void read_keyboard () {
char * linebuf = new char[buff_len];
thread * worker = new thread( data_thread );
while (1) //enter "end" to terminate the loop
{
if (!std::fgets( linebuf, buff_len, stdin)) // EOF?
return;
linebuf[strcspn(linebuf, "\n")] = '\0'; //remove new line '\n' from the string
word = linebuf; // Pass the word to the worker thread
while (word[0]); // Wait for the worker thread to consume it
}
worker->join(); // Wait for the worker to terminate
}
int main ()
{
read_keyboard();
return 0;
}
The problem with this type of multi threading implementation is busy waiting. The input reader & the data consumer both are busy waiting and wasting the cpu cycles. To overcome this you need Semaphore.
Semaphore s_full(0);
Semaphore s_empty(1);
void data_processor ()
{
while (true) {
// Wait for data availability.
s_full.wait();
// Data is available to you, consume it.
process_data();
// Unblock the data producer.
s_empty.signal();
}
}
void input_reader()
{
while (true) {
// Wait for empty buffer.
s_empty.wait();
// Read data.
read_input_data();
// Unblock data com=nsumer.
s.full.signal();
}
}
In addition this solution will work only for a single data consumer thread. But for multiple data consumer threads you'll need thread safe buffer queue and proper implementation of producer - consumer problem.
See below blog links for additional information to solve this problem:
Thread safe buffer queue:
https://codeistry.wordpress.com/2018/03/08/buffer-queue-handling-in-multithreaded-environment/
Producer - consumer problem:
https://codeistry.wordpress.com/2018/03/09/unordered-producer-consumer/
There are a few problems with your approach:
This method is not scalable. What if you have more than 1 processing thread?
You would need a mutex to synchronise read-write access to the memory stored by word. At the scale of this example, not a big deal. In a "serious" application you might not have the luxury of waiting till you get the data thread stops processing. In that case, you might be tempted to remove the while(word[0]) but that is unsafe.
You fire off a "daemon" thread (not exactly but close enough) to handle your computations. Most of the time the thread is waiting for your input and cannot proceed without it. This is inefficient, and modern C++ gives you a way around it without explicitly handling raw threads using std::async paradigm.
#include <future>
#include <string>
#include <iostream>
static std::string worker(const std::string &input)
{
// assume this is a lengthy operation
return input.substr(1);
}
int main()
{
while (true)
{
std::string input;
std::getline (std::cin, input);
if (input.empty())
break;
std::future<std::string> fut= std::async(std::launch::async, &worker, input);
// Other tasks
// size_t n_stars = count_number_of_stars();
//
std::string result = fut.get(); // wait for the task to complete
printf("Output : %s\n", result.c_str());
}
return 0;
}
Something like this in my opinion is the better approach. std::async will launch a thread (if std::launch::async option is specified) and return a waitable future. The computation will continue in the background, and you can do other work in the main thread. When you need to get the result of your computation, you can get() the result of the future(btw the future can be void too).
Also there are a lot of C-isms in your C++ code. Unless there is a reason to do so, why would you not use std::string?
In modern CPP multithreading, u should be using condition_variable, mutex, and queue to handle this. the mutex prevents mutual reach to the queue and the condition variable makes the reader thread sleep until the writer writes what it write. the following is an example
static void data_thread (std::queue<char> & dataToProcess, std::mutex & mut, std::condition_variable & cv, std::atomic<bool>& finished) { // thread to handle data
std::string readData;
while (!finished)
{
{
std::unique_lock lock{mut};
cv.wait(lock, [&] { return !dataToProcess.empty() || finished; });
if (finished) {
while (!dataToProcess.empty()){
readData += dataToProcess.front();
dataToProcess.pop();
}
}
else{
readData += dataToProcess.front();
dataToProcess.pop();
}
}
std::cout << "\nData processed\n";
}
std::cout << readData;
};
static void read_keyboard () {
std::queue<char> data;
std::condition_variable cv;
std::mutex mut;
std::atomic<bool> finished = false;
std::thread worker = std::thread( data_thread, std::ref(data), std::ref(mut), std::ref(cv), std::ref(finished) );
char temp;
while (true) //enter "end" to terminate the loop
{
if (!std::cin.get(temp)) // EOF?
{
std::cin.clear();
finished = true;
cv.notify_all();
break;
}
{
std::lock_guard lock {mut};
data.push(temp);
}
cv.notify_all();
}
worker.join(); // Wait for the worker to terminate
}
int main ()
{
read_keyboard();
return 0;
}
What you are looking for is a message queue. This needs mutex and condition variable.
Here is one on github (not mine but it popped up when I searched) https://github.com/khuttun/PolyM
and another
https://www.justsoftwaresolutions.co.uk/threading/implementing-a-thread-safe-queue-using-condition-variables.html
I will get told off for posting links, but I am not going to type the entire code here and github's not going anywhere soon

Using a single Condition Variable to pause multiple threads

I have a program that starts N number of threads (async/future). I want the main thread to set up some data, then all threads should go while the main thread waits for all of the other threads to finish, and then this needs to loop.
What I have atm is something like this
int main()
{
//Start N new threads (std::future/std::async)
while(condition)
{
//Set Up Data Here
//Send Data to threads
{
std::lock_guard<std::mutex> lock(mrun);
bRun = true;
}
run.notify_all();
//Wait for threads
{
std::unique_lock<std::mutex> lock(mrun);
run.wait(lock, [] {return bDone; });
}
//Reset bools
bRun = false;
bDone = false;
}
//Get results from futures once complete
}
int thread()
{
while(otherCondition)
{
std::unique_lock<std::mutex> lock(mrun);
run.wait(lock, [] {return bRun; });
bDone = true;
//Do thread stuff here
lock.unlock();
run.notify_all();
}
}
But I can't see any signs of either the main or the other threads waiting for each other! Any idea what I am doing wrong or how I can do this?
There are a couple of problems. First, you're setting bDone as soon as the first worker wakes up. Thus the main thread wakes immediately and begins readying the next data set. You want to have the main thread wait until all workers have finished processing their data. Second, when a worker finishes processing, it loops around and immediately checks bRun. But it can't tell if bRun == true means that the next data set is ready or if the last data set is ready. You want to wait for the next data set.
Something like this should work:
std::mutex mrun;
std::condition_variable dataReady;
std::condition_variable workComplete;
int nCurrentIteration = 0;
int nWorkerCount = 0;
int main()
{
//Start N new threads (std::future/std::async)
while(condition)
{
//Set Up Data Here
//Send Data to threads
{
std::lock_guard<std::mutex> lock(mrun);
nWorkerCount = N;
++nCurrentIteration;
}
dataReady.notify_all();
//Wait for threads
{
std::unique_lock<std::mutex> lock(mrun);
workComplete.wait(lock, [] { return nWorkerCount == 0; });
}
}
//Get results from futures once complete
}
int thread()
{
int nNextIteration == 1;
while(otherCondition)
{
std::unique_lock<std::mutex> lock(mrun);
dataReady.wait(lock, [&nNextIteration] { return nCurrentIteration==nNextIteration; });
lock.unlock();
++nNextIteration;
//Do thread stuff here
lock.lock();
if (--nWorkerCount == 0)
{
lock.unlock();
workComplete.notify_one();
}
}
}
Be aware that this solution isn't quite complete. If a worker encounters an exception, then the main thread will hang (because the dead worker will never reduce nWorkerCount). You'll likely need a strategy to deal with that scenario.
Incidentally, this pattern is called a barrier.

How to create an efficient multi-threaded task scheduler in C++?

I'd like to create a very efficient task scheduler system in C++.
The basic idea is this:
class Task {
public:
virtual void run() = 0;
};
class Scheduler {
public:
void add(Task &task, double delayToRun);
};
Behind Scheduler, there should be a fixed-size thread pool, which run the tasks (I don't want to create a thread for each task). delayToRun means that the task doesn't get executed immediately, but delayToRun seconds later (measuring from the point it was added into the Scheduler).
(delayToRun means an "at-least" value, of course. If the system is loaded, or if we ask the impossible from the Scheduler, it won't be able to handle our request. But it should do the best it can)
And here's my problem. How to implement delayToRun functionality efficiently? I'm trying to solve this problem with the use of mutexes and condition variables.
I see two ways:
With manager thread
Scheduler contains two queues: allTasksQueue, and tasksReadyToRunQueue. A task gets added into allTasksQueue at Scheduler::add. There is a manager thread, which waits the smallest amount of time so it can put a task from allTasksQueue to tasksReadyToRunQueue. Worker threads wait for a task available in tasksReadyToRunQueue.
If Scheduler::add adds a task in front of allTasksQueue (a task, which has a value of delayToRun so it should go before the current soonest-to-run task), then the manager task need to be woken up, so it can update the time of wait.
This method can be considered inefficient, because it needs two queues, and it needs two condvar.signals to make a task run (one for allTasksQueue->tasksReadyToRunQueue, and one for signalling a worker thread to actually run the task)
Without manager thread
There is one queue in the scheduler. A task gets added into this queue at Scheduler::add. A worker thread checks the queue. If it is empty, it waits without a time constraint. If it is not empty, it waits for the soonest task.
If there is only one condition variable for which the working threads waiting for: this method can be considered inefficient, because if a task added in front of the queue (front means, if there are N worker threads, then the task index < N) then all the worker threads need to be woken up to update the time which they are waiting for.
If there is a separate condition variable for each thread, then we can control which thread to wake up, so in this case we don't need to wake up all threads (we only need to wake up the thread which has the largest waiting time, so we need to manage this value). I'm currently thinking about implementing this, but working out the exact details are complex. Are there any recommendations/thoughts/document on this method?
Is there any better solution for this problem? I'm trying to use standard C++ features, but I'm willing to use platform dependent (my main platform is linux) tools too (like pthreads), or even linux specific tools (like futexes), if they provide a better solution.
You can avoid both having a separate "manager" thread, and having to wake up a large number of tasks when the next-to-run task changes, by using a design where a single pool thread waits for the "next to run" task (if there is one) on one condition variable, and the remaining pool threads wait indefinitely on a second condition variable.
The pool threads would execute pseudocode along these lines:
pthread_mutex_lock(&queue_lock);
while (running)
{
if (head task is ready to run)
{
dequeue head task;
if (task_thread == 1)
pthread_cond_signal(&task_cv);
else
pthread_cond_signal(&queue_cv);
pthread_mutex_unlock(&queue_lock);
run dequeued task;
pthread_mutex_lock(&queue_lock);
}
else if (!queue_empty && task_thread == 0)
{
task_thread = 1;
pthread_cond_timedwait(&task_cv, &queue_lock, time head task is ready to run);
task_thread = 0;
}
else
{
pthread_cond_wait(&queue_cv, &queue_lock);
}
}
pthread_mutex_unlock(&queue_lock);
If you change the next task to run, then you execute:
if (task_thread == 1)
pthread_cond_signal(&task_cv);
else
pthread_cond_signal(&queue_cv);
with the queue_lock held.
Under this scheme, all wakeups are directly at only a single thread, there's only one priority queue of tasks, and there's no manager thread required.
Your specification is a bit too strong:
delayToRun means that the task doesn't get executed immediately, but delayToRun seconds later
You forgot to add "at least" :
The task don't get executed now, but at least delayToRun seconds later
The point is that if ten thousand tasks are all scheduled with a 0.1 delayToRun, they surely won't practically be able to run at the same time.
With such correction, you just maintain some queue (or agenda) of (scheduled-start-time, closure to run), you keep that queue sorted, and you start N (some fixed number) of threads which atomically pop the first element of the agenda and run it.
then all the worker threads need to be woken up to update the time which they are waiting for.
No, some worker threads would be woken up.
Read about condition variables and broadcast.
You might also user POSIX timers, see timer_create(2), or Linux specific fd timer, see timerfd_create(2)
You probably would avoid running blocking system calls in your threads, and have some central thread managing them using some event loop (see poll(2)...); otherwise, if you have a hundred tasks running sleep(100) and one task scheduled to run in half a second it won't run before a hundred seconds.
You may want to read about continuation-passing style programming (it -CPS- is highly relevant). Read the paper about Continuation Passing C by Juliusz Chroboczek.
Look also into Qt threads.
You could also consider coding in Go (with its Goroutines).
This is a sample implementation for the interface you provided that comes closest to your 'With manager thread' description.
It uses a single thread (timer_thread) to manage a queue (allTasksQueue) that is sorted based on the actual time when a task must be started (std::chrono::time_point).
The 'queue' is a std::priority_queue (which keeps its time_point key elements sorted).
timer_thread is normally suspended until the next task is started or when a new task is added.
When a task is about to be run, it is placed in tasksReadyToRunQueue, one of the worker threads is signaled, wakes up, removes it from the queue and starts processing the task..
Note that the thread pool has a compile-time upper limit for the number of threads (40). If you are scheduling more tasks than can be dispatched to workers,
new task will block until threads are available again.
You said this approach is not efficient, but overall, it seems reasonably efficient to me. It's all event driven and you are not wasting CPU cycles by unnecessary spinning.
Of course, it's just an example, optimizations are possible (note: std::multimap has been replaced with std::priority_queue).
The implementation is C++11 compliant
#include <iostream>
#include <chrono>
#include <queue>
#include <unistd.h>
#include <vector>
#include <thread>
#include <condition_variable>
#include <mutex>
#include <memory>
class Task {
public:
virtual void run() = 0;
virtual ~Task() { }
};
class Scheduler {
public:
Scheduler();
~Scheduler();
void add(Task &task, double delayToRun);
private:
using timepoint = std::chrono::time_point<std::chrono::steady_clock>;
struct key {
timepoint tp;
Task *taskp;
};
struct TScomp {
bool operator()(const key &a, const key &b) const
{
return a.tp > b.tp;
}
};
const int ThreadPoolSize = 40;
std::vector<std::thread> ThreadPool;
std::vector<Task *> tasksReadyToRunQueue;
std::priority_queue<key, std::vector<key>, TScomp> allTasksQueue;
std::thread TimerThr;
std::mutex TimerMtx, WorkerMtx;
std::condition_variable TimerCV, WorkerCV;
bool WorkerIsRunning = true;
bool TimerIsRunning = true;
void worker_thread();
void timer_thread();
};
Scheduler::Scheduler()
{
for (int i = 0; i <ThreadPoolSize; ++i)
ThreadPool.push_back(std::thread(&Scheduler::worker_thread, this));
TimerThr = std::thread(&Scheduler::timer_thread, this);
}
Scheduler::~Scheduler()
{
{
std::lock_guard<std::mutex> lck{TimerMtx};
TimerIsRunning = false;
TimerCV.notify_one();
}
TimerThr.join();
{
std::lock_guard<std::mutex> lck{WorkerMtx};
WorkerIsRunning = false;
WorkerCV.notify_all();
}
for (auto &t : ThreadPool)
t.join();
}
void Scheduler::add(Task &task, double delayToRun)
{
auto now = std::chrono::steady_clock::now();
long delay_ms = delayToRun * 1000;
std::chrono::milliseconds duration (delay_ms);
timepoint tp = now + duration;
if (now >= tp)
{
/*
* This is a short-cut
* When time is due, the task is directly dispatched to the workers
*/
std::lock_guard<std::mutex> lck{WorkerMtx};
tasksReadyToRunQueue.push_back(&task);
WorkerCV.notify_one();
} else
{
std::lock_guard<std::mutex> lck{TimerMtx};
allTasksQueue.push({tp, &task});
TimerCV.notify_one();
}
}
void Scheduler::worker_thread()
{
for (;;)
{
std::unique_lock<std::mutex> lck{WorkerMtx};
WorkerCV.wait(lck, [this] { return tasksReadyToRunQueue.size() != 0 ||
!WorkerIsRunning; } );
if (!WorkerIsRunning)
break;
Task *p = tasksReadyToRunQueue.back();
tasksReadyToRunQueue.pop_back();
lck.unlock();
p->run();
delete p; // delete Task
}
}
void Scheduler::timer_thread()
{
for (;;)
{
std::unique_lock<std::mutex> lck{TimerMtx};
if (!TimerIsRunning)
break;
auto duration = std::chrono::nanoseconds(1000000000);
if (allTasksQueue.size() != 0)
{
auto now = std::chrono::steady_clock::now();
auto head = allTasksQueue.top();
Task *p = head.taskp;
duration = head.tp - now;
if (now >= head.tp)
{
/*
* A Task is due, pass to worker threads
*/
std::unique_lock<std::mutex> ulck{WorkerMtx};
tasksReadyToRunQueue.push_back(p);
WorkerCV.notify_one();
ulck.unlock();
allTasksQueue.pop();
}
}
TimerCV.wait_for(lck, duration);
}
}
/*
* End sample implementation
*/
class DemoTask : public Task {
int n;
public:
DemoTask(int n=0) : n{n} { }
void run() override
{
std::cout << "Start task " << n << std::endl;;
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << " Stop task " << n << std::endl;;
}
};
int main()
{
Scheduler sched;
Task *t0 = new DemoTask{0};
Task *t1 = new DemoTask{1};
Task *t2 = new DemoTask{2};
Task *t3 = new DemoTask{3};
Task *t4 = new DemoTask{4};
Task *t5 = new DemoTask{5};
sched.add(*t0, 7.313);
sched.add(*t1, 2.213);
sched.add(*t2, 0.713);
sched.add(*t3, 1.243);
sched.add(*t4, 0.913);
sched.add(*t5, 3.313);
std::this_thread::sleep_for(std::chrono::seconds(10));
}
It means that you want to run all tasks continuously using some order.
You can create some type of sorted by a delay stack (or even linked list) of tasks. When a new task is coming you should insert it in the position depending of a delay time (just efficiently calculate that position and efficiently insert the new task).
Run all tasks starting with the head of the task stack (or list).
Core code for C++11:
#include <thread>
#include <queue>
#include <chrono>
#include <mutex>
#include <atomic>
using namespace std::chrono;
using namespace std;
class Task {
public:
virtual void run() = 0;
};
template<typename T, typename = enable_if<std::is_base_of<Task, T>::value>>
class SchedulerItem {
public:
T task;
time_point<steady_clock> startTime;
int delay;
SchedulerItem(T t, time_point<steady_clock> s, int d) : task(t), startTime(s), delay(d){}
};
template<typename T, typename = enable_if<std::is_base_of<Task, T>::value>>
class Scheduler {
public:
queue<SchedulerItem<T>> pool;
mutex mtx;
atomic<bool> running;
Scheduler() : running(false){}
void add(T task, double delayMsToRun) {
lock_guard<mutex> lock(mtx);
pool.push(SchedulerItem<T>(task, high_resolution_clock::now(), delayMsToRun));
if (running == false) runNext();
}
void runNext(void) {
running = true;
auto th = [this]() {
mtx.lock();
auto item = pool.front();
pool.pop();
mtx.unlock();
auto remaining = (item.startTime + milliseconds(item.delay)) - high_resolution_clock::now();
if(remaining.count() > 0) this_thread::sleep_for(remaining);
item.task.run();
if(pool.size() > 0)
runNext();
else
running = false;
};
thread t(th);
t.detach();
}
};
Test code:
class MyTask : Task {
public:
virtual void run() override {
printf("mytask \n");
};
};
int main()
{
Scheduler<MyTask> s;
s.add(MyTask(), 0);
s.add(MyTask(), 2000);
s.add(MyTask(), 2500);
s.add(MyTask(), 6000);
std::this_thread::sleep_for(std::chrono::seconds(10));
}

Correct way to wait a condition variable that is notified by several threads

I'm trying to do this with the C++11 concurrency support.
I have a sort of thread pool of worker threads that all do the same thing, where a master thread has an array of condition variables (one for each thread, they need to 'start' synchronized, ie not run ahead one cycle of their loop).
for (auto &worker_cond : cond_arr) {
worker_cond.notify_one();
}
then this thread has to wait for a notification of each thread of the pool to restart its cycle again. Whats the correct way of doing this? Have a single condition variable and wait on some integer each thread that isn't the master is going to increase? something like (still in the master thread)
unique_lock<std::mutex> lock(workers_mtx);
workers_finished.wait(lock, [&workers] { return workers = cond_arr.size(); });
I see two options here:
Option 1: join()
Basically instead of using a condition variable to start the calculations in your threads, you spawn a new thread for every iteration and use join() to wait for it to be finished. Then you spawn new threads for the next iteration and so on.
Option 2: locks
You don't want the main-thread to notify as long as one of the threads is still working. So each thread gets its own lock, which it locks before doing the calculations and unlocks afterwards. Your main-thread locks all of them before calling the notify() and unlocks them afterwards.
I see nothing fundamentally wrong with your solution.
Guard workers with workers_mtx and done.
We could abstract this with a counting semaphore.
struct counting_semaphore {
std::unique_ptr<std::mutex> m=std::make_unique<std::mutex>();
std::ptrdiff_t count = 0;
std::unique_ptr<std::condition_variable> cv=std::make_unique<std::condition_variable>();
counting_semaphore( std::ptrdiff_t c=0 ):count(c) {}
counting_semaphore(counting_semaphore&&)=default;
void take(std::size_t n = 1) {
std::unique_lock<std::mutex> lock(*m);
cv->wait(lock, [&]{ if (count-std::ptrdiff_t(n) < 0) return false; count-=n; return true; } );
}
void give(std::size_t n = 1) {
{
std::unique_lock<std::mutex> lock(*m);
count += n;
if (count <= 0) return;
}
cv->notify_all();
}
};
take takes count away, and blocks if there is not enough.
give adds to count, and notifies if there is a positive amount.
Now the worker threads ferry tokens between two semaphores.
std::vector< counting_semaphore > m_worker_start{count};
counting_semaphore m_worker_done{0}; // not count, zero
std::atomic<bool> m_shutdown = false;
// master controller:
for (each step) {
for (auto&& starts:m_worker_start)
starts.give();
m_worker_done.take(count);
}
// master shutdown:
m_shutdown = true;
// wake up forever:
for (auto&& starts:m_worker_start)
starts.give(std::size_t(-1)/2);
// worker thread:
while (true) {
master->m_worker_start[my_id].take();
if (master->m_shutdown) return;
// do work
master->m_worker_done.give();
}
or somesuch.
live example.

Thread synchronisation with SDL thread library

I am attempting to write a thread safe task queue for multithreading in C++, using SDL2's threading library.
The thread function which runs on all threads is as follows:
int threadFunc(void * pData)
{
ThreadData* data = (ThreadData*)pData;
SDLTaskManager* pool = data->pool;
Task* task = nullptr;
while (true)
{
SDL_LockMutex(pool->mLock);
while (!pool->mRunning && pool->mCurrentTasks.empty())
{
//mutex is unlocked, then locked again when signal received
SDL_CondWait(pool->mConditionFlag, pool->mLock);
if (pool->mShuttingDown)
return 0;
}
//mutex is locked at this stage so no other threads can alter contents of deque
//code inside if block should not be executed if deque is empty
if (!pool->mCurrentTasks.empty())
{
/*out of range error here*/
task = pool->mCurrentTasks.front();
pool->mCurrentTasks.pop_front();
}
if (task != nullptr)
{
pool->notifyThreadWorking(true);
data->taskCount++;
}
else
{
pool->stop();
SDL_UnlockMutex(pool->mLock);
continue;
}
SDL_UnlockMutex(pool->mLock);
task->execute();
SDL_LockMutex(pool->mLock);
pool->notifyThreadWorking(false);
pool->mCompleteTasks.push_back(task);
SDL_UnlockMutex(pool->mLock);
task = nullptr;
}
return 0;
}
As you can see, according to the comments in the code, an out of range error occurs inside an if block, where the deque is empty. However, there is a check there to make sure that the code is only executed if the deque is not empty. The mutex is locked by SDL_CondWait so no other thread should be able to make changes to the deque, until that mutex is unlocked again.
The producer code is as follows:
SDL_LockMutex(pool->mLock);
for (int i = 0; i < numTasks; i++)
{
pool->mCurrentTasks.push_back(new Task());
}
pool->mRunning = true;
SDL_CondBroadcast(pool->mConditionFlag);
SDL_UnlockMutex(pool->mLock);
The fact that the code inside the if block is being executed, shows that at the time if (!pool->mCurrentTasks.empty()) is evaluated, the deque has member data, but not when it reaches task = pool->mCurrentTasks.front(); By my understanding of mutex' this shouldn't be possible. How can this be?