Synchronizing worker threads - c++

I have a scenario for which I am trying to come up with the best synchronization approach. We assume that std::thread in C++11 is present, so no need to worry about differences between various threading libraries etc.
The scenario is this. Thread a, the main thread, wants to hand out tasks to a bunch of worker threads. Then, after giving out its final instruction for the time being, it needs to wait for all the threads to complete their work. We don't want to join them, just wait for them to finish their given task. Then thread a has to analyze the collected data from all threads, and then send out commands to the workers to begin the procedure again.
In short, these are the steps.
Thread a sends command x to all worker threads.
Thread a waits until all the workers have finished.
Thread a does processing.
Go back to 1.
What would you suggest that I use? Simple mutexes? Condition variables? A combination of the two? Any tips on how to structure the synchronization to be as efficient as possible would be appreciated.

You have n worker threads and one main thread a, which delegates tasks to workers and must wait for them to complete these tasks before assigning them a new batch of tasks.
The basic technique is to use a barrier (like boost::barrier) to synchronize the end of the worker threads and a.
The barrier is inittialized at n+1. Main thread a waits on the barrier, and each worker threads does the same at the end of its task. When the last thread called wait on the barrier, all the threads are woken up, and main thread can continue its work. You may want to add a second barrier to block the worker threads until a new task is assigned to them.
The body of worker thread may look like the following pseudocode:
while (running) {
startbarrier.wait(); // wait for main thread to signal start
do_work();
endbarrier.wait(); // signal end of work
}
The same thing can also be implemented with semaphores. Both semaphore and barrier can be implemented with a mutex and a condition variable.
See this SO question for more details.

Related

How does a main program wait for several threads at the same time via wait rather than join?

I have a main program that needs to launch a number of threads that immediately wait on a condition_variable. The main program then enters a loop. Each iteration of the loop by the main program does a notify_all to awaken all the threads. The main program then waits for all the threads to complete their processing before beginning the next iteration of the main loop.
I know how to do this for one thread. I think I know how to do this with multiple threads, but I have a question with how the main program should go about waiting on all the threads. The main program cannot do it with join because the threads are going to wait for the next iteration of the main loop rather than terminate. So the main program has to do it with wait.
Each thread will of course notify the main program that it is complete for this iteration immediately before it goes back into wait. I know how to do notify and wait back and forth between the main program and one thread, and I know generally how to use condition_variables. I know how to avoid lost wakeups and spurious wakeup events, etc. But I'm a little fuzzy on how the main program is supposed to wait to be notified by all the threads before proceeding with the next iteration of the loop. Conceptually, the main program needs to wait on multiple condition_variables, one for each thread. But that's not how condition_variables work. You can only wait on one condition_variable at a time.
I can think of two ways to handle the main program waiting on all the threads. The two ways I can think of to do it is that the main program would be waiting on a single condition variable that all the threads would notify when they were done or that there would be multiple condition variables that the main program would be waiting on, one per thread, and the main program would wait for them one at a time. I'm sure I know how to make the latter case work. The question is whether it's possible to make the former case work. If it could work, I would picture a different Boolean variable that would be set by each thread when it finished an iteration. The main program's wait would have a predicate that tested each Boolean variable in turn. The wait processing would loop until it had found all the boolean variables turned on. So that's the question. Can I get by with one condition_variable that each thread notifies? Or do I need a separate condition_variable for each thread to notify?
It doesn't matter to my question, but during iteration N of the loop, the work being done by each thread is completely independent of the work being done at that time by the other threads. But during iteration N+1 of the loop, the work being done be each thread depends on the work that was done by all the threads during iteration N. That's why the main program has to wait on all the threads to complete an iteration before it can notify any of the threads to begin the next iteration. Were it not for that requirement, I wouldn't even need condition_variables. The threads could run asynchronously. The main program could just launch them all and wait for them all to complete via a join.
You can use a counter, which needs to be protected by mutex. Seit it to 0 before you start, and each thread increases it when it is done, so you need only a single object to check against which can be easily done with a condition variable.
if (DoneCount == ThreadCount)
// All threads done.
Can I get by with one condition_variable that each thread notifies? Or do I need a separate condition_variable for each thread to notify?
Either method works without any issues. Waiting for all of some set of things to happen can be accomplished simply by waiting for each of the things to happen successively in any order. You can use a shared condition variable or a different condition variable for each event.

how to implement a set of persistent coordinated worker threads

I want to reuse a set of worker threads. Each worker thread performs independent work but they must start and stop processing as a coordinated team. I need an efficient means for each worker thread to block until the main thread tells them all to go, and an efficient means for the main thread to block until all worker threads are finished.
Each chunk of work will only require some tens of microseconds so the usual approach of creating a set of threads then joining them all involves far too much overhead.
The pseudocode is like the following:
main thread:
create N threads
forever
prepare new independent work for each thread
tell all N threads to run their part
wait for all N threads to complete their work
use results
typical worker thread:
forever
wait to run
do my work
indicate to main my work is complete
My question is how best to perform this signaling and synchronization. I am not asking about how to divide up the work or move work to or from each thread; suffice it to say the threads do not interact.

Continue executing another thread

I'm currently playing with WinAPI and I have proceeded to threads. My program has n threads which work with each other. There is one main thread, which writes some data to specific memory location and waits until thread working with specific memory location processes the data. Then whole procedure repeats again, without terminating any thread.
My problem is that busy waiting for second thread wastes too much time.
Is there any way to suspend current thread (to leave time for enother threads) or to solve this problem differently?
Please help.
I'm guessing that you're currently polling / busy waiting in your main thread, constantly checking the state of some completion flag the worker thread will set. As you note, this isn't desirable as you use some proportion of cpu bandwidth just waiting for the worker to complete. In some cases, this will reduce the amount of time your worker is scheduled for, delaying its completion.
Rather that doing this, you can use a synchronisation object such as Event or Semaphore to have your main thread sleep until the worker signals its completion.
You can use synchronization objects like mutex, semaaphores events etc for synchronization and WaitForSingleObject/WaitForMultipleObject API for thread waiting.

boost thread pool

I need a threadpool for my application, and I'd like to rely on standard (C++11 or boost) stuff as much as possible. I realize there is an unofficial(!) boost thread pool class, which basically solves what I need, however I'd rather avoid it because it is not in the boost library itself -- why is it still not in the core library after so many years?
In some posts on this page and elsewhere, people suggested using boost::asio to achieve a threadpool like behavior. At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
What I want to do is:
ThreadPool pool(4);
for (...)
{
for (int i=0;i<something;i++)
pool.pushTask(...);
pool.join();
// do something with the results
}
Can anyone suggest a solution (except for using the existing unofficial thread pool on sourceforge)? Is there anything in C++11 or core boost that can help me here?
At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
I think you might have misunderstood the asio example:
IIRC (and it's been a while) each thread running in the thread pool has called io_service::run which means that effectively each thread has an event loop and a scheduler. To then get asio to complete tasks you post tasks to the io_service using the io_service::post method and asio's scheduling mechanism takes care of the rest. As long as you don't call io_service::stop, the thread pool will continue running using as many threads as you started running (assuming that each thread has work to do or has been assigned a io_service::work object).
So you don't need to create new threads for new tasks, that would go against the concept of a threadpool.
Have each task class derive from a Task that has an 'OnCompletion(task)' method/event. The threadpool threads can then call that after calling the main run() method of the task.
Waiting for a single task to complete is then easy. The OnCompletion() can perform whatever is required to signal the originating thread, signaling a condvar, queueing the task to a producer-consumer queue, calling SendMessage/PostMessage API's, Invoke/BeginInvoke, whatever.
If an oringinating thread needs to wait for several tasks to all complete, you could extend the above and issue a single 'Wait task' to the pool. The wait task has its own OnCompletion to communicate the completion of other tasks and has a thread-safe 'task counter', (atomic ops or lock), set to the number of 'main' tasks to be issued. The wait task is issued to the pool first and the thread that runs it waits on a private 'allDone' condvar in the wait task. The 'main' tasks are then issued to the pool with their OnCompletion set to call a method of the wait task that decrements the task counter towards zero. When the task counter reaches zero, the thread that achieves this signals the allDone condvar. The wait task OnCompletion then runs and so signals the completion of all the main tasks.
Such a mechansism does not require the continual create/terminate/join/delete of threadpool threads, places no restriction on how the originating task needs to be signaled and you can issue as many such task-groups as you wish. You should note, however, that each wait task blocks one threadpool thread, so make sure you create a few extra threads in the pool, (not usually any problem).
This seems like a job for boost::futures. The example in the docs seems to demonstrate exactly what you're looking to do.
Joining a thread mean stop for it until it stop, and if it stop and you want to assign a new task to it, you must create a new thread. So in your case you should wait for a condition (for example boost::condition_variable) to indicate end of tasks. So using this technique it is very easy to implement it using boost::asio and boost::condition_variable. Each thread call boost::asio::io_service::run and tasks will be scheduled and executed on different threads and at the end, each task will set a boost::condition_variable or event decrement a std::atomic to indicate end of the job! that's really easy, isn't it?

sleeping a thread in the middle of execution

What happens when a thread is put to sleep by other thread, possible by main thread, in the middle of its execution?
assuming I've a function Producer. What if Consumer sleep()s the Producer in the middle of production of one unit ?
Suppose the unit is half produced. and then its put on sleep(). The integrity of system may be in a problem
The thread that sleep is invoked on is put in the idle queue by the thread scheduler and is context switched out of the CPU it is running on, so other threads can take it's place.
All context (registers, stack pointer, base pointer, etc) are saved on the thread stack, so when it's run next time, it can continue from where it left off.
The OS is constantly doing context switches between threads in order to make your system seem like it's doing multiple things. The OS thread scheduler algorithm takes care of that.
Thread scheduling and threading is a big subject, if you want to really understand it, I suggest you start reading up on it. :)
EDIT: Using sleep for thread synchronization purposes not advised, you should use proper synchronization mechanisms to tell the thread to wait for other threads, etc.
There is no problem associated with this, unless some state is mutated while the thread sleeps, so it wakes up with a different set of values than before going to sleep.
Threads are switched in and out of execution by the CPU all the time, but that does not affect the overall outcome of their execution, assuming no data races or other bugs are present.
It would be unadvisable for one thread to forcibly and synchronously interfere with the execution of another thread. One thread could send an asynchronous message to another requesting that it reschedule itself in some way, but that would be handled by the other thread when it was in a suitable state to do so.
Assuming they communicate using channels that are thread-safe, nothing bad shoudl happen, as the sleeping thread will wake up eventually and grab data from its task queue or see that some semaphore has been set and read the prodced data.
If the threads communicate using nonvolatile variables or direct function calls that change state, that's when Bad Things occur.
I don't know of a way for a thread to forcibly cause another thread to sleep. If two threads are accessing a shared resource (like an input/output queue, which seems likely for you Produce/Consumer example), then both threads may contend for the same lock. The losing thread must wait for the other thread to release the lock if the contention is not the "trylock" variety. The thread that waits is placed into a waiting queue associated with the lock, and is removed from the schedulers run queue. When the winning thread releases the lock, the code checks the queue to see if there are threads still waiting to acquire it. If there are, one is chosen as the winner and is given the lock, and placed in the scheduler run queue.