platform independent inter thread communication - c++

I have a process which receives multiple jobs and picks a thread from thread pool and assigns a job to it, this thread in turn may spawn another set of threads from its own thread pool. Now when a STOP request for a job comes to the main process, it should be forwarded to corresponding thread for that request and all the threads associated with that job should clean themselves up and exit, My question is how to notify the worker threads about "STOP".
A global variable can be used and worker threads can poll it frequently but there are lot of functions that a worker can be doing, and adding checks everywhere could work.
Is there a clean approach? some kind of messaging layer. btw the code is C++

The Boost.Thread library is a wrapper around pthreads that's also portable to Windows. The boost::thread class has an interrupt() method that'll interrupt the thread at the next interruption point.
Boost.Thread also has a thread_group class which provides a collection of related threads. thread_group also has an interrupt() method that invokes interrupt() on each thread in the thread group.

Related

Behavior of boost::asio::io_service thread pool during uneven load

I have hard time finding out how exactly does thread pool built with boost::asio::io_service behave.
The documentation says:
Multiple threads may call the run() function to set up a pool of
threads from which the io_service may execute handlers. All threads
that are waiting in the pool are equivalent and the io_service may
choose any one of them to invoke a handler.
I would imagine, that when threads executing run() are taking a handler to execute, they execute it, and then come back to wait for next handlers to execute. When executing a handler, a thread is not considered waiting, and hence no new handlers to execute are assigned to it. Is that correct? Or does io_service assign work to threads, without considering whether these are busy or not?
I am asking, because in one project that we are using (OSRM), that uses boost::asio::io_service based thread pool to handle incoming HTTP requests, I noticed that long running request, sometimes block other, fast requests, even though more threads and cores are available.
When executing a handler, a thread is not considered waiting, and hence no new handlers to execute are assigned to it. Is that correct?
Yes. It's a pull model queue.
A notable "apparent" exception is when strands are used: handlers wrapped on a on a strand do synchronize with other handlers running on that same strand.

Boost::Asio - wake up a thread when there are handlers to run

The common way to process Asio handlers is to have a thread (or several threads) either polling io_service (i.e. calling io_service::poll()) regularly to run the handlers or using io_service::run(), which blocks the thread until there's work to do, in which case the thread will run the required handlers and either return or go to sleep again.
However, I want to make a system where a thread is not only responsible for running Asio handlers, but also needs to sync up with another thread using a condition variable. Basically, I want the thread to do all of these:
Wake up when there are Asio handlers that need to be processed (i.e. if I call io_service::poll(), one or more handlers will be processed).
Wake up when there is non-Asio work to be done, indicated by my condition variable.
Sleep otherwise.
In other words, I need a way for Asio to signal me that there are handlers ready to execute, without having to busy-wait or continuously poll. Ideally, Asio will somehow signal a thread when work is available, and that thread will in turn wake up my main worker thread, which will process Asio handlers. That worker thread will also be occasionally woken up by yet another thread, and will process other, non-Asio related work.
Is this even feasible, or should I reconsider how I am designing my system?

What's the purpose of SignalObjectAndWait regards there is SetEvent and WaitForSingleObject?

I just realized there is SignalObjectAndWait API function for Windows platform. But there is already SetEvent and WaitForSingleObject. You can use them together to achieve the same goal as SignalObjectAndWait.
Based on the MSDN, SignalObjectAndWait is more efficient than separate calls to SetEvent and WaitForSingleObject. It also states:
A thread can use the SignalObjectAndWait function to ensure that a worker thread is in a wait state before signaling an object.
I don't fully understand this sentence, but it seems that efficiency is not the only reason why we need SignalObjectAndWait. Can anybody provide a scenario where SetEvent + WaitForSingleObject fails to provide the functionality that SignalObjectAndWait offers?
My understanding is that this single function is more efficient in the way that it avoid the following scenario.
The SignalObjectAndWait function provides a more efficient way to signal one object and then wait on another compared to separate function calls such as SetEvent followed by WaitForSingleObject.
When you you SetEvent and another [esp. higher priority] thread is waiting on this event, it might so happen that thread scheduler takes control away from the signaling thread. When the thread receives control back, the only thing that it does is the following WaitForSingleObject call, thus wasting context switch for such a tiny thing.
Using SignalObjectAndWait you hint the kernel by saying "hey, I will be waiting for another event anyway, so if it makes any difference for you don't excessively bounce with context switches back and forth".
The purpose, as MSDN explains is to ensure that the thread is in a Wait State BEFORE the event is signalled. If you call WaitForSingleObject, the thread is in a waitstate, but you can't call that BEFORE calling SetEvent, since that will cause SetEvent to happen only AFTER the wait has finished - which is pointless if nothing else is calling SetEvent.
As you know, Microsoft gives the following example of why we may ever need SignalObjectAndWait if we already need separate SetEvent and WaitForSingleObject (quote the Microsoft example):
A thread can use the SignalObjectAndWait function to ensure that a worker thread is in a wait state before signaling an object. For example, a thread and a worker thread may use handles to event objects to synchronize their work. The thread executes code such as the following:
dwRet = WaitForSingleObject(hEventWorkerDone, INFINITE);
if( WAIT_OBJECT_0 == dwRet)
SetEvent(hEventMoreWorkToDo);
The worker thread executes code such as the following:
dwRet = SignalObjectAndWait(hEventWorkerDone,
hEventMoreWorkToDo,
INFINITE,
FALSE);
This algorithm flow is flawed and should never be used. We do not need such a perplexing mechanism where the threads notify each other until we are in a “Race condition”. Microsoft itself in this example creates the Race Condition. The worker thread should just wait for an event and take tasks from a list, while the thread that generates tasks should just add tasks to this list and signal the event. So, we just need one event, not two as in the above Microsoft example. The list has to be protected by a critical section. The thread that generates tasks should not wait for the worker thread to complete the tasks. If there are tasks that require to notify somebody on their completion, the tasks should send the notifications by themselves. In other words, it is the task who will notify the thread on completion -- it is not the thread who will specifically wait for the jobs thread until it finishes processing all the tasks.
Such a flawed design, as in the Microsoft Example, creates imperative for such monsters like atomic SignalObjectAndWait and atomic PulseEvent -- function that ultimately lead to doom.
Here is an algorithm how can you achieve you goal set in your question. The goal is achieved with just plain and simple events, and simple function SetEvent and WaitForSingleObject - no other functions needed.
Create one common auto-reset event for all job threads to signal that there is a task (tasks) available; and also create per-thread auto-reset events, one event for each job thread.
Multiple job treads, once finished running all the jobs, all wait for this common auto-reset “task available” event using WaitForMultipleObjects - it waits two event - the common event and the own thread event.
The scheduler thread puts new (pending) jobs to the list.
The jobs list access has to be protected by EnterCriticalSection/LeaveCriticalSection, so no one ever accesses this list the other way.
Each of the job threads, after completing one job, before starting to wait for the auto-reset “task available” event and its own event, checks the pending jobs list. If the list is not empty, get one job from the list (remove it from the list) and execute it.
There have to be another list protected by critical section – waiting jobs thread list.
Before each jobs tread starts waiting, i.e. before it calls WaitForMultipleObjects, it adds itself to the “waiting” list. On exit from wait, it removes itself from this waiting list.
When the scheduler thread puts new (pending) jobs to the jobs list, it first enters critical section of the jobs list and then of the treads list - so two critical sections are entered simultaneously. The jobs threads, however, may never enter both critical sections simultaneously.
If there is just one job pending, the scheduler sets the common auto-reset event to the signaled state (call SetEvent) -- it doesn’t matter which of the sleeping job threads will pick up the job.
If there are two or more jobs pending, it would not signal the common event, but will count how many threads are waiting. If there are at least as many threads waiting as there are the jobs, signal the own event of that number of threads as there are events, and leave the remaining thread to continue their sleeping.
If there are more jobs than waiting threads, signal the own event for each of the waiting thread.
After the scheduler thread has signaled all the events, it leaves the critical sections - first of the thread list, and then of the jobs list.
After the scheduler thread has signaled all the events needed for the particular case, it goes to sleep itself, i.e. calls WaitForSingleObject with its own sleep event (that is also an auto-reset event that should be signaled whenever a new job appears).
Since the jobs threads will not start to sleep until the whole jobs list is depleted, you will no longer need the scheduler thread again. The scheduler thread will only be needed later, when a new jobs appears, not when a job is finished by the jobs thread.
Important: this scheme is based purely on auto-reset events. You won’t ever need to call ResetEvent. All the functions that are needed are: SetEvent and WaitForMultipleObjects (or WaitForSingleObject). No atomic event operation is needed.
Please note: when I wrote that a thread sleeps, it doesn't call "Sleep" API call - it will never be needed, it just is in the "wait" state as a result of calling WaitForMultipleObjects (or WaitForSingleObject).
As you know, auto-reset event, and the SetEvent and WaitForMultipleObjects function are very reliable. They exist since NT 3.1. You may always architect such a program logic that will solely rely on these simple functions -- so you would not ever need complex and unreliable functions that presume atomic operations, like PulseEvent or SignalObjectAndWait. By the way, SignalObjectAndWait did only appear in Windows NT 4.0, while SetEvent and WaitForMultipleObjects did exist from the initial version of Win32 – NT 3.1.

boost thread pool

I need a threadpool for my application, and I'd like to rely on standard (C++11 or boost) stuff as much as possible. I realize there is an unofficial(!) boost thread pool class, which basically solves what I need, however I'd rather avoid it because it is not in the boost library itself -- why is it still not in the core library after so many years?
In some posts on this page and elsewhere, people suggested using boost::asio to achieve a threadpool like behavior. At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
What I want to do is:
ThreadPool pool(4);
for (...)
{
for (int i=0;i<something;i++)
pool.pushTask(...);
pool.join();
// do something with the results
}
Can anyone suggest a solution (except for using the existing unofficial thread pool on sourceforge)? Is there anything in C++11 or core boost that can help me here?
At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
I think you might have misunderstood the asio example:
IIRC (and it's been a while) each thread running in the thread pool has called io_service::run which means that effectively each thread has an event loop and a scheduler. To then get asio to complete tasks you post tasks to the io_service using the io_service::post method and asio's scheduling mechanism takes care of the rest. As long as you don't call io_service::stop, the thread pool will continue running using as many threads as you started running (assuming that each thread has work to do or has been assigned a io_service::work object).
So you don't need to create new threads for new tasks, that would go against the concept of a threadpool.
Have each task class derive from a Task that has an 'OnCompletion(task)' method/event. The threadpool threads can then call that after calling the main run() method of the task.
Waiting for a single task to complete is then easy. The OnCompletion() can perform whatever is required to signal the originating thread, signaling a condvar, queueing the task to a producer-consumer queue, calling SendMessage/PostMessage API's, Invoke/BeginInvoke, whatever.
If an oringinating thread needs to wait for several tasks to all complete, you could extend the above and issue a single 'Wait task' to the pool. The wait task has its own OnCompletion to communicate the completion of other tasks and has a thread-safe 'task counter', (atomic ops or lock), set to the number of 'main' tasks to be issued. The wait task is issued to the pool first and the thread that runs it waits on a private 'allDone' condvar in the wait task. The 'main' tasks are then issued to the pool with their OnCompletion set to call a method of the wait task that decrements the task counter towards zero. When the task counter reaches zero, the thread that achieves this signals the allDone condvar. The wait task OnCompletion then runs and so signals the completion of all the main tasks.
Such a mechansism does not require the continual create/terminate/join/delete of threadpool threads, places no restriction on how the originating task needs to be signaled and you can issue as many such task-groups as you wish. You should note, however, that each wait task blocks one threadpool thread, so make sure you create a few extra threads in the pool, (not usually any problem).
This seems like a job for boost::futures. The example in the docs seems to demonstrate exactly what you're looking to do.
Joining a thread mean stop for it until it stop, and if it stop and you want to assign a new task to it, you must create a new thread. So in your case you should wait for a condition (for example boost::condition_variable) to indicate end of tasks. So using this technique it is very easy to implement it using boost::asio and boost::condition_variable. Each thread call boost::asio::io_service::run and tasks will be scheduled and executed on different threads and at the end, each task will set a boost::condition_variable or event decrement a std::atomic to indicate end of the job! that's really easy, isn't it?

What is the best way to create suspendable/resumable threads

I am doing some network programming for a microprocessor which sends low buffer notifications and I have a thread that writes a set amount of information. When it is done it needs to enter a suspended state and wait for the low buffer notification to resume.
Is it better to use windows' thread pool api, or to use threads that are created with CreateThread()?
When your thread needs to wait, it should begin waiting on an event. This suspends the thread automatically.
Windows provides the WaitForMultipleObjects and WaitForSingleObject functions for this. Linux uses condition variables or semaphores.
The best way to create a suspendable thread is:
std::thread thread(function, arguments);
When you want to suspend the execution of that thread at a later stage you can use the wait() member of std::condition_variable or std::condition_variable_any.
It is better to use single threads created with CreateThread. ThreadPool threads are meant to do simple tasks and then return to the pool, they are not meant for long running tasks, waits or I/O operations. This is because they are limited in number and once you have one running and waiting, you cannot use it somewhere else.
Furthermore, ThreadPool threads are managed by the system and are not meant to be identifiable from the outside. You're better off using classic Threads.