How to make tasks in boost::asio thread pool Cancelable/Interruptable? - c++

I have a thread pool made with boost::asio::io_service from recipe like this. I wonder how to interrupt posted into it tasks (not killing threads), so that they would be replaced with next tasks in pool?

Callbacks in boost::asio in general should be reasonably fast. They should just do some work, schedule another one and finish. This will be the points, where task looses CPU and other task will be taken into execution.
There are boost::thread::interrupt() and boost::thread_group::interrupt_all(), but they could stop execution thread only at interruption points. The interruption will be seen as exception boost::thread_interrupted. This means, that you have to handle exception somehow and in your case - release current task. It is much more complicated, than just doing one step of processing and scheduling handler for the rest.
Moreover, you could play with interrupt() and interrupt_all() in threads which do execute your own routine, while this is not the case for threads running io_service::run(). One can imagine, that boost::thread_interrupted is being thrown inside run() method, instead of async handler, which could end with unexpected behavior.

Related

Behavior of boost::asio::io_service thread pool during uneven load

I have hard time finding out how exactly does thread pool built with boost::asio::io_service behave.
The documentation says:
Multiple threads may call the run() function to set up a pool of
threads from which the io_service may execute handlers. All threads
that are waiting in the pool are equivalent and the io_service may
choose any one of them to invoke a handler.
I would imagine, that when threads executing run() are taking a handler to execute, they execute it, and then come back to wait for next handlers to execute. When executing a handler, a thread is not considered waiting, and hence no new handlers to execute are assigned to it. Is that correct? Or does io_service assign work to threads, without considering whether these are busy or not?
I am asking, because in one project that we are using (OSRM), that uses boost::asio::io_service based thread pool to handle incoming HTTP requests, I noticed that long running request, sometimes block other, fast requests, even though more threads and cores are available.
When executing a handler, a thread is not considered waiting, and hence no new handlers to execute are assigned to it. Is that correct?
Yes. It's a pull model queue.
A notable "apparent" exception is when strands are used: handlers wrapped on a on a strand do synchronize with other handlers running on that same strand.

What's the purpose of SignalObjectAndWait regards there is SetEvent and WaitForSingleObject?

I just realized there is SignalObjectAndWait API function for Windows platform. But there is already SetEvent and WaitForSingleObject. You can use them together to achieve the same goal as SignalObjectAndWait.
Based on the MSDN, SignalObjectAndWait is more efficient than separate calls to SetEvent and WaitForSingleObject. It also states:
A thread can use the SignalObjectAndWait function to ensure that a worker thread is in a wait state before signaling an object.
I don't fully understand this sentence, but it seems that efficiency is not the only reason why we need SignalObjectAndWait. Can anybody provide a scenario where SetEvent + WaitForSingleObject fails to provide the functionality that SignalObjectAndWait offers?
My understanding is that this single function is more efficient in the way that it avoid the following scenario.
The SignalObjectAndWait function provides a more efficient way to signal one object and then wait on another compared to separate function calls such as SetEvent followed by WaitForSingleObject.
When you you SetEvent and another [esp. higher priority] thread is waiting on this event, it might so happen that thread scheduler takes control away from the signaling thread. When the thread receives control back, the only thing that it does is the following WaitForSingleObject call, thus wasting context switch for such a tiny thing.
Using SignalObjectAndWait you hint the kernel by saying "hey, I will be waiting for another event anyway, so if it makes any difference for you don't excessively bounce with context switches back and forth".
The purpose, as MSDN explains is to ensure that the thread is in a Wait State BEFORE the event is signalled. If you call WaitForSingleObject, the thread is in a waitstate, but you can't call that BEFORE calling SetEvent, since that will cause SetEvent to happen only AFTER the wait has finished - which is pointless if nothing else is calling SetEvent.
As you know, Microsoft gives the following example of why we may ever need SignalObjectAndWait if we already need separate SetEvent and WaitForSingleObject (quote the Microsoft example):
A thread can use the SignalObjectAndWait function to ensure that a worker thread is in a wait state before signaling an object. For example, a thread and a worker thread may use handles to event objects to synchronize their work. The thread executes code such as the following:
dwRet = WaitForSingleObject(hEventWorkerDone, INFINITE);
if( WAIT_OBJECT_0 == dwRet)
SetEvent(hEventMoreWorkToDo);
The worker thread executes code such as the following:
dwRet = SignalObjectAndWait(hEventWorkerDone,
hEventMoreWorkToDo,
INFINITE,
FALSE);
This algorithm flow is flawed and should never be used. We do not need such a perplexing mechanism where the threads notify each other until we are in a “Race condition”. Microsoft itself in this example creates the Race Condition. The worker thread should just wait for an event and take tasks from a list, while the thread that generates tasks should just add tasks to this list and signal the event. So, we just need one event, not two as in the above Microsoft example. The list has to be protected by a critical section. The thread that generates tasks should not wait for the worker thread to complete the tasks. If there are tasks that require to notify somebody on their completion, the tasks should send the notifications by themselves. In other words, it is the task who will notify the thread on completion -- it is not the thread who will specifically wait for the jobs thread until it finishes processing all the tasks.
Such a flawed design, as in the Microsoft Example, creates imperative for such monsters like atomic SignalObjectAndWait and atomic PulseEvent -- function that ultimately lead to doom.
Here is an algorithm how can you achieve you goal set in your question. The goal is achieved with just plain and simple events, and simple function SetEvent and WaitForSingleObject - no other functions needed.
Create one common auto-reset event for all job threads to signal that there is a task (tasks) available; and also create per-thread auto-reset events, one event for each job thread.
Multiple job treads, once finished running all the jobs, all wait for this common auto-reset “task available” event using WaitForMultipleObjects - it waits two event - the common event and the own thread event.
The scheduler thread puts new (pending) jobs to the list.
The jobs list access has to be protected by EnterCriticalSection/LeaveCriticalSection, so no one ever accesses this list the other way.
Each of the job threads, after completing one job, before starting to wait for the auto-reset “task available” event and its own event, checks the pending jobs list. If the list is not empty, get one job from the list (remove it from the list) and execute it.
There have to be another list protected by critical section – waiting jobs thread list.
Before each jobs tread starts waiting, i.e. before it calls WaitForMultipleObjects, it adds itself to the “waiting” list. On exit from wait, it removes itself from this waiting list.
When the scheduler thread puts new (pending) jobs to the jobs list, it first enters critical section of the jobs list and then of the treads list - so two critical sections are entered simultaneously. The jobs threads, however, may never enter both critical sections simultaneously.
If there is just one job pending, the scheduler sets the common auto-reset event to the signaled state (call SetEvent) -- it doesn’t matter which of the sleeping job threads will pick up the job.
If there are two or more jobs pending, it would not signal the common event, but will count how many threads are waiting. If there are at least as many threads waiting as there are the jobs, signal the own event of that number of threads as there are events, and leave the remaining thread to continue their sleeping.
If there are more jobs than waiting threads, signal the own event for each of the waiting thread.
After the scheduler thread has signaled all the events, it leaves the critical sections - first of the thread list, and then of the jobs list.
After the scheduler thread has signaled all the events needed for the particular case, it goes to sleep itself, i.e. calls WaitForSingleObject with its own sleep event (that is also an auto-reset event that should be signaled whenever a new job appears).
Since the jobs threads will not start to sleep until the whole jobs list is depleted, you will no longer need the scheduler thread again. The scheduler thread will only be needed later, when a new jobs appears, not when a job is finished by the jobs thread.
Important: this scheme is based purely on auto-reset events. You won’t ever need to call ResetEvent. All the functions that are needed are: SetEvent and WaitForMultipleObjects (or WaitForSingleObject). No atomic event operation is needed.
Please note: when I wrote that a thread sleeps, it doesn't call "Sleep" API call - it will never be needed, it just is in the "wait" state as a result of calling WaitForMultipleObjects (or WaitForSingleObject).
As you know, auto-reset event, and the SetEvent and WaitForMultipleObjects function are very reliable. They exist since NT 3.1. You may always architect such a program logic that will solely rely on these simple functions -- so you would not ever need complex and unreliable functions that presume atomic operations, like PulseEvent or SignalObjectAndWait. By the way, SignalObjectAndWait did only appear in Windows NT 4.0, while SetEvent and WaitForMultipleObjects did exist from the initial version of Win32 – NT 3.1.

Continue executing another thread

I'm currently playing with WinAPI and I have proceeded to threads. My program has n threads which work with each other. There is one main thread, which writes some data to specific memory location and waits until thread working with specific memory location processes the data. Then whole procedure repeats again, without terminating any thread.
My problem is that busy waiting for second thread wastes too much time.
Is there any way to suspend current thread (to leave time for enother threads) or to solve this problem differently?
Please help.
I'm guessing that you're currently polling / busy waiting in your main thread, constantly checking the state of some completion flag the worker thread will set. As you note, this isn't desirable as you use some proportion of cpu bandwidth just waiting for the worker to complete. In some cases, this will reduce the amount of time your worker is scheduled for, delaying its completion.
Rather that doing this, you can use a synchronisation object such as Event or Semaphore to have your main thread sleep until the worker signals its completion.
You can use synchronization objects like mutex, semaaphores events etc for synchronization and WaitForSingleObject/WaitForMultipleObject API for thread waiting.

platform independent inter thread communication

I have a process which receives multiple jobs and picks a thread from thread pool and assigns a job to it, this thread in turn may spawn another set of threads from its own thread pool. Now when a STOP request for a job comes to the main process, it should be forwarded to corresponding thread for that request and all the threads associated with that job should clean themselves up and exit, My question is how to notify the worker threads about "STOP".
A global variable can be used and worker threads can poll it frequently but there are lot of functions that a worker can be doing, and adding checks everywhere could work.
Is there a clean approach? some kind of messaging layer. btw the code is C++
The Boost.Thread library is a wrapper around pthreads that's also portable to Windows. The boost::thread class has an interrupt() method that'll interrupt the thread at the next interruption point.
Boost.Thread also has a thread_group class which provides a collection of related threads. thread_group also has an interrupt() method that invokes interrupt() on each thread in the thread group.

boost thread pool

I need a threadpool for my application, and I'd like to rely on standard (C++11 or boost) stuff as much as possible. I realize there is an unofficial(!) boost thread pool class, which basically solves what I need, however I'd rather avoid it because it is not in the boost library itself -- why is it still not in the core library after so many years?
In some posts on this page and elsewhere, people suggested using boost::asio to achieve a threadpool like behavior. At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
What I want to do is:
ThreadPool pool(4);
for (...)
{
for (int i=0;i<something;i++)
pool.pushTask(...);
pool.join();
// do something with the results
}
Can anyone suggest a solution (except for using the existing unofficial thread pool on sourceforge)? Is there anything in C++11 or core boost that can help me here?
At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
I think you might have misunderstood the asio example:
IIRC (and it's been a while) each thread running in the thread pool has called io_service::run which means that effectively each thread has an event loop and a scheduler. To then get asio to complete tasks you post tasks to the io_service using the io_service::post method and asio's scheduling mechanism takes care of the rest. As long as you don't call io_service::stop, the thread pool will continue running using as many threads as you started running (assuming that each thread has work to do or has been assigned a io_service::work object).
So you don't need to create new threads for new tasks, that would go against the concept of a threadpool.
Have each task class derive from a Task that has an 'OnCompletion(task)' method/event. The threadpool threads can then call that after calling the main run() method of the task.
Waiting for a single task to complete is then easy. The OnCompletion() can perform whatever is required to signal the originating thread, signaling a condvar, queueing the task to a producer-consumer queue, calling SendMessage/PostMessage API's, Invoke/BeginInvoke, whatever.
If an oringinating thread needs to wait for several tasks to all complete, you could extend the above and issue a single 'Wait task' to the pool. The wait task has its own OnCompletion to communicate the completion of other tasks and has a thread-safe 'task counter', (atomic ops or lock), set to the number of 'main' tasks to be issued. The wait task is issued to the pool first and the thread that runs it waits on a private 'allDone' condvar in the wait task. The 'main' tasks are then issued to the pool with their OnCompletion set to call a method of the wait task that decrements the task counter towards zero. When the task counter reaches zero, the thread that achieves this signals the allDone condvar. The wait task OnCompletion then runs and so signals the completion of all the main tasks.
Such a mechansism does not require the continual create/terminate/join/delete of threadpool threads, places no restriction on how the originating task needs to be signaled and you can issue as many such task-groups as you wish. You should note, however, that each wait task blocks one threadpool thread, so make sure you create a few extra threads in the pool, (not usually any problem).
This seems like a job for boost::futures. The example in the docs seems to demonstrate exactly what you're looking to do.
Joining a thread mean stop for it until it stop, and if it stop and you want to assign a new task to it, you must create a new thread. So in your case you should wait for a condition (for example boost::condition_variable) to indicate end of tasks. So using this technique it is very easy to implement it using boost::asio and boost::condition_variable. Each thread call boost::asio::io_service::run and tasks will be scheduled and executed on different threads and at the end, each task will set a boost::condition_variable or event decrement a std::atomic to indicate end of the job! that's really easy, isn't it?