I recently discovered ATL's CThreadPool class and was very happy with this find. It's a neat little class that will take care of the syncronization semantics of having multiple worker threads to process some queue of taks to do. The tasks are fed to the CThreadPool object, by some extrnal process.
While being very neat an clean, there doesn't seem to be a way find out whether all tasks have catually been completed. What's the best way to do that?
For example, say I need to do 10 heavy computational tasks and then move on to do something else. I can't move on until I know the tasks have completed.
So I create a CThreadPool with 10 threads, put the tasks on the queue and off they go to the threads. How would I know when the tasks are completed?
Boaz
I think that you'll need to have your worker object's Execute() method use some form of IPC that the 'main' thread can monitor. Something like have the Execute() method call ReleaseSemaphore() on a semaphore that the main thread can wait on and count how many times the wait completes. Or the thread pool workers can post a message indicating they are done with a work item to a message queue the main thread gets messages from.
The cleanest way of waiting for a completed task is by creating an event, then waiting for the worker thread to finish by calling WaitForSingleObject.
Related
I just realized there is SignalObjectAndWait API function for Windows platform. But there is already SetEvent and WaitForSingleObject. You can use them together to achieve the same goal as SignalObjectAndWait.
Based on the MSDN, SignalObjectAndWait is more efficient than separate calls to SetEvent and WaitForSingleObject. It also states:
A thread can use the SignalObjectAndWait function to ensure that a worker thread is in a wait state before signaling an object.
I don't fully understand this sentence, but it seems that efficiency is not the only reason why we need SignalObjectAndWait. Can anybody provide a scenario where SetEvent + WaitForSingleObject fails to provide the functionality that SignalObjectAndWait offers?
My understanding is that this single function is more efficient in the way that it avoid the following scenario.
The SignalObjectAndWait function provides a more efficient way to signal one object and then wait on another compared to separate function calls such as SetEvent followed by WaitForSingleObject.
When you you SetEvent and another [esp. higher priority] thread is waiting on this event, it might so happen that thread scheduler takes control away from the signaling thread. When the thread receives control back, the only thing that it does is the following WaitForSingleObject call, thus wasting context switch for such a tiny thing.
Using SignalObjectAndWait you hint the kernel by saying "hey, I will be waiting for another event anyway, so if it makes any difference for you don't excessively bounce with context switches back and forth".
The purpose, as MSDN explains is to ensure that the thread is in a Wait State BEFORE the event is signalled. If you call WaitForSingleObject, the thread is in a waitstate, but you can't call that BEFORE calling SetEvent, since that will cause SetEvent to happen only AFTER the wait has finished - which is pointless if nothing else is calling SetEvent.
As you know, Microsoft gives the following example of why we may ever need SignalObjectAndWait if we already need separate SetEvent and WaitForSingleObject (quote the Microsoft example):
A thread can use the SignalObjectAndWait function to ensure that a worker thread is in a wait state before signaling an object. For example, a thread and a worker thread may use handles to event objects to synchronize their work. The thread executes code such as the following:
dwRet = WaitForSingleObject(hEventWorkerDone, INFINITE);
if( WAIT_OBJECT_0 == dwRet)
SetEvent(hEventMoreWorkToDo);
The worker thread executes code such as the following:
dwRet = SignalObjectAndWait(hEventWorkerDone,
hEventMoreWorkToDo,
INFINITE,
FALSE);
This algorithm flow is flawed and should never be used. We do not need such a perplexing mechanism where the threads notify each other until we are in a “Race condition”. Microsoft itself in this example creates the Race Condition. The worker thread should just wait for an event and take tasks from a list, while the thread that generates tasks should just add tasks to this list and signal the event. So, we just need one event, not two as in the above Microsoft example. The list has to be protected by a critical section. The thread that generates tasks should not wait for the worker thread to complete the tasks. If there are tasks that require to notify somebody on their completion, the tasks should send the notifications by themselves. In other words, it is the task who will notify the thread on completion -- it is not the thread who will specifically wait for the jobs thread until it finishes processing all the tasks.
Such a flawed design, as in the Microsoft Example, creates imperative for such monsters like atomic SignalObjectAndWait and atomic PulseEvent -- function that ultimately lead to doom.
Here is an algorithm how can you achieve you goal set in your question. The goal is achieved with just plain and simple events, and simple function SetEvent and WaitForSingleObject - no other functions needed.
Create one common auto-reset event for all job threads to signal that there is a task (tasks) available; and also create per-thread auto-reset events, one event for each job thread.
Multiple job treads, once finished running all the jobs, all wait for this common auto-reset “task available” event using WaitForMultipleObjects - it waits two event - the common event and the own thread event.
The scheduler thread puts new (pending) jobs to the list.
The jobs list access has to be protected by EnterCriticalSection/LeaveCriticalSection, so no one ever accesses this list the other way.
Each of the job threads, after completing one job, before starting to wait for the auto-reset “task available” event and its own event, checks the pending jobs list. If the list is not empty, get one job from the list (remove it from the list) and execute it.
There have to be another list protected by critical section – waiting jobs thread list.
Before each jobs tread starts waiting, i.e. before it calls WaitForMultipleObjects, it adds itself to the “waiting” list. On exit from wait, it removes itself from this waiting list.
When the scheduler thread puts new (pending) jobs to the jobs list, it first enters critical section of the jobs list and then of the treads list - so two critical sections are entered simultaneously. The jobs threads, however, may never enter both critical sections simultaneously.
If there is just one job pending, the scheduler sets the common auto-reset event to the signaled state (call SetEvent) -- it doesn’t matter which of the sleeping job threads will pick up the job.
If there are two or more jobs pending, it would not signal the common event, but will count how many threads are waiting. If there are at least as many threads waiting as there are the jobs, signal the own event of that number of threads as there are events, and leave the remaining thread to continue their sleeping.
If there are more jobs than waiting threads, signal the own event for each of the waiting thread.
After the scheduler thread has signaled all the events, it leaves the critical sections - first of the thread list, and then of the jobs list.
After the scheduler thread has signaled all the events needed for the particular case, it goes to sleep itself, i.e. calls WaitForSingleObject with its own sleep event (that is also an auto-reset event that should be signaled whenever a new job appears).
Since the jobs threads will not start to sleep until the whole jobs list is depleted, you will no longer need the scheduler thread again. The scheduler thread will only be needed later, when a new jobs appears, not when a job is finished by the jobs thread.
Important: this scheme is based purely on auto-reset events. You won’t ever need to call ResetEvent. All the functions that are needed are: SetEvent and WaitForMultipleObjects (or WaitForSingleObject). No atomic event operation is needed.
Please note: when I wrote that a thread sleeps, it doesn't call "Sleep" API call - it will never be needed, it just is in the "wait" state as a result of calling WaitForMultipleObjects (or WaitForSingleObject).
As you know, auto-reset event, and the SetEvent and WaitForMultipleObjects function are very reliable. They exist since NT 3.1. You may always architect such a program logic that will solely rely on these simple functions -- so you would not ever need complex and unreliable functions that presume atomic operations, like PulseEvent or SignalObjectAndWait. By the way, SignalObjectAndWait did only appear in Windows NT 4.0, while SetEvent and WaitForMultipleObjects did exist from the initial version of Win32 – NT 3.1.
I need a threadpool for my application, and I'd like to rely on standard (C++11 or boost) stuff as much as possible. I realize there is an unofficial(!) boost thread pool class, which basically solves what I need, however I'd rather avoid it because it is not in the boost library itself -- why is it still not in the core library after so many years?
In some posts on this page and elsewhere, people suggested using boost::asio to achieve a threadpool like behavior. At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
What I want to do is:
ThreadPool pool(4);
for (...)
{
for (int i=0;i<something;i++)
pool.pushTask(...);
pool.join();
// do something with the results
}
Can anyone suggest a solution (except for using the existing unofficial thread pool on sourceforge)? Is there anything in C++11 or core boost that can help me here?
At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
I think you might have misunderstood the asio example:
IIRC (and it's been a while) each thread running in the thread pool has called io_service::run which means that effectively each thread has an event loop and a scheduler. To then get asio to complete tasks you post tasks to the io_service using the io_service::post method and asio's scheduling mechanism takes care of the rest. As long as you don't call io_service::stop, the thread pool will continue running using as many threads as you started running (assuming that each thread has work to do or has been assigned a io_service::work object).
So you don't need to create new threads for new tasks, that would go against the concept of a threadpool.
Have each task class derive from a Task that has an 'OnCompletion(task)' method/event. The threadpool threads can then call that after calling the main run() method of the task.
Waiting for a single task to complete is then easy. The OnCompletion() can perform whatever is required to signal the originating thread, signaling a condvar, queueing the task to a producer-consumer queue, calling SendMessage/PostMessage API's, Invoke/BeginInvoke, whatever.
If an oringinating thread needs to wait for several tasks to all complete, you could extend the above and issue a single 'Wait task' to the pool. The wait task has its own OnCompletion to communicate the completion of other tasks and has a thread-safe 'task counter', (atomic ops or lock), set to the number of 'main' tasks to be issued. The wait task is issued to the pool first and the thread that runs it waits on a private 'allDone' condvar in the wait task. The 'main' tasks are then issued to the pool with their OnCompletion set to call a method of the wait task that decrements the task counter towards zero. When the task counter reaches zero, the thread that achieves this signals the allDone condvar. The wait task OnCompletion then runs and so signals the completion of all the main tasks.
Such a mechansism does not require the continual create/terminate/join/delete of threadpool threads, places no restriction on how the originating task needs to be signaled and you can issue as many such task-groups as you wish. You should note, however, that each wait task blocks one threadpool thread, so make sure you create a few extra threads in the pool, (not usually any problem).
This seems like a job for boost::futures. The example in the docs seems to demonstrate exactly what you're looking to do.
Joining a thread mean stop for it until it stop, and if it stop and you want to assign a new task to it, you must create a new thread. So in your case you should wait for a condition (for example boost::condition_variable) to indicate end of tasks. So using this technique it is very easy to implement it using boost::asio and boost::condition_variable. Each thread call boost::asio::io_service::run and tasks will be scheduled and executed on different threads and at the end, each task will set a boost::condition_variable or event decrement a std::atomic to indicate end of the job! that's really easy, isn't it?
Is there any possible way to achieve this?
For instance, I have an I/O completion port that 10 worker threads are pulling tasks out of. Each task is associated with an object. Some objects cannot be worked on concurrently, so if one thread is working with one of these objects and a second thread pulls out a task that requires this object, the second thread has to wait for the first to complete.
As a work around, objects could have an event that gets signaled upon release. If a thread is 'stuck' because the task is received requires a locked object, it could wait on either the locked object to be released, or for a new task to be queued. If it picks up a new task, it will push the task it couldn't work on back into the queue.
I am aware of alternative approaches, but this seems like functionality that should exist. Can this be achieved with Windows API?
Change your design.
Add an internal task queue to the object. Then when a task is posted to the IOCP have the IOCP thread place the task in the object's task queue and, if no other thread is "processing" tasks for this object have this IOCP thread mark the object as being processed and begin processing the task; (lock per object queue, add task, check if we should be the processing thread, unlock the queue) and either process the task in the object or return to the IOCP.
When another thread has a task for the same object it also goes through the same process. Note that the thread processing the object DOES NOT hold a lock on the object's task queue so the new IOCP thread can add the task to the object's queue and then see that a thread is already processing and simply return to the IOCP.
Once the thread has finished the current task it checks the object's task queue again and either continues processing the next task, or, if the queue is empty, marks the object as not processing and returns to the IOCP.
This prevents you blocking IOCP threads on tasks which can't yet run and maintains locality of data to the thread that happens to be processing at the time.
The one potential issue is that you can have some always busy objects starving others but you can avoid this by simply checking how many tasks you have processed and it if exceeds a tunable max then pushing the next task to process back into the IOCP so that other objects have a chance.
The idea solution is to have a thread wait for the event and post to the completion port when the event occurs. Alternatively, have a thread wait for the event and just handle it. If you have two fundamentally different things you need to do, use two threads to do them.
Is there any way, by which we can Re-Initialize a thread without killing it. I want to use the existing thread, but they will again start from the beginning.
Create a class that manages a thread.
In the run method of this class have it wait until some work is assigned to the class in the form of a function pointer or some other class that implements a "work" interface.
Once work is assigned, the thread can stop waiting and execute the work.
Once the work is complete the thread sits and waits until more work is assigned to it.
This allows you to keep the thread running and waiting for work, without having to recreate it when new work comes along.
What y ou are asking for can only be achieved by the logic of your thread function. The thread library/operating system does not know about your logic and cannot possibly know where you want it to go on reinitialization.
Also note that while you can achieve something similar by canceling and starting the thread, thread cancellation is quite often dangerous (you might leak resources) if even possible (thread must hit a cancellation point) and should be avoided in most cases. So you are back at square one: implement logic in the function to detect the event and restart with whatever definition of start you want to use.
You could have two events: restart and stop. Your thread function would wait in a loop for any of them. If it detects restart, it would perform the task and go back waiting for events. If it detects stop, it would simply return.
I have a system where my singleton class spawns a thread to do a calculation. If the user requests another calculation while another calculation is still running, I want it to tear down the existing thread and start a new one. But, it should wait for the first thread to exit completely before proceeding. I have all the tear down working but I seem to have an issue with making sure that only one thread runs. My approach is for the StartCalculation function to call mutex->Lock(). And the thread in the destructor releases the lock. It's not working. Am I right in assuming that if Lock() can't get the lock, it spins and keeps trying to reacquire the lock? Can this Lock() be called from my main application thread? Any ideas is helpful. Maybe wxMutex locks are the right mechanism for this.
To wait for a thread you need to create it joinable and simply use wxThread::Wait(). However I agree with the remark above: this is not something you'd normally do at all and definitely not from the main GUI thread as you should never block in it because this freezes the UI.
Consider using a message queue to simply tell the existing thread about the new task it needs to perform instead.