I am trying to use the multithreading features in the C++11 standard library and have the following situation envisioned.
I have a parent class which maintains a queue of thread. So something like:
std::queue<MyMTObject *> _my_threads;
The class MyMTObject contains the std::thread object.
The queue has a fixed size of 5 and the class initially starts with the queue being full.
As I have jobs to process I launch threads and I remove them from the queue. What I would like is to get a notification when the job is finished along with the pointer to the MyMTObject, so that I can reinsert them into the queue and make them available again.
I have basically 2 questions:
1: Is this a sound idea? I know I have not specified specifics but broadly speaking. I will, of course, control all access to the queue with a mutex.
2: Is there a way to implement this notification mechanism without using external libraries like Qt or boost.
For duplicates, I did look on the site but could not find anything that was suitable to manage a collection of threads.
I'm not sure if I need to mention this, but std::thread objects can't be re-used. Generally, the only reason you keep a std::thread reference is to std::thread::join the thread. If you don't plan to join the thread later (e.g. dispatch to threads and wait for completion), it's generally advised to std::thread::detach it.
If you're trying to keep threads for a thread pool, it's probably easier to have each thread block on the std::queue and pull objects from the queue to work on. This is relatively easy to implement using a std::mutex and a std::condition_variable. It generally gives good throughput, but to get finer control over scheduling you can do things like keep a seperate std::queue for each thread.
Detaching the threads and creating a work queue also has the added benefit that it avoids redundantly requesting the operating system create new threads which adds overhead and increases overall resource usage.
You could try to deploy some version of Reactor pattern I think. So, you could start one additional control thread that cleans after these workers. Now, you create a ThreadSafeQueue that will be used to communicate events from worker threads to control thread. This queue should be implemented in such a way that you can select on it and wait for any activity on the other end (some thread terminates and calls queue.push for example).
All in all I think it's quite elegant solution. I does add an overhead of an additional thread, but this thread will be mostly sleeping and waking up only once a while to clean up after the worker.
There is no elegant way to do this in Posix, and C++ threading model is almost a thin wrapper on Posix.
You can join a specific thread (one at a time), or you can wait on futures - again, one future at a time.
The best you can do to avoid looping is to employ a conditional variable, and make all threads singal on it (as well as indicating which one just exited by setting some sort of per-thread flag) just before they are about to exit. The 'reaper' would notice the signal and check the flags.
The issue is that this solution requires thread cooperation. But I know not of any better.
Related
I'm trying to create a task manager, which accepts tasks and runs each task as a new thread, using C++ and (currently) std::thread on a Linux environment .
the task manager accepts normal tasks and priority tasks.
when a priority task arrives, all normal tasks need to be halted until the priority task is done.
I'm keeping all normal task threads in a std::vector, but I couldn't find a proper function to halt those threads.
is there a way, preferably not using locks, to implement the wanted behavior?
maybe with <pthread> or boost threads?
There is no direct way to interrupt a thread from the outside.
Boost interruption points are handy to stop things once for all but that's not equivalent to a pause.
I would suggest you to implement your own "interruption" class with a condition variable (and yes a mutex) to check and wait efficiently anywhere inside your tasks. But it is up to you to explicitely call these interruptions.
Maybe another way would be make your priority tasks multithreadable so that you can allocate more threads to fulfill them => the scheduler is more likely to complete them first but that's not sure so forget what i said.
Sorry, I don't aknowledge anything better then this.
When implementing a service, etc, there may be a need for fire-and-forget functionality where one creates a thread and leaves it to its own devices. However, one needs to keep the std::thread object around somewhere to prevent it going out of scope, but when the thread completes there is no neat delete this support, and even if there was, that's going to be a problem for non-pointer allocations. Similarly, higher-level libraries may have Timer objects, where a one-shot timer may be fired off but needs to be cleaned up when done.
One can perhaps keep a collection of std::thread and Timer objects, and every so often go through the list and delete the finished objects, but that seems bothersome. Is there some useful idiom for managing these kinds of temporaries?
My immediate solution has been to use a combination of std::mutex and std::atomic to get my service to return BUSY so I only ever have one thread around, but that feels like a Code Smell
This is what std::thread::detach() is for -- if you want the thread to be "fire and forget", then you call detach on it after creating it. This causes the std:thread object to no longer refer to the actual thread of execution, so you can destroy the std::thread and is has no effect on the execution.
In case of a lot fire-and-forget functionality you may consider using Thread Pool. Creating and destroying of threads have overheads, and a lot of it may eat your RAM in RT. Thread Pool fire up for you X threads ahead for you, that will manage every "fire-and-forget" thread for you.
A simple Thread Pool that solve the problem for me.
std::mutex isn't related to managing threads' resources, it's related to synchronize threads.
I'm working on a multi-thread scheduling assignment, which involves adding threads to a variety of queues and selecting the appropriate one to execute.
The pthread_cond_signal(&condition) command is completely asynchronous from what I can tell; it's simply thrown into memory and the first thread to find it with the appropriate pthread_cond_wait() will consume it.
However, say I have a vector of thread ids that have been pushed as the thread is created, ie:
threadIDVector1[0] = 3061099328
threadIDVector1[1] = 3077884736
...
threadIDVector2[0] = 3294747394
threadIDVector2[1] = 3384567393
...
etc.
And I wanted to send a signal specifically to the thread with an id that matches the appropriate element of a vector. I.e. the algorithm would be:
While (at least one threadVector is non-empty):
Look at the first element in each vector
Select the appropriate one to signal by some criteria
Send a signal to ONLY that thread
Complete the thread and remove from threadIDVectorX
Is there some way to execute the above, or some accepted standard for achieving the same result?
There is no way to "send" a signal to a specific thread, nor to know which thread among many will be woken by the OS. It is entirely non-deterministic.
You could use the "multiple condition variable" solution as proposed in the comments. But my preferred solution to something like this is a pipe or socket pair. Have the thread doing the waking write something (like a single byte) to the pipe for the corresponding thread to signal it.
This has a lot of benefits in my book. First, it allows bidirectional communication. Your pseudocode loop at the end of your question seems to also want to remove a finished thread from the list, so you need to know when that thread is done. You could have another CV, or you could have the completing thread write a single byte back to the manager object before exiting. Much easier, I feel.
It also allows you to choose between blocking or nonblocking I/O, or to use synchronous multiplexing with select(2) or epoll(2). If you were not exiting from the worker threads, but instead wanted to reuse them, the notifying thread would need to know when they're ready for more work. Again, a CV would be fine here, but the file-descriptor approach allows the notifier to wait for all of the worker threads in a single select(2) call.
The last thing is that I find files simpler. pthreads are pretty complicated, and multithreading is already hard enough to get right. I find that files are easier to manage and reason about in a multithreaded context, making it easier to avoid locking or crashes.
I need a threadpool for my application, and I'd like to rely on standard (C++11 or boost) stuff as much as possible. I realize there is an unofficial(!) boost thread pool class, which basically solves what I need, however I'd rather avoid it because it is not in the boost library itself -- why is it still not in the core library after so many years?
In some posts on this page and elsewhere, people suggested using boost::asio to achieve a threadpool like behavior. At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
What I want to do is:
ThreadPool pool(4);
for (...)
{
for (int i=0;i<something;i++)
pool.pushTask(...);
pool.join();
// do something with the results
}
Can anyone suggest a solution (except for using the existing unofficial thread pool on sourceforge)? Is there anything in C++11 or core boost that can help me here?
At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
I think you might have misunderstood the asio example:
IIRC (and it's been a while) each thread running in the thread pool has called io_service::run which means that effectively each thread has an event loop and a scheduler. To then get asio to complete tasks you post tasks to the io_service using the io_service::post method and asio's scheduling mechanism takes care of the rest. As long as you don't call io_service::stop, the thread pool will continue running using as many threads as you started running (assuming that each thread has work to do or has been assigned a io_service::work object).
So you don't need to create new threads for new tasks, that would go against the concept of a threadpool.
Have each task class derive from a Task that has an 'OnCompletion(task)' method/event. The threadpool threads can then call that after calling the main run() method of the task.
Waiting for a single task to complete is then easy. The OnCompletion() can perform whatever is required to signal the originating thread, signaling a condvar, queueing the task to a producer-consumer queue, calling SendMessage/PostMessage API's, Invoke/BeginInvoke, whatever.
If an oringinating thread needs to wait for several tasks to all complete, you could extend the above and issue a single 'Wait task' to the pool. The wait task has its own OnCompletion to communicate the completion of other tasks and has a thread-safe 'task counter', (atomic ops or lock), set to the number of 'main' tasks to be issued. The wait task is issued to the pool first and the thread that runs it waits on a private 'allDone' condvar in the wait task. The 'main' tasks are then issued to the pool with their OnCompletion set to call a method of the wait task that decrements the task counter towards zero. When the task counter reaches zero, the thread that achieves this signals the allDone condvar. The wait task OnCompletion then runs and so signals the completion of all the main tasks.
Such a mechansism does not require the continual create/terminate/join/delete of threadpool threads, places no restriction on how the originating task needs to be signaled and you can issue as many such task-groups as you wish. You should note, however, that each wait task blocks one threadpool thread, so make sure you create a few extra threads in the pool, (not usually any problem).
This seems like a job for boost::futures. The example in the docs seems to demonstrate exactly what you're looking to do.
Joining a thread mean stop for it until it stop, and if it stop and you want to assign a new task to it, you must create a new thread. So in your case you should wait for a condition (for example boost::condition_variable) to indicate end of tasks. So using this technique it is very easy to implement it using boost::asio and boost::condition_variable. Each thread call boost::asio::io_service::run and tasks will be scheduled and executed on different threads and at the end, each task will set a boost::condition_variable or event decrement a std::atomic to indicate end of the job! that's really easy, isn't it?
I am developing a C++ application that needs to process large amount of data. I am not in position to partition data so that multi-processes can handle each partition independently. I am hoping to get ideas on frameworks/libraries that can manage threads and work allocation among worker threads.
Manage threads should include at least below functionality.
1. Decide on how many workers threads are required. We may need to provide user-defined function to calculate number of threads.
2. Create required number of threads.
3. Kill/stop unnecessary threads to reduce resource wastage.
4. Monitor healthiness of each worker thread.
Work allocation should include below functionality.
1. Using callback functionality, the library should get a piece of work.
2. Allocate the work to available worker thread.
3. Master/slave configuration or pipeline-of-worker-threads should be possible.
Many thanks in advance.
Your question essentially boils down to "how do I implement a thread pool?"
Writing a good thread pool is tricky. I recommend hunting for a library that already does what you want rather than trying to implement it yourself. Boost has a thread-pool library in the review queue, and both Microsoft's concurrency runtime and Intel's Threading Building Blocks contain thread pools.
With regard to your specific questions, most platforms provide a function to obtain the number of processors. In C++0x this is std::thread::hardware_concurrency(). You can then use this in combination with information about the work to be done to pick a number of worker threads.
Since creating threads is actually quite time consuming on many platforms, and blocked threads do not consume significant resources beyond their stack space and thread info block, I would recommend that you just block worker threads with no work to do on a condition variable or similar synchronization primitive rather than killing them in the first instance. However, if you end up with a large number of idle threads, it may be a signal that your pool has too many threads, and you could reduce the number of waiting threads.
Monitoring the "healthiness" of each thread is tricky, and typically platform dependent. The simplest way is just to check that (a) the thread is still running, and hasn't unexpectedly died, and (b) the thread is processing tasks at an acceptable rate.
The simplest means of allocating work to threads is just to use a single shared job queue: all tasks are added to the queue, and each thread takes a task when it has completed the previous task. A more complex alternative is to have a queue per thread, with a work-stealing scheme that allows a thread to take work from others if it has run out of tasks.
If your threads can submit tasks to the work queue and wait for the results then you need to have a scheme for ensuring that your worker threads do not all get stalled waiting for tasks that have not yet been scheduled. One option is to spawn a new thread when a task gets blocked, and another is to run the not-yet-scheduled task that is blocking a given thread on that thread directly in a recursive manner. There are advantages and disadvantages with both these schemes, and with other alternatives.