Too many Futures - concurrency

I have a function that takes 2 functions, one watches for a certain event and another one which actually does the work, each runs in a future, as soon as the event future returns failure caller thread signals worker future to stop. if the worker thread finishes before the event is received caller thread then signals watcher to stop and then caller returns.
This works fine but the thing is the worker function that does the actual work may/do needs to check for other events down the line, each time I need to watch for a certain event I fire 2 extra futures. The question is, is there a limit on the number of max futures running? I may end up with 60 futures at times? Will the thread pool grow as needed? and since they run on a thread pool I am assuming they are not too costly create?

It all depends on the Executor. Suppose you created a Executor with Executors.newFixedThreadPool(n) and submitted n * 100 tasks to it, there will be no more than n threads running.
If you are using Clojure's future function it will submit your tasks to clojure.lang.Agent.soloExecutor. The soloExecutor is created with Executors.newCachedThreadPool(threadFactory), so it will re-use threads and run as many as possible.

Related

How do I manage thread shutdown in a multithreaded crawler?

Let's say I'm writing a mulithreaded web crawler. Threads get a job (for example, in a form of URL) from a queue, do some work, and then might add some new jobs to the queue. Sounds simple enough, but I'm not sure how to handle the situation where all the jobs are done. Let's say there are currently 0 jobs in the queue, and some thread is trying to get a new job. At this point two situations are possible:
Some other threads are working and might actually produce new jobs for this thread to get. In this case, it is probably possible to just wait for a new task (with a blocking .pop(), if the queue supports it, or just by sleeping and waking up time to time to check if a job is available)
All other threads are also waiting for a job. In this case, no new jobs can be produced, so threads must be terminated.
One solution I can think of is having an integer (behind a mutex), which should serve as a number of "busy" threads - it will be increased when thread gets a job, and decreased once it is finished processing it. This way, if there is 0 jobs and 0 threads working, a thread can safely be terminated. However, I'm not sure it is the best solution possible. Are there any other options to handle such a situation?

boost thread pool

I need a threadpool for my application, and I'd like to rely on standard (C++11 or boost) stuff as much as possible. I realize there is an unofficial(!) boost thread pool class, which basically solves what I need, however I'd rather avoid it because it is not in the boost library itself -- why is it still not in the core library after so many years?
In some posts on this page and elsewhere, people suggested using boost::asio to achieve a threadpool like behavior. At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
What I want to do is:
ThreadPool pool(4);
for (...)
{
for (int i=0;i<something;i++)
pool.pushTask(...);
pool.join();
// do something with the results
}
Can anyone suggest a solution (except for using the existing unofficial thread pool on sourceforge)? Is there anything in C++11 or core boost that can help me here?
At first sight, that looked like what I wanted to do, however I found out that all implementations I have seen have no means to join on the currently active tasks, which makes it useless for my application. To perform a join, they send stop signal to all the threads and subsequently join them. However, that completely nullifies the advantage of threadpools in my use case, because that makes new tasks require the creation of a new thread.
I think you might have misunderstood the asio example:
IIRC (and it's been a while) each thread running in the thread pool has called io_service::run which means that effectively each thread has an event loop and a scheduler. To then get asio to complete tasks you post tasks to the io_service using the io_service::post method and asio's scheduling mechanism takes care of the rest. As long as you don't call io_service::stop, the thread pool will continue running using as many threads as you started running (assuming that each thread has work to do or has been assigned a io_service::work object).
So you don't need to create new threads for new tasks, that would go against the concept of a threadpool.
Have each task class derive from a Task that has an 'OnCompletion(task)' method/event. The threadpool threads can then call that after calling the main run() method of the task.
Waiting for a single task to complete is then easy. The OnCompletion() can perform whatever is required to signal the originating thread, signaling a condvar, queueing the task to a producer-consumer queue, calling SendMessage/PostMessage API's, Invoke/BeginInvoke, whatever.
If an oringinating thread needs to wait for several tasks to all complete, you could extend the above and issue a single 'Wait task' to the pool. The wait task has its own OnCompletion to communicate the completion of other tasks and has a thread-safe 'task counter', (atomic ops or lock), set to the number of 'main' tasks to be issued. The wait task is issued to the pool first and the thread that runs it waits on a private 'allDone' condvar in the wait task. The 'main' tasks are then issued to the pool with their OnCompletion set to call a method of the wait task that decrements the task counter towards zero. When the task counter reaches zero, the thread that achieves this signals the allDone condvar. The wait task OnCompletion then runs and so signals the completion of all the main tasks.
Such a mechansism does not require the continual create/terminate/join/delete of threadpool threads, places no restriction on how the originating task needs to be signaled and you can issue as many such task-groups as you wish. You should note, however, that each wait task blocks one threadpool thread, so make sure you create a few extra threads in the pool, (not usually any problem).
This seems like a job for boost::futures. The example in the docs seems to demonstrate exactly what you're looking to do.
Joining a thread mean stop for it until it stop, and if it stop and you want to assign a new task to it, you must create a new thread. So in your case you should wait for a condition (for example boost::condition_variable) to indicate end of tasks. So using this technique it is very easy to implement it using boost::asio and boost::condition_variable. Each thread call boost::asio::io_service::run and tasks will be scheduled and executed on different threads and at the end, each task will set a boost::condition_variable or event decrement a std::atomic to indicate end of the job! that's really easy, isn't it?

Threading in an endless C++ program

I have a web interface where the user submits some data and it gets written to a database. In the background there is a C++ program which periodically checks the database for new entries. It then takes these entries, processes them and writes their result to a directory. It then proceeds to sleep and keep checking for new entries to process.
My question is in regards to adding multithreading to the C++ program. I have read that it's generally a bad idea just to create a new thread every time you need a another job done, but rather add the jobs to a queue and disperse them out to a fixed number of threads that have already been created (say, 5 or so). Is this the proper design route to take for my situation? Also, if I understand pthread_join correctly, I don't actually need to call it because I don't want to wait for all of the jobs to finish before continuing to check for new updates to the database.
I just wanted to make sure I'm headed in the right direction, any affirmations/criticisms/resources?
You should first decide whether you even need more than one thread - it sounds like checking the database and writing files at some given interval can be accomplished using only one thread. Multiple threads would become useful when you start having to write different data to multiple files simultaneously at non-regular intervals. You are correct that using a queue of sorts would be the best way to distribute these 'jobs' to your threads, and that using a thread pool will give you a little more control over how many 'jobs' you want running simultaneously at any given time. The pthread_join method is used when you want to make sure one thread doesn't exit before another - I've used this mostly to make sure that the program's initial thread doesn't exit after creating the thread pool, as when the parent thread exits the program's execution stops. Some psuedo code based on my comments below.
main thread:
spawn child threads
while(some exit condition){
check database for new jobs
if(new jobs){
acquire job queue mutex //mutexes ensures only one thread accesses shared
add job to queue //data at a time
signal on shared condition variable
release job queue mutex
}
sleep(some regular duration)
}
child thread:
while(some exit condition){
acquire job queue mutex
if(job queue's size == 0){
wait on the shared condition variable
}
grab job from queue
release job queue mutex
handle job
}
See here for pthread/mutex/CV usage notes.
In my experience creating a thread will most likely take tens of milliseconds. For your days computers this is not a big deal. Nothing bad will happen if it will be created/destroyed often. Looking for simple and flawless app level design might be more important.
As a possible variant, I would recommend considering a pool of threads, one thread per available CPU core. These threads should simply sleep at the end of the loop and regularly check if there is something to do or not.
This simplistic design will add minimal overhead and allow using all available CPU power at the same time.
My 2 cents.

How can I wait on an I/O completion port and an event at the same time?

Is there any possible way to achieve this?
For instance, I have an I/O completion port that 10 worker threads are pulling tasks out of. Each task is associated with an object. Some objects cannot be worked on concurrently, so if one thread is working with one of these objects and a second thread pulls out a task that requires this object, the second thread has to wait for the first to complete.
As a work around, objects could have an event that gets signaled upon release. If a thread is 'stuck' because the task is received requires a locked object, it could wait on either the locked object to be released, or for a new task to be queued. If it picks up a new task, it will push the task it couldn't work on back into the queue.
I am aware of alternative approaches, but this seems like functionality that should exist. Can this be achieved with Windows API?
Change your design.
Add an internal task queue to the object. Then when a task is posted to the IOCP have the IOCP thread place the task in the object's task queue and, if no other thread is "processing" tasks for this object have this IOCP thread mark the object as being processed and begin processing the task; (lock per object queue, add task, check if we should be the processing thread, unlock the queue) and either process the task in the object or return to the IOCP.
When another thread has a task for the same object it also goes through the same process. Note that the thread processing the object DOES NOT hold a lock on the object's task queue so the new IOCP thread can add the task to the object's queue and then see that a thread is already processing and simply return to the IOCP.
Once the thread has finished the current task it checks the object's task queue again and either continues processing the next task, or, if the queue is empty, marks the object as not processing and returns to the IOCP.
This prevents you blocking IOCP threads on tasks which can't yet run and maintains locality of data to the thread that happens to be processing at the time.
The one potential issue is that you can have some always busy objects starving others but you can avoid this by simply checking how many tasks you have processed and it if exceeds a tunable max then pushing the next task to process back into the IOCP so that other objects have a chance.
The idea solution is to have a thread wait for the event and post to the completion port when the event occurs. Alternatively, have a thread wait for the event and just handle it. If you have two fundamentally different things you need to do, use two threads to do them.

Scheduling huge number of threads so only 4 are executed in parallel

As already stated in the title I have a large number of threads (probably much higher than 100) that are rather saving a program state than running. I want only few of them (enough to use all physical processors) to really run concurrent and the rest should wait until one of the running is blocked. When this happens a new one should be running.
Is it possible to achieve this with pthreads for example with the pthread scheduling functions? How would you do this?
Regards,
Nobody
EDIT
More Information:
Each thread fetches a job from the taskpool on its own and goes on to a certain point.
I need 100 threads to gather at that certain point of program execution that cannot be calculated in parallel. When the calculation is done the threads should be awakened and go on. To make this efficient I have to avoid the scheduler from wasting time on switching between 100 threads instead of 4.
Just use a semaphore with initial count of 4?
http://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_init.html
You could always launch 4 at a time, assigning them to a thread group, then waiting with a join all on the thread group. But I think more information is needed to develop a really useful answer.
Initialize a global variable to the number of threads to run concurrently.
When a thread wants to do work it obtains a slot. Using a mutex and condition variable, it waits until slots_available > 1. It then decrements slots_available releases the mutex and proceeds with its work.
When a thread has completed its work, it releases the slot by locking the mutex and incrementing slots_available. It signals all threads waiting on the condition variable so they can wake and see if slots_available > 1.
See https://computing.llnl.gov/tutorials/pthreads/#Mutexes for specific pthread library calls to use for the above.
I don't know how to do this with pthread functions, but I do have an idea:
I would implement this by adding some intelligence to the threadpool/taskpool to count the number of active threads and only make 4 - number of active threads available at any one time. This could be done by having an idle queue, a ready queue, and an active queue (or just active count). Tasks would grab from the ready queue, and the threadpool would only migrate tasks from the idle queue to the ready queue conditionally.