c++ multithreaded task queue for scheduled tasks - c++

I need to develop a module which will execute scheduled tasks.
Each task is scheduled to be executed within X milliseconds.
The module takes as a parameter an amount of worker threads to execute the tasks.
The tasks are piled up in a queue which will probably be a priority queue, so a thread checks for the next-in-queue task (the one with the lowest "redemption" time), thus there's no need to iterate through all tasks each time.
Is there any public library that does that or shall I roll my own?
Note: I'm using VC2008 on Windows.

If you don't mind a Boost dependency, threadpool might fit your needs.

Take a look at TBB - Intel Threading Building Blocks.

Just to add a little information to your question, what you're asking for is a real-time scheduler that uses the Earliest Deadline First algorithm. Also note that without OS support, you can't guarantee that your program will work in that X millisecond deadline you assign it. The OS could always decide to swap your task off its CPU in the middle of the job, making it take an unpredictably-long time to complete.
If your application critically depeneds on the task being done in the X milliseconds you set for it (or something blows up), you'll need to be running a real-time operating system, not regular Windows.

Related

Ensure that each thread gets a chance to execute in a given time period using C++11 threads

Suppose I have a multi-threaded program in C++11, in which each thread controls the behavior of something displayed to the user.
I want to ensure that for every time period T during which one of the threads of the given program have run, each thread gets a chance to execute for at least time t, so that the display looks as if all threads are executing simultaneously. The idea is to have a mechanism for round robin scheduling with time sharing based on some information stored in the thread, forcing a thread to wait after its time slice is over, instead of relying on the operating system scheduler.
Preferably, I would also like to ensure that each thread is scheduled in real time.
In case there is no way other than relying on the operating system, is there any solution for Linux?
Is it possible to do this? How?
No that's not cross-platform possible with C++11 threads. How often and how long a thread is called isn't up to the application. It's up to the operating system you're using.
However, there are still functions with which you can flag the os that a special thread/process is really important and so you can influence this time fuzzy for your purposes.
You can acquire the platform dependent thread handle to use OS functions.
native_handle_type std::thread::native_handle //(since C++11)
Returns the implementation defined underlying thread handle.
I just want to claim again, this requires a implementation which is different for each platform!
Microsoft Windows
According to the Microsoft documentation:
SetThreadPriority function
Sets the priority value for the specified thread. This value, together
with the priority class of the thread's process determines the
thread's base priority level.
Linux/Unix
For Linux things are more difficult because there are different systems how threads can be scheduled. Under Microsoft Windows it's using a priority system but on Linux this doesn't seem to be the default scheduling.
For more information, please take a look on this stackoverflow question(Should be the same for std::thread because of this).
I want to ensure that for every time period T during which one of the threads of the given program have run, each thread gets a chance to execute for at least time t, so that the display looks as if all threads are executing simultaneously.
You are using threads to make it seem as though different tasks are executing simultaneously. That is not recommended for the reasons stated in Arthur's answer, to which I really can't add anything.
If instead of having long living threads each doing its own task you can have a single queue of tasks that can be executed without mutual exclusion - you can have a queue of tasks and a thread pool dequeuing and executing tasks.
If you cannot, you might want to look into wait free data structures and algorithms. In a wait free algorithm/data structure, every thread is guaranteed to complete its work in a finite (and even specified) number of steps. I can recommend the book The Art of Multiprocessor Programming where this topic is discussed in length. The gist of it is: every lock free algorithm/data structure can be modified to be wait free by adding communication between threads over which a thread that's about to do work makes sure that no other thread is starved/stalled. Basically, prefer fairness over total throughput of all threads. In my experience this is usually not a good compromise.

task delegation scheduler

I implemented a scheduler task delegation scheduler instead of a task stealing scheduler. So the basic idea of this method is each thread has its own private local queue. Whenever a task is produced, before the task gets enqueued to the local queues, a search operation is done among the queues and minimum size queue is found by comparing each size of the queues. Each time this minimum size queue is used to enqueue the task. This is a way of diverting the pressure of the work from a busy thread's queue and delegate the jobs to the least busy thread's queue.
The problem in this scheduling technique is, we dont know how much time each tasks takes to complete. ie. the queue may have a minimal count, but the task may be still operating, on the other hand the queue may have higher value counter, but the tasks may be completed very soon. any ideas to solve this problem?
I am working on linux, C++ programming language in our own multithreading library implementing a multi-rate synchronous data flow paradigm .
It seems that your scheduling policy doesn't fit the job at hand. Usually this type of naive-scheduling which ignores task completion times is only relevant when tasks are relatively equal in execution time.
I'd recommend doing some research. A good place to start would be Wikipedia's Scheduling article but that is of course just the tip of the iceberg.
I'd also give a second (and third) thought to the task-delegation requirement since timeslicing task operations allows you to fine grain queue management by considering the task's "history". However, if clients are designed so that each client consistently sends the same "type" of task, then you can achieve similar results with this knowledge.
As far as I remember from my Queueing Theory class the fairest (of them all;) system is the one which has a single queue and multiple servers. Using such system ensures the lowest expected average execution time for all tasks and the largest utilization factor (% of time it works, I'm not sure the term is correct).
In other words, unless you have some priority tasks, please reconsider your task delegation scheduler implementation.

Scheduling tasks in multithreads

I am trying to schedule tasks in multi threaded systems. my idea is to have a local queue per thread, each thread will fetch the job from its local queue. But when the thread reaches some threshold, it should not fetch the job, rather it should transfer the job to a thread which is below the threshold level.
My doubt is how to set the threshold for the threads.
An alternative arrangement to this problem is giving threads who have finished their queue the ability to take work from the queue of others. This is better known as "Work Stealing" and is a well known scheduling algorithm e.g.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.8905
What threading library are you using?
I use two OSS libraries in all of my threading projects TBB and Cilk Plus. One feature that these higher level runtimes provide is that they automatically schedule tasks on to threads in a way that make efficient use of processor resources. The runtimes are also very effective at load balancing the many task.
www.threadingbuildblocks.org
www.cilkplus.org

How does a scheduler end a running process?

I just realized that after learning a lot about various scheduling algorithms, how a context switch is done, etc. one thing still isn't clear to me.
Take a uniprocessor system:
If process A is running and it's time slot should end in 5 seconds, how does the scheduler or the operating system know how to end it after 5 seconds? No part of the operating system can run while A is running. The scheduler is supposed to be monitoring it, but how can it if it cannot run? Does the operating system's scheduler write an ISR and have an interrupt generate every 5 seconds? Is this possible? Even if it is, it doesn't seem a good way to implement it.
How exactly does a scheduler do this?
Does the operating system's scheduler write an ISR and have an interrupt generate every 5 seconds? Is this possible? Even if it is, it doesn't seem a good way to implement it.
Yes, this is exactly how it works on a preemptive multitasking system (although on desktop systems the interval is usually more like 10 milliseconds).
Yes, there are other schemes, such as cooperative multitasking, where each process decides for itself when to yield.
Yes, normally there is some kind of timer interrupt that fires. The kernel can then run for a bit and switch process context if it needs to - normally that interrupt would fire an awful lot more often than just once every 5 seconds though. Why doesn't it seem like a good way to implement it?

Thread pool for executing arbitrary tasks with different priorities

I'm trying to come up with a design for a thread pool with a lot of design requirements for my job. This is a real problem for working software, and it's a difficult task. I have a working implementation but I'd like to throw this out to SO and see what interesting ideas people can come up with, so that I can compare to my implementation and see how it stacks up. I've tried to be as specific to the requirements as I can.
The thread pool needs to execute a series of tasks. The tasks can be short running (<1sec) or long running (hours or days). Each task has an associated priority (from 1 = very low to 5 = very high). Tasks can arrive at any time while the other tasks are running, so as they arrive the thread pool needs to pick these up and schedule them as threads become available.
The task priority is completely independant of the task length. In fact it is impossible to tell how long a task could take to run without just running it.
Some tasks are CPU bound while some are greatly IO bound. It is impossible to tell beforehand what a given task would be (although I guess it might be possible to detect while the tasks are running).
The primary goal of the thread pool is to maximise throughput. The thread pool should effectively use the resources of the computer. Ideally, for CPU bound tasks, the number of active threads would be equal to the number of CPUs. For IO bound tasks, more threads should be allocated than there are CPUs so that blocking does not overly affect throughput. Minimising the use of locks and using thread safe/fast containers is important.
In general, you should run higher priority tasks with a higher CPU priority (ref: SetThreadPriority). Lower priority tasks should not "block" higher priority tasks from running, so if a higher priority task comes along while all low priority tasks are running, the higher priority task will get to run.
The tasks have a "max running tasks" parameter associated with them. Each type of task is only allowed to run at most this many concurrent instances of the task at a time. For example, we might have the following tasks in the queue:
A - 1000 instances - low priority - max tasks 1
B - 1000 instances - low priority - max tasks 1
C - 1000 instances - low priority - max tasks 1
A working implementation could only run (at most) 1 A, 1 B and 1 C at the same time.
It needs to run on Windows XP, Server 2003, Vista and Server 2008 (latest service packs).
For reference, we might use the following interface:
namespace ThreadPool
{
class Task
{
public:
Task();
void run();
};
class ThreadPool
{
public:
ThreadPool();
~ThreadPool();
void run(Task *inst);
void stop();
};
}
So what are we going to pick as the basic building block for this. Windows has two building blocks that look promising :- I/O Completion Ports (IOCPs) and Asynchronous Procedure Calls (APCs). Both of these give us FIFO queuing without having to perform explicit locking, and with a certain amount of built-in OS support in places like the scheduler (for example, IOCPs can avoid some context switches).
APCs are perhaps a slightly better fit, but we will have to be slightly careful with them, because they are not quite "transparent". If the work item performs an alertable wait (::SleepEx, ::WaitForXxxObjectEx, etc.) and we accidentally dispatch an APC to the thread then the newly dispatched APC will take over the thread, suspending the previously executing APC until the new APC is finished. This is bad for our concurrency requirements and can make stack overflows more likely.
It needs to run on Windows XP, Server 2003, Vista and Server 2008 (latest service packs).
What feature of the system's built-in thread pools make them unsuitable for your task? If you want to target XP and 2003 you can't use the new shiny Vista/2008 pools, but you can still use QueueUserWorkItem and friends.
#DrPizza - this is a very good question, and one that strikes right to the heart of the problem. There are a few reasons why QueueUserWorkItem and the Windows NT thread pool was ruled out (although the Vista one does look interesting, maybe in a few years).
Firstly, we wanted to have greater control over when it starts up and stops threads. We have heard that the NT thread pool is reluctant to start up a new thread if it thinks that the tasks are short running. We could use the WT_EXECUTELONGFUNCTION, but we really have no idea if the task is long or short
Secondly, if the thread pool was already filled up with long running, low priority tasks, there would be no chance of a high priority task getting to run in a timely manner. The NT thread pool has no real concept of task priorities, so we can't do a QueueUserWorkItem and say "oh by the way, run this one right away".
Thirdly, (according to MSDN) the NT thread pool is not compatible with the STA apartment model. I'm not sure quite what this would mean, but all of our worker threads run in an STA.
#DrPizza - this is a very good question, and one that strikes right to the heart of the problem. There are a few reasons why QueueUserWorkItem and the Windows NT thread pool was ruled out (although the Vista one does look interesting, maybe in a few years).
Yeah, it looks like it got quite beefed up in Vista, quite versatile now.
OK, I'm still a bit unclear about how you wish the priorities to work. If the pool is currently running a task of type A with maximal concurrency of 1 and low priority, and it gets given a new task also of type A (and maximal concurrency 1), but this time with a high priority, what should it do?
Suspending the currently executing A is hairy (it could hold a lock that the new task needs to take, deadlocking the system). It can't spawn a second thread and just let it run alongside (the permitted concurrency is only 1). But it can't wait until the low priority task is completed, because the runtime is unbounded and doing so would allow a low priority task to block a high priority task.
My presumption is that it is the latter behaviour that you are after?
#DrPizza:
OK, I'm still a bit unclear about how
you wish the priorities to work. If
the pool is currently running a task
of type A with maximal concurrency of
1 and low priority, and it gets given
a new task also of type A (and maximal
concurrency 1), but this time with a
high priority, what should it do?
This one is a bit of a tricky one, although in this case I think I would be happy with simply allowing the low-priority task to run to completion. Usually, we wouldn't see a lot of the same types of tasks with different thread priorities. In our model it is actually possible to safely halt and later restart tasks at certain well defined points (for different reasons than this) although the complications this would introduce probably aren't worth the risk.
Normally, only different types of tasks would have different priorities. For example:
A task - 1000 instances - low priority
B task - 1000 instances - high priority
Assuming the A tasks had come along and were running, then the B tasks had arrived, we would want the B tasks to be able to run more or less straight away.