Setting thread priorities from the running process - c++

I've just come across the Get/SetThreadPriority methods and they got me wondering - can a thread priority meaningfully be set higher than the owning process priority (which I don't believe can be changed programatically in the same way) ?
Are there any pitfalls to using these APIs?

Yes, you can set the thread priority to any class, including a class higher than the one of the current process. In fact, these two values are complementary and provide the base priority of the thread. You can read about it in the Remarks section of the link you posted.
You can set the process priority using SetPriorityClass.
Now that we got the technicalities out of the way, I find little use for manipulating the priority of a thread directly. The OS scheduler is sophisticated enough to boost the priority of threads blocked in I/O over threads doing CPU computations (to the point that an I/O thread will preempt a CPU thread when the I/O interrupt arrives). In fact, even I/O threads are differentiated, with keyboard I/O threads getting a priority boost over file I/O threads for example.

On Windows, the thread and process priorities are combined using an algorthm that decides overall scheduling priority:
Windows priorities
Pitfalls? Well:
Raising the priority of a thread is likely to give the greatest overall gain if it is usually blocked on IO but must run ASAP afer being signaled by its driver, eg. Video IO that must process buffers quickly.
Raising the priority of threads is likely to have the greatest overall negative impact if they are CPU-bound and raised to a high priority, so preventing the running of normal-priority threads. If taken to extremes, OS threads and utilities like Task Manger will not run.

Related

A synchronization primitive with increased owner thread priority

I have a program where sometimes bursts happen so that threads would load the CPU above 100% if that was possible, but in reality, they fight for the CPU. It is critical that a thread obtaining ownership of a synchronization primitive gets a higher priority than the other threads of the application, so to prevent the case where a thread obtains ownership and gets paused by the scheduler. Is there a suitable synchronization primitive in C++ (up to the latest draft) or WinAPI, or do I have to wrap the mutex locking code in SetThreadPriority() calls?
This isn't actually a problem. If a thread that owns a synchronization primitive gets paused by the scheduler, it would only be because there were enough ready-to-run threads to keep all the cores busy. In that case, there's no particular reason to care which thread runs.
Threads that waiting for the synchronization primitive aren't ready to run. So if you have four cores and the thread that holds the synchronization primitive isn't being blocked, it would only be because there are four threads, all ready-to-run, that can make forward progress without holding the synchronization primitive. In that case, running those four threads is just as good as running the thread that holds the synchronization primitive.
I strongly urge you not to mess with thread priorities unless you really have no choice. Once you start messing with thread priorities, the argument above can stop holding because you can get issues like priority inversion. But if you don't mess with thread priorities, then you can't run into those kinds of issues and the scheduler will be smart enough to do the right thing 99% of the time. And trying to mess with priorities to get it do the right thing that last 1% of the time will likely backfire.
The mechanism you are looking for is called a priority inheritance protocol. Pthreads offers support for this sort of configuration, and the idea is that if a high priority task is waiting for a resource held by a low priority task, the low priority task is boosted to that high priority until it relinquishes the resource.
Search for Liu and Layland, they wrote most of this up in the early 70s. As for C++, I am afraid it is a few versions away from 1973's state of the art.

Ensure that each thread gets a chance to execute in a given time period using C++11 threads

Suppose I have a multi-threaded program in C++11, in which each thread controls the behavior of something displayed to the user.
I want to ensure that for every time period T during which one of the threads of the given program have run, each thread gets a chance to execute for at least time t, so that the display looks as if all threads are executing simultaneously. The idea is to have a mechanism for round robin scheduling with time sharing based on some information stored in the thread, forcing a thread to wait after its time slice is over, instead of relying on the operating system scheduler.
Preferably, I would also like to ensure that each thread is scheduled in real time.
In case there is no way other than relying on the operating system, is there any solution for Linux?
Is it possible to do this? How?
No that's not cross-platform possible with C++11 threads. How often and how long a thread is called isn't up to the application. It's up to the operating system you're using.
However, there are still functions with which you can flag the os that a special thread/process is really important and so you can influence this time fuzzy for your purposes.
You can acquire the platform dependent thread handle to use OS functions.
native_handle_type std::thread::native_handle //(since C++11)
Returns the implementation defined underlying thread handle.
I just want to claim again, this requires a implementation which is different for each platform!
Microsoft Windows
According to the Microsoft documentation:
SetThreadPriority function
Sets the priority value for the specified thread. This value, together
with the priority class of the thread's process determines the
thread's base priority level.
Linux/Unix
For Linux things are more difficult because there are different systems how threads can be scheduled. Under Microsoft Windows it's using a priority system but on Linux this doesn't seem to be the default scheduling.
For more information, please take a look on this stackoverflow question(Should be the same for std::thread because of this).
I want to ensure that for every time period T during which one of the threads of the given program have run, each thread gets a chance to execute for at least time t, so that the display looks as if all threads are executing simultaneously.
You are using threads to make it seem as though different tasks are executing simultaneously. That is not recommended for the reasons stated in Arthur's answer, to which I really can't add anything.
If instead of having long living threads each doing its own task you can have a single queue of tasks that can be executed without mutual exclusion - you can have a queue of tasks and a thread pool dequeuing and executing tasks.
If you cannot, you might want to look into wait free data structures and algorithms. In a wait free algorithm/data structure, every thread is guaranteed to complete its work in a finite (and even specified) number of steps. I can recommend the book The Art of Multiprocessor Programming where this topic is discussed in length. The gist of it is: every lock free algorithm/data structure can be modified to be wait free by adding communication between threads over which a thread that's about to do work makes sure that no other thread is starved/stalled. Basically, prefer fairness over total throughput of all threads. In my experience this is usually not a good compromise.

Latency priority changes being applied to a thread

I would like to write a program, where several worker threads should process different tasks with different priorities. Large tasks would be processed with low priority and small tasks with a very high priority.
In a perfect world I would simply set a different priority for each kind of task, but since it is more task types than priority levels available on Windows, I think i have to set the thread priorities dynamically.
I think there should be a main thread with highest priority, working as a kind of scheduler setting the priorities of the worker threads dynamically. But I wonder what actually happens on Windows, when I call SetThreadPriority() and especially how quick the priority change is taken into account by the OS.
Ideally I need to boost the priority of a 'small task thread' within < 1 ms. Is this possible? And is there any way to change the latency of the OS (if there is any) reacting on the priority change?
The windows dispatcher (scheduler) is not a single process/thread; it is spread across the kernel. The dispatcher is generally triggered by the following events:
Thread becomes ready for execution
Thread leaves running state (e.g. quantum expires, wait state, or done)
The thread's priority changes (e.g. SetThreadPriority)
Processor affinity changes
I need to boost the priority of a 'small task thread' within < 1 ms. Is this possible?
According to 3: Yes, the dispatcher will reschedule immediately.
Ref.: Windows Internals Tour: Windows Processes, Threads and
Memory, Microsoft Academic Club 2011

Scheduling of Process(s) waiting for Semaphore

It is always said when the count of a semaphore is 0, the process requesting the semaphore are blocked and added to a wait queue.
When some process releases the semaphore, and count increases from 0->1, a blocking process is activated. This can be any process, randomly picked from the blocked processes.
Now my question is:
If they are added to a queue, why is the activation of blocking processes NOT in FIFO order? I think it would be easy to pick next process from the queue rather than picking up a process at random and granting it the semaphore. If there is some idea behind this random logic, please explain. Also, how does the kernel select a process at random from queue? getting a random process that too from queue is something complex as far as a queue data structure is concerned.
tags: various OSes as each have a kernel usually written in C++ and mutex shares similar concept
A FIFO is the simplest data structure for the waiting list in a system
that doesn't support priorities, but it's not the absolute answer
otherwise. Depending on the scheduling algorithm chosen, different
threads might have different absolute priorities, or some sort of
decaying priority might be in effect, in which case, the OS might choose
the thread which has had the least CPU time in some preceding interval.
Since such strategies are widely used (particularly the latter), the
usual rule is to consider that you don't know (although with absolute
priorities, it will be one of the threads with the highest priority).
When a process is scheduled "at random", it's not that a process is randomly chosen; it's that the selection process is not predictable.
The algorithm used by Windows kernels is that there is a queue of threads (Windows schedules "threads", not "processes") waiting on a semaphore. When the semaphore is released, the kernel schedules the next thread waiting in the queue. However, scheduling the thread does not immediately make that thread start executing; it merely makes the thread able to execute by putting it in the queue of threads waiting to run. The thread will not actually run until a CPU has no threads of higher priority to execute.
While the thread is waiting in the scheduling queue, another thread that is actually executing may wait on the same semaphore. In a traditional queue system, that new thread would have to stop executing and go to the end of the queue waiting in line for that semaphore.
In recent Windows kernels, however, the new thread does not have to stop and wait for that semaphore. If the thread that has been assigned that semaphore is still sitting in the run queue, the semaphore may be reassigned to the old thread, causing the old thread to go back to waiting on the semaphore again.
The advantage of this is that the thread that was about to have to wait in the queue for the semaphore and then wait in the queue to run will not have to wait at all. The disadvantage is that you cannot predict which thread will actually get the semaphore next, and it's not fair so the thread waiting on the semaphore could potentially starve.
It is not that it CAN'T be FIFO; in fact, I'd bet many implementations ARE, for just the reasons that you state. The spec isn't that the process is chosen at random; it is that it isn't specified, so your program shouldn't rely on it being chosen in any particular way. (It COULD be chosen at random; just because it isn't the fastest approach doesn't mean it can't be done.)
All of the other answers here are great descriptions of the basic problem - especially around thread priorities and ready queues. Another thing to consider however is IO. I'm only talking about Windows here, since it is the only platform I know with any authority, but other kernels are likely to have similar issues.
On Windows, when an IO completes, something called a kernel-mode APC (Asynchronous Procedure Call) is queued against the thread which initiated the IO in order to complete it. If the thread happens to be waiting on a scheduler object (such as the semaphore in your example) then the thread is removed from the wait queue for that object which causes the (internal kernel mode) wait to complete with (something like) STATUS_ALERTED. Now, since these kernel-mode APCs are an implementation detail, and you can't see them from user mode, the kernel implementation of WaitForMultipleObjects restarts the wait at that point which causes your thread to get pushed to the back of the queue. From a kernel mode perspective, the queue is still in FIFO order, since the first caller of the underlying wait API is still at the head of the queue, however from your point of view, way up in user mode, you just got pushed to the back of the queue due to something you didn't see and quite possibly had no control over. This makes the queue order appear random from user mode. The implementation is still a simple FIFO, but because of IO it doesn't look like one from a higher level of abstraction.
I'm guessing a bit more here, but I would have thought that unix-like OSes have similar constraints around signal delivery and places where the kernel needs to hijack a process to run in its context.
Now this doesn't always happen, but the documentation has to be conservative and unless the order is explicitly guaranteed to be FIFO (which as described above - for windows at least - it can't be) then the ordering is described in the documentation as being "random" or "undocumented" or something because a random process controls it. It also gives the OS vendors lattitude to change the ordering at some later time.
Process scheduling algorithms are very specific to system functionality and operating system design. It will be hard to give a good answer to this question. If I am on a general PC, I want something with good throughput and average wait/response time. If I am on a system where I know the priority of all my jobs and know I absolutely want all my high priority jobs to run first (and don't care about preemption/starvation), then I want a Priority algorithm.
As far as a random selection goes, the motivation could be for various reasons. One being an attempt at good throughput, etc. as mentioned above above. However, it would be non-deterministic (hypothetically) and impossible to prove. This property could be an exploitation of probability (random samples, etc.), but, again, the proofs could only be based on empirical data on whether this would really work.

Pattern for realizing low priority background threads?

I have a (soft) realtime system which queries some sensor data, does some processing and then waits for the next set of sensor data. The sensor data are read in a receiver thread and put into a queue, so the main thread is "sleeping" (by means of a mutex) until the new data has arrived.
There are other tasks like logging or some long-term calculations in the background to do. These are implemented to run in other threads.
However, it is important that while the main thread processes the sensor data, it should have highest priority which means that the others threads should not consume any CPU resources at all if possible (currently the background threads cause the main thread to slow down in an unacceptable way.)
According to Setting thread priority in Linux with Boost there is doubt that setting thread priorities will do the job. I am wondering how I can measure which effect setting thread priorities really has? (Platform: Angstrom Linux, ARM PC)
Is there a way to "pause" and "continue" threads completely?
Is there a pattern in C++ to maybe realize the pause/continue on my own? (I might be able to split the background work into small chunks and I could check after every chunk of work if I am allowed to continue, but the question is how big these chunks should be etc.)
Thanks for your thoughts!
Your problem is with OS scheduler, not the C++. You need to have a real real-time scheduler that will block lower priority threads while the higher priority thread is running.
Most "standard" PC schedulers are not real-time. There's an RT scheduler for Linux - use it. Start with reading about SCHED_RR and SCHED_FIFO, and the nice command.
In many systems, you'll have to spawn a task (using fork) to ensure the nice levels and the RT scheduler are actually effective, you have to read through the manuals of your system and figure out which scheduling modules you have and how are they implemented.
There is no portable way to set the priority in Boost::Thread. The reason is that different OSs will have different API for setting the priority (e.g. Windows and Linux).
The best way to set the priority in a portable way is to write a wrapper to boost::thread with a uniform API that internally gets the thread native_handle, and then uses the OS specific API (for example, in Linux you can use sched_setscheduler()).
You can see an example here:
https://sourceforge.net/projects/threadutility/
(code made by a student of mine, look at the svn repository)