If a critical section lock is currently owned by a thread and other threads are trying to own this very lock, then all the threads other than the thread which owns the lock enter into a wait queue for the lock to be released.
When the initial owning thread releases the critical section lock then one of the threads in the waiting queue will be selected to run and given the critical section lock allowing the thread to run.
How is the next thread to run selected as it is not guaranteed that the thread that first came will be the owner of the thread.
If threads are not served in FIFO fashion then how is the next owner Thread selected from the wait queue?
The next thread to get the critical section is chosen non-deterministically. The only thing that you should be concerned about is whether the critical section is implemented fairly, i.e., that no thread waits infinitely long to get its turn. If you need to run threads in specific order, you have to implement this yourself.
The next thread is chosen in quasi FIFO order. However many system level variables may cause this to appear non deterministic:
From Concurrent Programming On Windows by Joe Duffy: (Chapter 5)
... When a fixed number of threads
needs to be awakened, the OS uses a
semi-fair algorithm to choose between
them: as threads wait they are placed
in a FIFO queue that the awakening
logic consults when determining which
thread to wake up. Threads that have
been waiting for the longest time are
thus preferred over threads that been
waiting less time. Although the OS
does use a strict FIFO data structure
to manage wait lists; ... this
ordering is regularly perturbed by
other system code and is not reliable.
Posix threads do the FIFO queue.
What about Thread Scheduling Algorithm , the threads in waiting state get priority as per Thread Scheduling algorithm
Plz correct if I am wrong.
Related
Application design
I have a c++ application that has a producer thread, multiple queues (created during run time) and consumer thread.
Producer thread gets data via Tcp/Ip and puts into the respective queue (for E.g., if data is type A and put into the Queue A).
The consumer thread currently loops the queues from 1 - n to process the data from each queue.
As per the requirement no need to track the queue that is last updated or least. As long as any of the queue is updated, consumer should process from 1 - n queues.
If any of the queues' size is more than the defined limit, producer thread will pop the first item before it inserts the new item (to manage the queue size).
Resource synchronization and signaling between threads:
In this implementation, consumer thread should sleep until there is no queue has data from the listener. consumer thread should wake up only if producer puts data into any one of the queues.
Multiple queues are synchronized between 2 threads using mutex.
Event signaling is implemented between threads to wake up consumer thread whenever producer puts data into any of the queues.
However this way of signaling to wake up consumer thread, it is possible for the consumer to sleep although there is a data in any of the queues.
Issue:
Lets take this scenario, consider the consumer is processing n-th queue's data; at the same time it is possible for the producer to put data into the n-1, n-2 queue and signaling is not effective since the consumer is awake and processing n-th data. Once the consumer completes processing the n-th queue data, it will sleep and the data in n-1, n-2 will not be processes until any further signal is given by the listener.
How we can address this scenario?
People are also advising to use semophore. Is semaphore relavant to this scenario?
Thanks in advance.
This is the classical example for a C++11 std::condition_variable.
The condition in this case is the availability of consumable resources. If a consumer thread runs out of work, he waits on the condition variable which effectively puts him to sleep. The producer notifys after each insert to a queue. Care must be taken to arrange locking in a way that the contention on the queues is kept minimal, while still avoiding the scenario that a consumer misses a notify and goes to sleep although work is available.
A semaphore would work, yes.
But I'm not entirely certain if it's even necessary. It sounds like your problem is caused purely because the consumer thread fails to loop back after processing queue N. It should go to sleep only after it has seen N empty queues in succession, while holding a mutex to ensure that no entries were added in the mean time.
Of course, holding that mutex all the time is overkill. Instead, you should just keep looping, emptying queues one by one and counting how many empty queues you've seen. Once you've seen N empty queues in a row, take the mutex so you know no new entries can be added, and now recheck.
It does depend on your signalling mechanism. Robust signalling mechanisms allow you to signal a thread before it enters the check for that signal. This is necessary because you otherwise have a race condition.
You can use select and wait with it on file descriptor made from signal -> so it can wait on timeout(select has them) and wake up when signal is received (signal must be masked & blocked). When signalfd (look man signalfd) is readable you can read from it a struct signalfd_siginfo and check ssi_signo for signal number (if it's the one you are using for communication).
Is there no concept of queue in Windows critical sections?
I have the following render loop in a dedicated thread:
while (!viewer->finish)
{
EnterCriticalSection(&viewer->lock);
viewer->renderer->begin();
viewer->root->render(viewer->renderer);
viewer->renderer->end();
LeaveCriticalSection(&viewer->lock);
}
The main thread does message processing, and when I handle mouse events, I try to enter the same critical section, but for some reason, it runs the rendering thread for a thousand more iterations (around 10 seconds), before the main thread finally enters the critical section. What's causing this - even if there is no 'queue' to enter the section, shouldn't it be more like 50/50, instead of 99.9/0.1 like in my case? Both threads have 0 priority.
And what is a good way to add such queue? Would a simple flag like bDoNotRenderAnything suffice?
Edit: the solution in my case was simply to add an event object (a boolean variable would probably work too) that is set every time the message handler needs access to the critical section, and reset after using it. The renderer does not enter the section if the variable/event is set. This way message handler won't have to wait for more than one rendering iteration.
In older versions of Windows, critical sections were guaranteed to be acquired on a first-come first-served basis. This is no longer the case starting from Windows Server 2003 SP1.
From the MSDN:
Starting with Windows Server 2003 with Service Pack 1 (SP1), threads waiting on a critical section do not acquire the critical section on a first-come, first-serve basis. This change increases performance significantly for most code. However, some applications depend on first-in, first-out (FIFO) ordering and may perform poorly or not at all on current versions of Windows (for example, applications that have been using critical sections as a rate-limiter). To ensure that your code continues to work correctly, you may need to add an additional level of synchronization. For example, suppose you have a producer thread and a consumer thread that are using a critical section object to synchronize their work. Create two event objects, one for each thread to use to signal that it is ready for the other thread to proceed. The consumer thread will wait for the producer to signal its event before entering the critical section, and the producer thread will wait for the consumer thread to signal its event before entering the critical section. After each thread leaves the critical section, it signals its event to release the other thread.
Windows Server 2003 and Windows XP: Threads that are waiting on a critical section are added to a wait queue; they are woken and generally acquire the critical section in the order in which they were added to the queue. However, if threads are added to this queue at a fast enough rate, performance can be degraded because of the time it takes to awaken each waiting thread.
Threads waiting on a critical section do not acquire the critical section on a first-come, first-serve basis (MSDN)
Most of the time your worker thread owns the lock, because it re-locks immediately after it releases the lock. So there's not much time for the other thread to wake up and catch the lock when it's free.
According to MSDN
There is no guarantee about the order in which waiting threads
will acquire ownership of the critical section.
so it is not sure in which order threads will execute. And if your rather short
viewer->renderer->begin();
viewer->root->render(viewer->renderer);
viewer->renderer->end();
sequence manages to regain the CriticalSection over, this might happen.
You can try quick fix by using SwitchToThread call in your rendering loop (after certain number of iterations), although I doubt it will be good enough solution.
I have 6 threads running in my application continuously. The scenario is:
One thread continuously gets the messages and inserts into a message queue. Other 4 threads can be considered as workers which continuously fetch messages from queue and process them. The other final thread populates the analytics information.
Problem:
Now the sleep durations for getting messages thread is 100ms. Worker threads is 200ms. When I ran this application the messages fetch thread is taking control and inserting into the queue thus increasing the heap. The worker threads are not getting chance to process the messages and deallocate them. Finally its resulting into out of memory.
How to manage this kind of scenario so that equal opportunity is given for messages fetch thread and worker thread.
Thanks in advance :)
You need to add back-pressure to your producer thread. Usually this will done by using blocking consumer-producer queues. Producer adds items to queue, consumers dequeues items from queue and process them. If queue is empty, consumers blocks until producer adds something to queue. If queue is full producer blocks until consumers fetch items from the queue.
One system of flow-control that I use often is to create a large pool of message objects at startup and never create any more. The *objects are stored on a thread-safe, blocking 'pool queue' and circulated around, popped from the pool by producer/s, queued to consumer/s on other blocking queues and then pushed back onto the pool queue when 'consumed'.
This caps memory use, provides flow-control, (if the pool empties, the producer/s block on it until messages are returned from consumers), and eliminates continual new/delete/malloc/free. The more complex and slower bounded queues are not necessary and all queues need only to be large enough to hold the, (known), maximum number of messages.
Using 'classic' blocking queues does not require any Sleep() calls.
Your question is a little vague so I can give you these guidelines instead of a code:
Protect mutual data with Mutex. In a multi-threaded consumer producer problem usually there is a race condition on the mutual data (the message in your program). One thread is attempting to write on the mutual memory location while the other is trying to read from the same location. The message read by the reader might be corrupted because the writer has wrote over it in the middle of reading process. You can lock the mutual memory location with a Mutex. Each one of the threads should acquire this lock in order to be able to read or modify the mutual data. This way the consumer process will be absolutely sure that data has not been modified. However you should note that acquiring this lock might hold back the producer thread so you should release the lock as soon as possible.
Use condition variables to notify consumer threads. If you do not use a notification mechanisms all consumer threads should actively check for data production which will use up system resources. The consumer threads should easily go to sleep knowing that the producer thread will notify them whenever a message is ready.
The threading library in C++ 11 has everything you need to implement a consumer producer application. However if you are not able to upgrade your compiler you could use boost threading library as well.
You want to use a bounded queue which when full will block threads trying to enqueue until more space is available.
You can use concurrent_bounded_queue from tbb, or simply use a semaphore initialized to the maximum queue size, and decrement on enqueue and increment on dequeue. boost::thread doesn't provide semaphores natively, but you can implement it using locks and condition variables.
What happens when a thread is put to sleep by other thread, possible by main thread, in the middle of its execution?
assuming I've a function Producer. What if Consumer sleep()s the Producer in the middle of production of one unit ?
Suppose the unit is half produced. and then its put on sleep(). The integrity of system may be in a problem
The thread that sleep is invoked on is put in the idle queue by the thread scheduler and is context switched out of the CPU it is running on, so other threads can take it's place.
All context (registers, stack pointer, base pointer, etc) are saved on the thread stack, so when it's run next time, it can continue from where it left off.
The OS is constantly doing context switches between threads in order to make your system seem like it's doing multiple things. The OS thread scheduler algorithm takes care of that.
Thread scheduling and threading is a big subject, if you want to really understand it, I suggest you start reading up on it. :)
EDIT: Using sleep for thread synchronization purposes not advised, you should use proper synchronization mechanisms to tell the thread to wait for other threads, etc.
There is no problem associated with this, unless some state is mutated while the thread sleeps, so it wakes up with a different set of values than before going to sleep.
Threads are switched in and out of execution by the CPU all the time, but that does not affect the overall outcome of their execution, assuming no data races or other bugs are present.
It would be unadvisable for one thread to forcibly and synchronously interfere with the execution of another thread. One thread could send an asynchronous message to another requesting that it reschedule itself in some way, but that would be handled by the other thread when it was in a suitable state to do so.
Assuming they communicate using channels that are thread-safe, nothing bad shoudl happen, as the sleeping thread will wake up eventually and grab data from its task queue or see that some semaphore has been set and read the prodced data.
If the threads communicate using nonvolatile variables or direct function calls that change state, that's when Bad Things occur.
I don't know of a way for a thread to forcibly cause another thread to sleep. If two threads are accessing a shared resource (like an input/output queue, which seems likely for you Produce/Consumer example), then both threads may contend for the same lock. The losing thread must wait for the other thread to release the lock if the contention is not the "trylock" variety. The thread that waits is placed into a waiting queue associated with the lock, and is removed from the schedulers run queue. When the winning thread releases the lock, the code checks the queue to see if there are threads still waiting to acquire it. If there are, one is chosen as the winner and is given the lock, and placed in the scheduler run queue.
As already stated in the title I have a large number of threads (probably much higher than 100) that are rather saving a program state than running. I want only few of them (enough to use all physical processors) to really run concurrent and the rest should wait until one of the running is blocked. When this happens a new one should be running.
Is it possible to achieve this with pthreads for example with the pthread scheduling functions? How would you do this?
Regards,
Nobody
EDIT
More Information:
Each thread fetches a job from the taskpool on its own and goes on to a certain point.
I need 100 threads to gather at that certain point of program execution that cannot be calculated in parallel. When the calculation is done the threads should be awakened and go on. To make this efficient I have to avoid the scheduler from wasting time on switching between 100 threads instead of 4.
Just use a semaphore with initial count of 4?
http://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_init.html
You could always launch 4 at a time, assigning them to a thread group, then waiting with a join all on the thread group. But I think more information is needed to develop a really useful answer.
Initialize a global variable to the number of threads to run concurrently.
When a thread wants to do work it obtains a slot. Using a mutex and condition variable, it waits until slots_available > 1. It then decrements slots_available releases the mutex and proceeds with its work.
When a thread has completed its work, it releases the slot by locking the mutex and incrementing slots_available. It signals all threads waiting on the condition variable so they can wake and see if slots_available > 1.
See https://computing.llnl.gov/tutorials/pthreads/#Mutexes for specific pthread library calls to use for the above.
I don't know how to do this with pthread functions, but I do have an idea:
I would implement this by adding some intelligence to the threadpool/taskpool to count the number of active threads and only make 4 - number of active threads available at any one time. This could be done by having an idle queue, a ready queue, and an active queue (or just active count). Tasks would grab from the ready queue, and the threadpool would only migrate tasks from the idle queue to the ready queue conditionally.