I used to see Sleep(0) in some part of my code where some infinite/long while loops are available. I was informed that it would make the time-slice available for other waiting processes. Is this true? Is there any significance for Sleep(0)?
According to MSDN's documentation for Sleep:
A value of zero causes the thread to
relinquish the remainder of its time
slice to any other thread that is
ready to run. If there are no other
threads ready to run, the function
returns immediately, and the thread
continues execution.
The important thing to realize is that yes, this gives other threads a chance to run, but if there are none ready to run, then your thread continues -- leaving the CPU usage at 100% since something will always be running. If your while loop is just spinning while waiting for some condition, you might want to consider using a synchronization primitive like an event to sleep until the condition is satisfied or sleep for a small amount of time to prevent maxing out the CPU.
Yes, it gives other threads the chance to run.
A value of zero causes the thread to
relinquish the remainder of its time
slice to any other thread that is
ready to run. If there are no other
threads ready to run, the function
returns immediately, and the thread
continues execution.
Source
I'm afraid I can't improve on the MSDN docs here
A value of zero causes the thread to
relinquish the remainder of its time
slice to any other thread that is
ready to run. If there are no other
threads ready to run, the function
returns immediately, and the thread
continues execution.
Windows XP/2000: A value of zero
causes the thread to relinquish the
remainder of its time slice to any
other thread of equal priority that is
ready to run. If there are no other
threads of equal priority ready to
run, the function returns immediately,
and the thread continues execution.
This behavior changed starting with
Windows Server 2003.
Please also note (via upvote) the two useful answers regarding efficiency problems here.
Be careful with Sleep(0), if one loop iteration execution time is short, this can slow down such loop significantly. If this is important to use it, you can call Sleep(0), for example, once per 100 iterations.
Sleep(0); At that instruction, the system scheduler will check for any other runnable threads and possibly give them a chance to use the system resources depending on thread priorities.
On Linux there's a specific command for this: sched_yield()
as from the man pages:
sched_yield() causes the calling thread to relinquish the CPU. The
thread is moved to the end of the queue for its static priority and a
new thread gets to run.
If the calling thread is the only thread in the highest priority list
at that time, it will continue to run after a call to sched_yield().
with also
Strategic calls to sched_yield() can improve performance by giving
other threads or processes a chance to run when (heavily) contended
resources (e.g., mutexes) have been released by the caller. Avoid
calling sched_yield() unnecessarily or inappropriately (e.g., when
resources needed by other schedulable threads are still held by the
caller), since doing so will result in unnecessary context switches,
which will degrade system performance.
In one app....the main thread looked for things to do, then launched the "work" via a new thread. In this case, you should call sched_yield() (or sleep(0)) in the main thread, so, that you do not make the "looking" for work, more important then the "work". I prefer sleep(0), but sometimes this is excessive (because you are sleeping a fraction of a second).
Sleep(0) is a powerful tool and it can improve the performance in certain cases. Using it in a fast loop might be considered in special cases. When a set of threads shall be utmost responsive, they shall all use Sleep(0) frequently. But it is crutial to find a ruler for what responsive means in the context of the code.
I've given some details in https://stackoverflow.com/a/11456112/1504523
I am using using pthreads and for some reason on my mac the compiler is not finding pthread_yield() to be declared. But it seems that sleep(0) is the same thing.
Related
I've created a multi-threaded application using C++ and POSIX threads. In which I should now block a thread (main thread) until a boolean flag is set (becomes true).
I've found two ways to get this done.
Spinning through a loop without sleep.
while(!flag);
Spinning through a loop with sleep.
while(!flag){
sleep(some_int);
}
If I should follow the first way, why do some people write codes following the second way? If the second way should be used, why should we make current thread to sleep? And what are disadvantages of this way?
The first option (a "busy wait") wastes an entire core for the duration of the wait, preventing other useful work being done and/or wasting energy.
The second option is less wasteful - your waiting thread uses very little CPU and allows other threads to run. But it is still wasteful to keep switching back to the thread to check the flag.
Far better than either would be to use a condition variable, which allows the waiting thread to block without consuming any resources until it is able to proceed.
while(flag); will cause your thread to use all of its allocated time checking the condition. This wastes a lot of CPU cycles checking something which has likely not changed.
Sleeping for a bit causes the thread to pause and give up the CPU to programs that actually need it.
You shouldn't do either though; you should use a threading library to create a flag object and call its wait function, so that the kernel will pause the thread until the flag is set.
The first way (just the plain while) is wasting resources, specifically the processor time of your process.
When a thread is put into sleep, OS may decide that the processor will be used for different tasks when talking about systems with preemptive multitasking. In theory, if you had as many processors / cores as threads, there would not have to be any difference.
If a solution is good or not depends on the operating system used, and sometimes architecture the program is running on. You should consult your syscall reference to find out more about this.
Using VC++ 13 under windows, the on-line help states that using Sleep(0) relinquishes the remainder of the current threads time slice to any other thread of equal priority. Is this also the case for other values? e.g. if I use Sleep(1000) are 1000ms of CPU time for the core on which the current thread is running likely to be usable by another thread? I imagine this is hardware and implementation specific, so to narrow it assume Intel I5 or better, Windows 7 or 8.
The reason for asking is I have a thread pool class, and I'm using an additional monitor thread to report progress, allow the user to abort long processes, etc...
Yes, zero has the special meaning only in the regard to signal there is no minimal time to wait. Normally it could be interpreted like "I want to sleep for no-time" which doesn't make much sense. It means "I want to give chance to other thread to run."
If it's non-zero, thread is guaranteed not to be returned to for the amount of time specified, of course within the clock resolution. When thread gets suspended it gets a suspended status in the system and is not considered during scheduling. With 0 it doesn't change it's status, so it remains ready to run, and the function might return immediately.
Also, I don't think it is hardware related, this is purely system level thing.
MSDN: Sleep function
A value of zero causes the thread to relinquish the remainder of its time slice to any other thread that is ready to run. If there are no other threads ready to run, the function returns immediately, and the thread continues execution.
The special XP case is described as follows:
Windows XP: A value of zero causes the thread to relinquish the remainder of its time slice to any other thread of equal priority that is ready to run. If there are no other threads of equal priority ready to run, the function returns immediately, and the thread continues execution. This behavior changed starting with Windows Server 2003.
MSDN states that the reminder of the threads time slice is relinquished to any other thread of equal priority. This is somewhat meaningless because a thread of higher priority would have been scheduled prior to the thread calling Sleep(0) and a thread with lower priority would cause the Sleep(0) to return immediately without giving anything away. Therefore Sleep(0) only has impact to threads of equal priority by default.
Purpose of Sleep(0): It triggers the scheduler to re-schedule while putting the calling thread at the end of the queue. If the queue does not have any other processes of the same priority, the call will return immediately. If there are other threads, the delay is undetermined. Note: The Windows scheduler is not a single thread, it is spread all over the OS (Details: How does a scheduler regain control when wanted?).
The detailed behavior depends on the systems timer resolution setting (How to get the current Windows system-wide timer resolution). This setting also influences the threads time slice, it varies with the system timer resolution.
The system timer resolution defines the heartbeat of the system. This causes thread quanta to have specific values. The timer resolution granularity also determines the resolution of Sleep(period). Consequently, the accuracy of sleep periods is determined by the systems heartbeat. However, a high resolution timer setting increases the power consumption.
A Sleep(period) with period > 0 triggers the scheduler and prohibits scheduling of the calling thread for at least the requested period.
Consequently the calling threads time slice is interrupted. It ends immediately.
Yes, Sleep(period) with period > 0 relinquishes CPU time to other threads (if any applicable).
(Further reading: Few words about timer resolution, How to get an accurate 1ms Timer Tick under WinXP, and Limits of Windows Queue Timers).
In a multi threaded app, is
while (result->Status == Result::InProgress) Sleep(50);
//process results
better than
while (result->Status == Result::InProgress);
//process results
?
By that, I'm asking will the first method be polite to other threads while waiting for results rather than spinning constantly? The operation I'm waiting for usually takes about 1-2 seconds and is on a different thread.
I would suggest using semaphores for such case instead of polling. If you prefer active waiting, the sleep is much better solution than evaluating the loop condition constantly.
It's better, but not by much.
As long as result->Status is not volatile, the compiler is allowed to reduce
while(result->Status == Result::InProgress);
to
if(result->Status == Result::InProgress) for(;;) ;
as the condition does not change inside the loop.
Calling the external (and hence implicitly volatile) function Sleep changes this, because this may modify the result structure, unless the compiler is aware that Sleep never modifies data. Thus, depending on the compiler, the second implementation is a lot less likely to go into an endless loop.
There is also no guarantee that accesses to result->Status will be atomic. For specific memory layouts and processor architectures, reading and writing this variable may consist of multiple steps, which means that the scheduler may decide to step in in the middle.
As all you are communicating at this point is a simple yes/no, and the receiving thread should also wait on a negative reply, the best way is to use the appropriate thread synchronisation primitive provided by your OS that achieves this effect. This has the advantage that your thread is woken up immediately when the condition changes, and that it uses no CPU in the meantime as the OS is aware what your thread is waiting for.
On Windows, use CreateEvent and co. to communicate using an event object; on Unix, use a pthread_cond_t object.
Yes, sleep and variants give up the processor. Other threads can take over. But there are better ways to wait on other threads.
Don't use the empty loop.
That depends on your OS scheduling policy too.For example Linux has CFS schedular by default and with that it will fairly distribute the processor to all the tasks. But if you make this thread as real time thread with FIFO policy then code without sleep will never relenquish the processor untill and unless a higher priority thread comes, same priority or lower will never get scheduled untill you break from the loop. if you apply SCHED_RR then processes of same priority and higher will get scheduled but not lower.
I'm using pthread on Linux. I have a circular buffer to pass data from one thread to another. Maybe the circular buffer is not the best structure to use here, but changing that would not make my problem go away, so we'll just refer it as a queue.
Whenever my queue is either full or empty, pop/push operations return NULL. This is problematic since my threads fire periodically. Waiting for another thread loop would take too long.
I've tried using semaphores (sem_post, sem_wait) but unlocking under contention takes up to 25 ms, which is about the speed of my loop. I've tried waiting with pthread_cond_t, but the unlocking takes up to between 10 and 15 ms.
Is there a faster mechanism I could use to wait for data?
EDIT*
Ok I used condition variables. I'm on an embedded device so adding "more cores or cpu power" is not an option. This made me realise I had all sorts of thread priorities set all over the place so I'll sort this out before going further
You should use condition variables. The only faster ways are platform-specific, and they're only negligibly faster.
You're seeing what you think is poor performance simply because your threads are being de-scheduled. You're seeing long "delays" when your thread is near the end of its timeslice and the scheduler allows the unblocked thread to pre-empt the running thread. If you have more cores than threads or set your thread to a higher priority, you won't see these delays.
But these delays are actually a good thing, and you shouldn't be concerned about them. Other threads just get a chance to run too.
I have a program with a main thread and a diagnostics thread. The main thread is basically a while(1) loop that performs various tasks. One of these tasks is to provide a diagnostics engine with information about the system and then check back later (i.e. in the next loop) to see if there are any problems that should be dealt with. An iteration of the main loop should take no longer than 0.1 seconds. If all is well, then the diagnostic engine takes almost no time to come back with an answer. However, if there is a problem, the diagnostic engine can take seconds to isolate the problem. For this reason each time the diagnostic engine receives new information it spins up a new diagnostics thread.
The problem we're having is that the diagnostics thread is stealing time away from the main thread. Effectively, even though we have two threads, the main thread is not able to run as often as I would like because the diagnostic thread is still spinning.
Using Boost threads, is it possible to limit the amount of time that a thread can run before moving on to another thread? Also of importance here is that the diagnostic algorithm we are using is blackbox, so we can't put any threading code inside of it. Thanks!
If you run multiple threads they will indeed consume CPU time. If you only have a single processor, and one thread is doing processor intensive work then that thread will slow down the work done on other threads. If you use OS-specific facilities to change the thread priority then you can make the diagnostic thread have a lower priority than the main thread. Also, you mention that the diagnostic thread is "spinning". Do you mean it literally has the equivalent of a spin-wait like this:
while(!check_done()) ; // loop until done
If so, I would strongly suggest that you try and avoid such a busy-wait, as it will consume CPU time without achieving anything.
However, though multiple threads can cause each other to slow-down, if you are seeing an actual delay of several seconds this would suggest there is another problem, and that the main thread is actually waiting for the diagnostic thread to complete. Check that the call to join() for the diagnostic thread is outside the main loop.
Another possibility is that the diagnostic thread is locking a mutex needed by the main thread loop. Check which mutexes are locked and where.
To really help, I'd need to see some code.
looks like your threads are interlocked, so your main thread waits until background thread finished its work. check any multithreading sychronization that can cause this.
to check that it's nothing related to OS scheduling run you program on double-core system, so both threads can be executed really in parallel
From the way you've worded your question, it appears that you're not quite sure how threads work. I assume by "the amount of time that a thread can run before moving on to another thread" you mean the number of cpu cycles spent per thread. This happens hundreds of thousands of times per second.
Boost.Thread does not have support for thread priorities, although your OS-specific thread API will. However, your problem seems to indicate the necessity for a fundamental redesign -- or at least heavy profiling to find bottlenecks.
You can't do this generally at the OS level, so I doubt boost has anything specific for limiting execution time. You can kinda fake it with small-block operations and waits, but it's not clean.
I would suggest looking into processor affinity, either at a thread or process level (this will be OS-specific). If you can isolate your diagnostic processing to a limited subset of [logical] processors on a multi-core machine, it will give you a very course mechanism to control maximum execution amount relative to the main process. That's the best solution I have found when trying to do a similar type of thing.
Hope that helps.