Can Windows totally stop a thread if it's sleeping too often? - c++

I have a rare heisenbug in a multi-threaded application where the main thread, and only this thread, will just do nothing. As it's an heisenbug it's really hard to understand why this is happening.
The main thread is basically just looping. In the loop, it check several concurrent priority queues which contain tasks ordered by time to be executed. It pop a task, see if it's time to execute it. If it's time, it will just schedule it into TBB's task scheduler (using a root task which is the parent of all other tasks). If it's not time, the task is pushed again in the priority queue.
That's for one cycle. At the end of the cycle, the main thread is put to sleep for a very short time that I expect will be longer in practice but it's not really a problem, I just don't want it to use too much resources when not necessary.
Litterally:
static const auto TIME_SCHEDULED_TASKS_SPAWN_FREQUENCY = microseconds(250);
while( !m_task_scheduler.is_exiting() ) // check if the application should exit
{
m_clock_scheduler.spawn_realtime_tasks(); // here we spawn tasks if it's time
this_thread::sleep_for( TIME_SCHEDULED_TASKS_SPAWN_FREQUENCY );
}
m_clock_scheduler.clear_tasks();
m_root_task.wait_for_all();
I have a special task that just log a "TICK" message each second. It is automatically rescheduling until the end of the program. However, when the heisenbug appear, I can see the "TICK" disappearing and the application not doing anything else than the work that occurs in non-main threads. So it appear that only the main thread is touched.
The problem can come from different places, maybe in the scheduling logic, but then it's also the only thread that have a sleep call. That sleep is a boost::this_thread::sleep_for().
My question is: Is it possible that Windows (7 64bit) consider the main thread to be sleeping often and decide that it should sleep for a longer period of time than asked or be definitely ended?
I expect that it is not possible but I would like to be sure. I didn't find any precision on this in online documentation so far.
Update:
I have a friend who can reproduce the bug systematically (on Windows Vista, Core 2 Duo). I sent him a version without sleep and another with the main loop reimplemented using condition_variable so that each time a task is pushed in the queue the condition_variable awaken the main thread (but still have a minimum time of spawning).
The version without sleep works (but is slower) - so the problem seem to be related even if I don't know the real source.
The version using condition_variable works - which would indicate that it's the sleep call that don't work correctly?
So, apparently I fixed the bug, but I still don't know why the specific sleep call can sometime block.
UPDATE:
It was actually a bug triggered by Boost code. I hunted the bug and reported it and it have been fixed. I didn't check the previous versions but it is fixed in Boost 1.55

Is it possible that Windows (7 64bit) consider the main thread to be sleeping often and decide that it should sleep for a longer period of time than asked or be definitely ended?
NO. This does not happen. MSDN does not indicate that this could happen. Empirically, I have many Windows apps with periodic intervals ranging from ms to hours. The effect you suggest does not happen - it would be disastrous for my apps.
Given the well-known granularity issues with Sleep() calls for very short intervals, a sleeping thread will become ready upon the expiry of the interval. If there is a CPU core available, (ie. the cores are not all in use running higher-priority threads), the newly-ready thread will become running.
The OS will not extend the interval of Sleep() because of any historical/statistical data associated with the thread states - I don't think it keeps any such data.

Related

Accurate Sleep with cancellation

I need to implement a delay or sleep function that is accurate and consistent, and must be able to be cancelled.
Here's my code:
bool cancel_flag(false);
void My_Sleep(unsigned int duration)
{
static const size_t SLEEP_INTERVAL = 10U; // 10 milliseconds
while ((!cancel_flag) && (duration > SLEEP_INTERVAL))
{
Sleep(duration);
duration -= SLEEP_INTERVAL;
}
if ((!cancel_flag) && (duration > 0U))
{
Sleep(duration);
}
}
The above function is run in a worker thread. The main thread is able to change the value of the "cancel_flag" in order to abort (cancel) the sleeping.
At my shop, we have different results when the duration is 10 seconds (10000 ms). Some PCs are showing a sleep duration of 10 seconds, other PCs are showing 16 seconds.
Articles about the Sleep() function say that it is bound to the windows interrupt and when the duration elapses, the thread is rescheduled (may not be run immediately). The function above may be encountering a propagation of time error due to rescheduling and interrupt latency.
The Windows Timestamp Project describes another technique of waiting on a timer object. My understanding is that this technique doesn't provide a means of cancellation (by another, main, thread).
Question:
1. How can I improve my implementation for a thread delay or sleep, that can be cancelled by another task, and is more consistent?
Can a sleeping AFX thread be terminated?
(What happens when a main thread terminates a sleeping AFX thread?)
What happens when a main thread terminates a thread that has called WaitForSingleObject?
Accuracy should be around 10ms, as in 10 seconds + 10ms.
Results should be consistent across various PCs (all running Windows 7 or 10).
Background
PCs that have correct timings are running Windows 7, at 2.9 GHz.
The PCs that have incorrect timings are running Windows 7, at 3.1 GHz and have fewer simultaneous tasks and applications running.
Application is developed using Visual Studio 2017, and MFC framework (using AFX for thread creation).
You shouldn't be implementing this at all.
In C++11 all basic necessary utilities for multithreading are implemented in the standard.
If you do not use C++11 - then switch to C++11 or higher - in the unfortunate case that you cannot, then use Boost which has the same features.
Basically, what you want to do with this functionality is covered by std::condition_variable. You can put a thread into waiting mode by using function wait (it accepts a condition function necessary for leaving the wait), wait_for and wait_until (same as wait but with total waiting time limit) as well as notify_one and notify_all methods that wake the sleeping threads (one or all) and make them check the awakening condition and proceed with their tasks.
Check out std::conditional_variable in the reference. Just google it and you'll find enough information about it with examples.
In case you do not trust std::conditional_variable implementation for some reason, you can still utilize it for mini waits and awakening.
How can I improve my implementation for a thread delay or sleep, that
can be cancelled by another task, and is more consistent?
High precision sleep has been discussed here before. I have used a waitable timer approach similar to what's described here.
Can a sleeping AFX thread be terminated? (What happens when a main
thread terminates a sleeping AFX thread?)
I assume you mean terminate with TerminateThread? AFX threads are simply wrappers around standard Windows threads. There is nothing special or magical about them that would differentiate their behavior. So what would happen is a bunch of bad stuff:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.
You should never have to do this if you have access to all the source code of the application.
What happens when a main thread terminates a thread that has called
WaitForSingleObject?
See previous bullet. CreateEvent and WaitForSingleObject are actually a recommended way of cleanly terminating threads.

Event respond faster than semaphore?

In a project I run into a case like this (On windows 7),
When several threads are busy (all my CPU cores are busy working), there'll be delay for a thread
to receive a semaphore (which is increased from 0 to 1). It may be as long as 1.5ms.
I solve this by cache a little things and increase the semaphore value earlier.
So to me, it seems signaling a semaphore is slow, it's not immediately received by threads (especially when CPU are busy), but if you signal it earlier before some thread begin to wait on it,, there' be no delay.
I once thought event is just a semaphore with maximum value of 1,,, well, now having met this case, I'm beginning to wonder if event is faster than semaphore at noticing threads to 'wake up'.
Sorry, I tried, but didn't come out with a demo,, I'm not very good at threading yet.
EDIT:
Is it true that Event is faster than Semaphore on Windows?
1.5 milliseconds is not explained by just the overhead between different multithreading primitives.
To simplify, Threads have three states
blocked
runnable
running
If a thread is waiting on a semaphore or an event, then it's blocked. When the event is signalled, it becomes runnable.
So the real question is, "When does a runnable thread actually run?" This varies according to scheduler algorithms, etc, but obviously it needs to run on a core, and that means nothing else can be "running" on that core at the same time. The scheduler will normally 'remove' the current running thread from a core when one of the following happens
it waits on a semaphore/event, and so becomes 'blocked'
It's been running continually for a certain time (time-based, or round-robin scheduling)
A higher priority thread becomes runnable.
The 1.5 milliseconds is probably round-robin, or time-based scheduling. Your thread is runnable but just hasn't started yet. If the thread must start, and should boot out the current thread, then you can try to increase it's priority via SetThreadPriority
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686277(v=vs.85).aspx
If a thread is waiting on a semaphore and it gets signaled, the thread will in my limited testing, become running in ~10us on a box that is not overloaded.
Signaling, and subsequent dispatching onto a core, will take longer if:
The signaled thread is in another process than any thread is preempts.
Running the signaled thread requires a thread running on another core to be preempted.
The box is already overloaded with higher-priority threads.
1.5ms must represent an extreme case where your box is very busy.
In such a case, replacing the semaphore with an event is unlikely to result in any significant improvement to overall signaling latency because the bulk of the work/delay required by the inter-thread signaling is tied up the in scheduling/dispatching, which is required in either case.

sleep boost::thread indefinitely and wake from another thread?

Is it possible to have a boost::thread sleep indefinitely after its work is completed and then wake it from another boost::thread?
Using while(1)s are perfect for a dedicated server where I want the threads to run all cores at 100%, but I'm writing a websocket++ server to be run on a desktop, thus I only want the boost::threads to run when they actually have work to do, so I can do other work on my desktop without performance suffering.
I've seen other examples where boost::threads are set to sleep() for constant a amount of time, but I'd rather not spend the time trying to find that optimal constant; besides, I need the websocket++ server to respond as quickly as possible when it receives data to process.
If this is possible, how can it be done with multiple threads trying to wake?
This mechanism is implemented by what is called a condition-variable, see boost::condition_variable. Essentially, the waiting thread will sleep on a locked mutex until another thread signals the condition, thereby unlocking it.
Watch out for spurious wake-ups. Sometimes the waiting thread will wake-up without being signaled. This means that you should still put a while-loop that checks a predicate (or condition) to decipher between real wake-ups and spurious ones.
yes, pthread_mutex_t+pthread_cond_t is the right thing to use, you can find the corresponding
thing in boost.

Boost threads: is it possible to limit the run time of a thread before moving to another thread

I have a program with a main thread and a diagnostics thread. The main thread is basically a while(1) loop that performs various tasks. One of these tasks is to provide a diagnostics engine with information about the system and then check back later (i.e. in the next loop) to see if there are any problems that should be dealt with. An iteration of the main loop should take no longer than 0.1 seconds. If all is well, then the diagnostic engine takes almost no time to come back with an answer. However, if there is a problem, the diagnostic engine can take seconds to isolate the problem. For this reason each time the diagnostic engine receives new information it spins up a new diagnostics thread.
The problem we're having is that the diagnostics thread is stealing time away from the main thread. Effectively, even though we have two threads, the main thread is not able to run as often as I would like because the diagnostic thread is still spinning.
Using Boost threads, is it possible to limit the amount of time that a thread can run before moving on to another thread? Also of importance here is that the diagnostic algorithm we are using is blackbox, so we can't put any threading code inside of it. Thanks!
If you run multiple threads they will indeed consume CPU time. If you only have a single processor, and one thread is doing processor intensive work then that thread will slow down the work done on other threads. If you use OS-specific facilities to change the thread priority then you can make the diagnostic thread have a lower priority than the main thread. Also, you mention that the diagnostic thread is "spinning". Do you mean it literally has the equivalent of a spin-wait like this:
while(!check_done()) ; // loop until done
If so, I would strongly suggest that you try and avoid such a busy-wait, as it will consume CPU time without achieving anything.
However, though multiple threads can cause each other to slow-down, if you are seeing an actual delay of several seconds this would suggest there is another problem, and that the main thread is actually waiting for the diagnostic thread to complete. Check that the call to join() for the diagnostic thread is outside the main loop.
Another possibility is that the diagnostic thread is locking a mutex needed by the main thread loop. Check which mutexes are locked and where.
To really help, I'd need to see some code.
looks like your threads are interlocked, so your main thread waits until background thread finished its work. check any multithreading sychronization that can cause this.
to check that it's nothing related to OS scheduling run you program on double-core system, so both threads can be executed really in parallel
From the way you've worded your question, it appears that you're not quite sure how threads work. I assume by "the amount of time that a thread can run before moving on to another thread" you mean the number of cpu cycles spent per thread. This happens hundreds of thousands of times per second.
Boost.Thread does not have support for thread priorities, although your OS-specific thread API will. However, your problem seems to indicate the necessity for a fundamental redesign -- or at least heavy profiling to find bottlenecks.
You can't do this generally at the OS level, so I doubt boost has anything specific for limiting execution time. You can kinda fake it with small-block operations and waits, but it's not clean.
I would suggest looking into processor affinity, either at a thread or process level (this will be OS-specific). If you can isolate your diagnostic processing to a limited subset of [logical] processors on a multi-core machine, it will give you a very course mechanism to control maximum execution amount relative to the main process. That's the best solution I have found when trying to do a similar type of thing.
Hope that helps.

Significance of Sleep(0)

I used to see Sleep(0) in some part of my code where some infinite/long while loops are available. I was informed that it would make the time-slice available for other waiting processes. Is this true? Is there any significance for Sleep(0)?
According to MSDN's documentation for Sleep:
A value of zero causes the thread to
relinquish the remainder of its time
slice to any other thread that is
ready to run. If there are no other
threads ready to run, the function
returns immediately, and the thread
continues execution.
The important thing to realize is that yes, this gives other threads a chance to run, but if there are none ready to run, then your thread continues -- leaving the CPU usage at 100% since something will always be running. If your while loop is just spinning while waiting for some condition, you might want to consider using a synchronization primitive like an event to sleep until the condition is satisfied or sleep for a small amount of time to prevent maxing out the CPU.
Yes, it gives other threads the chance to run.
A value of zero causes the thread to
relinquish the remainder of its time
slice to any other thread that is
ready to run. If there are no other
threads ready to run, the function
returns immediately, and the thread
continues execution.
Source
I'm afraid I can't improve on the MSDN docs here
A value of zero causes the thread to
relinquish the remainder of its time
slice to any other thread that is
ready to run. If there are no other
threads ready to run, the function
returns immediately, and the thread
continues execution.
Windows XP/2000: A value of zero
causes the thread to relinquish the
remainder of its time slice to any
other thread of equal priority that is
ready to run. If there are no other
threads of equal priority ready to
run, the function returns immediately,
and the thread continues execution.
This behavior changed starting with
Windows Server 2003.
Please also note (via upvote) the two useful answers regarding efficiency problems here.
Be careful with Sleep(0), if one loop iteration execution time is short, this can slow down such loop significantly. If this is important to use it, you can call Sleep(0), for example, once per 100 iterations.
Sleep(0); At that instruction, the system scheduler will check for any other runnable threads and possibly give them a chance to use the system resources depending on thread priorities.
On Linux there's a specific command for this: sched_yield()
as from the man pages:
sched_yield() causes the calling thread to relinquish the CPU. The
thread is moved to the end of the queue for its static priority and a
new thread gets to run.
If the calling thread is the only thread in the highest priority list
at that time, it will continue to run after a call to sched_yield().
with also
Strategic calls to sched_yield() can improve performance by giving
other threads or processes a chance to run when (heavily) contended
resources (e.g., mutexes) have been released by the caller. Avoid
calling sched_yield() unnecessarily or inappropriately (e.g., when
resources needed by other schedulable threads are still held by the
caller), since doing so will result in unnecessary context switches,
which will degrade system performance.
In one app....the main thread looked for things to do, then launched the "work" via a new thread. In this case, you should call sched_yield() (or sleep(0)) in the main thread, so, that you do not make the "looking" for work, more important then the "work". I prefer sleep(0), but sometimes this is excessive (because you are sleeping a fraction of a second).
Sleep(0) is a powerful tool and it can improve the performance in certain cases. Using it in a fast loop might be considered in special cases. When a set of threads shall be utmost responsive, they shall all use Sleep(0) frequently. But it is crutial to find a ruler for what responsive means in the context of the code.
I've given some details in https://stackoverflow.com/a/11456112/1504523
I am using using pthreads and for some reason on my mac the compiler is not finding pthread_yield() to be declared. But it seems that sleep(0) is the same thing.