Performance of Windows timer functions

Performance of Windows timer functions - c++

I trying to call a watchdog function every 500ms using timeSetEvent.
Normally the watchdog is called without any problems. However when a DVD is inserted into a drive I don't get a callback for up to 8 secs as the system is busy reading the disk. Using WindowsQueue timer things aren't quite so bad and I get a 4 second day.
My last attempt was to set thread priority to time critical and then sleep for 500ms, but again the callback didn't occur for 8secs.
I am still running XP, but I'm not aware of changes made in later operating systems which would make this better.
I am using the DVD insertion as an example. I'd like the code to be robust to other conditions if possible.
Any suggestions greatly appreciated.

The best way we've found of getting decent timing performance from Windows is to avoid the timer functions altogether and roll your own; just sit in a tight loop, eating CPU, until the time for the next timer event comes up. QueryPerformanceCounter seems to be the best way of getting fast, high-resolution time for this purpose. Do whatever you can to give the thread that is running this loop its own CPU core that it will stick to and won't have anything else taking CPU time from.
But be warned:
Timing is still far from perfect. This method leaves the vast bulk of your cycles with Fairly Reasonable Timing (TM) but there will be occasional glitches, mostly in the 100s of ms but very occasionally in range of a few seconds.
This, of course, sets something of a lower bound on your CPU utilisation.
I still don't know how it'll perform with your DVD driver.
In general, if you need accurate timing, Windows is not the place to do it.

Have you looked into the RegisterWaitForSingleObject function (http://msdn.microsoft.com/en-us/library/windows/desktop/ms685061%28v=vs.85%29.aspx)?

You could try this example code, meant to be used in a separate thread to run at fixed frequency, up to 400hz on Windows XP (where Sleep(1) can take up to about 2 ms). There seems to be a system wide delay when a new volume becomes ready, in this case a DVD, but a similar delay can happen if an external USB hard drive is turned on. Seems Windows XP starts scanning the new volume, locking up quite a bit of the system, and even this code example may get impacted, since it's using Sleep(1) to reduce cpu overhead. You could remove the Sleep(1) code, which would keep one of the cpu cores running at 100%, and it's still possible that inserting a DVD would even mess this up.
/* code for a thread to run at fixed frequency */
typedef unsigned long long UI64; /* unsigned 64 bit int */
#define FREQ 2 /* frequency (500 ms) */
LARGE_INTEGER liPerfTemp; /* used for query */
UI64 uFreq = FREQ; /* process frequency */
UI64 uOrig; /* original tick */
UI64 uWait; /* tick rate / freq */
UI64 uRem = 0; /* tick rate % freq */
UI64 uPrev; /* previous tick based on original tick */
UI64 uDelta; /* current tick - previous */
UI64 u2ms; /* 2ms of ticks */
UI64 i;
/* ... */ /* wait for some event to start thread */
timeBeginPeriod(1); /* set period to 1ms */
Sleep(128); /* wait for it to stabilize */
u2ms = ((UI64)(liPerfFreq.QuadPart)+499) / ((UI64)500);
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uOrig = uPrev = liPerfTemp.QuadPart;
for(i = 0; i < (uFreq*30); i++){
/* update uWait and uRem based on uRem */
uWait = ((UI64)(liPerfFreq.QuadPart) + uRem) / uFreq;
uRem = ((UI64)(liPerfFreq.QuadPart) + uRem) % uFreq;
/* wait for uWait ticks */
while(1){
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uDelta = (UI64)(liPerfTemp.QuadPart - uPrev);
if(uDelta >= uWait)
break;
if((uWait - uDelta) > u2ms)
Sleep(1);
}
if(uDelta >= (uWait*2))
dwLateStep += 1;
uPrev += uWait;
/* fixed frequency code goes here */
/* along with some type of break when done */
}
timeEndPeriod(1); /* restore period */

Related

How to measure elapse time in Cortex-M4

I am using Cortex-M4 on SOC and I want to measure the time a certain function take.
Googling it I saw two methods
Method 1 - using DWT_CYCCNT
REGISTER(DEMCR_ADDR) |= 1 << 24 ; //TRCENA_OFFSET
REGISTER(DWT_CTRL) |= 1; //on
startTime = REGISTER(DWT_CYCCNT);
//doing work
elapsedTime = REGISTER(DWT_CYCCNT) -startTime
REGISTER(DWT_CTRL) &= ~1; //of
Method 2: - using SysTick
//init
SysTick->LOAD = SysTick_LOAD_RELOAD_Msk; /* set reload register = MAX COUNT*/
SysTick->VAL = 0UL; /* Load the SysTick Counter Value */
SysTick->CTRL = SysTick_CTRL_CLKSOURCE_Msk |
SysTick_CTRL_ENABLE_Msk; /* Enable SysTick IRQ and SysTick Timer */
startTime = SysTick->VAL;
//do some work
elapsedTime = SysTick->VAL - start time;
SysTick->LOAD = SysTick_LOAD_RELOAD_Msk; /* set reload register = MAX COUNT*/
SysTick->VAL = 0UL; /* Load the SysTick Counter Value */
SysTick->CTRL = 0UL;
I wonder what are the advantages / disadvantages of these two methods

I have used both these methods in different projects.
In either case, you might use one of these because the other was already used for something else. If your RTOS wants the systick, use the debug counter. If your debugger wants the debug counter, use the systick.
The main disadvantage of the systick is that it only has 24 bits, whereas the debug counter has 32.
The main disadvantage is the debug counter is it is not available on every part (the systick is optional too, but hardly any silicon vendors take it out).
Enabling the whole debug block just for a counter also wastes a little bit of power, which you might care about if you are running from batteries.

timeGetTime() start variable is bigger than end variable

I am using timeGetTime() to limit the framerate to 60 frames per second. The way i intend to do that is get the time it takes to render said 60 frames and then use Sleep to wait the remainder of the second. But for some reason timeGetTime() is returning a way bigger number the first time i call it than when i call it after the 60 frames are rendered.
Here is the code:
Header
#ifndef __TesteMapa_h_
#define __TesteMapa_h_
#include "BaseApplication.h"
#include "Mundo.h"
class TesteMapa : public BaseApplication{
public:
TesteMapa()
virtual ~TesteMapa();
protected:
virtual void createScene();
virtual bool frameRenderingQueued(const Ogre::FrameEvent& evt);
virtual bool frameEnded(const Ogre::FrameEvent& evt);
virtual bool keyPressed(const OIS::KeyEvent &evt);
virtual bool keyReleased(const OIS::KeyEvent &evt);
private:
Mundo mundo = Mundo(3,3,3);
short altura, largura, passos, balanca, framesNoSegundo=0;
Ogre::SceneNode *noSol, *noSolFilho, *noCamera;
DWORD inicioSegundo = 0, finala;//inicioSegundo is the start variable and finala the ending variable
};
#endif
CPP relevant function
bool TesteMapa::frameEnded(const Ogre::FrameEvent& evt){
framesNoSegundo++;
if (inicioSegundo == 0)
inicioSegundo = timeGetTime();
else{
if (framesNoSegundo == 60){
finala = timeGetTime(); //getting this just to see the value being returned
Sleep(1000UL - (timeGetTime() - inicioSegundo));
inicioSegundo = 0;
framesNoSegundo = 0;
}
}
return true;
}
I am using timeBeginPeriod(1) and timeEndPeriod(1) in the main function.

Without even reading the complete question, the following:
using timeGetTime()
t limit the framerate to 60 frames per second
...
Sleep for the remainder of the second
can be answered with a firm "You are doing it wrong". In other words, stop here, and take a different approach.
Neither does timeGetTime have the necessary precision (not even if you use timeBeginPeriod(1)), nor does Sleep have the required precision, nor does Sleep give any guarantees about the maximum duration, nor are the semantics of Sleep even remotely close to what you expect, nor is sleeping to limit the frame rate a correct approach.
Also, calculating the remainder of the second will inevitably introduce a systematic error that will accumulate over time.
The one and only correct approach to limit frame rate is to use vertical sync.
If you need to otherwise limit a simulation to a particular rate, using a waitable timer is the correct approach. That will still be subject to the scheduler's precision, but it will avoid accumulating systematic errors, and priority boost will at least give a de-facto soft realtime guarantee.
In order to understand why what you are doing is (aside from precision and accumulating errors) conceptually wrong to begin with, consider two things:
Different timers, even if they run at apparently the same frequency, will diverge (thus, using any timer other than the vsync interrupt is wrong to limit frame rate). Watch cars at a red traffic light for a real-life analogy. Their blinkers will always be out of sync.
Sleep makes the current thread "not ready" to run, and eventually, some time after the specified time has passed, makes the thread "ready" again. That doesn't mean that the thread will run at that time again. Indeed, it doesn't necessarily mean that the thread will run at all in any finite amount of time.
Resolution is commonly around 16ms (1ms if you adjust the scheduler's granularity, which is an antipattern -- some recent architectures support 0.5ms by using the undocumented Nt API), which is way too coarse for something on the 1/60 second scale.

If you're using Visual Studio 2013 or older, std::chrono uses the 64hz ticker (15.625 ms per tick), which is slow. VS 2015 is supposed to fix this. You can use QueryPerformanceCounter instead. Here is example code that runs at a fixed frequency with no drift, since delays are based off an original reading of the counter. dwLateStep is a debugging aid that gets incremented if one or more steps took too long. The code is Windows XP compatible, where Sleep(1) can take up to 2 ms, which is why the code only does a sleep if there is 2 ms or more of time to delay.
typedef unsigned long long UI64; /* unsigned 64 bit int */
#define FREQ 60 /* frequency */
DWORD dwLateStep; /* late step count */
LARGE_INTEGER liPerfFreq; /* 64 bit frequency */
LARGE_INTEGER liPerfTemp; /* used for query */
UI64 uFreq = FREQ; /* thread frequency */
UI64 uOrig; /* original tick */
UI64 uWait; /* tick rate / freq */
UI64 uRem = 0; /* tick rate % freq */
UI64 uPrev; /* previous tick based on original tick */
UI64 uDelta; /* current tick - previous */
UI64 u2ms; /* 2ms of ticks */
UI64 i;
/* ... */ /* wait for some event to start thread */
QueryPerformanceFrequency(&liPerfFreq);
u2ms = ((UI64)(liPerfFreq.QuadPart)+499) / ((UI64)500);
timeBeginPeriod(1); /* set period to 1ms */
Sleep(128); /* wait for it to stabilize */
QueryPerformanceCounter(&liPerfTemp);
uOrig = uPrev = liPerfTemp.QuadPart;
for(i = 0; i < (uFreq*30); i++){
/* update uWait and uRem based on uRem */
uWait = ((UI64)(liPerfFreq.QuadPart) + uRem) / uFreq;
uRem = ((UI64)(liPerfFreq.QuadPart) + uRem) % uFreq;
/* wait for uWait ticks */
while(1){
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uDelta = (UI64)(liPerfTemp.QuadPart - uPrev);
if(uDelta >= uWait)
break;
if((uWait - uDelta) > u2ms)
Sleep(1);
}
if(uDelta >= (uWait*2))
dwLateStep += 1;
uPrev += uWait;
/* fixed frequency code goes here */
/* along with some type of break when done */
}
timeEndPeriod(1); /* restore period */

How to create a Timer in C of fixed Duration [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
How to create a timer of 200µs in C/C++?
Just perform some task in that time and then timer reset and after every 200µs keeps performing the same task?

C++ has std::chrono::high_resolution_clock which may have nanoseconds precision.
...represents the clock with the smallest tick period provided by the
implementation.
Standard time functions in C aren't very precise. ~15ms error you can expect.
For waiting: Unix implementations provide usleep and nanosleep. On Windows you can use CreateWaitableTimer (example).
For current time: Unix provides clock_gettime, Windows QueryPerformanceCounter (more).
Implementing a timer class which uses these time functions shouldn't be much work but if you use a good framework probably a high resolution timer is already available.

Example windows code for a thread. Note that cpu usage will be 100% for the core running this thread.
/* code for a thread to run at fixed frequency */
typedef unsigned long long UI64; /* unsigned 64 bit int */
#define FREQ 5000 /* frequency */
LARGE_INTEGER liPerfFreq; /* 64 bit frequency */
LARGE_INTEGER liPerfTemp; /* used for query */
UI64 uFreq = FREQ; /* process frequency */
UI64 uOrig; /* original tick */
UI64 uWait; /* tick rate / freq */
UI64 uRem = 0; /* tick rate % freq */
UI64 uPrev; /* previous tick based on original tick */
UI64 uDelta; /* current tick - previous */
/* ... */ /* wait for some event to start thread */
QueryPerformanceFrequency(&liPerfFreq);
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uOrig = uPrev = liPerfTemp.QuadPart;
while(1){
/* update uWait and uRem based on uRem */
uWait = ((UI64)(liPerfFreq.QuadPart) + uRem) / uFreq;
uRem = ((UI64)(liPerfFreq.QuadPart) + uRem) % uFreq;
/* wait for uWait ticks */
while(1){
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uDelta = (UI64)(liPerfTemp.QuadPart - uPrev);
if(uDelta >= uWait)
break;
}
uPrev += uWait;
/* fixed frequency code goes here */
/* along with some type of break when done */
}

How to coordinate threads properly based on a fixed cycle frequency?

I want to create a gateway to pass messages from a can bus and lcm and vice-versa. If a message in lcm is sent at a specific frequency, its copy on the can bus should be sent at the exact same frequency.
How would you solve this?
In my mind I thought about two threads, one for each direction converting the message from one system to another in a loop. The two threads are coordinated by a timer which is set to a frequency which is much lower than the maximum message passing frequency possible. The timer sents a signal to the threads after each cycle. The threads wait for that event at the end of each loop, i.e. they sleep and free resources until the event occures.
I implemented this idea already but the resulting frequencies are not constant. Am I doing something conceptually wrong?
A solution should run on windows, thus utilize either native windows api or boost threads for example. The gateway should be real-time capable.

For windows, timeSetEvent can be used to set an event at a regular interval, although MSDN lists it as an obsolete function. The replacement for timeSetEvent uses a callback function, so you'd have to set an event in the callback function.
You can increase the tick rate from it's default 64hz == 15.625 ms down to 1 ms using timeBeginPeriod
Some games have threads that run at fixed frequencies, and poll a high frequency counter and Sleep when there's enough delay time remaining in the current cycle. To prevent drift, the delay is based off an original reading of the high frequency counter. Example code that is Windows XP compatible, where a Sleep(1) can take up to 2 milliseconds. dwLateStep is a diagnostic aid and incremented if the code exceeds a cycle period.
/* code for a thread to run at fixed frequency */
typedef unsigned long long UI64; /* unsigned 64 bit int */
#define FREQ 400 /* frequency */
DWORD dwLateStep; /* late step count */
LARGE_INTEGER liPerfFreq; /* 64 bit frequency */
LARGE_INTEGER liPerfTemp; /* used for query */
UI64 uFreq = FREQ; /* process frequency */
UI64 uOrig; /* original tick */
UI64 uWait; /* tick rate / freq */
UI64 uRem = 0; /* tick rate % freq */
UI64 uPrev; /* previous tick based on original tick */
UI64 uDelta; /* current tick - previous */
UI64 u2ms; /* 2ms of ticks */
UI64 i;
/* ... */ /* wait for some event to start thread */
QueryPerformanceFrequency(&liPerfFreq);
u2ms = ((UI64)(liPerfFreq.QuadPart)+499) / ((UI64)500);
timeBeginPeriod(1); /* set period to 1ms */
Sleep(128); /* wait for it to stabilize */
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uOrig = uPrev = liPerfTemp.QuadPart;
for(i = 0; i < (uFreq*30); i++){
/* update uWait and uRem based on uRem */
uWait = ((UI64)(liPerfFreq.QuadPart) + uRem) / uFreq;
uRem = ((UI64)(liPerfFreq.QuadPart) + uRem) % uFreq;
/* wait for uWait ticks */
while(1){
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uDelta = (UI64)(liPerfTemp.QuadPart - uPrev);
if(uDelta >= uWait)
break;
if((uWait - uDelta) > u2ms)
Sleep(1);
}
if(uDelta >= (uWait*2))
dwLateStep += 1;
uPrev += uWait;
/* fixed frequency code goes here */
/* along with some type of break when done */
}
timeEndPeriod(1); /* restore period */

Sleep(1) and SDL_Delay(1) takes 15 ms

I am writing a C++/SDL/OpenGL application, and I have had the most peculiar bug. The game seemed to be working fine with a simple variable timestep. But then the FPS started behaving strangely. I figured out that both Sleep(1) and SDL_Delay(1) take 15 ms to complete.
Any input into those functions between 0-15 takes 15ms to complete, locking FPS at about 64. If I set it to 16, it takes 30 MS O.O
My loop looks like this:
while (1){
GLuint t = SDL_GetTicks();
Sleep(1); //or SDL_Delay(1)
cout << SDL_GetTicks() - t << endl; //outputs 15
}
It will very rarely take 1ms as it is supposed to, but the majority of the time it takes 15ms.
My OS is windows 8.1. CPU is an intel i7. I am using SDL2.

The ticker defaults to 64 hz, or 15.625 ms / tick. You need to change this to 1000hz == 1ms with timeBeginPeriod(1). MSDN article:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd757624(v=vs.85).aspx
If the goal here is to get a fixed frequency sequence, you should use a higher resolution timer, but unfortunately these can only be polled, so a combination of polling and sleep to reduce cpu overhead is needed. Example code, which assumes that a Sleep(1) could take up to almost 2 ms (which does happen with Windows XP, but not with later versions of Windows).
/* code for a thread to run at fixed frequency */
#define FREQ 400 /* frequency */
typedef unsigned long long UI64; /* unsigned 64 bit int */
LARGE_INTEGER liPerfFreq; /* used for frequency */
LARGE_INTEGER liPerfTemp; /* used for query */
UI64 uFreq = FREQ; /* process frequency */
UI64 uOrig; /* original tick */
UI64 uWait; /* tick rate / freq */
UI64 uRem = 0; /* tick rate % freq */
UI64 uPrev; /* previous tick based on original tick */
UI64 uDelta; /* current tick - previous */
UI64 u2ms; /* 2ms of ticks */
#if 0 /* for optional error check */
static DWORD dwLateStep = 0;
#endif
/* get frequency */
QueryPerformanceFrequency(&liPerfFreq);
u2ms = ((UI64)(liPerfFreq.QuadPart)+499) / ((UI64)500);
/* wait for some event to start this thread code */
timeBeginPeriod(1); /* set period to 1ms */
Sleep(128); /* wait for it to stabilize */
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uOrig = uPrev = liPerfTemp.QuadPart;
while(1){
/* update uWait and uRem based on uRem */
uWait = ((UI64)(liPerfFreq.QuadPart) + uRem) / uFreq;
uRem = ((UI64)(liPerfFreq.QuadPart) + uRem) % uFreq;
/* wait for uWait ticks */
while(1){
QueryPerformanceCounter((PLARGE_INTEGER)&liPerfTemp);
uDelta = (UI64)(liPerfTemp.QuadPart - uPrev);
if(uDelta >= uWait)
break;
if((uWait - uDelta) > u2ms)
Sleep(1);
}
#if 0 /* optional error check */
if(uDelta >= (uWait*2))
dwLateStep += 1;
#endif
uPrev += uWait;
/* fixed frequency code goes here */
/* along with some type of break when done */
}
timeEndPeriod(1); /* restore period */

Looks like 15 ms is the smallest slice the OS will deliver to you. I'm not sure about your specific framework but sleep usually guarantees a minimal sleep time. (ie. it will sleep for at least 1ms.)

SDL_Delay()/Sleep() cannot be used reliably with times below 10-15 milliseconds. CPU ticks don't register fast enough to detect a 1 ms difference.
See the SDL docs here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js