Running code every x seconds, no matter how long execution within loop takes - c++

I'm trying to make an LED blink to the beat of a certain song. The song has exactly 125 bpm.
The code that I wrote seems to work at first, but the longer it runs the bigger the difference in time between the LED flashes and the next beat starts. The LED seems to blink a tiny bit too slow.
I think that happens because lastBlink is kind of depending on the blink which happened right before that to stay in sync, instead of using one static initial value to sync to...
unsigned int bpm = 125;
int flashDuration = 10;
unsigned int lastBlink = 0;
for(;;) {
if (getTickCount() >= lastBlink+1000/(bpm/60)) {
lastBlink = getTickCount();
printf("Blink!\r\n");
RS232_SendByte(cport_nr, 4); //LED ON
delay(flashDuration);
RS232_SendByte(cport_nr, 0); //LED OFF
}
}

Add value to lastBlink, not reread it as the getTickCount might have skipped more than the exact beats want to wait.
lastblink+=1000/(bpm/60);

Busy-waiting is bad, it spins the CPU for no good reason, and under most OS's it will lead to your process being punished -- the OS will notice that it is using up lots of CPU time and dynamically lower its priority so that other, less-greedy programs get first dibs on CPU time. It's much better to sleep until the appointed time(s) instead.
The trick is to dynamically calculate the amount of time to sleep until the next time to blink, based on the current system-clock time. (Simply delaying by a fixed amount of time means you will inevitably drift, since each iteration of your loop takes a non-zero and somewhat indeterminate time to execute).
Example code (tested under MacOS/X, probably also compiles under Linux, but can be adapted for just about any OS with some changes) follows:
#include <stdio.h>
#include <unistd.h>
#include <sys/times.h>
// unit conversion code, just to make the conversion more obvious and self-documenting
static unsigned long long SecondsToMillis(unsigned long secs) {return secs*1000;}
static unsigned long long MillisToMicros(unsigned long ms) {return ms*1000;}
static unsigned long long NanosToMillis(unsigned long nanos) {return nanos/1000000;}
// Returns the current absolute time, in milliseconds, based on the appropriate high-resolution clock
static unsigned long long getCurrentTimeMillis()
{
#if defined(USE_POSIX_MONOTONIC_CLOCK)
// Nicer New-style version using clock_gettime() and the monotonic clock
struct timespec ts;
return (clock_gettime(CLOCK_MONOTONIC, &ts) == 0) ? (SecondsToMillis(ts.tv_sec)+NanosToMillis(ts.tv_nsec)) : 0;
# else
// old-school POSIX version using times()
static clock_t _ticksPerSecond = 0;
if (_ticksPerSecond <= 0) _ticksPerSecond = sysconf(_SC_CLK_TCK);
struct tms junk; clock_t newTicks = (clock_t) times(&junk);
return (_ticksPerSecond > 0) ? (SecondsToMillis((unsigned long long)newTicks)/_ticksPerSecond) : 0;
#endif
}
int main(int, char **)
{
const unsigned int bpm = 125;
const unsigned int flashDurationMillis = 10;
const unsigned int millisBetweenBlinks = SecondsToMillis(60)/bpm;
printf("Milliseconds between blinks: %u\n", millisBetweenBlinks);
unsigned long long nextBlinkTimeMillis = getCurrentTimeMillis();
for(;;) {
long long millisToSleepFor = nextBlinkTimeMillis - getCurrentTimeMillis();
if (millisToSleepFor > 0) usleep(MillisToMicros(millisToSleepFor));
printf("Blink!\r\n");
//RS232_SendByte(cport_nr, 4); //LED ON
usleep(MillisToMicros(flashDurationMillis));
//RS232_SendByte(cport_nr, 0); //LED OFF
nextBlinkTimeMillis += millisBetweenBlinks;
}
}

I think the drift problem may be rooted in your using relative time delays by sleeping for a fixed duration rather than sleeping until an absolute point in time. The problem is threads don't always wake up precisely on time due to scheduling issues.
Something like this solution may work for you:
// for readability
using clock = std::chrono::steady_clock;
unsigned int bpm = 125;
int flashDuration = 10;
// time for entire cycle
clock::duration total_wait = std::chrono::milliseconds(1000 * 60 / bpm);
// time for LED off part of cycle
clock::duration off_wait = std::chrono::milliseconds(1000 - flashDuration);
// time for LED on part of cycle
clock::duration on_wait = total_wait - off_wait;
// when is next change ready?
clock::time_point ready = clock::now();
for(;;)
{
// wait for time to turn light on
std::this_thread::sleep_until(ready);
RS232_SendByte(cport_nr, 4); // LED ON
// reset timer for off
ready += on_wait;
// wait for time to turn light off
std::this_thread::sleep_until(ready);
RS232_SendByte(cport_nr, 0); // LED OFF
// reset timer for on
ready += off_wait;
}

If your problem is drifting out of sync rather than latency I would suggest measuring time from a given start instead of from the last blink.
start = now()
blinks = 0
period = 60 / bpm
while true
if 0 < ((now() - start) - blinks * period)
ledon()
sleep(blinklengh)
ledoff()
blinks++

Since you didn't specify C++98/03, I'm assuming at least C++11, and thus <chrono> is available. This so far is consistent with Galik's answer. However I would set it up so as to use <chrono>'s conversion abilities more precisely, and without having to manually enter conversion factors, except to describe "beats / minute", or actually in this answer, the inverse: "minutes / beat".
using namespace std;
using namespace std::chrono;
using mpb = duration<int, ratio_divide<minutes::period, ratio<125>>>;
constexpr auto flashDuration = 10ms;
auto beginBlink = steady_clock::now() + mpb{0};
while (true)
{
RS232_SendByte(cport_nr, 4); //LED ON
this_thread::sleep_until(beginBlink + flashDuration);
RS232_SendByte(cport_nr, 0); //LED OFF
beginBlink += mpb{1};
this_thread::sleep_until(beginBlink);
}
The first thing to do is specify the duration of a beat, which is "minutes/125". This is what mpb does. I've used minutes::period as a stand in for 60, just in an attempt to improve readability and reduce the number of magic numbers.
Assuming C++14, I can give flashDuration real units (milliseconds). In C++11 this would need to be spelled with this more verbose syntax:
constexpr auto flashDuration = milliseconds{10};
And then the loop: This is very similar in design to Galik's answer, but here I only increment the time to start the blink once per iteration, and each time, by precisely 60/125 seconds.
By delaying until a specified time_point, as opposed to a specific duration, one ensures that there is no round off accumulation as time progresses. And by working in units which exactly describe your required duration interval, there is also no round off error in terms of computing the start time of the next interval.
No need to traffic in milliseconds. And no need to compute how long one needs to delay. Only the need to symbolically compute the start time of each iteration.
Um...
Sorry to pick on Galik's answer, which I believe is the second best answer next to mine, but it exhibits a bug which my answer not only doesn't have, but is designed to prevent. I didn't notice it until I dug into it with a calculator, and it is subtle enough that testing might miss it.
In Galik's answer:
total_wait = 480ms; // this is exactly correct
off_wait = 990ms; // likely a design flaw
on_wait = -510ms; // certainly a mistake
And the total time that an iteration takes is on_wait + off_wait which is 440ms, almost imperceptibly close to total_wait (480ms), making debugging very challenging.
In contrast my answer increments ready (beginBlink) only once, and by exactly 480ms.
My answer is more likely to be right for the simple reason that it delegates more of its computation to the <chrono> library. And in this particular case, that probability paid off.
Avoid manual conversions. Instead let the <chrono> library do them for you. Manual conversions introduce the possibility for error.

You should count the time spent on the process and substract it to the flashDuration value.

The most obvious issue is that you're losing precision when you divide bpm/60. This always yields an integer (2) instead of 2.08333333...
Calling getTickCount() twice could also lead to some drift.

Related

Best way to implement a high resolution timer

What is the best way in C++11 to implement a high-resolution timer that continuously checks for time in a loop, and executes some code after it passes a certain point in time? e.g. check what time it is in a loop from 9am onwards and execute some code exactly at 11am. I require the timing to be precise (i.e. no more than 1 microsecond after 9am).
I will be implementing this program on Linux CentOS 7.3, and have no issues with dedicating CPU resources to execute this task.
Instead of implementing this manually, you could use e.g. a systemd.timer. Make sure to specify the desired accuracy which can apparently be as precise as 1us.
a high-resolution timer that continuously checks for time in a loop,
First of all, you do not want to continuously check the time in a loop; that's extremely inefficient and simply unnecessary.
...executes some code after it passes a certain point in time?
Ok so you want to run some code at a given time in the future, as accurately as possible.
The simplest way is to simply start a background thread, compute how long until the target time (in the desired resolution) and then put the thread to sleep for that time period. When your thread wakes up, it executes the actual task. This should be accurate enough for the vast majority of needs.
The std::chrono library provides calls which make this easy:
System clock in std::chrono
High resolution clock in std::chrono
Here's a snippet of code which does what you want using the system clock (which makes it easier to set a wall clock time):
// c++ --std=c++11 ans.cpp -o ans
#include <thread>
#include <iostream>
#include <iomanip>
// do some busy work
int work(int count)
{
int sum = 0;
for (unsigned i = 0; i < count; i++)
{
sum += i;
}
return sum;
}
std::chrono::system_clock::time_point make_scheduled_time (int yyyy, int mm, int dd, int HH, int MM, int SS)
{
tm datetime = tm{};
datetime.tm_year = yyyy - 1900; // Year since 1900
datetime.tm_mon = mm - 1; // Month since January
datetime.tm_mday = dd; // Day of the month [1-31]
datetime.tm_hour = HH; // Hour of the day [00-23]
datetime.tm_min = MM;
datetime.tm_sec = SS;
time_t ttime_t = mktime(&datetime);
std::chrono::system_clock::time_point scheduled = std::chrono::system_clock::from_time_t(ttime_t);
return scheduled;
}
void do_work_at_scheduled_time()
{
using period = std::chrono::system_clock::period;
auto sched_start = make_scheduled_time(2019, 9, 17, // date
00, 14, 00); // time
// Wait until the scheduled time to actually do the work
std::this_thread::sleep_until(sched_start);
// Figoure out how close to scheduled time we actually awoke
auto actual_start = std::chrono::system_clock::now();
auto start_delta = actual_start - sched_start;
float delta_ms = float(start_delta.count())*period::num/period::den * 1e3f;
std::cout << "worker: awoken within " << delta_ms << " ms" << std::endl;
// Now do some actual work!
int sum = work(12345);
}
int main()
{
std::thread worker(do_work_at_scheduled_time);
worker.join();
return 0;
}
On my laptop, the typical latency is about 2-3ms. If you use the high_resolution_clock you should be able to get even better results.
There are other APIs you could use too, such as Boost where you could use ASIO to implement high res timeout.
I require the timing to be precise (i.e. no more than 1 microsecond after 9am).
Do you really need it to be accurate to the microsecond? Consider that at this resolution, you will also need to take into account all sorts of other factors, including system load, latency, clock jitter, and so on. Your code can start to execute at close to that time, but that's only part of the problem.
My suggestion would be to use timer_create(). This allows you to get notified by a signal at a given time. You can then implement your action in the signal handler.
In any case you should be aware that the accuracy of course depends on the system clock accuracy.

pthread_cond_timedwait timing out late when large load put on CPU

In writing unit tests for an object, I am noticing that a pthread_cond_timedwait does not timeout soon enough when large loads are put upon the CPU. If these loads are not put on the CPU, everything works fine. When loads are put on to the system, however, I find that no matter the amount of time I set the timeout to, the true delay is off by about 50-100ms.
For example, here is a printout from a single interval of the program, where the last and current times are found using the function GetTimeInMs.
// Printout, values are in ms
Last: 89799240
Current: 89799440
Period Length: 200
Expected Period: 100
From all I have read this issue is usually caused by using relative times instead of absolute times, but as far as I can tell we are using absolute times correctly. If you wonderful people could help me figure out what is being done wrong here I would be very grateful.
The function utilizing timedwait is shown here. Note that based off of timing debugging I have done, I know the extra time generated is done via the timedwait call, so I have not included other code that would not be necessary.
bool func(unsigned long long int time = 100) // ms
{
struct timespec ts;
pthread_mutex_lock(&m_Mutex);
if (0 == m_CurrentCount)
{
// Current time + delay in ns
unsigned long long int absnanotime = (GetTimeInMs()+time)*1000000;
struct timespec ts;
ts.tv_nsec = absnanotime % 1000000000ULL;
ts.tv_sec = absnanotime / 1000000000ULL;
do
{
if (0 != pthread_cond_timedwait(&m_Condition, &m_Mutex, &ts))
{
// In the case I am testing, I hope to get here via timeout in 100 ms
pthread_mutex_unlock(&m_Mutex);
return false;
}
}
while (!m_CurrentCount);
}
pthread_mutex_unlock(&m_Mutex);
return true;
}
unsigned long long int GetTimeInMs()
{
unsigned long long int time;
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
time = ts.tv_nsec + ts.tv_sec * 1000000000ULL;
time = time / 1000000ULL; // Converts to ms
return time;
}
The code used to initialize the class variables used in func.
void init()
{
pthread_mutex_init(&m_Mutex, NULL);
pthread_condattr_init(&m_Attr);
pthread_condattr_setclock(&m_Attr, CLOCK_MONOTONIC);
pthread_cond_init(&m_Condition, &m_Attr);
}
The CPU eater thread which simulates CPU load is running the following while loop.
void cpuEatingThread()
{
while (false == m_ShutdownRequested);
{
// m_UselessFoo is of type float*
m_UselessFoo = new float(1.23423525);
delete m_UselessFoo;
}
}
It's likely that, when the wait times out, the thread becomes ready without any priority boost or any other such action/s. If the box is loaded up, then the ready thread may not become running immediately.
It's common to apply temporary priority boosts to thread that become ready on signals - this tends to improve overall performance in the 'usual' case where the signal arrives before the timeout. The timeout is often more of an 'unusual' event, often signaling some sort of failure that will not be repeated and so threads becoming ready on timeout can wait their turn:)
For timed waits in general, the requirement is that they will wait at least as long as their argument. If you want precise times, this is not the right tool; you'll need something that guarantees particular times, and that's generally only available in a real-time operating system (RTOS).

Why is microsecond timestamp is repetitive using (a private) gettimeoftheday() i.e. epoch

I am printing microseconds continuously using gettimeofday(). As given in program output you can see that the time is not updated microsecond interval rather its repetitive for certain samples then increments not in microseconds but in milliseconds.
while(1)
{
gettimeofday(&capture_time, NULL);
printf(".%ld\n", capture_time.tv_usec);
}
Program output:
.414719
.414719
.414719
.414719
.430344
.430344
.430344
.430344
e.t.c
I want the output to increment sequentially like,
.414719
.414720
.414721
.414722
.414723
or
.414723, .414723+x, .414723+2x, .414723 +3x + ...+ .414723+nx
It seems that microseconds are not refreshed when I acquire it from capture_time.tv_usec.
=================================
//Full Program
#include <iostream>
#include <windows.h>
#include <conio.h>
#include <time.h>
#include <stdio.h>
#if defined(_MSC_VER) || defined(_MSC_EXTENSIONS)
#define DELTA_EPOCH_IN_MICROSECS 11644473600000000Ui64
#else
#define DELTA_EPOCH_IN_MICROSECS 11644473600000000ULL
#endif
struct timezone
{
int tz_minuteswest; /* minutes W of Greenwich */
int tz_dsttime; /* type of dst correction */
};
timeval capture_time; // structure
int gettimeofday(struct timeval *tv, struct timezone *tz)
{
FILETIME ft;
unsigned __int64 tmpres = 0;
static int tzflag;
if (NULL != tv)
{
GetSystemTimeAsFileTime(&ft);
tmpres |= ft.dwHighDateTime;
tmpres <<= 32;
tmpres |= ft.dwLowDateTime;
/*converting file time to unix epoch*/
tmpres -= DELTA_EPOCH_IN_MICROSECS;
tmpres /= 10; /*convert into microseconds*/
tv->tv_sec = (long)(tmpres / 1000000UL);
tv->tv_usec = (long)(tmpres % 1000000UL);
}
if (NULL != tz)
{
if (!tzflag)
{
_tzset();
tzflag++;
}
tz->tz_minuteswest = _timezone / 60;
tz->tz_dsttime = _daylight;
}
return 0;
}
int main()
{
while(1)
{
gettimeofday(&capture_time, NULL);
printf(".%ld\n", capture_time.tv_usec);// JUST PRINTING MICROSECONDS
}
}
The change in time you observe is 0.414719 s to 0.430344 s. The difference is 15.615 ms. The fact that the representation of the number is microsecond does not mean that it is incremented by 1 microsecond. In fact I would have expected 15.625 ms. This is the system time increment on standard hardware. I've given a closer look here and here.
This is called granularity of the system time.
Windows:
However, there is a way to improve this, a way to reduce the granularity: The Multimedia Timers. Particulary Obtaining and Setting Timer Resolution will disclose a way to increase the systems interrupt frequency.
The code:
#define TARGET_PERIOD 1 // 1-millisecond target interrupt period
TIMECAPS tc;
UINT wTimerRes;
if (timeGetDevCaps(&tc, sizeof(TIMECAPS)) != TIMERR_NOERROR)
// this call queries the systems timer hardware capabilities
// it returns the wPeriodMin and wPeriodMax with the TIMECAPS structure
{
// Error; application can't continue.
}
// finding the minimum possible interrupt period:
wTimerRes = min(max(tc.wPeriodMin, TARGET_PERIOD ), tc.wPeriodMax);
// and setting the minimum period:
timeBeginPeriod(wTimerRes);
This will force the system to run at its maximum interrupt frequency. As a consequence
also the update of the system time will happen more often and the granularity of the system time increment will be close to 1 milisecond on most systems.
When you deserve resolution/granularity beyond this, you'd have to look into QueryPerformanceCounter. But this is to be used with care when using it over longer periods of time. The frequency of this counter can be obtained by a call to QueryPerformanceFrequency. The OS considers this frequency as a constant and will give the same value all time. However, some hardware produces this frequency and the true frequency differs from the given value. It has an offset and it shows thermal drift. Thus the error shall be assumed in the range of several to many microseconds/second. More details about this can be found in the second "here" link above.
Linux:
The situation looks somewhat different for Linux. See this to get an idea. Linux
mixes information of the CMOS clock using the function getnstimeofday (for seconds since epoch) and information from a high freqeuncy counter (for the microseconds) using the function timekeeping_get_ns. This is not trivial and is questionable in terms of accuracy since both sources are backed by different hardware. The two sources are not phase locked, thus it is possible to get more/less than one million microseconds per second.
The Windows system clock only ticks every few milliseconds -- in your case 64 times per second, so when it does tick it increases the system time by 15.625 ms.
The solution is to use a higher-resolution timer that the system time (QueryPerformanceCounter).
You still won't see .414723, .414723+x, .414723+2x, .414723 +3x + ...+ .414723+nx, though, because you code will not run exactly once every x microseconds. It will run as fast as it can, but there's no particular reason that should always be a constant speed, or that if it is then it's an integer number of microseconds.
I recommend you to look at the C++11 <chrono> header.
high_resolution_clock (C++11) the clock with the shortest tick period available
The tick period referred to here is the frequency at which the clock is updated. If we look in more details:
template<
class Rep,
class Period = std::ratio<1>
> class duration;
Class template std::chrono::duration represents a time interval.
It consists of a count of ticks of type Rep and a tick period, where the tick period is a compile-time rational constant representing the number of seconds from one tick to the next.
Previously, functions like gettimeofday would give you a time expressed in microseconds, however they would utterly fail to tell you the interval at which this time expression was refreshed.
In the C++11 Standard, this information is now in the clear, to make it obvious that there is no relation between the unit in which the time is expressed and the tick period. And that, therefore, you definitely need to take both into accounts.
The tick period is extremely important when you want to measure durations that are close to it. If the duration you wish to measure is inferior to the tick period, then you will measure it "discretely" like you observed: 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, ... I advise caution at this point.
This is because the process running your code isn't always scheduled to execute.
Whilst it does, it will bang round the loop quickly, printing multiple values for each microsecond - which is a comparatively long period of time on modern CPUs.
There are then periods where it is not scheduled to execute by the system, and therefore cannot print values.
If what you want to do is execute every microsecond, this may be possible with some real-time operating systems running on high performance hardware.

What's the best replacement for timeGetTime to avoid wrap-around?

timeGetTime seems to be quite good to query for system time. However, its return value is 32-bit only, so it wraps around every 49 days approx.
It's not too hard to detect the rollover in calling code, but it adds some complexity, and (worse) requires keeping a state.
Is there some replacement for timeGetTime that would not have this wrap-around problem (probably by returning a 64-bit value), and have roughly the same precision and cost?
Unless you need to time an event that is over 49 days, you can SAFELY ignore the wrap-around. Just always subtract the previous timeGetTime() from the current timeGetTime() and you will always obtain a delta measured time that is accurate, even across wrap-around -- provided that you are timing events whose total duration is under 49 days. This all works due to how unsigned modular math works inside the computer.
// this code ALWAYS works, even with wrap-around!
DWORD dwStart = timeGetTime();
// provided the event timed here has a duration of less than 49 days
DWORD dwDuration = timeGetTime()-dwStart;
TIP: look into TimeBeginPeriod(1L) to increase the accuracy of timeGetTime().
BUT... if you want a 64-bit version of timeGetTime, here it is:
__int64 timeGetTime64() {
static __int64 time64=0;
// warning: if multiple threads call this function, protect with a critical section!
return (time64 += (timeGetTime()-(DWORD)time64));
}
Please note that if this function is not called at least once every 49 days, that this function will fail to properly detect a wrap-around.
What platform?
You could use GetTickCount64() if you're running on Vista or later, or synthesise your own GetTickCount64() from GetTickCount() and a timer...
I deal with the rollover issue in GetTickCount() and synthesising a GetTickCount64() on platforms that don't support it here on my blog about testing non-trivial code: http://www.lenholgate.com/blog/2008/04/practical-testing-17---a-whole-new-approach.html
Nope, tracking roll-over requires state. It can be as simple as just incrementing your own 64-bit counter on each callback.
It is pretty unusual to want to track time periods to a resolution as low as 1 millisecond for up to 49 days. You'd have to worry that the accuracy is still there after such a long period. The next step is to use the clock, GetTickCount(64), GetSystemTimeAsFileTime have a resolution of 15.625 milliseconds and are kept accurate with a time server.
Have a look at GetSystemTimeAsFileTime(). It fills a FILETIME struct that contains a "64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC)"
How are you trying to use it? I frequently use the Win32 equivalent when checking for durations that I know will be under 49 days. For example the following code will always work.
DWORD start = timeGetTime();
DoSomthingThatTakesLessThen49Days();
DWORD duration = timeGetTime() - start;
Even if timeGetTime rolled over while calling DoSomthingThatTakesLessThen49Days duration will still be correct.
Note the following code could fail on rollover.
DWORD start = timeGetTime();
DoSomthingThatTakesLessThen49Days();
if (now + 5000 < timeGetTime())
{
}
but can easy be re-written to work as follows
DWORD start = timeGetTime();
DoSomthingThatTakesLessThen49Days();
if (timeGetTime() - start < 5000)
{
}
Assuming you can guarantee that this function will called at least once every 49 days, something like this will work:
// Returns current time in milliseconds
uint64_t timeGetTime64()
{
static uint32_t _prevVal = 0;
static uint64_t _wrapOffset = 0;
uint32_t newVal = (uint32_t) timeGetTime();
if (newVal < _prevVal) _wrapOffset += (((uint64_t)1)<<32);
_prevVal = newVal;
return _wrapOffset+newVal;
}
Note that due to the use of static variables, this function isn't multithread-safe, so if you plan on calling it from multiple threads you should serialize it via a critical section or mutex or similar.
I'm not sure if this fully meets your needs, but
std::chrono::system_clock
might be along the lines of what you're looking for.
http://en.cppreference.com/w/cpp/chrono/system_clock
You could use RDTSC intrinsic. To get time in milliseconds you could get transform coefficient:
double get_rdtsc_coeff() {
static double coeff = 0.0;
if ( coeff < 1.0 ) { // count it only once
unsigned __int64 t00 = __rdtsc();
Sleep(1000);
unsigned __int64 t01 = __rdtsc();
coeff = (t01-t00)/1000.0;
}
return coeff; // transformation coefficient
}
Now you could get count of milliseconds from the last reset:
__int64 get_ms_from_start() {
return static_cast<__int64>(__rdtsc()/get_rdtsc_coeff());
}
If your system uses SpeedStep or similar technologies you could use QueryPerformanceCounter/QueryPerformanceFrequency functions. Windows gives guarantees then the frequency cannot change while the system is running.

C++ windows time

I have a problem in using time.
I want to use and get microseconds on windows using C++.
I can't find the way.
The "canonical" answer was given by unwind :
One popular way is using the QueryPerformanceCounter() call.
There are however few problems with this method:
it's intended for measurement of time intervals, not time. This means you have to write code to establish "epoch time" from which you will measure precise intervals. This is called calibration.
As you calibrate your clock, you also need to periodically adjust it so it's never too much out of sync (this is called drift) with your system clock.
QueryPerformanceCounter is not implemented in user space; this means context switch is needed to call kernel side of implementation, and that is relatively expensive (around 0.7 microsecond). This seems to be required to support legacy hardware.
Not all is lost, though. Points 1. and 2. are something you can do with a bit of coding, 3. can be replaced with direct call to RDTSC (available in newer versions of Visual C++ via __rdtsc() intrinsic), as long as you know accurate CPU clock frequency. Although, on older CPUs, such call would be susceptible to changes in cpu internal clock speed, in all newer Intel and AMD CPUs it is guaranteed to give fairly accurate results and won't be affected by changes in CPU clock (e.g. power saving features).
Lets get started with 1. Here is data structure to hold calibration data:
struct init
{
long long stamp; // last adjustment time
long long epoch; // last sync time as FILETIME
long long start; // counter ticks to match epoch
long long freq; // counter frequency (ticks per 10ms)
void sync(int sleep);
};
init data_[2] = {};
const init* volatile init_ = &data_[0];
Here is code for initial calibration; it has to be given time (in milliseconds) to wait for the clock to move; I've found that 500 milliseconds give pretty good results (the shorter time, the less accurate calibration). For the purpose of callibration we are going to use QueryPerformanceCounter() etc. You only need to call it for data_[0], since data_[1] will be updated by periodic clock adjustment (below).
void init::sync(int sleep)
{
LARGE_INTEGER t1, t2, p1, p2, r1, r2, f;
int cpu[4] = {};
// prepare for rdtsc calibration - affinity and priority
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);
SetThreadAffinityMask(GetCurrentThread(), 2);
Sleep(10);
// frequency for time measurement during calibration
QueryPerformanceFrequency(&f);
// for explanation why RDTSC is safe on modern CPUs, look for "Constant TSC" and "Invariant TSC" in
// Intel(R) 64 and IA-32 Architectures Software Developer’s Manual (document 253668.pdf)
__cpuid(cpu, 0); // flush CPU pipeline
r1.QuadPart = __rdtsc();
__cpuid(cpu, 0);
QueryPerformanceCounter(&p1);
// sleep some time, doesn't matter it's not accurate.
Sleep(sleep);
// wait for the system clock to move, so we have exact epoch
GetSystemTimeAsFileTime((FILETIME*) (&t1.u));
do
{
Sleep(0);
GetSystemTimeAsFileTime((FILETIME*) (&t2.u));
__cpuid(cpu, 0); // flush CPU pipeline
r2.QuadPart = __rdtsc();
} while(t2.QuadPart == t1.QuadPart);
// measure how much time has passed exactly, using more expensive QPC
__cpuid(cpu, 0);
QueryPerformanceCounter(&p2);
stamp = t2.QuadPart;
epoch = t2.QuadPart;
start = r2.QuadPart;
// calculate counter ticks per 10ms
freq = f.QuadPart * (r2.QuadPart-r1.QuadPart) / 100 / (p2.QuadPart-p1.QuadPart);
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_NORMAL);
SetThreadAffinityMask(GetCurrentThread(), 0xFF);
}
With good calibration data you can calculate exact time from cheap RDTSC (I measured the call and calculation to be ~25 nanoseconds on my machine). There are three things to note:
return type is binary compatible with FILETIME structure and is precise to 100ns , unlike GetSystemTimeAsFileTime (which increments in 10-30ms or so intervals, or 1 millisecond at best).
in order to avoid expensive conversions integer to double to integer, the whole calculation is performed in 64 bit integers. Even though these can hold huge numbers, there is real risk of integer overflow, and so start must be brought forward periodically to avoid it. This is done in clock adjustment.
we are making a copy of calibration data, because it might have been updated during our call by clock adjustement in another thread.
Here is the code to read current time with high precision. Return value is binary compatible with FILETIME, i.e. number of 100-nanosecond intervals since Jan 1, 1601.
long long now()
{
// must make a copy
const init* it = init_;
// __cpuid(cpu, 0) - no need to flush CPU pipeline here
const long long p = __rdtsc();
// time passed from epoch in counter ticks
long long d = (p - it->start);
if (d > 0x80000000000ll)
{
// closing to integer overflow, must adjust now
adjust();
}
// convert 10ms to 100ns periods
d *= 100000ll;
d /= it->freq;
// and add to epoch, so we have proper FILETIME
d += it->epoch;
return d;
}
For clock adjustment, we need to capture exact time (as provided by system clock) and compare it against our clock; this will give us drift value. Next we use simple formula to calculate "adjusted" CPU frequency, to make our clock meet system clock at the time of next adjustment. Thus it is important that adjustments are called on regular intervals; I've found that it works well when called in 15 minutes intervals. I use CreateTimerQueueTimer, called once at program startup to schedule adjustment calls (not demonstrated here).
The slight problem with capturing accurate system time (for the purpose of calculating drift) is that we need to wait for the system clock to move, and that can take up to 30 milliseconds or so (it's a long time). If adjustment is not performed, it would risk integer overflow inside function now(), not to mention uncorrected drift from system clock. There is builtin protection against overflow in now(), but we really don't want to trigger it synchronously in a thread which happened to call now() at the wrong moment.
Here is the code for periodic clock adjustment, clock drift is in r->epoch - r->stamp:
void adjust()
{
// must make a copy
const init* it = init_;
init* r = (init_ == &data_[0] ? &data_[1] : &data_[0]);
LARGE_INTEGER t1, t2;
// wait for the system clock to move, so we have exact time to compare against
GetSystemTimeAsFileTime((FILETIME*) (&t1.u));
long long p = 0;
int cpu[4] = {};
do
{
Sleep(0);
GetSystemTimeAsFileTime((FILETIME*) (&t2.u));
__cpuid(cpu, 0); // flush CPU pipeline
p = __rdtsc();
} while (t2.QuadPart == t1.QuadPart);
long long d = (p - it->start);
// convert 10ms to 100ns periods
d *= 100000ll;
d /= it->freq;
r->start = p;
r->epoch = d + it->epoch;
r->stamp = t2.QuadPart;
const long long dt1 = t2.QuadPart - it->epoch;
const long long dt2 = t2.QuadPart - it->stamp;
const double s1 = (double) d / dt1;
const double s2 = (double) d / dt2;
r->freq = (long long) (it->freq * (s1 + s2 - 1) + 0.5);
InterlockedExchangePointer((volatile PVOID*) &init_, r);
// if you have log output, here is good point to log calibration results
}
Lastly two utility functions. One will convert FILETIME (including output from now()) to SYSTEMTIME while preserving microseconds to separate int. Other will return frequency, so your program can use __rdtsc() directly for accurate measurements of time intervals (with nanosecond precision).
void convert(SYSTEMTIME& s, int &us, long long f)
{
LARGE_INTEGER i;
i.QuadPart = f;
FileTimeToSystemTime((FILETIME*) (&i.u), &s);
s.wMilliseconds = 0;
LARGE_INTEGER t;
SystemTimeToFileTime(&s, (FILETIME*) (&t.u));
us = (int) (i.QuadPart - t.QuadPart)/10;
}
long long frequency()
{
// must make a copy
const init* it = init_;
return it->freq * 100;
}
Well of course none of the above is more accurate than your system clock, which is unlikely to be more accurate than few hundred milliseconds. The purpose of precise clock (as opposed to accurate) as implemented above, is to provide single measure which can be used for both:
cheap and very accurate measurement of time intervals (not wall time),
much less accurate, but monotonous and consistent with the above, measure of wall time
I think it does it pretty well. Example use are logs, where one can use timestamps not only to find time of events, but also reason about internal program timings, latency (in microseconds) etc.
I leave the plumbing (call to initial calibration, scheduling adjustment) as an exercise for gentle readers.
You can use boost date time library.
You can use boost::posix_time::hours, boost::posix_time::minutes,
boost::posix_time::seconds, boost::posix_time::millisec, boost::posix_time::nanosec
http://www.boost.org/doc/libs/1_39_0/doc/html/date_time.html
One popular way is using the QueryPerformanceCounter() call. This is useful if you need high-precision timing, such as for measuring durations that only take on the order of microseconds. I believe this is implemented using the RDTSC machine instruction.
There might be issues though, such as the counter frequency varying with power-saving, and synchronization between multiple cores. See the Wikipedia link above for details on these issues.
Take a look at the Windows APIs GetSystemTime() / GetLocalTime() or GetSystemTimeAsFileTime().
GetSystemTimeAsFileTime() expresses time in 100 nanosecond intervals, that is 1/10 of a microsecond. All functions provide the current time with in millisecond accuracy.
EDIT:
Keep in mind, that on most Windows systems the system time is only updated about every 1 millisecond. So even representing your time with microsecond accuracy makes it still necessary to acquire the time with such a precision.
Take a look at this: http://www.decompile.com/cpp/faq/windows_timer_api.htm
May be this can help:
NTSTATUS WINAPI NtQuerySystemTime(__out PLARGE_INTEGER SystemTime);
SystemTime [out] - a pointer to a LARGE_INTEGER structure that receives the system time. This is a 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC).