Performance measurement: time vs tick? - c++

What is the best way to ensure that real-time performance are achieved, with a 2 thread program running on 1 or 2 cores ? boost::timer or RDTSC ?
We started from that code
boost::timer t;
p.f(frame);
max_time_per_frame = std!::max(max_time_per_frame, t.ellapsed());
... where p is an instance of Proc.
class Proc {
public:
Proc() : _frame_counter(0) {}
// that function must be call for each video frame and take less than 1/fps seconds
// 24 fps => 1/24 => < 0.04 seconds.
void f(unsigned char * const frame)
{
processFrame(frame); //that's the most important part
//that part run every 240 frame and should not affect
// the processFrame flow !
if(_frame_counter % 240 == 0)
{
do_something_more();
}
_frame_counter++;
}
private:
_frame_counter;
}
So it run in a Single-Thread/Single-Core way and we observed that the max_time_per_frame is higher than the target time because of the do_something_more processing.
To remove those processing time spikes, we started every do_something_more in a separate thread, like in the pseudo-code below.
class Proc {
public:
Proc() : _frame_counter(0) {
t = start_thread ( do_something_more_thread );
}
// that function must be call for each video frame and take less than 1/fps seconds
// 24 fps => 1/24 => < 0.04 seconds.
void f(unsigned char * const frame)
{
processFrame(frame); //that's the most important part
//that part run every 240 frame and should not affect
// the processFrame flow !
if(_frame_counter % 240 == 0)
{
sem.up();
}
_frame_counter++;
}
void do_something_more_thread()
{
while(1)
{
sem.down();
do_something_more();
}
}
private:
_frame_counter;
semaphore sem;
thread t;
}
I always start my program on 1 and 2 core. So i use start /AFFINITY 1 pro.exe or start /AFFINITY 3 prog.exe
And from time point of view, every thing is ok, max_time_per_frame stay below our target, close to the average at 0.02 second/frame.
But if I dump the number of tick spent in f, using RDTSC.
#include <intrin.h>
...
unsigned long long getTick()
{
return __rdtsc();
}
void f(unsigned char * const frame)
{
s = getTick();
processFrame(frame); //that's the most important part
//that part run every 240 frame and should not affect
// the processFrame flow !
if(_frame_counter % 240 == 0)
{
sem.up();
}
_frame_counter++;
e = getTick();
dump(e - s);
}
start /AFFINITY 3 prog.exe the max_tick_per_frame was stable and as expected i saw 1 thread(100% of 1 core) and the 2nd thread started at a normal pace on the 2nd core.
start /AFFINITY 1 pro.exe, i saw only 1 core at 100% (as expected), but the do_something_more computation time doesn't seem spead over the time, interleaved thread execution. In fact, at regular interval, i saw a huge spike of the tick count.
So the question is why ? does the only interesting measure is time ? does tickhave sense when running sofware on 1 core (frequency boost) ?

Although you'll never get true real time performance out of windows, you can reduce the pitfalls of RDTSC by using the Windows API.
Here is a small code chunk that takes advantage of the API.
#include <Windows.h>
#include <stdio.h>
int
main(int argc, char* argv[])
{
double timeTaken;
LARGE_INTEGER frequency;
LARGE_INTEGER firstCount;
LARGE_INTEGER endCount;
/*-- give us the higheest priority avaliable --*/
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);
/*-- get the frequency of the timer we are using --*/
QueryPerformanceFrequency(&frequency);
/*-- get the timers current tick --*/
QueryPerformanceCounter(&firstCount);
/*-- some pause --*/
Sleep(1);
/*-- get the timers current tick --*/
QueryPerformanceCounter(&endCount);
/*-- calculate time passed --*/
timeTaken = (double)(doubleendCount.QuadPart-firstCount.QuadPart)/(double)(frequency.QuadPart/1000);
printf("Time: %lf", timeTaken);
return 0;
}
You can also use:
#include <Mmsystem.h>
if(timeBeginPeriod(1) == TIMERR_NOCANDO) {
printf("TIMER could not be set to 1ms\n");
}
/*-- your code here --*/
timeEndPeriod(1);
But this will change the global windows timer resolution to what ever interval you set it to (or at least attempt it), so i wouldn't recommend this approach unless you are 100% certain you are the only one that will use this program as this may have unintended side effects on other programs.

Based on the comment about the REALTIME_PRIORITY_CLASS, I added the following line in a test program.
#define NOMINMAX
#include <windows.h>
....
SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS);
And now the tick count i got from RDTSC looks better, the huge spike I saw before on 1 frame, is now spread over multiple frames.
As i wanted to keep my code portable and create some scheduling opportunities, I yielded the additional thread at some specific point using:
boots::this_thread::yield();
and with that change, I obtain the scheduling and the RDTSC value I expected without having to configure the priority!
Thank for all help and advice.

Related

Inconsistent chrono::high_resolution_clock delay

I'm trying to implement a MIDI-like clocked sample player.
There is a timer, which increments pulse counter, and every 480 pulses is a quarter, so pulse period is 1041667 ns for 120 beats per minute.
Timer is not sleep-based and running in separate thread, but it seems like delay time is inconsistent: period between samples played in a test file is fluctuating +- 20 ms (in some occasions period is OK and steady, I can't find out dependency of this effect).
Audio backend influence is excluded: i've tried OpenAL as well as SDL_mixer.
void Timer_class::sleep_ns(uint64_t ns){
auto start = std::chrono::high_resolution_clock::now();
bool sleep = true;
while(sleep)
{
auto now = std::chrono::high_resolution_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(now - start);
if (elapsed.count() >= ns) {
TestTime = elapsed.count();
sleep = false;
//break;
}
}
}
void Timer_class::Runner(void){
// this running as thread
while(1){
sleep_ns(BPMns);
if (Run) Transport.IncPlaybackMarker(); // marker increment
if (Transport.GetPlaybackMarker() == Transport.GetPlaybackEnd()){ // check if timer have reached end, which is 480 pulses
Transport.SetPlaybackMarker(Transport.GetPlaybackStart());
Player.PlayFile(1); // period of this event fluctuates severely
}
}
};
void Player_class::PlayFile(int FileNumber){
#ifdef AUDIO_SDL_MIXER
if(Mix_PlayChannel(-1, WaveData[FileNumber], 0)==-1) {
printf("Mix_PlayChannel: %s\n",Mix_GetError());
}
#endif // AUDIO_SDL_MIXER
}
Am i doing something wrong in terms of an approach? Is there any better way to implement timer of this kind?
Deviation higher than 4-5 ms is too much in case of audio.
I see a large error and a small error. The large error is that your code assumes that the main processing in Runner consistently takes zero time:
if (Run) Transport.IncPlaybackMarker(); // marker increment
if (Transport.GetPlaybackMarker() == Transport.GetPlaybackEnd()){ // check if timer have reached end, which is 480 pulses
Transport.SetPlaybackMarker(Transport.GetPlaybackStart());
Player.PlayFile(1); // period of this event fluctuates severely
}
That is, you're "sleeping" for the time you want your loop iteration to take, and then you're doing processing on top of that.
The small error is presuming that you can represent your ideal loop iteration time with an integral number of nanoseconds. This error is so small that it doesn't really matter. However I amuse myself by showing people how they can get rid of this error too. :-)
First lets correct the small error by exactly representing the idealized loop iteration time:
using quarterPeriod = std::ratio<1, 2>;
using iterationPeriod = std::ratio_divide<quarterPeriod, std::ratio<480>>;
using iteration_time = std::chrono::duration<std::int64_t, iterationPeriod>;
I know nothing of music, but I'm guessing the above code is right because if you convert iteration_time{1} to nanoseconds, you get approximately 1041667ns. iteration_time{1} is intended to be the precise amount of time you want each iteration of your loop in Timer_class::Runner to take.
To correct the large error, you need to sleep until a time_point, as opposed to sleeping for a duration. Here's a generic utility to help you do that:
template <class Clock, class Duration>
void
delay_until(std::chrono::time_point<Clock, Duration> tp)
{
while (Clock::now() < tp)
;
}
Now if you code Timer_class::Runner to use delay_until instead of sleep_ns, I think you'll get better results:
void
Timer_class::Runner()
{
auto next_start = std::chrono::steady_clock::now() + iteration_time{1};
while (true)
{
if (Run) Transport.IncPlaybackMarker(); // marker increment
if (Transport.GetPlaybackMarker() == Transport.GetPlaybackEnd()){ // check if timer have reached end, which is 480 pulses
Transport.SetPlaybackMarker(Transport.GetPlaybackStart());
Player.PlayFile(1);
}
delay_until(next_start);
next_start += iteration_time{1};
}
}
I ended up using #howard-hinnant version of delay, and reducing buffer size in openal-soft, that's what made a huge difference, fluctuations is now about +-5 ms for 1/16th at 120BPM (125 ms period) and +-1 ms for quarter beats. Leaves a lot to be desired, but i guess it's okay

Using std::chrono library to adjust the application fps but getting weird behavior

I wrote the code bellow using std::chrono c++ library , what i am trying to do is
to fix the application's FPSon 60 , but i am getting 50 FPS, not a performance issue for sure
because i am computing nothing . but it is certainly an invalid usage or a bug .
the TARGET_FPS macro is set to the target FPSthat i want to get , then the console window
displays the real actual FPS , these following lines shows the values i set TARGET_FPSto , and each is associated to the final FPS.
TARGET_FPS---->FPS
60----->50
90----->50
100----->100
1000----->100
10000----->100
whatever ----->100
Even if i define TARGET_FPS to 1000000000 i get 100 FPS, even when i define it to 458 or whatever value more than 100 i will get 100 FPSas output .
#include <chrono> /// to use std::chrono namespace
#include <iostream> /// for console output
#include <thread> /// for std::this_thread::sleep_for()
#define TARGET_FPS 60// our target FPS
using frame_len_type = std::chrono::duration<float,std::ratio<1,TARGET_FPS>>; /// this is the duration that defines the length of a frame
using fsecond = std::chrono::duration<float>; /// this duration represents once second and uses 'float' type as internal representation
const frame_len_type target_frame_len(1); /// we will define this constant here , to represent on frame duration ( defined to avoid construction inside a loop )
void app_logic(){ /** ... All application logic goes here ... **/}
int main() /// our main function !
{
using sys_clock = std::chrono::system_clock; /// simplify the type name to make the code readable
sys_clock::time_point frame_begin,frame_end; /// we will use these time points to point to frame begin and end
while (true)
{
frame_begin = sys_clock::now(); /// there we go !
app_logic(); /// lets be logical here :)
frame_end = sys_clock::now(); /// we are done so quick !
std::this_thread::sleep_for( target_frame_len- (frame_end.time_since_epoch()-frame_begin.time_since_epoch()) ); /// we will take a rest that is equal to what we where supposed to take to finish the actual target frame length
std::cout<< fsecond(1) / ( sys_clock::now() - frame_begin) <<std::endl; /// this will show ass the current FPS
}
return 0; /// return to OS
} /// end of code
The timing resolution of std::chrono is system dependent:
In this answer to another question, you'll find a code snippet to determine the approximate timing resolution of your platform.
On windows 7, the default timer resolution is 15.6 ms
In addition, the windows API sleep on which the c++ standard library has to rely, does not guarantee that the thread will resume execution immediately after the waiting time:
After the sleep interval has passed, the thread is ready to run. If
you specify 0 milliseconds, the thread will relinquish the remainder
of its time slice but remain ready. Note that a ready thread is not
guaranteed to run immediately. Consequently, the thread may not run
until some time after the sleep interval elapses.
The C++ standard library doesn't give better guarantees for sleep_for, whatever OS you are using:
30.3.2/7: Effect: Blocks the calling thread for the relative timeout (...)
Consequence:
With FPS set to 60, there would be a frame every 16.6 ms. So assuming that your app_logic() is ultra fast, your thread will sleep at least 15.6 ms. If the logic takes 1 ms to execute, you'd be exactly at 60 FPS.
However, according to the API documentation, if [wait time] is greater than one tick but less than two, the wait can be anywhere between one and two ticks, so that the average sleep time will be between 15.6 and 31.2 ms, whic means, inversely, that your FPS will be between 60 and 32 FPS. This explains why you only achieve 50 FPS.
When you set FPS to 100, there should be a frame every 10ms. This is below the timer accuracy. There might be no sleep at all. If no other thread is ready to run, the function will return immediately, so that you will be at your maximum throughput. If you set a higher FPS, you'd be in exactly the same situation as the expected waiting time would always be below the timer accuracy. The result will therefore not improve.
Problem solved :)
#include <chrono> /// to use std::chrono namespace
#include <iostream> /// for console output
#include <thread> /// for std::this_thread::sleep_for()
#include <windows.h>
#define TARGET_FPS 500 /// our target fps as a macro
const float target_fps = (float)TARGET_FPS; /// our target fps
float tmp_target_fps = target_fps; /// used to adjust the target fps depending on the actual real fps to reach the real target fps
using frame_len_type = std::chrono::duration<float,std::ratio<1,TARGET_FPS>>; /// this is the duration that defines the length of a frame
using fsecond = std::chrono::duration<float>; /// this duration represents once second and uses 'float' type as internal representation
fsecond target_frame_len(1.0f/tmp_target_fps); /// we will define this constant here , to represent on frame duration ( defined to avoid construction inside a loop )
bool enable_fps_oscillation = true;
void app_logic()
{
/** ... All application logic goes here ... **/
}
class HeighResolutionClockKeeper
{
private :
bool using_higher_res_timer;
public :
HeighResolutionClockKeeper() : using_higher_res_timer(false) {}
void QueryHeighResolutionClock()
{
if (timeBeginPeriod(1) != TIMERR_NOCANDO)
{
using_higher_res_timer = true;
}
}
void FreeHeighResolutionClock()
{
if (using_higher_res_timer)
{
timeEndPeriod(1);
}
}
~HeighResolutionClockKeeper()
{
FreeHeighResolutionClock(); /// if exception is thrown , if not this wont cause problems thanks to the flag we put
}
};
int main() /// our main function !
{
HeighResolutionClockKeeper MyHeighResolutionClockKeeper;
MyHeighResolutionClockKeeper.QueryHeighResolutionClock();
using sys_clock = std::chrono::system_clock; /// simplify the type name to make the code readable
sys_clock::time_point frame_begin,frame_end; /// we will use these time points to point to frame begin and end
sys_clock::time_point start_point = sys_clock::now();
float accum_fps = 0.0f;
int frames_count = 0;
while (true)
{
frame_begin = sys_clock::now(); /// there we go !
app_logic(); /// lets be logical here :)
frame_end = sys_clock::now(); /// we are done so quick !
std::this_thread::sleep_for( target_frame_len- (frame_end.time_since_epoch()-frame_begin.time_since_epoch()) ); /// we will take a rest that is equal to what we where supposed to take to finish the actual target frame length
float fps = fsecond(1) / ( sys_clock::now() - frame_begin) ; /// this will show ass the current FPS
/// obviously we will not be able to hit the exact FPS we want se we need to oscillate around until we
/// get a very close average FPS by time .
if (fps < target_fps) /// our real fps is less than what we want
tmp_target_fps += 0.01; /// lets ask for more !
else if (fps > target_fps ) /// it is more than what we want
tmp_target_fps -=0.01; /// lets ask for less
if(enable_fps_oscillation == true)
{
/// now we will adjust our target frame length for match the new target FPS
target_frame_len = fsecond(1.0f/tmp_target_fps);
/// used to calculate average FPS
accum_fps+=fps;
frames_count++;
/// each 1 second
if( (sys_clock::now()-start_point)>fsecond(1.0f)) /// show average each 1 sec
{
start_point=sys_clock::now();
std::cout<<accum_fps/frames_count<<std::endl; /// it is getting more close each time to our target FPS
}
}
else
{
/// each frame
std::cout<<fps<<std::endl;
}
}
MyHeighResolutionClockKeeper.FreeHeighResolutionClock();
return 0; /// return to OS
} /// end of code
I had to add timeBeginPeriod() and timeEndPeriod() on windows platform , thanks to this awesome , lost-in-the-wind website http://www.geisswerks.com/ryan/FAQS/timing.html from Ryan Geiss .
Details :
Because we can't actually hit the exact fps that we want ( very slightly above or bellow , but up to 1000 fps and down to 1 fps thanks to timeXPeriod(1) ) therefore i used some extra dump fps variable to adjust the target fps i am seeking, increasing it and decreasing it .., that will let us control the actual application fps to hits our real target fps as an average (you can enable and disable this using 'enable_fps_oscillation' flag ) this fixes an issue for fps = 60 because we can't hit it ( +/-0.5 ) , but if we set fps = 500 we hit it and we dont need to oscillate bellow and above it

pthread_cond_timedwait timing out late when large load put on CPU

In writing unit tests for an object, I am noticing that a pthread_cond_timedwait does not timeout soon enough when large loads are put upon the CPU. If these loads are not put on the CPU, everything works fine. When loads are put on to the system, however, I find that no matter the amount of time I set the timeout to, the true delay is off by about 50-100ms.
For example, here is a printout from a single interval of the program, where the last and current times are found using the function GetTimeInMs.
// Printout, values are in ms
Last: 89799240
Current: 89799440
Period Length: 200
Expected Period: 100
From all I have read this issue is usually caused by using relative times instead of absolute times, but as far as I can tell we are using absolute times correctly. If you wonderful people could help me figure out what is being done wrong here I would be very grateful.
The function utilizing timedwait is shown here. Note that based off of timing debugging I have done, I know the extra time generated is done via the timedwait call, so I have not included other code that would not be necessary.
bool func(unsigned long long int time = 100) // ms
{
struct timespec ts;
pthread_mutex_lock(&m_Mutex);
if (0 == m_CurrentCount)
{
// Current time + delay in ns
unsigned long long int absnanotime = (GetTimeInMs()+time)*1000000;
struct timespec ts;
ts.tv_nsec = absnanotime % 1000000000ULL;
ts.tv_sec = absnanotime / 1000000000ULL;
do
{
if (0 != pthread_cond_timedwait(&m_Condition, &m_Mutex, &ts))
{
// In the case I am testing, I hope to get here via timeout in 100 ms
pthread_mutex_unlock(&m_Mutex);
return false;
}
}
while (!m_CurrentCount);
}
pthread_mutex_unlock(&m_Mutex);
return true;
}
unsigned long long int GetTimeInMs()
{
unsigned long long int time;
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
time = ts.tv_nsec + ts.tv_sec * 1000000000ULL;
time = time / 1000000ULL; // Converts to ms
return time;
}
The code used to initialize the class variables used in func.
void init()
{
pthread_mutex_init(&m_Mutex, NULL);
pthread_condattr_init(&m_Attr);
pthread_condattr_setclock(&m_Attr, CLOCK_MONOTONIC);
pthread_cond_init(&m_Condition, &m_Attr);
}
The CPU eater thread which simulates CPU load is running the following while loop.
void cpuEatingThread()
{
while (false == m_ShutdownRequested);
{
// m_UselessFoo is of type float*
m_UselessFoo = new float(1.23423525);
delete m_UselessFoo;
}
}
It's likely that, when the wait times out, the thread becomes ready without any priority boost or any other such action/s. If the box is loaded up, then the ready thread may not become running immediately.
It's common to apply temporary priority boosts to thread that become ready on signals - this tends to improve overall performance in the 'usual' case where the signal arrives before the timeout. The timeout is often more of an 'unusual' event, often signaling some sort of failure that will not be repeated and so threads becoming ready on timeout can wait their turn:)
For timed waits in general, the requirement is that they will wait at least as long as their argument. If you want precise times, this is not the right tool; you'll need something that guarantees particular times, and that's generally only available in a real-time operating system (RTOS).

Precise way to reduce CPU usage in an infinite loop

This is my code using QueryPeformanceCounter as timer.
//timer.h
class timer {
private:
...
public:
...
double get(); //returns elapsed time in seconds
void start();
};
//a.cpp
void loop() {
timer t;
double tick;
double diff; //surplus seconds
t.start();
while( running ) {
tick = t.get();
if( tick >= 1.0 - diff ) {
t.start();
//things that should be run exactly every second
...
}
Sleep( 880 );
}
}
Without Sleep this loop would go on indefinitely calling t.get() every time which causes high CPU usage. For that reason, I make it sleep for about 880 milliseconds so that it wouldn't call t.get() while not necessary.
As I said above, I'm currently using Sleep to do the trick, but what I'm worried about is the accuracy of Sleep. I've read somewhere that the actual milliseconds the program pauses may vary - 20 to 50 ms - the reason I set the parameter to 880. I want to reduce the CPU usage as much as possible; I want to, if possible, pause more than 990 milliseconds EDIT: and yet less than 1000 milliseconds between every loop. What would be the best way to go?
I don't get why you are calling t.start() twice (it resets the clock?), but I would like to propose a kind of solution for the Sleep inaccuracy. Let's take a look at the content of while( running ) loop and follow the algorithm:
double future, remaining, sleep_precision = 0.05;
while (running) {
future = t.get() + 1.0;
things_that_should_be_run_exactly_every_second();
// the loop in case of spurious wakeup
for (;;) {
remaining = future - t.get();
if (remaining < sleep_precision) break;
Sleep(remaining);
}
// next, do the spin-lock for at most sleep_precision
while (t.get() < future);
}
The value of sleep_precision should be set empirically - OSes I know can't give you that.
Next, there are some alternatives of the sleeping mechanism that may better suit your needs - Is there an alternative for sleep() in C?
If you want to pause more than 990 milliseconds, write a sleep for 991 milliseconds. Your thread is guaranteed to be asleep for at least that long. It won't be less, but it could be multiples of 20-50ms more (depending on the resolution of your OS's time slicing, and on the the cost of context switching).
However, this will not give you something running "exactly every second". There is just no way to achieve that on a time-shared operating system. You'll have to program closer to the metal, or rely on an interrupt from a PPS source and just pray your OS lets you run your entire loop iteration in one shot. Or, I suppose, write something to run in kernel modeā€¦?

Uniformly Regulating Program Execution Rate [Windows C++]

First off, I found a lot of information on this topic, but no solutions that solved the issue unfortunately.
I'm simply trying to regulate my C++ program to run at 60 iterations per second. I've tried everything from GetClockTicks() to GetLocalTime() to help in the regulation but every single time I run the program on my Windows Server 2008 machine, it runs slower than on my local machine and I have no clue why!
I understand that "clock" based function calls return CPU time spend on the execution so I went to GetLocalTime and then tried to differentiate between the start time and the stop time then call Sleep((FPS / 1000) - millisecondExecutionTime)
My local machine is quite faster than the servers CPU so obviously the thought was that it was going off of CPU ticks, but that doesn't explain why the GetLocalTime doesn't work. I've been basing this method off of http://www.lazyfoo.net/SDL_tutorials/lesson14/index.php changing the get_ticks() with all of the time returning functions I could find on the web.
For example take this code:
#include <Windows.h>
#include <time.h>
#include <string>
#include <iostream>
using namespace std;
int main() {
int tFps = 60;
int counter = 0;
SYSTEMTIME gStart, gEnd, start_time, end_time;
GetLocalTime( &gStart );
bool done = false;
while(!done) {
GetLocalTime( &start_time );
Sleep(10);
counter++;
GetLocalTime( &end_time );
int startTimeMilli = (start_time.wSecond * 1000 + start_time.wMilliseconds);
int endTimeMilli = (end_time.wSecond * 1000 + end_time.wMilliseconds);
int time_to_sleep = (1000 / tFps) - (endTimeMilli - startTimeMilli);
if (counter > 240)
done = true;
if (time_to_sleep > 0)
Sleep(time_to_sleep);
}
GetLocalTime( &gEnd );
cout << "Total Time: " << (gEnd.wSecond*1000 + gEnd.wMilliseconds) - (gStart.wSecond*1000 + gStart.wMilliseconds) << endl;
cin.get();
}
For this code snippet, run on my computer (3.06 GHz) I get a total time (ms) of 3856 whereas on my server (2.53 GHz) I get 6256. So it potentially could be the speed of the processor though the ratio of 2.53/3.06 is only .826797386 versus 3856/6271 is .614893956.
I can't tell if the Sleep function is doing something drastically different than expected though I don't see why it would, or if it is my method for getting the time (even though it should be in world time (ms) not clock cycle time. Any help would be greatly appreciated, thanks.
For one thing, Sleep's default resolution is the computer's quota length - usually either 10ms or 15ms, depending on the Windows edition. To get a resolution of, say, 1ms, you have to issue a timeBeginPeriod(1), which reprograms the timer hardware to fire (roughly) once every millisecond.
In your main loop you can
int main()
{
// Timers
LONGLONG curTime = NULL;
LONGLONG nextTime = NULL;
Timers::GameClock::GetInstance()->GetTime(&nextTime);
while (true) {
Timers::GameClock::GetInstance()->GetTime(&curTime);
if ( curTime > nextTime && loops <= MAX_FRAMESKIP ) {
nextTime += Timers::GameClock::GetInstance()->timeCount;
// Business logic goes here and occurr based on the specified framerate
}
}
}
using this time library
include "stdafx.h"
LONGLONG cacheTime;
Timers::SWGameClock* Timers::SWGameClock::pInstance = NULL;
Timers::SWGameClock* Timers::SWGameClock::GetInstance ( ) {
if (pInstance == NULL) {
pInstance = new SWGameClock();
}
return pInstance;
}
Timers::SWGameClock::SWGameClock(void) {
this->Initialize ( );
}
void Timers::SWGameClock::GetTime ( LONGLONG * t ) {
// Use timeGetTime() if queryperformancecounter is not supported
if (!QueryPerformanceCounter( (LARGE_INTEGER *) t)) {
*t = timeGetTime();
}
cacheTime = *t;
}
LONGLONG Timers::SWGameClock::GetTimeElapsed ( void ) {
LONGLONG t;
// Use timeGetTime() if queryperformancecounter is not supported
if (!QueryPerformanceCounter( (LARGE_INTEGER *) &t )) {
t = timeGetTime();
}
return (t - cacheTime);
}
void Timers::SWGameClock::Initialize ( void ) {
if ( !QueryPerformanceFrequency((LARGE_INTEGER *) &this->frequency) ) {
this->frequency = 1000; // 1000ms to one second
}
this->timeCount = DWORD(this->frequency / TICKS_PER_SECOND);
}
Timers::SWGameClock::~SWGameClock(void)
{
}
with a header file that contains the following:
// Required for rendering stuff on time
#pragma once
#define TICKS_PER_SECOND 60
#define MAX_FRAMESKIP 5
namespace Timers {
class SWGameClock
{
public:
static SWGameClock* GetInstance();
void Initialize ( void );
DWORD timeCount;
void GetTime ( LONGLONG* t );
LONGLONG GetTimeElapsed ( void );
LONGLONG frequency;
~SWGameClock(void);
protected:
SWGameClock(void);
private:
static SWGameClock* pInstance;
}; // SWGameClock
} // Timers
This will ensure that your code runs at 60FPS (or whatever you put in) though you can probably dump the MAX_FRAMESKIP as that's not truly implemented in this example!
You could try a WinMain function and use the SetTimer function and a regular message loop (you can also take advantage of the filter mechanism of GetMessage( ... ) ) in which you test for the WM_TIMER message with the requested time and when your counter reaches the limit do a PostQuitMessage(0) to terminate the message loop.
For a duty cycle that fast, you can use a high accuracy timer (like QueryPerformanceTimer) and a busy-wait loop.
If you had a much lower duty cycle, but still wanted precision, then you could Sleep for part of the time and then eat up the leftover time with a busy-wait loop.
Another option is to use something like DirectX to sync yourself to the VSync interrupt (which is almost always 60 Hz). This can make a lot of sense if you're coding a game or a/v presentation.
Windows is not a real-time OS, so there will never be a perfect way to do something like this, as there's no guarantee your thread will be scheduled to run exactly when you need it to.
Note that in the remarks for Sleep, the actual amount of time will be at least one "tick" and possible one whole "tick" longer than the delay you requested before the thread is scheduled to run again (and then we have to assume the thread is scheduled). The "tick" can vary a lot depending on hardware and the version of Windows. It is commonly in the 10-15 ms range, and I've seen it as bad as 19 ms. For 60 Hz, you need 16.666 ms per iteration, so this is obviously not nearly precise enough to give you what you need.
What about rendering (iterating) based on the time elapsed between rendering of each frame? Consider creating a void render(double timePassed) function and render depending on the timePassed parameter instead of putting program to sleep.
Imagine, for example, you want to render a ball falling or bouncing. You would know it's speed, acceleration and all other physics that you need. Calculate the position of the ball based on timePassed and all other physics parameters (speed, acceleration, etc.).
Or if you prefer, you could just skip the render() function execution if time passed is a value to small, instead of puttin program to sleep.