I try to call a function every 1 ms. The problem is, I like to do this with windows. So I tried the multimediatimer API.
Multimediatimer API
Source
idTimer = timeSetEvent(
1,
0,
TimerProc,
0,
TIME_PERIODIC|TIME_CALLBACK_FUNCTION );
My result was that most of the time the 1 ms was ok, but sometimes I get the double period. See the little bump at around 1.95ms
multimediatimerHistogram http://www.freeimagehosting.net/uploads/8b78f2fa6d.png
My first thought was that maybe my method was running too long. But I measured this already and this was not the case.
Queued Timers API
My next try was using the queud timers API with
hTimerQueue = CreateTimerQueue();
if(hTimerQueue == NULL)
{
printf("Error creating queue: 0x%x\n", GetLastError());
}
BOOL res = CreateTimerQueueTimer(
&hTimer,
hTimerQueue,
TimerProc,
NULL,
0,
1, // 1ms
WT_EXECUTEDEFAULT);
But also the result was not as expected. Now I get most of the time 2 ms cycletime.
queuedTimer http://www.freeimagehosting.net/uploads/2a46259a15.png
Measurement
For measuring the times I used the method QueryPerformanceCounter and QueryPerformanceFrequency.
Question
So now my question is if somebody encountered similar problems under windows and maybe even found a solution?
Thanks.
Without going to a real-time OS, you cannot expect to have your function called every 1 ms.
On Windows that is NOT a real-time OS (for Linux it is similar), a program that repeatedly read a current time with microsecond precision, and store consecutive differences in an histogram have a non-empty bin for >10 ms! This means that sometimes you will have 2 ms, but you can also get more between your calls.
You can try to run timeBeginPeriod(1) at the program start and timeEndPeriod(1) before quitting. This probably can enhance timer precision.
A call to NtQueryTimerResolution() will return a value for ActualResolution. In your case the actual resolution is almost certainly 0.9765625 ms. This is exactly what you show in the first plot.
The second occurace of about 1.95 ms is more precisely Sleep(1) = 1.9531 ms = 2 x 0.9765625 ms
I guess the interrupt period runs at someting close to 1ms (0.9765625).
And now the trouble begins: The timer signals when the desired delay expires.
Say the ActualResolution is set to 0.9765625, the interrupt heartbeat of the system will run at 0.9765625 ms periods or 1024 Hz and a call to Sleep is made with a desired delay of 1 ms. Two scenarios are to be looked at:
The call was made < 1ms (ΔT) ahead of the next interrupt. The next interrupt will not confirm that the desired period of time has expired. Only the following interrupt will cause the call to return. The resulting sleep delay will be ΔT + 0.9765625 ms.
The call was made >= 1ms (ΔT) ahead of the next interrupt. The next interrupt will force the call to return. The resulting sleep delay will be ΔT.
So the result depends a lot on when the call was made and therefore you may observe 0.98ms events as well as 1.95ms events.
Edit: Using the CreateTimerQueueTimer will push the observed delay to 1.95 because the timer tick (interrupt period) is 0.9765625 ms. On the first occurence of the interrupt, the requested duration of 1 ms has not quite expired, thus the TimerProc will only be triggered after the second interrupt (2 x 0.9765625 ms = 1.953125 ms > 1 ms). Consequently, the queueTimer plot shows the peak at 1.953125 ms.
Note: This behavior strongly depends on the underlying hardware.
More details can be found at the Windows Timestamp Project
Related
I'm trying to create a game using C++ and I want to create limit for fps but I always get more or less fps than I want. When I look at games that have fps limit it's always precise framerate. Tried using Sleep() std::this_thread::sleep_for(sleep_until). For example Sleep(0.01-deltaTime) to get 100 fps but ended up with +-90fps.
How do these games handle fps so precisely when any sleeping isn't precise?
I know I can use infinite loop that just checks if time passed but it's using full power of CPU but I want to decrease CPU usage by this limit without VSync.
Yes, sleep is usually inaccurate. That is why you sleep for less than the actual time it takes to finish the frame. For example, if you need 5 more milliseconds to finish the frame, then sleep for 4 milliseconds. After the sleep, simply do a spin-lock for the rest of the frame. Something like
float TimeRemaining = NextFrameTime - GetCurrentTime();
Sleep(ConvertToMilliseconds(TimeRemaining) - 1);
while (GetCurrentTime() < NextFrameTime) {};
Edit: as stated in another answer, timeBeginPeriod() should be called to increase the accuracy of Sleep(). Also, from what I've read, Windows will automatically call timeEndPeriod() when your process exits if you don't before then.
You could record the time point when you start, add a fixed duration to it and sleep until the calculated time point occurs at the end (or beginning) of every loop. Example:
#include <chrono>
#include <iostream>
#include <ratio>
#include <thread>
template<std::intmax_t FPS>
class frame_rater {
public:
frame_rater() : // initialize the object keeping the pace
time_between_frames{1}, // std::ratio<1, FPS> seconds
tp{std::chrono::steady_clock::now()}
{}
void sleep() {
// add to time point
tp += time_between_frames;
// and sleep until that time point
std::this_thread::sleep_until(tp);
}
private:
// a duration with a length of 1/FPS seconds
std::chrono::duration<double, std::ratio<1, FPS>> time_between_frames;
// the time point we'll add to in every loop
std::chrono::time_point<std::chrono::steady_clock, decltype(time_between_frames)> tp;
};
// this should print ~10 times per second pretty accurately
int main() {
frame_rater<10> fr; // 10 FPS
while(true) {
std::cout << "Hello world\n";
fr.sleep(); // let it sleep any time remaining
}
}
The accepted answer sounds really bad. It would not be accurate and it would burn the CPU!
Thread.Sleep is not accurate because you have to tell it to be accurate (by default is about 15ms accurate - means that if you tell it to sleep 1ms it could sleep 15ms).
You can do this with Win32 API call to timeBeginPeriod & timeEndPeriod functions.
Check MSDN for more details -> https://learn.microsoft.com/en-us/windows/win32/api/timeapi/nf-timeapi-timebeginperiod
(I would comment on the accepted answer but still not having 50 reputation)
Be very careful when implementing any wait that is based on scheduler sleep.
Most OS schedulers have higher latency turn-around for a wait with no well-defined interval or signal to bring the thread back into the ready-to-run state.
Sleeping isn't inaccurate per-se, you're just approaching the problem all wrong. If you have access to something like DXGI's Waitable Swapchain, you synchronize to the DWM's present queue and get really reliable low-latency timing.
You don't need to busy-wait to get accurate timing, a waitable timer will give you a sync object to reschedule your thread.
Whatever you do, do not use the currently accepted answer in production code. There's an edge case here you WANT TO AVOID, where Sleep (0) does not yield CPU time to higher priority threads. I've seen so many game devs try Sleep (0) and it's going to cause you major problems.
Use a timer.
Some OS's can provide special functions. For example, for Windows you can use SetTimer and handle its WM_TIMER messages.
Then calculate the frequency of the timer. 100 fps means that the timer must fire an event each 0.01 seconds.
At the event handler for this timer-event you can do your rendering.
In case the rendering is slower than the desired frequency then use a syncro flag OpenGL sync and discard the timer-event if the previous rendering is not complete.
You may set a const fps variable to your desired frame rate, then you can update your game if the elapsed time from last update is equal or more than 1 / desired_fps.
This will probably work.
Example:
const /*or constexpr*/ int fps{60};
// then at update loop.
while(running)
{
// update the game timer.
timer->update();
// check for any events.
if(timer->ElapsedTime() >= 1 / fps)
{
// do your updates and THEN renderer.
}
}
I am new to python and psychopy, however I have vast experience in programming and in designing experiments (using Matlab and EPrime). I am running an RSVP (rapid visual serial presentation) experiment with displays a different visual stimuli every X ms (X is an experimental variable, can be from 100 ms to 1000 ms). As this is a physiological experiment, I need to send triggers over the parallel port exactly on stimulus onset. I test the sync between triggers and visual onset using an oscilloscope and photosensor. However, when I send my trigger before or after the win.flip(), even with the window waitBlanking=False parameter then I still get a difference between the onset of the stimuli and the onset of the code.
Attached is my code:
im=[]
for pic in picnames:
im.append(visual.ImageStim(myWin,image=pic,pos=[0,0],autoLog=True))
myWin.flip() # to get to the next vertical blank
while tm < and t < len(codes):
im[tm].draw()
parallel.setData(codes[t]) # before
myWin.flip()
#parallel.setData(codes[t]) # after
ttime.append(myClock.getTime())
core.wait(0.01)
parallel.setData(0)
dur=(myClock.getTime()-ttime[t])*1000
while dur < stimDur-frameDurAvg+1:
dur=(myClock.getTime()-ttime[t])*1000
t=t+1
tm=tm+1
myWin.flip()
How can I sync my stimulus onset to the trigger? I'm not sure if this is a graphics card issue (I'm using a LCD ACER screen with the onboard Intel graphics card). Many thanks,
Shani
win.flip() waits for next monitor update. This means that the next line after win.flip() is executed almost exactly when the monitor begins drawing the frame. That's where you want to send your trigger. The line just before win.flip() is potentially almost one frame earlier, e.g. 16.7 ms on a 60Hz monitor so your trigger would arrive too early.
There are two almost identical ways to do it. Let's start with the most explicit:
for i in range(10):
win.flip()
# On the first flip
if i == 0:
parallel.setData(255)
core.wait(0.01)
parallel.setData(0)
... so the signal is sent just after the image has been pushed to the monitor.
The slightly more timing-accurate way to do it will save you like 0.01 ms (plus minus an order of magnitude). Somewhere early in the script define
def sendTrigger(code):
parallel.setData(code)
core.wait(0.01)
parallel.setData(0)
Then do
win.callOnFlip(sendTrigger, code=255)
for i in range(10):
win.flip()
This will call the function just after the first flip, before psychopy does a bit of housecleaning. So the function could have been called win.callOnNextFlip since it's only executed on the first following flip.
Again, this difference in timing is so miniscule compared to other factors that this is not really a question of a performance but rather of style preferences.
There is a hidden timing variable that is usually ignored - the monitor input lag, and I think this is the reason for the delay. Put simply, the monitor needs some time to display the image even after getting the input from the graphics card. This delay has nothing to do with the refresh rate (how many times the screen switches buffer), or the response time of the monitor.
In my monitor, I find a delay of 23ms when I send a trigger with callOnFlip(). How I correct it is: floor(23/16.667) = 1, and 23%16.667 = 6.333. So I call the callOnFlip on the second frame, wait 6.3 ms and trigger the port. This works. I haven't tried with WaitBlanking=True, which waits for the blanking start from the graphics card, as that gives me some more time to prepare the next buffer already. However, I think that even with WaitBlanking=True the effect will be there. (More after testing!)
Best,
Suddha
There is at least one routine that you can use to normalized the trigger delay to your screen refreshing rate. I just tested it with a photosensor cell and I went from a mean delay of 13 milliseconds (sd = 3.5 ms) between the trigger and the stimulus display, to a mean delay of 4.8 milliseconds (sd = 3.1 ms).
The procedure is the following :
Compute the mean duration between two displays. Say your screen has a refreshing rate of 85.05 (this is my case). This means that there is mean duration of 1000/85.05 = 11.76 milliseconds between two refreshes.
Just after you called win.flip(), wait for this averaged delay before you send your trigger : core.wait(0.01176).
This will not ensure that all your delays now equal zero, since you cannot master the synchronization between the win.flip() command and the current state of your screen, but it will center the delay around zero. At least, it did for me.
So the code could be updated as following :
refr_rate = 85.05
mean_delay_ms = (1000 / refr_rate)
mean_delay_sec = mean_delay_ms / 1000 # Psychopy needs timing values in seconds
def send_trigger(port, value):
core.wait(mean_delay_sec)
parallel.setData(value)
core.wait(0.001)
parallel.setData(0)
[...]
stimulus.draw()
win.flip()
send_trigger(port, value)
[...]
I have a program that runs every 5 minutes when the stock market is open, which it does by running once, then entering the following function, which returns once 5 minutes has passed if the stock market is open.
What I don't understand, is that after a period of time, usually about 18 or 19 hours, it crashes returning a sigsegv error. I have no idea why, as it isn't writing to any memory - although I don't know much about the systemtime type, so maybe that's it?
Anyway, any help you could give would be very much appreciated! Thanks in advance!!
void KillTimeUntilNextStockDataReleaseOnWeb()
{
SYSTEMTIME tLocalTimeNow;
cout<<"\n*****CHECKING IF RUN HAS JUST COMPLETED OR NOT*****\n";
GetLocalTime(&tLocalTimeNow);//CHECK IF A RUN HAS JUST COMPLETED. IF SO, AWAIT NEXT 5 MINUTE MARK
while((tLocalTimeNow.wMinute % 5)==0)
GetLocalTime(&tLocalTimeNow);
cout<<"\n*****AWAITING 5 MINUTE MARK TO UPDATE STOCK DATA*****\n";
GetLocalTime(&tLocalTimeNow);//LOOP THROUGH THIS SECTION, CHECKING CURRENT TIME, UNTIL 5 MINUTE UPDATE. THEN PROCEED
while((tLocalTimeNow.wMinute % 5)!=0)
GetLocalTime(&tLocalTimeNow);
cout<<"\n*****CHECKING IF MARKET IS OPEN*****\n";
//CHECK IF STOCK MARKET IS EVEN OPEN. IF NOT, REPEAT
GetLocalTime(&tLocalTimeNow);
while((tLocalTimeNow.wHour < 8)||(tLocalTimeNow.wHour) > 17)
GetLocalTime(&tLocalTimeNow);
cout<<"\n*****PROGRAM CONTINUING*****\n";
return;
}
If you want to "wait for X seconds", then the Windows system call Sleep(x) will sleep for x milliseconds. Note however, if you sleep for, say, 300s, after some operation that took 3 seconds, that would mean you drift 3 seconds every 5minutes - it may not matter, but if it's critical that you keep the same timing all the time, you should figure out [based on time or some such function] how long it is to the next boundary, and then sleep that amount [possibly run a bit short and then add another check and sleep if you woke up early]. If "every five minutes" is more of an approximate thing, then 300s is fine.
There are other methods to wait for a given amount of time, but I suspect the above is sufficient.
Instead of using a busy loop, or even Sleep() in a loop, I would suggest using a Waitable Timer instead. That way, the calling thread can sleep effectively while it is waiting, while still providing a mechanism to "wake up" early if needed.
I want to use select() to receive update from other server and also send out periodic messages. Consider the following set up:
while(1){
select(... timeout = 5 seconds);
// some other code}
If I receive update at t = 2 seconds, then select() will return and corresponding statement will be executed. When the next loop begins, timeout will be set to 5 seconds again. However, it should be 5 - 2 = 3 seconds. Is there a way to update the timer with the right time?
I thought about to manually start a timer righr before select(), however this timer might not be synchronous with the one used in select(). And will cause other potential problems.
According to the select man page:
On Linux, select() modifies timeout to reflect the amount of time not slept; most other implementations do not do this. (POSIX.1-2001 permits either behaviour.)
So, you just simply reuse the timeout variable. You only reset its value when you really time-out.
As the warning suggests, relying on this behavior makes for a porting problem, so if you rely on this behavior, make sure you document it so that the right thing is done when porting the code.
Just remember time() in a variable before you call select(), get another time() when select() returns and... in the next while(1) iteration use not 5, but 5 - difference_between_times for timeout value.
Perhaps you'd want to use new_timeout = 5 - difference_between_times % 5, so that if your operation after select returns takes longer than 5 seconds... you still set the timeout in 5 sec interval.
You probably should use not seconds, but some more granular time unit. And think whether above is the behaviour you really want (with modulo). Maybe when difference_between_times > 5, you should wait just for 5 sec. Do as you wish, but you get the idea.
When your app gets a little more complicated, you may have multiple timers with different timeout intervals. We do. Here is how we handle it.
Each timer has a timer object with a time_t of when the timer expires. We store all the timers in a heap data structure, so the soonest timer to expire is at the root of the heap. Before doing a select() we fetch the root of the heap, and subtract the current time from the timer's expiration time and use that delta as the timeout to the select() call.
Timer * t = heap->Root();
time_t now = time(0);
timeval tv;
tv.tv_sec = t->when - now;
tv.tv_usec = 0;
select( ... & tv );
So I made a game loop that uses SDL_Delay function to cap the frames per second, it look like this:
//While the user hasn't qui
while( stateID != STATE_EXIT )
{
//Start the frame timer
fps.start();
//Do state event handling
currentState->handle_events();
//Do state logic
currentState->logic();
//Change state if needed
change_state();
//Do state rendering
currentState->render();
//Update the screen
if( SDL_Flip( screen ) == -1 )
{
return 1;
}
//Cap the frame rate
if( fps.get_ticks() < 1000 / FRAMES_PER_SECOND )
{
SDL_Delay( ( 1000 / FRAMES_PER_SECOND ) - fps.get_ticks() );
}
}
So when I run my games on 60 frames per second (which is the "eye cap" I assume) I can still see laggy type of motion, meaning i see the frames appearing independently causing unsmooth motion.
This is because apparently SDL_Delay function is not too accurate, causing +,- 15 milliseconds or something difference between frames greater than whatever I want it to be.
(all these are just my assumptions)
so I am just searching fo a good and accurate timer that will help me with this problem.
any suggestions?
I think there is a similar question in How to make thread sleep less than a millisecond on Windows
But as a game programmer myself, I don't rely on sleep functions to manage frame-rate (the parameter they take is just a minimum). I just draw stuff on screen as fast as I can. I have a bunch of function calls in my game loop, and then I keep track of how often I'm calling them. For instance, I check input quite often (1000x/second) to make the game more responsive, but I don't check the network inbox more than 100x/second.
For example:
#define NW_CHECK_INTERVAL 10
#define INPUT_CHECK_INTERVAL 1
uint32_t last_nw_check = 0, last_input_check = 0;
while (game_running) {
uint32_t now = SDL_GetTicks();
if (now - last_nw_check > NW_CHECK_INTERVAL) {
check_network();
last_nw_check = now;
}
if (now - last_input_check > INPUT_CHECK_INTERVAL) {
check_input();
last_input_check = now;
}
check_video();
// and so on...
}
Use the QueryPerformanceCounter / Frequency for that.
LARGE_INTEGER start, end, tps; //tps = ticks per second
QueryPerformanceFrequency( &tps );
QueryPerformanceCounter( &start );
QueryPerformanceCounter( &end );
int usPassed = (end.QuadPart - start.QuadPart) * 1000000 / tps.QuadPart;
Here's a small wait function I had created for timing midi sequences using QueryPerformanceCounter:
void wait(int waitTime) {
LARGE_INTEGER time1, time2, freq;
if(waitTime == 0)
return;
QueryPerformanceCounter(&time1);
QueryPerformanceFrequency(&freq);
do {
QueryPerformanceCounter(&time2);
} while((time2.QuadPart - time1.QuadPart) * 1000000ll / freq.QuadPart < waitTime);
}
To convert ticks to microseconds, calculate the difference in ticks, multiply by 1,000,000 (microseconds/second) and divide by the frequency of ticks per second.
Note that some things may throw this off, for instance the precision of the high-resolution counter is not likely to be down to a single microsecond. For example, if you want to wait 10 microseconds and the precision/frequency is one tick every 6 microseconds, your 10 microsecond wait will actually be no less than 12 microseconds. Again, this frequency is system dependent and will vary from system to system.
Also, Windows is not a real-time operating system. A process may be preempted at any time and it is up to Windows to decide when the process is rescheduled. The application may be preempted in the middle of this function and not restarted again until long after the expected wait time has elapsed. There really isn't much you can do about it but you'll probably never notice it if it happens.
60 fame per second is just the frequency of power in US (50 in Europe, Africa and Asia are somehow mixed) and is the frequency of video refreshing for hardware comfortable reasons (It can be an integer multiple on more sophisticated monitors). It was a mandatory constrains for CRT dispaly, and it is still a comfortable reference for LCD (that's how frequently the frame buffer is uploaded to the display)
The eye-cap is no more than 20-25 fps - not to be confused with retina persistency, that's about one-half - and that's why TV interlace two squares upon every refresh.
independently on the timing accuracy, whatever hardware device cannot be updated during its buffer-scan (otherwise the image changes while it is shown, resulting in half-drawn broken frames), hence, if you go faster than one half of the device refresh you are queued behind it and forced to wait for it.
60 fps in a game loop serves only to help CPU manufacturers to sell new faster CPUs. Slow down under 25 and everything will look more fluid.
SDL_Delay:
This function waits a specified number of milliseconds before returning. It waits at least the specified time, but possible longer due to OS scheduling. The delay granularity is at least 10 ms. Some platforms have shorter clock ticks but this is the most common.
The actual delays observed with this function depend on OS settings. I'd suggest to look into the
Mutimedia Timer API, particulary into the timeBeginPeriod function, to adapt the interrupt frequency to your requirements.
Obtaining and Setting Timer Resolution shows an example how to change the interrupt period to about 1ms. This way you don't have the 15ms hickup anymore. BTW: Eye-catch period is about 40ms.
Obtaining fixed period timing can also be addressed by Waitable Timer Objects. But the use of mutimedia timers is mandatory to obtain decent resolution, no matter what.
Using other tools to improve the timing capabilities is discussed here.