controlling FPS limit in OpenGL application - opengl

I am trying to find a solid method for being able to set exactly how many FPS I want my OpenGL application to render on screen. I can do it to some extent by sleeping for 1000/fps milliseconds but that doesn't take into account the time needed to render.
Which is the most consistent way to limit fps to desired amount?

you can sync to vblank by using wglSwapIntervalEXT in opengl. its not nice code, but it does work.
http://www.gamedev.net/topic/360862-wglswapintervalext/#entry3371062
bool WGLExtensionSupported(const char *extension_name) {
PFNWGLGETEXTENSIONSSTRINGEXTPROC _wglGetExtensionsStringEXT = NULL;
_wglGetExtensionsStringEXT = (PFNWGLGETEXTENSIONSSTRINGEXTPROC)wglGetProcAddress("wglGetExtensionsStringEXT");
if (strstr(_wglGetExtensionsStringEXT(), extension_name) == NULL) {
return false;
}
return true;
}
and
PFNWGLSWAPINTERVALEXTPROC wglSwapIntervalEXT = NULL;
PFNWGLGETSWAPINTERVALEXTPROC wglGetSwapIntervalEXT = NULL;
if (WGLExtensionSupported("WGL_EXT_swap_control"))
{
// Extension is supported, init pointers.
wglSwapIntervalEXT = (PFNWGLSWAPINTERVALEXTPROC)wglGetProcAddress("wglSwapIntervalEXT");
// this is another function from WGL_EXT_swap_control extension
wglGetSwapIntervalEXT = (PFNWGLGETSWAPINTERVALEXTPROC)wglGetProcAddress("wglGetSwapIntervalEXT");
}

Since OpenGL is just a low-level graphics API, you won't find anything like this built into OpenGL directly.
However, I think your logic is a bit flawed. Rather than the following:
Draw frame
Wait 1000/fps milliseconds
Repeat
You should do this:
Start timer
Draw frame
Stop timer
Wait (1000/fps - (stop - start)) milliseconds
Repeat
This way if you are only waiting exactly the amount you should be, and you should end up very close to 60 (or whatever you're aiming for) frames per second.

Don't use sleeps. If you do, then the rest of your application must wait for them to finish.
Instead, keep track of how much time has passed and render only when 1000/fps has been met. If the timer hasn't been met, skip it and do other things.
In a single threaded environment it will be difficult to make sure you draw at exactly 1000/fps unless that is absolutely the only thing your doing. A more general and robust way would be to have all your rendering done in a separate thread and launch/run that thread on a timer. This is a much more complex problem, but will get you the closest to what your asking for.
Also, keeping track of how long it takes to issue the rendering would help in adjusting on the fly when to render things.
static unsigned int render_time=0;
now = timegettime();
elapsed_time = last_render_time - now - render_time;
if ( elapsed_time > 1000/fps ){
render(){
start_render = timegettime();
issue rendering commands...
end_render = timegettime();
render_time = end_render - start_render;
}
last_render_time = now;
}

OpenGL itself doesn't have any functionality that allows limiting framerate. Period.
However, on modern GPUs there's a lot of functionality covering framerate, frame prediction, etc etc. There was John Carmack's issue that he pushed to make some functionality for it available. And there's NVidia's adaptive sync.
What does all that mean for you? Leave that up to GPU. Assume that drawing is totally unpredictable (as you should when sticking to OpenGL only), time the events yourself and keep the logic updates (such as physics) separate from drawing. That way users will be able to benefit from all those advanced technologies and you won't have to worry about that anymore.

An easy way is to use GLUT. This code may do the job, roughly.
static int redisplay_interval;
void timer(int) {
glutPostRedisplay();
glutTimerFunc(redisplay_interval, timer, 0);
}
void setFPS(int fps)
{
redisplay_interval = 1000 / fps;
glutTimerFunc(redisplay_interval, timer, 0);
}

Put this after drawing and the call to swap buffers:
//calculate time taken to render last frame (and assume the next will be similar)
thisTime = getElapsedTimeOfChoice(); //the higher resolution this is the better
deltaTime = thisTime - lastTime;
lastTime = thisTime;
//limit framerate by sleeping. a sleep call is never really that accurate
if (minFrameTime > 0)
{
sleepTime += minFrameTime - deltaTime; //add difference to desired deltaTime
sleepTime = max(sleepTime, 0); //negative sleeping won't make it go faster :(
sleepFunctionOfChoice(sleepTime);
}
If you want 60fps, minFrameTime = 1.0/60.0 (assuming time is in seconds).
This won't give you vsync, but will mean that your app won't be running out of control, which can affect physics calculations (if they're not fixed-step), animation etc. Just remember to process input after the sleep! I've experimented with trying to average frame times but this has worked best so far.
For getElapsedTimeOfChoice(), I use what's mentioned here, which is
LINUX: clock_gettime(CLOCK_MONOTONIC, &ts);
WINDOWS: QueryPerformanceCounter

Another idea is to use WaitableTimers (when possible, for instance on Windows)
basic idea:
while (true)
{
SetWaitableTimer(myTimer, desired_frame_duration, ...);
PeekMsg(...)
if (quit....) break;
if (msg)
handle message;
else
{
Render();
SwapBuffers();
}
WaitForSingleObject(myTimer);
}
More info: How to limit fps information

Related

Calculate FPS with sfml

This question is NOT related to the main loop in sfml. How do I calculate the actual framerate (for example, I have vsync enabled, but the main loop still runs at a high speed), so that it actually displays how fast the screen is updated (not the speed of the main loop). Measuring main loop speed is not important for me, but the actual window update speed is.
SFML did not provide a way to retrieve your current framerate, neither backends like OpenGL does. Therefore, the only way is to monitor the main loop speed, as you suggested.
Also, window.setFrameLimit(60) or window.setVerticalSyncEnabled(true) or internal loop sleep on a 60Hz monitor causes the same effect in my SFML application, with the difference of V-Sync being more CPU and GPU expensive (due to their synchronization ways).
Therefore you can rely on calculating FPS by using chrono for example in your main loop.
Make sure to wrap your draw calls by using time_point(s), into start and end points and calculating time elapsed with std::chrono::time_duration.
Example:
std::chrono::high_resolution_clock::time_point start;
std::chrono::high_resolution_clock::time_point end;
float fps;
while(window.isOpen()){
// Perform some non-rendering logic there...
// Performed. Now perform GPU stuff...
start = std::chrono::high_resolution_clock::now();
// window.draw, etc.
end = std::chrono::high_resolution_clock::now();
fps = (float)1e9/(float)std::chrono::duration_cast<std::chrono::nanoseconds>(end-start).count());
}

GLX animation slower than expected

I have an application using XCB and openGL. At the beginning, I choose a framebuffer configuration with the following attributes:
const int attributes[] = {GLX_BUFFER_SIZE, 32, GLX_DEPTH_SIZE, 24, GLX_DOUBLEBUFFER, True, GLX_RENDER_TYPE, GLX_RGBA_BIT, None};
fb_configs = glXChooseFBConfig(display, screen_index, attributes, &fb_configs_count);
I run a simple animation which is supposed to last a fixed duration (1s), but showing it on the screen takes much longer (about 5s). After adding logs to show the value of progress, I found out the actual loop only lasts 1s.
struct timeval start; // start time of the animation
gettimeofday(&start, 0);
while (1)
{
double progress = timer_progress(&start);
if (progress > 1.0)
break; // end the animation
draw(progress);
glXSwapBuffers(display, drawable);
xcb_generic_event_t *event = xcb_poll_for_event(connection);
if (!event)
{
usleep(1000);
continue;
}
switch (event->response_type & ~0x80)
{
case XCB_EXPOSE:
default:
free(event);
continue;
}
}
I am not sure what is really going on. I suppose on each iteration glXSwapBuffers() enqueues the opengl commands for drawing and most of them are yet to be executed when the loop is over.
Tweaking the parameter of usleep() has no effect other than to make the animation less smooth or to make the animation much slower. The problem disappears when I switch to single buffering (but I get the problems associated with single buffering).
It seems I'm not doing something right, but I have no idea what.
The exact timing behaviour of glXSwapBuffers is left open to each implementation. NVidia and fglrx opt to block glXSwapBuffers until V-Sync (if V-Sync is enabled), Mesa and Intel opt to return immediately and block at the next call that would no longer fit into the command queue, where calls that would modify the back buffer before V-Sync are held up.
However if your desire is an exact length for your animation, then a loop with a fixed amount of frames and performing delays will not work. Instead you should redraw as fast as possible (and use delays only to limit your drawing rate). The animation should be progressed by the actual time that elapsed between the actual draw iterations instead of a fixed timestep (this is in contrast to game loops that should in fact do use a fixed time step, albeit at a much faster rate than drawing).
Last but not least you must not use gettimeofday for controlling animations. gettimeofday reports wall clock time, which may jump, slow down or up or even run backwards. Use high precision timers instead (clock_gettime(CLOCK_MONOTONIC, …)).

Why is my rendering thread taking up 100% cpu?

So right now in my OpenGL game engine, when my rendering thread has literally nothing to do, it's taking up the maximum for what my CPU can give it. Windows Task Manager shows my application taking up 25% processing (I have 4 hardware threads, so 25% is the maximum that one thread can take). When I don't start the rendering thread at all I get 0-2% (which is worrying on it's own since all it's doing is running an SDL input loop).
So, what exactly is my rendering thread doing? Here's some code:
Timer timer;
while (gVar.running)
{
timer.frequencyCap(60.0);
beginFrame();
drawFrame();
endFrame();
}
Let's go through each of those. Timer is a custom timer class I made using SDL_GetPerformanceCounter. timer.frequencyCap(60.0); is meant to ensure that the loop doesn't run more than 60 times per second. Here's the code for Timer::frequencyCap():
double Timer::frequencyCap(double maxFrequency)
{
double duration;
update();
duration = _deltaTime;
if (duration < (1.0 / maxFrequency))
{
double dur = ((1.0 / maxFrequency) - duration) * 1000000.0;
this_thread::sleep_for(chrono::microseconds((int64)dur));
update();
}
return duration;
}
void Timer::update(void)
{
if (_freq == 0)
return;
_prevTicks = _currentTicks;
_currentTicks = SDL_GetPerformanceCounter();
// Some sanity checking here. //
// The only way _currentTicks can be less than _prevTicks is if we've wrapped around to 0. //
// So, we need some other way of calculating the difference.
if (_currentTicks < _prevTicks)
{
// If we take difference between UINT64_MAX and _prevTicks, then add that to _currentTicks, we get the proper difference between _currentTicks and _prevTicks. //
uint64 dif = UINT64_MAX - _prevTicks;
// The +1 here prvents an off-by-1 error. In truth, the error would be pretty much indistinguishable, but we might as well be correct. //
_deltaTime = (double)(_currentTicks + dif + 1) / (double)_freq;
}
else
_deltaTime = (double)(_currentTicks - _prevTicks) / (double)_freq;
}
The next 3 functions are considerably simpler (at this stage):
void Renderer::beginFrame()
{
// Perform a resize if we need to. //
if (_needResize)
{
gWindow.getDrawableSize(&_width, &_height);
glViewport(0, 0, _width, _height);
_needResize = false;
}
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT);
}
void Renderer::endFrame()
{
gWindow.swapBuffers();
}
void Renderer::drawFrame()
{
}
The rendering thread was created using std::thread. The only explanation I can think of is that timer.frequencyCap somehow isn't working, except I use that exact same function in my main thread and I idle at 0-2%.
What am I doing wrong here?
If V-Sync is enabled and your program honors the the swap intervals, then you seeing your program taking up 100% is actually an artifact how Windows measures CPU time. It's been a long known issue, but anytime your program blocks in a driver context (which is what happens when OpenGL blocks on a V-Sync) windows will account this for the program actually consuming CPU time, while its actually just idling.
If you add a Sleep(1) right after swap buffers it will trick Windows into a more sane accounting; on some systems even a Sleep(0) does the trick.
Anyway, the 100% are just a cosmetic problem, most of the time.
In the past weeks I've done some exhaustive research on low latency rendering (i.e. minimizing the time between user input and corresponding photons coming out of the display), since I'm getting a VR headset soon. And here's what I found out regarding timing SwapBuffers: The sane solution to the problem is actually to time the frame rendering times and add an artificial sleep before SwapBuffers so that you wake up only a few ms before the V-Sync. However this is easier said than done because OpenGL is highly asynchronous and explicitly adding syncs will slow down your throughput.
if you have a complex scene or non optimized rendering
hit bottleneck somewhere or have an error in gl code
then framerate usually drops to around 20 fps (at least on NVidia) no matter the complexity of the scene
for very complex scenes even bellow that
try this:
try to measure time it takes this to process
beginFrame();
drawFrame();
endFrame();
there you will see your fps limit
compare it to scene complexity/HW capability
and decide if it is a bug or too complex scene
try to turn off some GL stuff
for example last week I discover that if I turn CULL_FACE off it actually speeds up one of mine non optimized rendering about 10-100 times which I don't get why till today (on old stuff GL code)
check for GL errors
I do not see any glFlush()/glFinish() in your code
try to measure with glFinish();
If you cant sort this out you still can use dirty trick like
add Sleep(1); to your code
it will force to sleep your thread so it will never use 100% power
the time it sleeps is 1ms + scheduler granularity so it also limits the target fps
you use this_thread::sleep_for(chrono::microseconds((int64)dur));
do not know that function are you really sure it does what you think?

Only Execute on Certain Frames in SFML

I am still fairly new to SFML, and right now I am trying to make a basic scrolling plane game. I've got most of the basic stuff down, but I'm beginning to notice that the screen looks laggy. I'm using the window.setFrameRateLimit function so that the result of movement is the same on all computers, but it makes random lag spikes happen while the program is executing. I think it's doing this because of the limit on the frame rate, and because of how it does that. Is there another better way to only have the program execute at certain times? Ideally, it would also move at the same speed on slower computers.
It seems the setFrameRateLimit lag spikes was fixed in SFML 2.0. If you cannot upgrade to SFML 2.0, then you can add your own frame rate limiter. This involves adding a sleep() to your game loop. For example:
while(App.IsOpened()) {
float time = Clock.GetElapsedTime();
// update game
// draw game
float timeToWait = (1.0 / FRAMES_PER_SECOND) - (Clock.GetElapsedTime() - time);
if(timeToWait > 0) {
sleep(timeToWait * 1000);
}
}

How can I implement an accurate (but variable) FPS limit/cap in my OpenGL application?

I am currently working on an OpenGL application to display a few 3D spheres to the user, which they can rotate, move around, etc. That being said, there's not much in the way of complexity here, so the application runs at quite a high framerate (~500 FPS).
Obviously, this is overkill - even 120 would be more then enough, but my issue is that running the application at full-stat eats away my CPU, causing excess heat, power consumption, etc. What I want to do is be able to let the user set an FPS cap so that the CPU isn't being overly used when it doesn't need to be.
I'm working with freeglut and C++, and have already set up the animations/event handling to use timers (using the glutTimerFunc). The glutTimerFunc, however, only allows an integer amount of milliseconds to be set - so if I want 120 FPS, the closest I can get is (int)1000/120 = 8 ms resolution, which equates to 125 FPS (I know it's a neglegible amount, but I still just want to put in an FPS limit and get exactly that FPS if I know the system can render faster).
Furthermore, using glutTimerFunc to limit the FPS never works consistently. Let's say I cap my application to 100 FPS, it usually never goes higher then 90-95 FPS. Again, I've tried to work out the time difference between rendering/calculations, but then it always overshoots the limit by 5-10 FPS (timer resolution possibly).
I suppose the best comparison here would be a game (e.g. Half Life 2) - you set your FPS cap, and it always hits that exact amount. I know I could measure the time deltas before and after I render each frame and then loop until I need to draw the next one, but this doesn't solve my 100% CPU usage issue, nor does it solve the timing resolution issue.
Is there any way I can implement an effective, cross-platform, variable frame rate limiter/cap in my application? Or, in another way, is there any cross-platform (and open source) library that implements high resolution timers and sleep functions?
Edit: I would prefer to find a solution that doesn't rely on the end user enabling VSync, as I am going to let them specify the FPS cap.
Edit #2: To all who recommend SDL (which I did end up porting my application to SDL), is there any difference between using the glutTimerFunc function to trigger a draw, or using SDL_Delay to wait between draws? The documentation for each does mention the same caveats, but I wasn't sure if one was more or less efficient then the other.
Edit #3: Basically, I'm trying to figure out if there is a (simple way) to implement an accurate FPS limiter in my application (again, like Half Life 2). If this is not possible, I will most likely switch to SDL (makes more sense to me to use a delay function rather then use glutTimerFunc to call back the rendering function every x milliseconds).
I'd advise you to use SDL. I personnally use it to manage my timers. Moreover, it can limit your fps to your screen refresh rate (V-Sync) with SDL 1.3. That enables you to limit CPU usage while having the best screen performance (even if you had more frames, they wouldn't be able to be displayed since your screen doesn't refresh fast enough).
The function is
SDL_GL_SetSwapInterval(1);
If you want some code for timers using SDL, you can see that here :
my timer class
Good luck :)
I think a good way to achieve this, no matter what graphics library you use, is to have a single clock measurement in the gameloop to take every single tick (ms) into account. That way the average fps will be exactly the limit just like in Half-Life 2. Hopefully the following code snippet will explain what I am talking about:
//FPS limit
unsigned int FPS = 120;
//double holding clocktime on last measurement
double clock = 0;
while (cont) {
//double holding difference between clocktimes
double deltaticks;
//double holding the clocktime in this new frame
double newclock;
//do stuff, update stuff, render stuff...
//measure clocktime of this frame
//this function can be replaced by any function returning the time in ms
//for example clock() from <time.h>
newclock = SDL_GetTicks();
//calculate clockticks missing until the next loop should be
//done to achieve an avg framerate of FPS
// 1000 / 120 makes 8.333... ticks per frame
deltaticks = 1000 / FPS - (newclock - clock);
/* if there is an integral number of ticks missing then wait the
remaining time
SDL_Delay takes an integer of ms to delay the program like most delay
functions do and can be replaced by any delay function */
if (floor(deltaticks) > 0)
SDL_Delay(deltaticks);
//the clock measurement is now shifted forward in time by the amount
//SDL_Delay waited and the fractional part that was not considered yet
//aka deltaticks
the fractional part is considered in the next frame
if (deltaticks < -30) {
/*dont try to compensate more than 30ms(a few frames) behind the
framerate
//when the limit is higher than the possible avg fps deltaticks
would keep sinking without this 30ms limitation
this ensures the fps even if the real possible fps is
macroscopically inconsitent.*/
clock = newclock - 30;
} else {
clock = newclock + deltaticks;
}
/* deltaticks can be negative when a frame took longer than it should
have or the measured time the frame took was zero
the next frame then won't be delayed by so long to compensate for the
previous frame taking longer. */
//do some more stuff, swap buffers for example:
SDL_RendererPresent(renderer); //this is SDLs swap buffers function
}
I hope this example with SDL helps. It is important to measure the time only once per frame so every frame is taken into account.
I recommend to modularize this timing in a function which also makes your code clearer. This code snipped has no comments in the case they just annoyed you in the last one:
unsigned int FPS = 120;
void renderPresent(SDL_Renderer * renderer) {
static double clock = 0;
double deltaticks;
double newclock = SDL_GetTicks();
deltaticks = 1000.0 / FPS - (newclock - clock);
if (floor(deltaticks) > 0)
SDL_Delay(deltaticks);
if (deltaticks < -30) {
clock = newclock - 30;
} else {
clock = newclock + deltaticks;
}
SDL_RenderPresent(renderer);
}
Now you can call this function in your mainloop instead of your swapBuffer function (SDL_RenderPresent(renderer) in SDL). In SDL you'd have to make sure the SDL_RENDERER_PRESENTVSYNC flag is turned off. This function relies on the global variable FPS but you can think of other ways of storing it. I just put the whole thing in my library's namespace.
This method of capping the framerate delivers exactly the desired average framerate if there are no large differences in the looptime over multiple frames because of the 30ms limit to deltaticks. The deltaticks limit is required. When the FPS limit is higher than the actual framerate deltaticks will drop indefinitely. Also when the framerate then rises above the FPS limit again the code would try to compensate the lost time by rendering every frame immediately resulting in a huge framerate until deltaticks rises back to zero. You can modify the 30ms to fit your needs, it is just an estimate by me. I did a couple of benchmarks with Fraps. It works with every imaginable framerate really and delivers beautiful results from what I have tested.
I must admit I coded this just yesterday so it is not unlikely to have some kind of bug. I know this question was asked 5 years ago but the given answers did not statify me. Also feel free to edit this post as it is my very first one and probably flawed.
EDIT:
It has been brought to my attention that SDL_Delay is very very inaccurate on some systems. I heard a case where it delayed by far too much on android. This means my code might not be portable to all your desired systems.
The easiest way to solve it is to enable Vsync. That's what I do in most games to prevent my laptop from getting too hot.
As long as you make sure the speed of your rendering path is not connected to the other logic, this should be fine.
There is a function glutGet( GLUT_ELAPSED_TIME ) which returns the time since started in miliseconds, but that's likely still not fast enough.
A simple way is to make your own timer method, which uses the HighPerformanceQueryTimer on windows, and the getTimeOfDay for POSIX systems.
Or you can always use timer functions from SDL or SFML, which do basically the same as above.
You should not try to limit the rendering rate manually, but synchronize with the display vertical refresh. This is done by enabling V sync in the graphics driver settings. Apart from preventing (your) programs from rendering at to high rate, it also increases picture quality by avoiding tearing.
The swap interval extensions allow your application to fine tune the V sync behaviour. But in most cases just enabling V sync in the driver and letting the buffer swap block until sync suffices.
I would suggest using sub-ms precision system timers (QueryPerformanceCounter, gettimeofday) to get timing data. These can help you profile performance in optimized release builds also.
Some background information:
SDL_Delay is pretty much the same as Sleep/sleep/usleep/nanosleep but it is limited to milliseconds as parameter
Sleeping works by relying on the systems thread scheduler to continue your code.
Depending on your OS and hardware the scheduler may have a lower tick frequency than 1000hz, which results in longer timespans than you have specified when calling sleep, so you have no guarantee to get the desired sleep time.
You can try to change the scheduler's frequency. On Windows you can do it by calling timeBeginPeriod(). For linux systems checkout this answer.
Even if your OS supports a scheduler frequency of 1000hz your hardware may not, but most modern hardware does.
Even if your scheduler's frequency is at 1000hz sleep may take longer if the system is busy with higher priority processes, but this should not happen if your system isn't under super high load.
To sum up, you may sleep for microseconds on some tickless linux kernels, but if you are interested in a cross platform solution you should try to get the scheduler frequency up to 1000hz to ensure the sleeps are accurate in most of the cases.
To solve the rounding issue for 120FPS:
1000/120 = 8,333ms
(int)1000/120 = 8ms
Either you do a busy wait for 333 microseconds and sleep for 8ms afterwards. Which costs some CPU time but is super accurate.
Or you follow Neop approach by sleeping sometimes 8ms and sometimes 9ms seconds to average out at 8,333ms. Which is way more efficent but less accurate.