FPS how calculate this? - opengl

In my OpenGL book, it says this:
"What often happens on such a system is that the frame is too complicated
to draw in 1/60 second, so each frame is displayed more than once. If, for
example, it takes 1/45 second to draw a frame, you get 30 fps, and the
graphics are idle for 1/30 1/45 = 1/90 second per frame, or one-third of
the time."
In the sentence that say "it takes 1/45 second to draw a frame, you get 30 fps", why do I get only 30 fps? Woudln't 45 fps be more correct?

The graphics card will normally only buffer one frame ahead.
If it takes 1/45 of a second to draw a frame, then at the 1/60 of a second mark, the previous frame will be redisplayed. At the 1/45 mark, the next frame is done - but the card doesn't have a free buffer to start rendering the next one, so has to sit idle until 1/30, where it can send out that frame and start working on the next one.
This is with VSync enabled - if you disable it, instead of getting the 30FPS framerate and an idle card 1/3rd of the time, the card will start redrawing immediately, and you'll get screen tearing instead.

It's correct. You'd get 45 fps, but the system is slowing it down to 30 fps, to achieve a smooth framerate on 60Hz (60 redraws per second) monitors.
Because you need to draw something every 1/60 seconds on a 60Hz monitor, and can't draw a "half-frame", you must draw the previous frame. So if you 60 time per second you once draw the real frame, and every 2 frames you draw the former, then you get 30fps despite the fact that you could manage 45fps.

So yes, as others have said, this is due to your graphics waiting for v-sync prior to starting to generate the next frame.
That said...
Beware, not all monitors refresh at 60Hz. 60fps vs 30fps becomes 70fps vs 35fps on a 70Hz display.
If you don't want to get your card to wait for the v-sync before starting next frame, but still avoid the tearing, use triple buffering. The GPU then ping-pongs rendering to 2 buffers while the 3rd is displayed. The v-sync event is what triggers the swap to the "currently finished" back buffer. This is still not really great, because you end up with some frames that stay on the screen more often than others: with your 1/45 rendering, a frame will stay for 1/30s and the next for 1/60, giving some jerkiness.
Last, with the advent of offscreen rendering (rendering to non-displayed buffers), it's in theory possible for a driver to not wait for the v-sync before starting on the next frame, if the early work of that next frame happens to not touch the display surface. I don't think I've ever seen a driver be that smart though.

Related

Understanding buffer swapping in more detail

This is more a theoretical question. This is what I understand regarding buffer swapping and vsync:
I - When vsync is off, whenever the developer swap the front/back buffers, the buffer that the GPU is reading from and sending to the monitor will be changed to the new one, regardless if the old buffer was being read (i.e. no vblank is needed).
II - When vsync is on, the buffers are not immediately swapped, they will only be changed when the old buffer was completely read (i.e. vblank is needed).
III - Turning vsync off can boost the frame rate to be greater than the monitor refresh rate, but screen tearing can appear when buffers are swapped when they are being read
IV - Turning vsync on prevents tearing, but the monitor refresh rate limits the FPS.
Based on this I tried to do the following experiment: I disabled vsync and every frame I rendered all pixels with a solid color using glClearColor + glClear, choosing a new random color per frame. I got ~2400FPS in a 60Hz monitor. Since every frame I swapped the buffers, and since the monitor takes 1/60 second for each full screen drawing, I was expecting that each time the monitor was being refreshed, the buffers would have been swapped roughly ~40 times. This is because in 1/60s, there are around 40 buffer swapping calls. Since everytime the buffers are swapped the clear color is different, I was expecting to see a really messy image, with lots of different colors, because of the tearing. Instead, by taking some screenshots I didn't see any tearing... every pixel had the same solid color.
Could someone point the wrong assumptions that I had and why I see such behavior?
Thanks in advance!
The problem was related to the window manager. I could see the expected behavior when I ran in full screen.

optimization streaming vbo openg

I'm rendering a top-down, tile-based world, using opengl 3.3, using fully streamed VBO's.
After encountering some lag I did some benchmarking and what I found was horrid!
Let me explain the picture. The first marked square is me running my game using the simplest of shaders. There is no lightning, no nothing! I'm simply uploading 5000 vertices and draw them. My memory load is about 20-30%, cpu-load 30-40%
The second is with lightning. Every light is uploaded as an array to the fragment shader and every fragment processes the lights. load about 40-50%. 100% with 60 lights.
The third is with deferred shading. First I draw normal and diffuse to a FBO, then I render each light to the default FB, while reading from these. load is about 80%. Basically unaffected by amount of lights.
These are the scenes I render:
As you can see, there's nothing fancy. It's retro style. My plan has been to add tons of complexity and still run smooth on low-end computers. Mine is a i7 nvidia 660M, so it shouldn't have a problem.
For comparison I ran warcraft 3 and it took about 50-60% load, 20% memory.
One strange thing I've noticed is that if I disable V-sync and don't call glFinish before swapbuffers, load goes down significantly. However, the clock goes up and heat is produced (53C*).
Now, first I'm wondering if you think this is normal. If not, then what could be my bottleneck? Could it be my streaming VBO? I've tried double buffering and orphaning, but nothing. Doubling the number of sprites basically increases the memory load by 5-10%. the gpu-load remains basically the same.
I'm aware this question can't be easily answered, but I'll provide more details as you require them. Don't want to post my 20000 lines of code here.
Oh, and one more thing... It fluctuates. The draw calls are identical, but the load can go from 2-100%, whenever it feels like it.
UPDATE:
my main loop looks like this:
swapbuffers
renderAndDoGlCalls
updateGameAndPoll
sleep if there's any time left (1/60th second)
repeat.
Without v-sync, glflush or glfinsih, this results in percentage used:
swap: 0.16934400677376027
ren: 0.9929640397185616
upp:0.007698000307920012
poll:0.0615780024631201
sleep: 100.39487801579511
With glFinish prior to swapbuffers:
swap: 26.609977064399082 (this usually goes up to 80%)
ren: 1.231584049263362
upp:0.010266000410640016
poll:0.07697400307896013
sleep: 74.01582296063292
with Vsync it starts well, usually the same as with glFinish, then bam!:
swap: 197.84934791397393
ren: 1.221324048852962
upp:0.007698000307920012
poll:0.05644800225792009
sleep: 0.002562000102480004
And it stays that way.
Let me clarify... If I call swapbuffers right after all opengl calls, my CPU stalls for 70% of update-time, letting me do nothing. This way, I give the GPU the longest possible time to finish the backbuffer before I call the swap again.
You are actually inadvertently causing the opposite scenario.
The only time SwapBuffers causes the calling thread to stall is when the pre-rendered frame queue is full and it has to wait for VSYNC to flush a finished frame. The CPU could easily be a good 2-3 frames ahead of the GPU at any given moment, and it is not the current frame finishing that causes waiting (there's already a finished frame that needs to be swapped in this scenario).
Waiting happens because the driver cannot swap the backbuffer from back to front until the VBLANK signal rolls around (which only occurs once every 16.667ms). The driver will actually continue to accept commands while it is waiting for a swap up until it hits a certain limit (pre-rendered frames on NVIDIA hardware / flip queue size on AMD) worth of queued swaps. Once that limit is hit, GL commands will cause blocking until the back buffer(s) is/are swapped.
You are sleeping at the end of your frames, so no appreciable CPU/GPU parallelism ever develops; in fact you are more likely to skip a frame this way.
That is what you are seeing here. The absolute worst-case scenario is when you sleep for 1 ms too late to swap buffers in time for VBLANK. Your time between two frames then becomes 16.66667 + 15.66667 = 32.33332 ms. This causes a stutter that would not have happened if you did not add your own wait time. The driver could have easily copied the backbuffer from back to front and continued accepting commands in that 1 extra ms you added, but instead it blocks for an additional 15 at the beginning of the next frame.
To avoid this, you want to swap buffers as soon as possible after all commands for a frame have been issued. You have the best likelihood of meeting the VBLANK deadline this way. Reported CPU usage may go up since less time is spent sleeping, but performance should be measured using frame time rather than scheduled CPU time.
VSYNC and the pre-rendered frame limit discussed will keep your CPU and GPU from running out of control and generating huge amounts of heat as mentioned in the question.

OpenGL framerate: connection with the size of the window

I was in the process of tracking down and eliminating those parts of my C++/OpenGL/GLUT code that were inefficient and slow, and in doing so, I watched my frames per second counter to know if I was actually making progress. I noticed that my frame rate dropped from about 120 to 60 if I maximized the window.
Further experimentation revealed that this was a linear thing, I could change the frame rate by changing the size of the window.
Does this mean that my bottleneck in in the GPU rendering? Surely GPUs these days are more than powerful enough not to notice the difference between a 300x300 and 1920x1080? Or am I asking too much from my graphics card?
The alternative is that there is some bug in my code that is causing the system to slow down on larger renders.
What I am asking is this: is it reasonable to expect a halving of framerate when changing the window size, or is there something very wrong?
Further experimentation revealed that this was a linear thing, I could change the frame rate by changing the size of the window.
Congratulations: You discovered fill rate
Does this mean that my bottleneck in in the GPU rendering?
Yes, pretty much. To be specific the bottleneck is either the bandwidth from/to the graphics memory, or the complexity of the fragment shader, or a combination of both.
Surely GPUs these days are more than powerful enough not to notice the difference between a 300x300 and 1920x1080?
300×300 = 90000
1920×1080 = 2073600
Or in other words: You ask the GPU to fill about 20 times as many pixels. Which means 20 times as much data must be flung around and also be processed.
That drop from 120Hz to 60Hz comes from V-Sync. If you disabled V-Sync you'd find, that your program would probably reach way higher rates than 60Hz for 1920×1080, but for 300×300 it will be something below 180Hz.
The reason for that is simple: When synched to the display vertical retrace, your GPU can "put out" the next frame only at the moment the display is v-syncing. If your display can do 120Hz (like yours as it's obvious) and your rendering takes less time than 1/120s to complete it will make the deadline and your framerate synchronizes to the display. If however drawing a frame takes more then 1/120s, then it will sync with every 2nd frame displayed. If rendering takes more than 1/60s second every 3rd, 1/30s every 4th and so on.

what if the update method scheduled by scheduleUpdate run too long in cocos2d-iphone?

after scheduleUpdate, the update:(ccTime)dt will be called 60 times per second, what if at one time the update method's running time exceeds 1/60 second? the next call will be cancelled?
The framerate drops. Nothing will be cancelled.
At 60 fps there's exactly 1/60th of a second for cocos2d and your code to process everything that's needed to render a frame, including all OpenGL drawing operations. That's 0.016666666 seconds to do it all.
If one update cycle takes longer than that, the next frame will be rendered after 0.03333333 seconds instead, dropping the framerate to 30 fps if multiple frames continuously take longer to process. Provided that everything is done within that time, otherwise the next frame update will be deferred to 0.05 seconds or even 0.06666666 seconds.
You can only get 60, 30, 20 or 15 fps framerate with cocos2d since it uses CADisplayLink which synchronizes updates with the screen refresh rate. The framerate counter in cocos2d may show 40 fps or something because it averages over multiple frames.

Perfect V-sync implementation for a lightweight OpenGL game: need one tidbit of information

In the game our Internet-assembled team is programming, we're assuming everybody from our audience will have WAY over fullspeed in the game.
So, to save video RAM, and hopefully give a little more idle time to the graphics card, using V-sync without double buffering would be our best option. So, in OpenGL, we need to know how to do that.
From my understanding, V-sync is when the graphics card is paused once it's done rendering a single frame until that frame has finished being sent to the display device. Double buffering doesn't pause render operations (or maybe it does, or maybe it's implementation-specific; not sure), because it instead draws to a second buffer before copying to the framebuffer, so that the monitor either gets the full frame or no new frame at all (specifically, the last stored image in the framebuffer). Well, we don't need that feature, as long as the graphics card just writes to the framebuffer ONLY when it damn needs to.
This is a pretty slow online game (But it's VERY creative ^_^). There's very little realtime action. Therefore, extremely precise user input is not a necessity; it can be captured from the OS as a single unit any time before rendering a frame.
So, in order to do EXACTLY this, I need to be able to get a "Frame has finished sending to monitor" message from OpenGL. Is it possible? If not, what is the best alternative?
The game is being programmed for Windows only at the moment but should have work done for Linux in a few months.
You suffer from a misconception what V-Sync does. There's a part in video RAM that's continously sent to the display device at a constant rate, the frame refresh rate. So immediately after a full frame has been sent the next frame gets sent, after a very short blank time. But the time between sending frames is far shorter than the time it takes to send the full frame.
What happens without V-Sync is, that operations on the contents of the framebuffer get visible, for example if the frame is filled alternating with red and green and there's no V-Sync you'll see red and green bands on the monitor. To avoid this, V-Sync swaps the pointer the display driver uses to access the framebuffer just after a full frame has been sent.
Which brings us to what doublebuffering does. Without doublebuffering there's little use for a V-Sync. The action triggered by V-Sync must happen very, very fast. So this boils down to swapping a pointer or a very fast blitting operation (potentially by simply setting CoW attributes for the GPU's MMU).
Without doublebuffering and no V-Sync the effect is, that one can see the process in which the picture is rendered piece by piece to the framebuffer. Of course if rendering happens faster than a frame period this has the effect that top-down you'll see a only sparsely populated image with more and more content being visible toward the bottem, and somewhere inbetween it'll hit the lower screen edge, wapping around to the top. The intersection line will be moving.
TL;DR: Just use double buffering and enable V-Sync for buffer swap. Don't be afraid of memory consumption. All GPUs in circulation today have more than enough RAM to easily provide the memory for doublebuffered colour planes. Just do the math: 1920x1200 * RGB = 6MiB, even the smallest GPUs in PCs today deliver at least 128MiB of RAM. Mobile devices, let's say iPad 1024*768 * RGB = 2MiB vs. 32MiB for graphics. The UI of the iPad is doublebuffered anyway.
You can use wglGetProcAddress to get the address of wglSwapIntervalEXT, and then call wglSwapIntervalEXT(1); to synchronize updates with the vertical synch. When you do this, you don't get a message at the vertical synch -- instead glFlush simply doesn't return until a vertical retrace has happened, and the screen has been updated. So, you have a WM_PAINT handler that looks something like this:
BeginPaint
wglMakeCurrent
do drawing
glFlush
EndPaint
The glFlush is needed in any case, to ensure the drawing you've done gets sent to the screen.