I have this in my app delegate: [director setAnimationInterval:1.0/60];
Despite this my app is running around 30 fps. What is weird is that it is currently doing absolutely nothing. the init for my only layer does nothing more than add 6 sprites based on images, and no actions or running, they are merely displaying on the screen. The total size of these sprites is around 500 KB total. In simulator or on device the FPS displays around 30.
What could cause such a low frame rate when nothing else is going on in the app at all? There are no scheduled updates and nothing running at all; just displayed sprites.
If your sprites are large, and possibly rotated or scaled or with opacity < 255, and you're running this on an older device (1st or 2nd generation), then you may have simply run into the performance limitation of these devices.
You may be able to improve performance in particular if you use large sprites, or sprites that are rotated and scaled, by using CCSpriteBatchNode and a texture atlas to which you add each of the sprite's images. You can also reduce the color bit depth of the textures from 32-bit to 16-bit or even PVR compressed.
If you changed any of the default startup settings: for example changing the frame buffer from 16 bit to 32 bit or enabling depth buffering can also decrease performance.
Since you only have 6 sprites, wouldn't it be interesting to test what happens if you run your sample app with 5, 4, 3, 2, 1 and no sprites?
Related
I am trying to revive an old game using cocos2dx.
What I have done was reading the legacy binary files and extract the bitmap files ,and there is total 68k of bitmap files inside it.
So for now I have already read the file, decompress the bytes, transform the bitmap from RGB8 to RGBA8888, and then generate the bitmap as texture and creating a sprite.
But since it was an isometric game, so there is a map and consists of many tiles. So drawing the map with different textures (each bitmap as a individual texture) costs a lot of glcalls. What I have done is trying to reuse the texture and group them by local zorder to try to make use of the auto batching.
And for the animation of a character, now I have created 127 individual bitmap textures and try to create sprite frame on it one by one.
After all of the works the gl draw calls reduce from 800 to 50. But unluckyly the FPS is still too slow (drops to 10-20 and it should be 60)
The tests are ran on the iphone simulator, although it does not have any GPU, but is this still a normal FPS?(with almost 13k gl verts)
And does the FPS affected by the number of the textures of my character animation?
Should I try to pack the textures at the runtime? e.g. combine the textures to make a bigger texture in memory in runtime and loading them by offsets.
Don't even look at performance on the simulator. It's completely irrelevant and non-representative.
All current iOS devices will cope with 50 draw calls and 13k verts just fine, unless you have some other bottleneck (which you'll only find out by running on device), then you'll be running at 60fps for sure.
I am developing an image viewer where graphics are rendered with antialiased mode. Images can be first edited using Autocad that generates DXF files.
The application is written by using Visual C++ and Direct2D.
Although I am able to load the image quite quickly, zoom and especially pan remain a problem for me if compared with the performance of Autocad for the same image (same number of shapes).
Following is the piece of code aimed to render graphics:
auto shapes = quadTree.get_visible_shapes();
shapes.sort_by_Zorder();
for each shape in shapes:
shape.draw();
After profiling I can say that more than the 90% of the computational time is spent in the loop aimed to draw the shapes.
Drawing only the visible shapes, thanks to the implementation of the Quadtree, has been a huge performance improvement; I also render the graphics in aliased mode while panning, but there is still a big difference with Autocad.
I am wondering if Autocad draws a bitmap representation of the image, even if I didn't try this approach yet so I cannot tell if there could be an effective improvement in speed.
Considering these hypothesis are there any ways to improve the action of pan and zoom?
In AutoCAD, there is a mechanism called Adaptive degradation which abort rendering when the FPS falls below a predefined value:
And there is also a lot of optimization. You can not compete with a big program like this.
There are few considerations when doing pan on 2D/3D scene, especially when redraw-world is expensive.
Off-screen canvas
Render your screen onto an off screen bitmap with slightly larger canvas (e.g. w+N * h+N), upon PAN you instantly put up the screen, and update the off-screen one in background. There are also many ways to further optimize on this direction.
EDIT: More details:
For example, the screen of your scene is 640x480, the scene itself is 1000x1000, you want to show the region (301, 301) ~ (940, 780). You would instead create an off-screen buffer with, say, 740x580 (ie. N=50) from (251,251) ~ (990, 830). so, if the PAN operation move less than 50 pixel, (e.g. PAN left 5 pixels) you already have such content to instantly render to screen.
Also, after PAN you may want to prepare the new off-screen buffer in background (or when idle) so that subsequent PAN can be performed instantly.
In case of PAN too far, you still have to wait for it, or reduce the quality of rendering for intermediate screens, and render full details only when PAN stopped - user won't notice details when moving anyway.
Limit update frequency
PAN operation is usually triggered by mouse (or gesture touch) which may comes at high volume of events. Instead of queue all the 20 mouse move events within that one second and spend 3 seconds redraw the world 20 times, you should limit the update frequency.
I'm writing a 2D platformer game using SDL with C++. However I have encountered a huge issue involving scaling to resolution. I want the the game to look nice in full HD so all the images for the game have been created so that the natural resolution of the game is 1920x1080. However I want the game to scale down to the correct resolution if someone is using a smaller resolution, or to scale larger if someone is using a larger resolution.
The problem is I haven't been able to find an efficient way to do this.I started by using the SDL_gfx library to pre-scale all images but this doesn't work as it creates a lot of off-by-one errors, where one pixel was being lost. And since my animations are contained in one image when the animation would play the animation would slightly move up or down each frame.
Then after some looking round I have tried using opengl to handle the scaling. Currently my program draws all the images to a SDL_Surface that is 1920x1080. It then converts this surface to a opengl texture, scales this texture to the screen resolution, then draws the texture. This works fine visually but the problem is that its not efficient at all. Currently I am getting a max fps of 18 :(
So my question is does anyone know of an efficient way to scale the SDL display to the screen resolution?
It's inefficient because OpenGL was not designed to work that way. Main performance problems with current design:
First problem: You're software rasterizing with SDL. Sorry, but no matter what you do with this configuration, that will be a bottleneck. At a resolution of 1920x1080, you have 2,073,600 pixels to color. Assuming it takes you 10 clock cycles to shade each 4-channel pixel, on a 2GHz processor you're running a maximum of 96.4 fps. That doesn't sound bad, except you probably can't shade pixels that fast, and you still haven't done AI, user input, game mechanics, sound, physics, and everything else, and you're probably drawing over some pixels at least once anyway. SDL_gfx may be quick, but for large resolutions, the CPU is just fundamentally overtasked.
Second problem: Each frame, you're copying data across the graphics bus to the GPU. This is the slowest thing you can possibly do graphics-wise. Image data is probably the worst of that, because there's typically so much of it. Basically, each frame you're telling the GPU to copy two million some pixels from RAM to VRAM. According to Wikipedia, you can expect, for 2,073,600 pixels at 4 bytes each, no more than 258.9 fps, which again doesn't sound bad until you remember everything else you need to do.
My recommendation: switch your application completely to OpenGL. This removes the need to render to a texture and copy to the screen--just render directly to the screen! Also, scaling is handled automatically by your view matrix (glOrtho/gluOrtho2D for 2D), so you don't have to care about the scaling issue at all--your viewport will just show everything at the same scale. This is the ideal solution to your problem.
Now, it comes with the one major drawback that you have to recode everything with OpenGL draw commands (which is work, but not too hard, especially in the long run). Short of that, you can try the following ideas to improve speed:
PBOs. Pixel buffer objects can be used to address problem two by making texture loading/copying asynchronous.
Multithread your rendering. Most CPUs have at least two cores and on newer chips two register states can be saved for a single core (Hyperthreading). You're essentially duplicating how the GPU solves the rendering problem (have a lot of threads going). I'm not sure how thread safe SDL_gfx is, but I bet that something could be worked out, especially if you're only working on different parts of the image at the same time.
Make sure you pay attention to what place your draw surface is in SDL. It should probably be SDL_SWSURFACE (because you're drawing on the CPU).
Remove VSync. This can improve performance, even if you're not running at 60Hz
Make sure you're drawing your original texture--DO NOT scale it up or down to a new one. Draw it at a different size, and let the rasterizer do the work!
Sporadically update: Only update half the image at a time. This will probably close to double your "framerate", and it's (usually) not noticeable.
Similarly, only update the changing parts of the image.
Hope this helps.
I'm building a cocos2d game where I use two background sprites, actually one is a sprite, the other one is a CCMask that is used to make holes into the other background, but the performance problem is the same even when using 2 regular background sprites on top of each other.
When I use one background sprite, my FPS is around 60 all the time, when I use two background sprites the FPS drops to 30 every time. I've googled around, tried different solutions including reading sprites from a sprite frame cash instead of from a file, unfortunately the result is the same.
I just can't figure out why this is happening. Does any one here have any idea why this is happening and how to get around it?
On older devices (1st & 2nd generation, ie iPhone 3G) this can easily happen since they have terrible fillrates.
If possible try to SpriteBatch the two background images. You need to add both to a texture atlas, for example with TexturePacker. Sprite batching is particularly effective if the sprites are large.
Also, just in case: don't test performance in the Simulator. Simulator performance has no relation to actual device performance whatsoever.
I am working on a Opengl based 2D CAD software which requires heavy use of hardware OpenGL acceleator (pushing 250 million vertex per second at times). Here is my problem.... whenever the viewport is stagnant for more than 10 seconds, the Opengl accelerator (Geforce 9800 GT in this case) goes to a inactive mode. When the viewport is being rendered again after the inactive period, I am getting 1/4th the normal framerate and this will last for 3-4 seconds before the 3D accelerator wakes up and kicks into full speed.
Question :
How do I prevent this from happening ?
Is there an Opengl way to prevent GPus from going into inactive mode?
Thank you for your replies.
Gary
There are several ways you can keep a GPU busy but the most sure fire way to guarantee it is doing something and not just deferring your commands is to actually draw something. glClear() and every glDraw* command constitute actual drawing commands. Throw in a glFinish() at the end of the draw to guarantee execution of the gl command stream.
Presumably you don't want to see this drawing so create a new framebuffer object, create a small RGBA texture (say 256 on a side), then attach the texture to color attachment point 0.
When you want to keep the GPU busy draw to this offscreen buffer.
This is all with the assumption that you can't, for instance, just change your boot-args or control panel settings to modulate power management behavior on the card. Every OS has different semantics here.