I am developing an image viewer where graphics are rendered with antialiased mode. Images can be first edited using Autocad that generates DXF files.
The application is written by using Visual C++ and Direct2D.
Although I am able to load the image quite quickly, zoom and especially pan remain a problem for me if compared with the performance of Autocad for the same image (same number of shapes).
Following is the piece of code aimed to render graphics:
auto shapes = quadTree.get_visible_shapes();
shapes.sort_by_Zorder();
for each shape in shapes:
shape.draw();
After profiling I can say that more than the 90% of the computational time is spent in the loop aimed to draw the shapes.
Drawing only the visible shapes, thanks to the implementation of the Quadtree, has been a huge performance improvement; I also render the graphics in aliased mode while panning, but there is still a big difference with Autocad.
I am wondering if Autocad draws a bitmap representation of the image, even if I didn't try this approach yet so I cannot tell if there could be an effective improvement in speed.
Considering these hypothesis are there any ways to improve the action of pan and zoom?
In AutoCAD, there is a mechanism called Adaptive degradation which abort rendering when the FPS falls below a predefined value:
And there is also a lot of optimization. You can not compete with a big program like this.
There are few considerations when doing pan on 2D/3D scene, especially when redraw-world is expensive.
Off-screen canvas
Render your screen onto an off screen bitmap with slightly larger canvas (e.g. w+N * h+N), upon PAN you instantly put up the screen, and update the off-screen one in background. There are also many ways to further optimize on this direction.
EDIT: More details:
For example, the screen of your scene is 640x480, the scene itself is 1000x1000, you want to show the region (301, 301) ~ (940, 780). You would instead create an off-screen buffer with, say, 740x580 (ie. N=50) from (251,251) ~ (990, 830). so, if the PAN operation move less than 50 pixel, (e.g. PAN left 5 pixels) you already have such content to instantly render to screen.
Also, after PAN you may want to prepare the new off-screen buffer in background (or when idle) so that subsequent PAN can be performed instantly.
In case of PAN too far, you still have to wait for it, or reduce the quality of rendering for intermediate screens, and render full details only when PAN stopped - user won't notice details when moving anyway.
Limit update frequency
PAN operation is usually triggered by mouse (or gesture touch) which may comes at high volume of events. Instead of queue all the 20 mouse move events within that one second and spend 3 seconds redraw the world 20 times, you should limit the update frequency.
Related
I'm trying to do a little game in 2D to learn how to do it and improve my programming skills. I programme the game using C++/C and OpenGL 3.0 with GLUT.
I so confused with some important concepts about animations and scenario refresh.
It's a good practice load all the textures only when the level begins ?
I choose a fps rate to 40 fps, should i redraw all the scenario and the agents in every frame or only the modifications ?
In an agent animation should i redraw all the entire agent or only the parts which changes from the past ?
If some part of the scene changes (one wall or something similar is destroyed) should i need to redraw all the entire scene or only the part which changes ?
Now my "game" works with a framerate of 40fps but the game has a flickering effect that looks really weird.
Yes, creating and deleting textures/buffers every frame is a huge waste.
It's almost always cheaper to just redraw the entire scene. GPUs are built to do this, it's very fast.
Reading the framebuffer from VRAM back to regular RAM and calculating the difference is going to be much slower, especially since OpenGL doesn't keep track of your "objects", it just takes a triangle at a time, rasterizes it, then forgets about it.
Depends on how you define the animation. If you're talking about sprite-like animation, where each frame is a separate image, then it's cheapest to just refer to the new texture and redraw.
If you've got a texture atlas, update the texture coordinates and redraw, and if you're using shaders (you pretty much have to if you want to be OpenGL 3.0), you might be able to get away with a uniform that offsets texture coordinates.
Yeah, as I said before, the hardware is built to clear the screen and redraw everything.
And for a framerate, you should be using the monitor's refresh rate to avoid vertical tearing. Pretty much all monitors now are 60Hz, so 60fps is the most common "target" framerate.
Choose either 30 or 60 fps as most modern monitors refresh in 60 Hz rate. So you have either 2 or 1 rendered frame per "monitor frame". This should reduce flickering effects. (I'm not 100% sure if you mean this with "flash effect".)
Regarding all other questions (which sound pretty much the same): In OpenGL rendering, redrawing everything is pretty common, as in most games almost the entire screen changes in every frame, for example if you're moving around. You could do a partial screen update, but it's very uncommon and more expensive on the CPU side, as you have to compute which parts to draw instead of just "draw everything".
Yes
2-4. Yes - Hopefully this help you understand why you must...
Imagine you have 2 pieces of paper. The first paper you draw a stick man standing still, and show that to somebody.
The second paper while the user is looking at that paper you draw the same thing again but this time you move the arm a little bit.
Now you show them the second paper, as they look at the second paper you clear the first paper and draw the man moving his arm a little bit more.
This is pretty much how it works and is the reason you must always render the whole image regardless if nothing has changed.
I'm writing a 2D platformer game using SDL with C++. However I have encountered a huge issue involving scaling to resolution. I want the the game to look nice in full HD so all the images for the game have been created so that the natural resolution of the game is 1920x1080. However I want the game to scale down to the correct resolution if someone is using a smaller resolution, or to scale larger if someone is using a larger resolution.
The problem is I haven't been able to find an efficient way to do this.I started by using the SDL_gfx library to pre-scale all images but this doesn't work as it creates a lot of off-by-one errors, where one pixel was being lost. And since my animations are contained in one image when the animation would play the animation would slightly move up or down each frame.
Then after some looking round I have tried using opengl to handle the scaling. Currently my program draws all the images to a SDL_Surface that is 1920x1080. It then converts this surface to a opengl texture, scales this texture to the screen resolution, then draws the texture. This works fine visually but the problem is that its not efficient at all. Currently I am getting a max fps of 18 :(
So my question is does anyone know of an efficient way to scale the SDL display to the screen resolution?
It's inefficient because OpenGL was not designed to work that way. Main performance problems with current design:
First problem: You're software rasterizing with SDL. Sorry, but no matter what you do with this configuration, that will be a bottleneck. At a resolution of 1920x1080, you have 2,073,600 pixels to color. Assuming it takes you 10 clock cycles to shade each 4-channel pixel, on a 2GHz processor you're running a maximum of 96.4 fps. That doesn't sound bad, except you probably can't shade pixels that fast, and you still haven't done AI, user input, game mechanics, sound, physics, and everything else, and you're probably drawing over some pixels at least once anyway. SDL_gfx may be quick, but for large resolutions, the CPU is just fundamentally overtasked.
Second problem: Each frame, you're copying data across the graphics bus to the GPU. This is the slowest thing you can possibly do graphics-wise. Image data is probably the worst of that, because there's typically so much of it. Basically, each frame you're telling the GPU to copy two million some pixels from RAM to VRAM. According to Wikipedia, you can expect, for 2,073,600 pixels at 4 bytes each, no more than 258.9 fps, which again doesn't sound bad until you remember everything else you need to do.
My recommendation: switch your application completely to OpenGL. This removes the need to render to a texture and copy to the screen--just render directly to the screen! Also, scaling is handled automatically by your view matrix (glOrtho/gluOrtho2D for 2D), so you don't have to care about the scaling issue at all--your viewport will just show everything at the same scale. This is the ideal solution to your problem.
Now, it comes with the one major drawback that you have to recode everything with OpenGL draw commands (which is work, but not too hard, especially in the long run). Short of that, you can try the following ideas to improve speed:
PBOs. Pixel buffer objects can be used to address problem two by making texture loading/copying asynchronous.
Multithread your rendering. Most CPUs have at least two cores and on newer chips two register states can be saved for a single core (Hyperthreading). You're essentially duplicating how the GPU solves the rendering problem (have a lot of threads going). I'm not sure how thread safe SDL_gfx is, but I bet that something could be worked out, especially if you're only working on different parts of the image at the same time.
Make sure you pay attention to what place your draw surface is in SDL. It should probably be SDL_SWSURFACE (because you're drawing on the CPU).
Remove VSync. This can improve performance, even if you're not running at 60Hz
Make sure you're drawing your original texture--DO NOT scale it up or down to a new one. Draw it at a different size, and let the rasterizer do the work!
Sporadically update: Only update half the image at a time. This will probably close to double your "framerate", and it's (usually) not noticeable.
Similarly, only update the changing parts of the image.
Hope this helps.
I'm building a cocos2d game where I use two background sprites, actually one is a sprite, the other one is a CCMask that is used to make holes into the other background, but the performance problem is the same even when using 2 regular background sprites on top of each other.
When I use one background sprite, my FPS is around 60 all the time, when I use two background sprites the FPS drops to 30 every time. I've googled around, tried different solutions including reading sprites from a sprite frame cash instead of from a file, unfortunately the result is the same.
I just can't figure out why this is happening. Does any one here have any idea why this is happening and how to get around it?
On older devices (1st & 2nd generation, ie iPhone 3G) this can easily happen since they have terrible fillrates.
If possible try to SpriteBatch the two background images. You need to add both to a texture atlas, for example with TexturePacker. Sprite batching is particularly effective if the sprites are large.
Also, just in case: don't test performance in the Simulator. Simulator performance has no relation to actual device performance whatsoever.
Is there anyone who can explain how hardware cursor works precisely? How does it relate to the graphics I'm drawing on the screen? I'm using OpenGL to draw, how does hardware cursor relate to OpenGL graphics?
EDIT: For those who may be interested in this in the future I just implemented what is needed to show the cursor with the hardware. The implementation was in the kernel and to use it simple ioctl's were sufficient. Works perfectly.
Hardware Cursor means, that the GPU provides to draw a (small) overlay picture over the screen framebuffer, which position can be changed by two registers (or so) on the GPU. So moving around the pointer doesn't require to redraw the portions of the framebuffer that were previously obstructed.
Relation to OpenGL: None!
The hardware cursor is not rendered or supported by OpenGL. Some small piece of hardware overlays it on whatever image is going out the display connector - it's inserted directly into the bitstream at scan-out of each frame. Because of that, it can be moved around by changing a pair of hardware registers containing its coordinates. In the old days, these were called sprites and various numbers of them were supported on different systems.
Hardware cursors have less latency, and thus provide a better experience, because they are not tied to your game or engine frame rate but to the screen refresh rate.
Software cursors, rendered by you as a screen-space sprite during your render loop, however, must run at the rate of your game engine. Thus, if your game experiences lag or otherwise drops below target fps, the cursor latency will get worse. A minor drop in game fps is usually acceptable, but a minor drop in cursor latency is very noticeable as a "sluggish cursor".
You can test this easily by rendering a software cursor while leaving the hardware cursor on. (FYI, in Windows API the hw cursor function is ShowCursor). You'll find that the software cursor trails behind the hardware cursor.
In order to do object picking in OpenGL, do I really have to render the scene twice?
I realize rendering the scene is supposed to be cheap, going at 30fps.
But if every selection object requires an additional gall to RenderScene()
then if I click at 30 times a second, then the GPU has to render twice as many times?
One common trick is to have two separate functions to render your scene. When you're in picking mode, you can render a simplified version of the world, without the things you don't want to pick. So terrain, inert objects, etc, don't need to be rendered at all.
The time to render a stripped-down scene should be much less than the time to render a full scene. Even if you click 30 times a second (!), your frame rate should not be impacted much.
First of all, the only way you're going to get 30 mouse clicks per second is if you have some other code simulating mouse clicks. For a person, 10 clicks a second would be pretty fast -- and at that, they wouldn't have any chance to look at what they'd selected -- that's just clicking the button as fast as possible.
Second, when you're using GL_SELECT, you normally want to use gluPickMatrix to give it a small area to render, typically a (say) 10x10 pixel square, centered on the click point. At least in a typical case, the vast majority of objects will fall entirely outside that area, and be culled immediately (won't be rendered at all). This speeds up that rendering pass tremendously in most cases.
There have been some good suggestions already about how to optimize picking in GL. That will probably work for you.
But if you need more performance than you can squeeze out of gl-picking, then you may want to consider doing what most game-engines do. Since most engines already have some form of a 3D collision detection system, it can be much faster to use that. Unproject the screen coordinates of the click and run a ray-vs-world collision test to see what was clicked on. You don't get to leverage the GPU, but the volume of work is much smaller. Even smaller than the CPU-side setup work that gl-picking requires.
Select based on simpler collision hulls, or even just bounding boxes. The performance scales by number of objects/hulls in the scene rather than by the amount of geometry plus the number of objects.