Costs/Benefits of various DirectX drawing methods? - c++

I'm working on a 2D game using C++ and DirectX9 and I've got a decent amount of it working. As of now I have it using sprite.draw for everything: the player, the backgrounds (tiled with for loops), the walls, the HUD, etc. Then I started questioning if how I was drawing the game was the best way to go. Are there major differences between using sprites and using textured primitives? Is there a way to just set each pixel individually from my own functions, and would that be practical? It'd be nice if I could later add lighting and alpha blending, and I'd be up for coding that myself if it doesn't slow the program down too much. I just want to get things straight right away and make sure there's nothing I'm missing.

Sprite drawing is fine, the advantage of it is that it handles all the texture coordinates for you and probably is also hardware accelerated.
Is there a way to just set each pixel individually from my own functions, and would that be practical
It is possible, but not practical. The cpu cannot process so much pixels at each frame, and neither its buses are wide enough to send all of them every frame. Thats what the graphic card is for, it is much faster with pixel processing and has much wider buses to the display.
It'd be nice if I could later add lighting and alpha blending, and I'd be up for coding that myself if it doesn't slow the program down too much
It is possible, with built in functions for lights and alpha blending, and you can even code it yourself (its called a shader).

Related

Techniques for drawing tiles with OpenGL

I've been using XNA for essentialy all of my programming so far and would like to move on to OpenGL (along with SFML for IO, creating the window etc.) with C++ . For starters I'd like to create a tile-based game and I've mostly looked at LazyFoo's tutorials.
I just have a two questions:
How should I draw the tiles? Should I use immediate drawing, arrays, VBOs or what? VBOs feel like overkill for this but I'm not sure. It's very tempting to use immediate drawing but apparently it's deprecated. Maybe it's fine for this purpose since it's 2D and only for a bunch of quads.
I'd like a lot of different tiles and thus all of my tiles will not fit into a single texture without making it massive. I've read that using bindTexture isn't very cheap and thus I should avoid as many calls as I can. I thought that maybe I can create a manager for my textures and stitch them all together into one big texture and bind that but then the dimensions of that is an issue.
Don't use immediate mode! It's cumbersome to work with and has been removed from recent OpenGL versions. Use Vertex Arrays, ideally through VBOs. In the end they're much easier to use, believe me.
Regarding that switching of textures. We're talking about optimizing the texture switch patterns in very complex scenes. In your case it will hardly matter at all.
Update
Right now you worry abount things without having even used them. That's worse than premature optimization. I suggest you first get a good grip on OpenGL, then start worrying about state switch management.
With regards to the texture atlas; this is usually done by stitching textures into groups of power-of-two sized textures. For example in a tile-based game you might have a particular tile set (say, tiles for an ice world) grouped together on 2 or 3 textures. When you want to render them you would determine what tiles are visible, then you bind each texture once and render the tiles from that texture for any tiles that are visible on screen.
This requires quite a lot of set-up time to get right; you need keep information on each sub-texture of the atlas so you can find the right texture and render the appropriate region of that texture whenever a tile is referenced. You also need a good way of grouping rendering operations so that they occur when the appropriate texture is bound.
Like datenwolf said, I wouldn't focus too much on complicated texture systems early on; eager binding of textures will be plenty fast enough until you get further down the road.

Why does stereo 3D rendering require software written especially for it?

Given a naive take on 3D graphics rendering it seems that stereo 3D rendering should be essentially transparent to the developer and be entirely a feature of the graphics hardware and drivers. Wherever an OpenGL window is displaying a scene, it takes the geometry, lighting, camera and texture etc. information to render a 2D image of the scene.
Adding stereo 3D to the scene seems to essentially imply using two laterally offset cameras where there was originally one, and all other scene variables stay the same. The only additional information then would be how far apart to make the cameras and how far out to to make their central rays converge. Given this it would seem trivial to take a GL command sequence and interleave the appropriate commands at driver level to drive a 3D rendering.
It seems though applications need to be specially written to make use of special 3D hardware architectures making it cumbersome and prohibitive to implement. Would we expect this to be the future of stereo 3D implementations or am I glossing over too many important details?
In my specific case we are using a .net OpenGL viewport control. I originally hoped that simply having stereo enabled hardware and drivers would be enough to enable stereo 3D.
Your assumptions are wrong. OpenGL does not "take geometry, lighting camera and texture information to render a 2D image". OpenGL takes commands to manipulate its state machine and commands to execute draw calls.
As Nobody mentions in his comment, the core profile does not even care about transformations at all. The only thing it really provides you with now is ways to provide arbitrary data to a vertex shader, and an arbitrary 3D cube to do rendering to. Wether that corresponds or not to the actual view, GL does not care, nor should it.
Mind you, some people have noticed that a driver can try to guess what's the view and what's not, and this is what the nvidia driver tries to do when doing automatic stereo rendering. This requires some specific guess-work, which amounts to actual analysis of game rendering to tweak the algorithms so that the driver guesses right. So it's typically a per-title, in-driver change. And some developers have noticed that the driver can guess wrong, and when that happens, it starts to get confusing. See some first-hand account of those questions.
I really recommend you read that presentation, because it makes some further points as to where the camera should be pointing towards (should the 2 view directions be parallel and such).
Also, It turns out that is essentially costs twice as much rendering for everything that is view dependent. Some developers (including, for example, the Crytek guys, see Part 2), figured out that to a great extent, you can do a single render, and fudge the picture with additional data to generate the left and right eye pictures.
The amount of saved work here is worth a lot by itself, for the developer to do this themselves.
Stereo 3D rendering is unfortunately more complex than just adding a lateral camera offset.
You can create stereo 3D from an original 'mono' rendered frame and the depth buffer. Given the range of (real world) depths in the scene, the depth buffer for each value tells you how far away the corresponding pixel would be. Given a desired eye separation value, you can slide each pixel left or right depending on distance. But...
Do you want parallel axis stereo (offset asymmetrical frustums) or 'toe in' stereo where the two cameras eventually converge? If the latter, you will want to tweak the camera angles scene by scene to avoid 'reversing' bits of geometry beyond the convergence point.
For objects very close to the viewer, the left and right eyes see quite different images of the same object, even down to the left eye seeing one side of the object and the right eye the other side - but the mono view will have averaged these out to just the front. If you want an accurate stereo 3D image, it really does have to be rendered from different eye viewpoints. Does this matter? FPS shooter game, probably not. Human surgery training simulator, you bet it does.
Similar problem if the viewer tilts their head to one side, so one eye is higher than the other. Again, probably not important for a game, really important for the surgeon.
Oh, and do you have anti-aliasing or transparency in the scene? Now you've got a pixel which really represents two pixel values at different depths. Move an anti-aliased pixel sideways and it probably looks worse because the 'underneath' color has changed. Move a mostly-transparent pixel sideways and the rear pixel will be moving too far.
And what do you do with gunsight crosses and similar HUD elements? If they were drawn with depth buffer disabled, the depth buffer values might make them several hundred metres away.
Given all these potential problems, OpenGL sensibly does not try to say how stereo 3D rendering should be done. In my experience modifying an OpenGL program to render in stereo is much less effort than writing it in the first place.
Shameless self promotion: this might help
http://cs.anu.edu.au/~Hugh.Fisher/3dteach/stereo3d-devel/index.html

Opengl 2D performance tips

I'm currently developing a Touhou-esque bullet hell shooter game. The screen will be absolutely filled with bullets (so instancing is what I want here), but I want this to work on older hardware, so I'm doing something along the lines of this at the moment, there are not colors, textures, etc. yet until I figure this out.
glVertexPointer(3, GL_FLOAT, 0, SQUARE_VERTICES);
for (int i = 0; i < info.centers.size(); i += 3) {
glPushMatrix();
glTranslatef(info.centers.get(i), info.centers.get(i + 1), info.centers.get(i + 2));
glScalef(info.sizes.get(i), info.sizes.get(i + 1), info.sizes.get(i + 2));
glDrawElements(GL_QUADS, 4, GL_UNSIGNED_SHORT, SQUARE_INDICES);
glPopMatrix();
}
Because I want this to work on old hardware I'm trying to avoid shaders and whatnot. The setup up there fails me on about 80 polygons. I'm looking to get at least a few hundred out of this. info is a struct which has all the goodies for rendering, nothing much to it besides a few vectors.
I'm pretty new to OpenGL, but I at least heard and tried out everything that can be done, not saying I'm good with it at all though. This game is a 2D game, I switched from SDL to Opengl because it would make for some fancier effects easier. Obviously SDL works differently, but I never had this problem using it.
It boils down to this, I'm clearly doing something wrong here, so how can I implement instancing for old hardware (OpenGL 1.x) correctly? Also, give me any tips for increasing performance.
Also, give me any tips for increasing performance.
If you're going to use sprites....
Load all sprites into single huge texture. If they don't fit, use several textures, but keep number of textures low - to avoid texture switching.
Switch textures and change OpenGL state as infrequently as possible. Ideally, you should set texture once, and draw everything you can with it.
Use texture fonts for text. FTGL font might look nice, but it can hit performance very hard with complex fonts.
Avoid alpha-blending when possible and use alpha-testing.
When alpha-blending, always use alpha-testing to reduce number of pixels you draw. When your texture has many pixels with alpha==0, cut them out with alpha-test.
Reduce number of very big sprites. Huge screen-aligned/pixel-aligne sprite (1024*1024) will drop FPS even on very good hardware.
Don't use non-power-of-2 sized textures. They (used to) produce huge performance drop on certain ATI cards.
glTranslatef
For 2D sprite-based(that's important) game you could avoid matrices completely (with exception of camera/projection matrices, perhaps). I don't think that matrices will benefit you very much with 2D game.
With 2d game your main bottleneck will be GPU memory transfer speed - transferring data from texture to screen. So "use as little draw calls" and "put everything in VA" won't help you - you can kill performance with one sprite.
However, if you're going to use vector graphics (see area2048(youtube) or rez) that does not use textures, then most of the advice above will not apply, and such game won't be very different from 3d game. In this case it'll be reasonable to use vertex arrays, vertex buffer objects or display lists (depends on what is available) and utilize matrix function - because your bottleneck will be vertex processing. You'll still have to minimize number of state switches.

2D engine with OpenGL: Use Z buffer or own implementation for sprite sorting?

If I was making a 3D engine, the answer to this question would be clear: I'd go for using the depth buffer instead of thinking of sorting all my polygons on my own.
However, this is a different situation with 2D, because here layers can be implemented easily without the help of OpenGL - and you then could even sort and move sprites within layers. (Which isn't possible in OpenGL afaik)
(Why) should I use the OpenGL depth buffer instead of a C++ layer system running on the CPU?
How much slower would the depth buffer version be?
It is clear to me that making a layer system in C++ would impose as good as no performance impact at all, as I have to iterate over the sprites for rendering in any case.
I would suggest you to do it in software since you probably want to use transparency on your sprites and that implies you render them from back to front. Also sorting a couple of sprites shouldn't be that CPU demanding.
Use both, if you can.
Depth information is nice for post-processing and stuff like 3D-glasses, so you shouldn't throw it away. These kinds of effects can be very nice for 2D games.
Also, if you draw your (opaque) layers front to back, you can save fill-rate because the Z-Buffer can do the clipping for you (Depth tests are faster than actual drawing).
Depth testing is usually almost free, especially when you got hierarchical Z info. Because of this and the fill-rate savings, using depth testing will probably be even faster.
On the other hand, the software sorting is nice so you can actually do front to back rendering for opaque sprites and it's mandatory to do alpha-blending right (of course, you draw these sprites back to front).
Direct answers:
allowing the GPU to use the depth buffer would allow you to dynamically adjust the draw order of things without any on-CPU shuffling and would free you from having to assign things to different layers in situations where doing so is a bit of a fiction — for example, you could have effects like projectiles that come from the background towards and then in front of the player, without having to figure out which layer to assign them to all the time
on the GPU, the use of a depth would have no measurable effect, even if you're on an embedded chip, a plug-in card from more than a decade ago or an integrated part; they're so fundamental to modern GPUs that they've been optimised down to costing nothing in practical terms
However, I'd imagine you actually want to do it on the CPU for the simple reason of treating transparency correctly. A depth buffer stores one depth per pixel, so if you draw a near transparent object then attempt to draw something behind it, the thing behind won't be drawn even though it should be visible. In a 2d game it's likely that anti-aliasing will give your sprites partially transparent edges; if you submit drawing to the GPU in draw order then your partial transparencies will always be composited correctly. If you leave the z-buffer to do it then you risk weird looking fringing.

How to implement independent rendering layers in Direct3D9?

I'm working on a windowed Direct3D data plotting application that needs to display multiple overlays on top of the data (similar to HUDs in games). Since there could be a large amount of data that needs plotting, and not all overlays will be changed every time, I figured it wouldn't be a good idea to replot verticies when only one overlay in the display changes.
This led me to the idea of rendering the textures and verticies of the overlays to multiple textures with transparent backgrounds that could be overlaid in the render loop and updated independently (similar to layers in Photoshop).
Before I embark on changing a large portion of this program to render to textures as opposed to surfaces, I was just wondering if using textures is the best approach.
RTT works well, I used it in a game I did recently. Each scene (scene refers to layer, "HUD" was a scene, "Main" was the main scene etc...) was rendered onto a texture, then each texture was rendering onto a quad, sorted back to front (for alpha blending). I chose this over just rendering the scenes directly onto the back buffer because it allowed me to do post-processing.
For your caching purposes this seems to be the best way to go, but just be aware that the textures can eat memory quickly, and sometimes its just better to render everything again, making sure you sort back to front.
Render to texture will certainly work and could be a good route but it is probably overkill. Modern 3D hardware is very fast and I'd suggest you verify whether performance is really an issue re-rendering when you need an update before investing significant time making major changes to your program.
If performance is an issue your time might be better spent optimizing the code that renders your plot since that will benefit updates that involve changes to the data as well as those that just change an overlay. I'm a graphics programmer for games and generally with realtime 3D you want to focus your optimization efforts on your worst case (you have to redraw everything) rather than your best (only one overlay needs an update).
Rendering to texture render target surfaces is a very good idea, and can be used for a lot of things e.g. optimization/caching, but beware of the blend operation with regular alpha (a*c1 + (1-a)*c2); if # is ARGB blend, then l1#l2#l3 != l3#l1#l2; i.e. it's not commutative, but by using pre-multiplied alpha in all textures/layers the blend operation can be made commutative.
The ultimate reference is the Porter/Duff article "Compositing Digital Images" from 1984.