I'm working on a windowed Direct3D data plotting application that needs to display multiple overlays on top of the data (similar to HUDs in games). Since there could be a large amount of data that needs plotting, and not all overlays will be changed every time, I figured it wouldn't be a good idea to replot verticies when only one overlay in the display changes.
This led me to the idea of rendering the textures and verticies of the overlays to multiple textures with transparent backgrounds that could be overlaid in the render loop and updated independently (similar to layers in Photoshop).
Before I embark on changing a large portion of this program to render to textures as opposed to surfaces, I was just wondering if using textures is the best approach.
RTT works well, I used it in a game I did recently. Each scene (scene refers to layer, "HUD" was a scene, "Main" was the main scene etc...) was rendered onto a texture, then each texture was rendering onto a quad, sorted back to front (for alpha blending). I chose this over just rendering the scenes directly onto the back buffer because it allowed me to do post-processing.
For your caching purposes this seems to be the best way to go, but just be aware that the textures can eat memory quickly, and sometimes its just better to render everything again, making sure you sort back to front.
Render to texture will certainly work and could be a good route but it is probably overkill. Modern 3D hardware is very fast and I'd suggest you verify whether performance is really an issue re-rendering when you need an update before investing significant time making major changes to your program.
If performance is an issue your time might be better spent optimizing the code that renders your plot since that will benefit updates that involve changes to the data as well as those that just change an overlay. I'm a graphics programmer for games and generally with realtime 3D you want to focus your optimization efforts on your worst case (you have to redraw everything) rather than your best (only one overlay needs an update).
Rendering to texture render target surfaces is a very good idea, and can be used for a lot of things e.g. optimization/caching, but beware of the blend operation with regular alpha (a*c1 + (1-a)*c2); if # is ARGB blend, then l1#l2#l3 != l3#l1#l2; i.e. it's not commutative, but by using pre-multiplied alpha in all textures/layers the blend operation can be made commutative.
The ultimate reference is the Porter/Duff article "Compositing Digital Images" from 1984.
Related
I've recently started learning DirectX programming in C++, I have some experience of graphical programming in other languages however I am new to the DirectX scene.
Anyway, I wanted to ask a question about transparent textures. So far I've always used alpha testing as that has reached my needs, however I've recently began to wonder how "proper" game engines manage to render such good looking semi-transparent textures for things like plants and trees which have smooth transparency.
As everytime I've used alpha testing, the texutres have ended up looking blocky and just plain bad. I'd love to be able to have smooth, semi-transparent textures which draw as I would expect.
My guess as to how this works would be to execute render calls in order, starting with things that are far away from the camera and moving closer, However, I can't really see how this works for pre-made models, for example if you had a tree model where the leaves and trunk shared a model, how to guarantee that the back leaves would draw, and the trunks would draw correctly over the leaves, and that the front leaves would look correct over the trunk.
I had tried that method above and had also disabled z buffering for the transparent objects such as smoke particles, and it sort of worked, but looked messy and the effect appeared different depending on the viewing angle. So that didn't seem ideal.
So, in short, what methods do "proper" games use to correctly draw smooth alpha textures (which have a range of alpha values) into a 3D scene for things like foliage.
Thanks,
Michael.
Ordered transparency is accomplished most basically using the painters algorithm.
The painter's algorithm falls apart where an object needs to be drawn both in front of and behind another object, or where a single object has multiple sub components that are transparent. We can't easily sort sub-components of a mesh relative to each other.
While it doesn't solve the problems z-buffer allows us to optimize rendering. Most games use this slightly more complex algorithm as the basis of their rendering.
Render all Opaque objects sorted by material state or front to back
to avoid overdraw.
Render all Transparent objects sorted front to back.
Games use a variety of techniques in combination to avoid this problem.
Split models into non overlapping transparent sections. Often times this is done implicitly because a game's transparent objects will often use use different materials than the rest of the model. You can also split models with multiple layers of transparency in such a way that each new model's layers do not overlap. For example you could split a pine tree model radially into 5 sections.
This was more common in fixed function pipelines. Modern games simply try to avoid the problem.
Avoid semi-transparent parts in models. Use transparency only for anti-aliasing edges and where the transparent object can split the world cleanly into two separate groups of objects. (Windows or water planes for example). Splitting the world like this and rendering those chunks front to back allows our anti-aliased edges to be drawn without causing obvious cut-outs on other transparent objects. The edges themselves tend to look good even if they overlap as long as your alpha-test is set higher than ~30%.
Semi transparent objects are often rendered as particle effects. Grass and smoke are the most common examples. The point list for the effect or group of grass objects is sorted each frame. This is a much simpler problem than sorting arbitrary sub meshes. Many outdoor games have complex grass and foliage instancing systems. These allow them to render individual leaves, and blades properly sorted and avoid most of the rendering overhead of doing it in this fashion but they strictly limit the types of objects.
Many effects can be done in an order independent way using additive and subtraction blending rather than alpha blending.
There are a couple easy options if your smooth edges are still unacceptable. You can dither any parts of the model below 75% transparency. Or you can have the hardware do it for you without visible artifacts by using coverage-to-alpha. This causes the multisampling hardware to dither the edges in the overdrawn samples. It won't give you a smooth gradient but the 4-16 levels of alpha are perfectly acceptable for anti-aliasing edges and free if you already intend to use MSAA.
There are a lot of caveats and special cases. If you have water you will probably need to render any semi-transparent objects that intersect the water twice using a stencil or depth test.
Moving the camera in and out of transparent objects is always problematic.
It is nearly impossible to render a complex semi-transparent object. Like an x-ray view of a building or a ghost. Many games simply render this type of object as additive. But with modern hardware a variety of more complex schemes are possible.
More complex schemes
Depth Peeling is a method of rendering where you render multiple passes with different Z-clipping planes to composite the scene from back to front regardless of order or what object contains the alpha. It is less expensive than you would expect because many objects render to only one or two slices. But it is not perfect and many game developers find it too costly.
There are many other varieties of Order Independent Transparency. With a modern GPU and compute we can render in a single pass to a buffer where each pixel is a stack of possible slices. We can then sort the stack and blend these slices in a post process, and only incur the performance penalty when there are layers of transparency on a pixel.
OIT is still mostly only used in special cases like 2.5D games (such as little Big Planet). But I believe that it may eventually become a core tool in game programming.
I've been using XNA for essentialy all of my programming so far and would like to move on to OpenGL (along with SFML for IO, creating the window etc.) with C++ . For starters I'd like to create a tile-based game and I've mostly looked at LazyFoo's tutorials.
I just have a two questions:
How should I draw the tiles? Should I use immediate drawing, arrays, VBOs or what? VBOs feel like overkill for this but I'm not sure. It's very tempting to use immediate drawing but apparently it's deprecated. Maybe it's fine for this purpose since it's 2D and only for a bunch of quads.
I'd like a lot of different tiles and thus all of my tiles will not fit into a single texture without making it massive. I've read that using bindTexture isn't very cheap and thus I should avoid as many calls as I can. I thought that maybe I can create a manager for my textures and stitch them all together into one big texture and bind that but then the dimensions of that is an issue.
Don't use immediate mode! It's cumbersome to work with and has been removed from recent OpenGL versions. Use Vertex Arrays, ideally through VBOs. In the end they're much easier to use, believe me.
Regarding that switching of textures. We're talking about optimizing the texture switch patterns in very complex scenes. In your case it will hardly matter at all.
Update
Right now you worry abount things without having even used them. That's worse than premature optimization. I suggest you first get a good grip on OpenGL, then start worrying about state switch management.
With regards to the texture atlas; this is usually done by stitching textures into groups of power-of-two sized textures. For example in a tile-based game you might have a particular tile set (say, tiles for an ice world) grouped together on 2 or 3 textures. When you want to render them you would determine what tiles are visible, then you bind each texture once and render the tiles from that texture for any tiles that are visible on screen.
This requires quite a lot of set-up time to get right; you need keep information on each sub-texture of the atlas so you can find the right texture and render the appropriate region of that texture whenever a tile is referenced. You also need a good way of grouping rendering operations so that they occur when the appropriate texture is bound.
Like datenwolf said, I wouldn't focus too much on complicated texture systems early on; eager binding of textures will be plenty fast enough until you get further down the road.
I'm kind of stuck on the logic behind an SDL2 texture. To me, they are pointless since you cannot draw to them.
In my program, I have several surfaces (or what were surfaces before I switched to SDL2) that I just blitted together to form layers. Now, it seems, I have to create several renderers and textures to create the same effect since SDL_RenderCopy takes a texture pointer.
Not only that, but all renderers have to come from a window, which I understand, but still fouls me up a bit more.
This all seems extremely bulky and slow. Am I missing something? Is there a way to draw directly to a texture? What are the point of textures, and am I safe to have multiple (if not hundreds) of renderers in place of what were surfaces?
SDL_Texture objects are stored as close as possible to video card memory and therefore can easily be accelerated by your GPU. Resizing, alpha blending, anti-aliasing and almost any compute-heavy operation can harshly be affected by this performance boost. If your program needs to run a per-pixel logic on your textures, you are encouraged to convert your textures into surfaces temporarily. Achieving a workaround with streaming textures is also possible.
Edit:
Since this answer recieves quite the attention, I'd like to elaborate my suggestion.
If you prefer to use Texture -> Surface -> Texture workflow to apply your per-pixel operation, make sure you cache your final texture unless you need to recalculate it on every render cycle. Textures in this solution are created with SDL_TEXTUREACCESS_STATIC flag.
Streaming textures (creation flag is SDL_TEXTUREACCESS_STREAMING) are encouraged for use cases where source of the pixel data is network, a device, a frameserver or some other source that is beyond SDL applications' full reach and when it is apparent that caching frames from source is inefficient or would not work.
It is possible to render on top of textures if they are created with SDL_TEXTUREACCESS_TARGET flag. This limits the source of the draw operation to other textures although this might already be what you required in the first place. "Textures as render targets" is one of the newest and least widely supported feature of SDL2.
Nerd info for curious readers:
Due to the nature of SDL implementation, the first two methods depend on application level read and copy operations, though they are optimized for suggested scenarios and fast enough for realtime applications.
Copying data from application level is almost always slow when compared to post-processing on GPU. If your requirements are more strict than what SDL can provide and your logic does not depend on some outer pixel data source, it would be sensible to allocate raw OpenGL textures painted from you SDL surfaces and apply shaders (GPU logic) to them.
Shaders are written in GLSL, a language which compiles into GPU assembly. Hardware/GPU Acceleration actually refers to code parallelized on GPU cores and using shaders is the prefered way to achieve that for rendering purposes.
Attention! Using raw OpenGL textures and shaders in conjunction with SDL rendering functions and structures might cause some unexpected conflicts or loss of flexibility provided by the library.
TLDR;
It is faster to render and operate on textures than surfaces although modifying them can sometimes be cumborsome.
Through creating a SDL2 Texture as a STREAMING type, one can lock and unlock the entire texture or just an area of pixels to perform direct pixel operations. One must create prior a SDL2 Surface, and link with lock-unlock as follows:
SDL_Surface surface = SDL_CreateSurface(..);
SDL_LockTexture(texture, &rect, &surface->pixels, &surface->pitch);
// paint into surface pixels
SDL_UnlockTexture(texture);
The key is, if you draw to texture of larger size, and the drawing is incremental ( e.g. data graph in real time ) be sure to only lock and unlock the actual area to update. Otherwise the operations will be slow, with heavy memory copying.
I have experienced reasonable performance and the usage model is not too difficult to understand.
In SDL2 it is possible to render off-screen / render directly to a texture. The function to use is:
int SDL_SetRenderTarget(SDL_Renderer *renderer, SDL_Texture *texture);
This only works if the renderer enables SDL_RENDERER_TARGETTEXTURE.
I'm working on a 2D game using C++ and DirectX9 and I've got a decent amount of it working. As of now I have it using sprite.draw for everything: the player, the backgrounds (tiled with for loops), the walls, the HUD, etc. Then I started questioning if how I was drawing the game was the best way to go. Are there major differences between using sprites and using textured primitives? Is there a way to just set each pixel individually from my own functions, and would that be practical? It'd be nice if I could later add lighting and alpha blending, and I'd be up for coding that myself if it doesn't slow the program down too much. I just want to get things straight right away and make sure there's nothing I'm missing.
Sprite drawing is fine, the advantage of it is that it handles all the texture coordinates for you and probably is also hardware accelerated.
Is there a way to just set each pixel individually from my own functions, and would that be practical
It is possible, but not practical. The cpu cannot process so much pixels at each frame, and neither its buses are wide enough to send all of them every frame. Thats what the graphic card is for, it is much faster with pixel processing and has much wider buses to the display.
It'd be nice if I could later add lighting and alpha blending, and I'd be up for coding that myself if it doesn't slow the program down too much
It is possible, with built in functions for lights and alpha blending, and you can even code it yourself (its called a shader).
If I was making a 3D engine, the answer to this question would be clear: I'd go for using the depth buffer instead of thinking of sorting all my polygons on my own.
However, this is a different situation with 2D, because here layers can be implemented easily without the help of OpenGL - and you then could even sort and move sprites within layers. (Which isn't possible in OpenGL afaik)
(Why) should I use the OpenGL depth buffer instead of a C++ layer system running on the CPU?
How much slower would the depth buffer version be?
It is clear to me that making a layer system in C++ would impose as good as no performance impact at all, as I have to iterate over the sprites for rendering in any case.
I would suggest you to do it in software since you probably want to use transparency on your sprites and that implies you render them from back to front. Also sorting a couple of sprites shouldn't be that CPU demanding.
Use both, if you can.
Depth information is nice for post-processing and stuff like 3D-glasses, so you shouldn't throw it away. These kinds of effects can be very nice for 2D games.
Also, if you draw your (opaque) layers front to back, you can save fill-rate because the Z-Buffer can do the clipping for you (Depth tests are faster than actual drawing).
Depth testing is usually almost free, especially when you got hierarchical Z info. Because of this and the fill-rate savings, using depth testing will probably be even faster.
On the other hand, the software sorting is nice so you can actually do front to back rendering for opaque sprites and it's mandatory to do alpha-blending right (of course, you draw these sprites back to front).
Direct answers:
allowing the GPU to use the depth buffer would allow you to dynamically adjust the draw order of things without any on-CPU shuffling and would free you from having to assign things to different layers in situations where doing so is a bit of a fiction — for example, you could have effects like projectiles that come from the background towards and then in front of the player, without having to figure out which layer to assign them to all the time
on the GPU, the use of a depth would have no measurable effect, even if you're on an embedded chip, a plug-in card from more than a decade ago or an integrated part; they're so fundamental to modern GPUs that they've been optimised down to costing nothing in practical terms
However, I'd imagine you actually want to do it on the CPU for the simple reason of treating transparency correctly. A depth buffer stores one depth per pixel, so if you draw a near transparent object then attempt to draw something behind it, the thing behind won't be drawn even though it should be visible. In a 2d game it's likely that anti-aliasing will give your sprites partially transparent edges; if you submit drawing to the GPU in draw order then your partial transparencies will always be composited correctly. If you leave the z-buffer to do it then you risk weird looking fringing.