Rendering translucent triangles in models animated by a vertex shader? - c++

Until now, I've been handling translucency by sorting my translucent triangles back to front. This works very well for quads, but I'd like to incorporate translucency in my models now.
I've thought of separating the translucent tris out and sorting them in the same way as my quads. Sorting by their centroids, then streaming the results into an IBO for just them each frame. But the number of triangles in a model, and the need to transform them on the CPU according to a table of bones, and blend shapes, and some other things in my vertex shader... This doesn't seem like a good solution in performance or sanity.
My models are about 4K tris each, with maybe 20 in a scene at worst, and I'd really like to lean into a simple cute style that relies on translucency, which doesn't have to be physically accurate, or draw objects behind 4 or more layers of translucency.
What technique might work well for my situation, in 2021? I'm using OpenGL 3.3 but I'll use another version if new features exist for this.

Afaik, there's no easy solution to what you want to do. However, there are some things that you can do:
Don't sort triangles at all, and only sort individual draw calls. This is the easiest solution and is utilised by most games, if they use translucent/transparent objects in the first place. Of course, this might not have the best looks (though it works well with convex shapes), but it will have very good performance.
Order-independet transparency: There are some pixel shader based techniques for rendering transparent/translucent objects without having to sort anything at all. These techniques usually are approximations (there are also some non-approximate algorithms) and tend work the best for things like smoke (where small errors aren't as noticable), or when not many transparent/translucent triangles overlap each other anyway.
Compute Shaders: You can use a compute shader to transform your individual vertices according to your animation, and then use another compute shader to sort the triangles on the GPU. That is probably the most straight-forward improvement of what you'd otherwise do on the CPU, and there are many examples for sorting stuff on the GPU out there. But, if you've never worked with compute shaders before, it might be a bit hard to wrap your head around their strengths and limitations at first.
Ray Tracing: This would probably be the most complex way to solve that problem, since you'll need specific hardware, generate corresponding data structures and a few new shaders, and on top of that, even with modern hardware, ray tracing is quite costly (though probably still faster than sorting triangles on the CPU each frame). But it also doesn't need your triangles to be sorted and will actually work perfectly well even if different translucent objects are intersecting/interleaving each other.

Related

DirectX Transparent textures - Making plants look good

I've recently started learning DirectX programming in C++, I have some experience of graphical programming in other languages however I am new to the DirectX scene.
Anyway, I wanted to ask a question about transparent textures. So far I've always used alpha testing as that has reached my needs, however I've recently began to wonder how "proper" game engines manage to render such good looking semi-transparent textures for things like plants and trees which have smooth transparency.
As everytime I've used alpha testing, the texutres have ended up looking blocky and just plain bad. I'd love to be able to have smooth, semi-transparent textures which draw as I would expect.
My guess as to how this works would be to execute render calls in order, starting with things that are far away from the camera and moving closer, However, I can't really see how this works for pre-made models, for example if you had a tree model where the leaves and trunk shared a model, how to guarantee that the back leaves would draw, and the trunks would draw correctly over the leaves, and that the front leaves would look correct over the trunk.
I had tried that method above and had also disabled z buffering for the transparent objects such as smoke particles, and it sort of worked, but looked messy and the effect appeared different depending on the viewing angle. So that didn't seem ideal.
So, in short, what methods do "proper" games use to correctly draw smooth alpha textures (which have a range of alpha values) into a 3D scene for things like foliage.
Thanks,
Michael.
Ordered transparency is accomplished most basically using the painters algorithm.
The painter's algorithm falls apart where an object needs to be drawn both in front of and behind another object, or where a single object has multiple sub components that are transparent. We can't easily sort sub-components of a mesh relative to each other.
While it doesn't solve the problems z-buffer allows us to optimize rendering. Most games use this slightly more complex algorithm as the basis of their rendering.
Render all Opaque objects sorted by material state or front to back
to avoid overdraw.
Render all Transparent objects sorted front to back.
Games use a variety of techniques in combination to avoid this problem.
Split models into non overlapping transparent sections. Often times this is done implicitly because a game's transparent objects will often use use different materials than the rest of the model. You can also split models with multiple layers of transparency in such a way that each new model's layers do not overlap. For example you could split a pine tree model radially into 5 sections.
This was more common in fixed function pipelines. Modern games simply try to avoid the problem.
Avoid semi-transparent parts in models. Use transparency only for anti-aliasing edges and where the transparent object can split the world cleanly into two separate groups of objects. (Windows or water planes for example). Splitting the world like this and rendering those chunks front to back allows our anti-aliased edges to be drawn without causing obvious cut-outs on other transparent objects. The edges themselves tend to look good even if they overlap as long as your alpha-test is set higher than ~30%.
Semi transparent objects are often rendered as particle effects. Grass and smoke are the most common examples. The point list for the effect or group of grass objects is sorted each frame. This is a much simpler problem than sorting arbitrary sub meshes. Many outdoor games have complex grass and foliage instancing systems. These allow them to render individual leaves, and blades properly sorted and avoid most of the rendering overhead of doing it in this fashion but they strictly limit the types of objects.
Many effects can be done in an order independent way using additive and subtraction blending rather than alpha blending.
There are a couple easy options if your smooth edges are still unacceptable. You can dither any parts of the model below 75% transparency. Or you can have the hardware do it for you without visible artifacts by using coverage-to-alpha. This causes the multisampling hardware to dither the edges in the overdrawn samples. It won't give you a smooth gradient but the 4-16 levels of alpha are perfectly acceptable for anti-aliasing edges and free if you already intend to use MSAA.
There are a lot of caveats and special cases. If you have water you will probably need to render any semi-transparent objects that intersect the water twice using a stencil or depth test.
Moving the camera in and out of transparent objects is always problematic.
It is nearly impossible to render a complex semi-transparent object. Like an x-ray view of a building or a ghost. Many games simply render this type of object as additive. But with modern hardware a variety of more complex schemes are possible.
More complex schemes
Depth Peeling is a method of rendering where you render multiple passes with different Z-clipping planes to composite the scene from back to front regardless of order or what object contains the alpha. It is less expensive than you would expect because many objects render to only one or two slices. But it is not perfect and many game developers find it too costly.
There are many other varieties of Order Independent Transparency. With a modern GPU and compute we can render in a single pass to a buffer where each pixel is a stack of possible slices. We can then sort the stack and blend these slices in a post process, and only incur the performance penalty when there are layers of transparency on a pixel.
OIT is still mostly only used in special cases like 2.5D games (such as little Big Planet). But I believe that it may eventually become a core tool in game programming.

OpenGL: Is it more efficient to use GL_QUADS or GL_TRIANGLES?

I know that OpenGL deprecated and got rid of GL_QUADS in the newer releases. I have heard this is due to the fact that modern GPUs only render with triangles so calling a quad would just make the GPU work harder to break it into two triangles (what I have heard anyway, I am not much of an expert on any of this topic).
I was wondering whether or not it is better (assuming the average person's CPU is faster, relatively, than their GPU) to just manually break the rendering of quads into two triangles yourself or to just let the GPU do it itself. Again, I have absolutely no real experience with OpenGL as I am just starting. I would rather know which is better for most machines these days so I could focus my attention on either rendering method*. Thanks.
*Yet I will probably utilize the 'triangle method' for the sake of it.
Even if you feed OpenGL quads, the triangularization is done by the driver on the CPU side before it even hits the GPU. Modern GPUs these days eat nothing except triangles. (Well, and lines and points.) So something will be triangulating, whether it's you or the driver -- it doesn't matter too much where it happens.
This would be less efficient if, say, you don't reuse your vertex buffers, and instead refill them anew every time with quads (in which case the driver will have to retriangulate every vertex buffer), instead of refilling them with pretriangulated triangles every time, but that's pretty contrived (and the problem you should be fixing in that case is just the fact you're refilling your vertex buffers).
I would say, if you have the choice, stick with triangles, since that's what most content pipelines put out anyways, and you're less likely to run into problems with non-planar quads and the like. If you get to choose what format your content comes in, then use triangles for sure, and the triangulation step gets skipped altogether.
Any geometry can be represented with triangles, and that is why it was decided to use triangles instead of quads. Another reason is two triangles do not have to be co-planar, which is not true for quad.
Yes, you select to render quads, but the driver will converting the quad into two triangles.
Therefore, by choosing to render a quad will not make GPU work less, but will make your CPU work more, because it has to do the conversion.

Techniques for drawing tiles with OpenGL

I've been using XNA for essentialy all of my programming so far and would like to move on to OpenGL (along with SFML for IO, creating the window etc.) with C++ . For starters I'd like to create a tile-based game and I've mostly looked at LazyFoo's tutorials.
I just have a two questions:
How should I draw the tiles? Should I use immediate drawing, arrays, VBOs or what? VBOs feel like overkill for this but I'm not sure. It's very tempting to use immediate drawing but apparently it's deprecated. Maybe it's fine for this purpose since it's 2D and only for a bunch of quads.
I'd like a lot of different tiles and thus all of my tiles will not fit into a single texture without making it massive. I've read that using bindTexture isn't very cheap and thus I should avoid as many calls as I can. I thought that maybe I can create a manager for my textures and stitch them all together into one big texture and bind that but then the dimensions of that is an issue.
Don't use immediate mode! It's cumbersome to work with and has been removed from recent OpenGL versions. Use Vertex Arrays, ideally through VBOs. In the end they're much easier to use, believe me.
Regarding that switching of textures. We're talking about optimizing the texture switch patterns in very complex scenes. In your case it will hardly matter at all.
Update
Right now you worry abount things without having even used them. That's worse than premature optimization. I suggest you first get a good grip on OpenGL, then start worrying about state switch management.
With regards to the texture atlas; this is usually done by stitching textures into groups of power-of-two sized textures. For example in a tile-based game you might have a particular tile set (say, tiles for an ice world) grouped together on 2 or 3 textures. When you want to render them you would determine what tiles are visible, then you bind each texture once and render the tiles from that texture for any tiles that are visible on screen.
This requires quite a lot of set-up time to get right; you need keep information on each sub-texture of the atlas so you can find the right texture and render the appropriate region of that texture whenever a tile is referenced. You also need a good way of grouping rendering operations so that they occur when the appropriate texture is bound.
Like datenwolf said, I wouldn't focus too much on complicated texture systems early on; eager binding of textures will be plenty fast enough until you get further down the road.

Are triangles a gpu restriction or are there other rendering pathways?

To preface this question, I have a competent understanding of OpenGL and the maths behind it, and while I have never touched anything related to DirectX I imagine the concepts are similar.
There is plenty of information around about why triangles are used for 3D graphics (they are necessarily planar, are indivisible except into smaller triangles, etc). However, I would like to know if triangles are merely a convenient way of storing and manipulating 3D data (simpler maths regarding interpolation, etc), or if there is a hardware limitation in the graphics card that only realistically allows the rendering of triangles (e.g. instructions that can essentially ONLY be applied to triangles).
Following on from this, is there any way to achieve pixel-by-pixel control of graphics rendering (as outlined briefly by the answer to this question). While I appreciate direct control over individual pixels is done through a driver, is there any way I can get this kind of control over a rendering environment? Is there away to 'avoid triangles' completely?
Yes and no. Kind of.
Current GPUs are designed to render triangles because triangles are nice to work with. And because current GPUs are designed to work with triangles, people use triangles and so GPUs only need to process triangles, and so they're designed to process only triangles.
As you say, triangles just have advantages that make them convenient to use. GPUs can be made (and have been made) to render other primitives natively, but it's just not really worth it. If you tell a modern GPU to render a quad, it splits it up into two triangles and renders those.
Not because there's a technical reason why a GPU can't render quads natively, but because it's not worth spending transistors on. It's much more useful to focus the GPU on doing triangles as fast as possible, and then just emulate other primitives if they're needed.
So yes, modern GPUs have hardware limitations so they don't work with quads, for example, but not because it's impossible to design a GPU which works with quads. It'd just be less efficient to do so. :)
As for "avoiding triangles", sure, that's basically what the fragment shader does: it fills in one single pixel. The GPU just runs it a few million times in parallel to fill in the entire screen. You could draw two big triangles, which form a quad filling the entire screen, and then just specify a fragment shader which fills that with the content you like.
If you want more control over the process, do it in software instead: paint one pixel at a time to a memory surface, and then load that as a texture on the GPU. But it's slow. :)
As far as i know every modern CAN render quads and some even N-gons but it comparing the render time of a quad to 2 triangles shows the triangle advantage.
This is mainly because GPU's have been optimized to render triangles and that the accual hardware has way more "steam processors" (for triangles) then others such as textures ones. Some other processor types on the GPU can render quads directly but normally you would find a thousand steam to a few texture processors
Note that getting a texure unit to render a quad is EXTREMELY difficult. It is possible in theory but no one used the pricip for a serius case.
Unless you work with very hardware close operation the software will take care of the triangles, (eg, Auto-Convert them from quads)

2D engine with OpenGL: Use Z buffer or own implementation for sprite sorting?

If I was making a 3D engine, the answer to this question would be clear: I'd go for using the depth buffer instead of thinking of sorting all my polygons on my own.
However, this is a different situation with 2D, because here layers can be implemented easily without the help of OpenGL - and you then could even sort and move sprites within layers. (Which isn't possible in OpenGL afaik)
(Why) should I use the OpenGL depth buffer instead of a C++ layer system running on the CPU?
How much slower would the depth buffer version be?
It is clear to me that making a layer system in C++ would impose as good as no performance impact at all, as I have to iterate over the sprites for rendering in any case.
I would suggest you to do it in software since you probably want to use transparency on your sprites and that implies you render them from back to front. Also sorting a couple of sprites shouldn't be that CPU demanding.
Use both, if you can.
Depth information is nice for post-processing and stuff like 3D-glasses, so you shouldn't throw it away. These kinds of effects can be very nice for 2D games.
Also, if you draw your (opaque) layers front to back, you can save fill-rate because the Z-Buffer can do the clipping for you (Depth tests are faster than actual drawing).
Depth testing is usually almost free, especially when you got hierarchical Z info. Because of this and the fill-rate savings, using depth testing will probably be even faster.
On the other hand, the software sorting is nice so you can actually do front to back rendering for opaque sprites and it's mandatory to do alpha-blending right (of course, you draw these sprites back to front).
Direct answers:
allowing the GPU to use the depth buffer would allow you to dynamically adjust the draw order of things without any on-CPU shuffling and would free you from having to assign things to different layers in situations where doing so is a bit of a fiction — for example, you could have effects like projectiles that come from the background towards and then in front of the player, without having to figure out which layer to assign them to all the time
on the GPU, the use of a depth would have no measurable effect, even if you're on an embedded chip, a plug-in card from more than a decade ago or an integrated part; they're so fundamental to modern GPUs that they've been optimised down to costing nothing in practical terms
However, I'd imagine you actually want to do it on the CPU for the simple reason of treating transparency correctly. A depth buffer stores one depth per pixel, so if you draw a near transparent object then attempt to draw something behind it, the thing behind won't be drawn even though it should be visible. In a 2d game it's likely that anti-aliasing will give your sprites partially transparent edges; if you submit drawing to the GPU in draw order then your partial transparencies will always be composited correctly. If you leave the z-buffer to do it then you risk weird looking fringing.