I am currently working on a WebGL project, although I imagine this question is generic across many graphics APIs.
Let me use the example of a simple cube to demonstrate what I am asking.
A cube has 6 faces with 4 vertices per face so in total we have 24 vertices that make up the cube. However, we could reduce the total number of vertices to only 8 if we share vertices between faces. As I have been reading this can save a lot of precious GPU memory especially when working with complex models and scenes.
On the other hand though, I have experienced first-hand some of the drawbacks with sharing vertices between faces. These include:
Complex vertex normal calculations as we must find the 'average' normal for each vertex, taking into account the face normals of each face that said vertex is a part of.
Some vertices must be duplicated anyway to 'match up' with their corresponding UV coordinates.
As a vertex may be shared by many faces, we are not able to specify a different colour per face using per vertex colouring.
The book I have been reading really stresses the importance of vertex sharing to minimise memory usage, so when I came across some of the disadvantages of vertex sharing I was unsure as to how viable/helpful vertex shading really is, and as the author did not mention any of the downsides of vertex sharing I would like to get the opinions of you guys. So is the memory saving produced from vertex shading really that important?
The disadvantages you named are indeed very real, especially for shapes with lots of sharp edges or different textures. A cube is the worst possible example for vertex sharing, each vertex has 3 different normals and possibly texture coordinates. It is essentially impossible to share the vertices.
However think of some organic shape. Like a ball, the body of some animal, cars, trees or even something as simple as a desert or something. These shapes probably need a high amount of vertices to look like anything decent, but a lot of those vertices are shared between faces. They need the exact same normals, texture coordinates and whatevers in order to look smooth.
Furthermore, the first disadvantage is not really that important. Calculating the vertices can be done in preprocessing, in most cases even by the modeller. This is basically never done realtime, instead you simply already have it in this format. However if it does need to be done realtime you can imagine this becoming an actual issue, and you need to start thinking about the trade offs and profile. But even then it can probably be dealt with using geometry shaders, if the visual fidelity is needed this can be a preferable solution.
In conclusion it heavily depends on what you're doing. In some cases vertex sharing isn't really viable because of the reasons you mentioned. Regardless, in many many cases it can potentially save a lot of memory.
Related
I have a large 2D Triangle(not Triangle Strip) mesh with about 2+ million polygons. Many of these polygons are redundant which can be determined using one of the attribute variables in vertex shader. But since discarding a vertex in vertex shader is not possible. I am exploring the idea of discarding primitives in Geometry shader as an optimization. It is guaranteed that all three vertices of this primitive will have same attribute value.
I have a few doubts here
Will this really optimize rendering in terms of speed and GPU memory?
Is this even possible in geometry shader and are geometry shaders are suitable for this?
But since discarding a vertex in vertex shader is not possible.
Nonsense. Oh sure, the VS cannot "discard" a triangle, but that doesn't mean it has no power.
Triangles which appear wholly off-screen will generally get minimal processing done on them. So at a minimum, you can have the VS adjust the final gl_Position value to be off screen based on whether the attribute in question has the property you are looking for.
If you have access to OpenGL 4.6, you can get even fancier by employing cull planes, which allow the VS to cull triangles more directly.
Will this really optimize rendering in terms of speed and GPU memory?
Broadly speaking, you should never assume Geometry Shaders will increase the performance of any algorithm you apply it to. There may be cases where a GS could be used in optimizing an algorithm, but absent actual performance data, you should start from the assumption that employing it will only make performance worse.
I am currently learning OpenGL for 3D rendering and i can't quite wrap my head around some things regarding shaders and VBOs, i get that all VBOs share one index and therefore you need to duplicate some data
but when you create more VBOs there are nearly no faces with vertices that share the same position normal and texture coordinates so the indices are at least from my point of view pretty useless, it is basically just an array of consecutive numbers.
Is there an aspect of indicesBuffers i don't see ?
The utility of index buffers is, as with the utility of all vertex specification features, dependent on your mesh data.
Most of the meshes that get used in high-performance graphics, particularly those with significant polygon density, are smooth. Normals across such meshes are primarily smooth, since the modeller is usually approximating a primarily curved surface. Oh yes, there can be some sharp edges here and there, but for the most part, each position in such models has a single normal.
Texture coordinates usually vary smoothly across meshes too. There are certainly texture coordinate edges; well-optimized UV unwrapping often produces these kinds of things. But if you have a mesh of real size (10K+ vertices), most positions have a single texture coordinate. And tangents/bitangents are based on the changes in texture coordinates, so those will match the texture topology.
Are there meshes where the normal topology is highly disjoint with position topology? Yes. Cubes being the obvious example. But there are oftentimes needs for highly faceted geometry, either to achieve a specific look or for low-polygon uses. In these cases, normal indexed rendering may not be of benefit to you.
But that does not change the fact that these cases are the exception, generally speaking, rather than the rule. Even if your code always involves these cases, that simply isn't true for the majority of high-performance graphics applications out there.
In a closed mesh, every vertex is shared by at least two faces. (The only time a vertex will be used fewer than three times is in a double-sided mesh, where two faces have the same vertices but opposite normals and winding order.) Not using indices, and just duplicating vertices, is not only inefficient, but, at minimum, doubles the amount of vertex data required.
There’s also potential for cache thrashing that could be otherwise avoided, related pipeline stalls and other insanity.
Indices are your friend. Get to know them.
Update
Typically, normals, etc. are stored in a normal map, or interpolated between vertices.
If you just want a faceted or "flat shaded" render, use the cross product of dFdx() and dFdy() (or, in HLSL, ddx() and ddy())to generate the per-pixel normal in your fragment shader. Duplicating data is bad, and only necessary under very special and unusual circumstances. Nothing you've mentioned leads me to believe that this is necessary for your use case.
The problem that I have is, that there are some pixles in my rendered scene
that seem to be missing/invisible and therefore have the same color as my clear
color. Interestingly, this only happens if MSAA is turned off.
My first thought was, that it could have something to do with the fact, that all the triangles are overlapping and somehow distorted by the projection matrix but these artifacts only seem to occur on lines rather than edges.
I read about just applying a scale of 1.00001 to everything in another question but that seems like a cheap hack to me that could cause other problems. Although these artifacts seem to be reduced when using hardware multisampling, I want to know if there is any other way to solve this.
Edit:
A way to solve this by Nicol Bolas:
OpenGL (and all other hardware rasterizers) only guarantees gapless rendering of the edge between two triangles if the edges exactly match. And that means you can't just have one triangle next to the edge of another. The two triangles must have identical vertices on the shared edge between them.
So if you have a long triangle next to a short triangle, what you have to do is split the long triangle into several triangles, so that the shared portion of the edge is properly shared between two triangles.
As stated, a possible solution is to split the big triangles into small ones to ensure that all overlapping vertecies are identical (i.e. abolishing the greedy meshing). But in my case I want to keep the greedy meshing due to performance aspects.
OpenGL (and all other hardware rasterizers) only guarantees gapless rendering of the edge between two triangles if the edges exactly match. And that means you can't just have one triangle next to the edge of another. The two triangles must have identical vertices on the shared edge between them.
So if you have a long triangle next to a short triangle, what you have to do is split the long triangle into several triangles, so that the shared portion of the edge is properly shared between two triangles.
Since you seem to be building a world made of cubes, this should be fairly easy for you.
The other thing you have to do is make certain that the two shared vertices between the two triangles are binary identical. The gl_Position output from the vertex shader needs to be the exact same value. So if you're computing the position of the cube's vertices in the VS, you need to do that in a way that will guarantee binary identical results.
But in my case I want to keep the greedy meshing due to performance aspects.
Then you need to decide which is more important: performance, or correctness. Because there's no way to force the rasterizer allow such edges to be gapless. It's a matter of floating-point precision and such, which will always be different on different hardware.
It's called "stitching". The rasterizer isn't matching the triangles exactly and some pixels are missing. Others have explained how to set up the triangles so OpenGL knows they share an edge. However as your model-building logic gets more and more complicated sometimes stitching is hard to avoid. The workaround is to clear to black or dark grey, and the stitches are then nearly invisible.
This is due to there being an issue of you having two polygons touching on the edges, and but being there would be some small error. (most likely round-off error in the floating point units being used to estimate the edge) There should theoretically also be times that they are overlapping their edges, but this would be far less noticeable. Setting a really small magnification factor, or going and just moving the polygons themselves a fraction towards each-other should resolve it. The reason it only occurs when MSAA is turned off as generally it would help prevent this type of issue. (This is due to it running all of the calculations at a higher resolution, but then downscaling everything to a lower resolution so it won't be as noticeable. The issue actually does still occur, but because of the downscaling, the missing pixels will disappear as they are blended into the surrounding pixels. )
In this tutorial for OpenGL ES, techniques for optimizing models are explained and one of those is to use triangle strips to define your mesh, using "degenerate" triangles to end one strip and begin another without ending the primitive. http://www.learnopengles.com/tag/degenerate-triangles/
However, this guide is very specific to mobile platforms, and I wanted to know if this technique held for modern desktop hardware. Specifically, would it hurt? Would it either cause graphical artifacts or degrade performance (opposed to splitting the strips into separate primatives?)
If it causes no artifacts and performs at least as well, I aim to use it solely because it makes organizing vertices in a certain mesh I want to draw easier.
Degenerate triangles work pretty well on all platforms. I'm aware of an old fixed-function console that struggled with degenerate triangles, but anything vaguely modern will be fine. Reducing the number of draw calls is always good and I would certainly use degenerates rather than multiple calls to glDrawArrays.
However, an alternative that usually performs better is indexed draws of triangle lists. With a triangle list you have a lot of flexibility to reorder the triangles to take maximum advantage of the post-transform cache. The post-transform cache is a hardware cache of the last few vertices that went through the vertex shader, the GPU can spot if you've re-issued the same vertex and skip the entire vertex shader for that vertex.
In addition to the above answers (no it shouldn't hurt at all unless you do something mad in terms of the ratio of real triangles to the degenerates), also note that the newer versions of OpenGL and OpenGL ES (3.x or higher) APIs support a means to insert breaks into index lists without needing an actual degenerate triangle, which is called primitive restart.
https://www.khronos.org/opengles/sdk/docs/man3/html/glEnable.xhtml
When enabled you can encode "MAX_INT" for the index type, and when detected that forces the GPU to restart building a new tristrip from the next index value.
It will not cause artifacts. As to "degrading performance"... relative to what? Relative to a random assortment of triangles with no indexing? Yes, it will be faster than that.
But there are plenty of other things one can do. For example, primitive restarting, which removes the need for degenerate triangles. Then there's using ordered lists of triangles for improved cache coherency. Will triangle strips be faster than that?
It rather depends on what you're rendering, how expensive your vertex shaders are, and various other things.
But at the end of the day, if you care about maximum performance on particular platforms, then you should profile for each platform and pick the vertex data based on what platform you're running on. If performance is really that important to you, then you're going to have to put forth some effort.
I know in DirectX 11 you can use the awesome tessellation feature for LODs, but knowing that DirectX 9 doesn't have this feature, how would I go about creating LODs for the models in my 3D application/game to speed it up?
I heard back in the old days before DirectX 10 or 11 came out, people used to create many of the same type of models but just with different polycounts (i.e: one with very low polycount for far away objects and one with a high polycount for very near objects).
But doing this would mean doubling or even tripling the size of models in the game right? Is there any other approaches in achieving LODs in DirectX 9? Or is this really the best soultion when it comes to DirectX 9? Can someone at least point me in the right direction for this issue I can atleast go away and do more research about it?
Thanks
Generating multiple LOD meshes using mesh simplification algorithms (or by hand) might not be as bad as you think in terms of memory consumption. As in mipmaps, since your simplified mesh have much less vertices, they shouldn't triple the size of your in-game models. And you don't have to keep the high-resolution meshes in video memory if you're not going to be using them for a while.
An alternative to save memory is to simplify meshes by discarding vertices only. This way, you can use a single vertex buffer and have different index buffers for each LOD. You might get slightly lower quality LOD meshes, but the memory overhead of keeping them all in memory will be much smaller.
If I'm not mistaking, tessellation is for subdivision, so it wouldn't help you anyways if you want a coarser mesh (though it can probably help interpolate between LODs.)