The problem that I have is, that there are some pixles in my rendered scene
that seem to be missing/invisible and therefore have the same color as my clear
color. Interestingly, this only happens if MSAA is turned off.
My first thought was, that it could have something to do with the fact, that all the triangles are overlapping and somehow distorted by the projection matrix but these artifacts only seem to occur on lines rather than edges.
I read about just applying a scale of 1.00001 to everything in another question but that seems like a cheap hack to me that could cause other problems. Although these artifacts seem to be reduced when using hardware multisampling, I want to know if there is any other way to solve this.
Edit:
A way to solve this by Nicol Bolas:
OpenGL (and all other hardware rasterizers) only guarantees gapless rendering of the edge between two triangles if the edges exactly match. And that means you can't just have one triangle next to the edge of another. The two triangles must have identical vertices on the shared edge between them.
So if you have a long triangle next to a short triangle, what you have to do is split the long triangle into several triangles, so that the shared portion of the edge is properly shared between two triangles.
As stated, a possible solution is to split the big triangles into small ones to ensure that all overlapping vertecies are identical (i.e. abolishing the greedy meshing). But in my case I want to keep the greedy meshing due to performance aspects.
OpenGL (and all other hardware rasterizers) only guarantees gapless rendering of the edge between two triangles if the edges exactly match. And that means you can't just have one triangle next to the edge of another. The two triangles must have identical vertices on the shared edge between them.
So if you have a long triangle next to a short triangle, what you have to do is split the long triangle into several triangles, so that the shared portion of the edge is properly shared between two triangles.
Since you seem to be building a world made of cubes, this should be fairly easy for you.
The other thing you have to do is make certain that the two shared vertices between the two triangles are binary identical. The gl_Position output from the vertex shader needs to be the exact same value. So if you're computing the position of the cube's vertices in the VS, you need to do that in a way that will guarantee binary identical results.
But in my case I want to keep the greedy meshing due to performance aspects.
Then you need to decide which is more important: performance, or correctness. Because there's no way to force the rasterizer allow such edges to be gapless. It's a matter of floating-point precision and such, which will always be different on different hardware.
It's called "stitching". The rasterizer isn't matching the triangles exactly and some pixels are missing. Others have explained how to set up the triangles so OpenGL knows they share an edge. However as your model-building logic gets more and more complicated sometimes stitching is hard to avoid. The workaround is to clear to black or dark grey, and the stitches are then nearly invisible.
This is due to there being an issue of you having two polygons touching on the edges, and but being there would be some small error. (most likely round-off error in the floating point units being used to estimate the edge) There should theoretically also be times that they are overlapping their edges, but this would be far less noticeable. Setting a really small magnification factor, or going and just moving the polygons themselves a fraction towards each-other should resolve it. The reason it only occurs when MSAA is turned off as generally it would help prevent this type of issue. (This is due to it running all of the calculations at a higher resolution, but then downscaling everything to a lower resolution so it won't be as noticeable. The issue actually does still occur, but because of the downscaling, the missing pixels will disappear as they are blended into the surrounding pixels. )
Related
I am currently learning OpenGL for 3D rendering and i can't quite wrap my head around some things regarding shaders and VBOs, i get that all VBOs share one index and therefore you need to duplicate some data
but when you create more VBOs there are nearly no faces with vertices that share the same position normal and texture coordinates so the indices are at least from my point of view pretty useless, it is basically just an array of consecutive numbers.
Is there an aspect of indicesBuffers i don't see ?
The utility of index buffers is, as with the utility of all vertex specification features, dependent on your mesh data.
Most of the meshes that get used in high-performance graphics, particularly those with significant polygon density, are smooth. Normals across such meshes are primarily smooth, since the modeller is usually approximating a primarily curved surface. Oh yes, there can be some sharp edges here and there, but for the most part, each position in such models has a single normal.
Texture coordinates usually vary smoothly across meshes too. There are certainly texture coordinate edges; well-optimized UV unwrapping often produces these kinds of things. But if you have a mesh of real size (10K+ vertices), most positions have a single texture coordinate. And tangents/bitangents are based on the changes in texture coordinates, so those will match the texture topology.
Are there meshes where the normal topology is highly disjoint with position topology? Yes. Cubes being the obvious example. But there are oftentimes needs for highly faceted geometry, either to achieve a specific look or for low-polygon uses. In these cases, normal indexed rendering may not be of benefit to you.
But that does not change the fact that these cases are the exception, generally speaking, rather than the rule. Even if your code always involves these cases, that simply isn't true for the majority of high-performance graphics applications out there.
In a closed mesh, every vertex is shared by at least two faces. (The only time a vertex will be used fewer than three times is in a double-sided mesh, where two faces have the same vertices but opposite normals and winding order.) Not using indices, and just duplicating vertices, is not only inefficient, but, at minimum, doubles the amount of vertex data required.
There’s also potential for cache thrashing that could be otherwise avoided, related pipeline stalls and other insanity.
Indices are your friend. Get to know them.
Update
Typically, normals, etc. are stored in a normal map, or interpolated between vertices.
If you just want a faceted or "flat shaded" render, use the cross product of dFdx() and dFdy() (or, in HLSL, ddx() and ddy())to generate the per-pixel normal in your fragment shader. Duplicating data is bad, and only necessary under very special and unusual circumstances. Nothing you've mentioned leads me to believe that this is necessary for your use case.
In this tutorial for OpenGL ES, techniques for optimizing models are explained and one of those is to use triangle strips to define your mesh, using "degenerate" triangles to end one strip and begin another without ending the primitive. http://www.learnopengles.com/tag/degenerate-triangles/
However, this guide is very specific to mobile platforms, and I wanted to know if this technique held for modern desktop hardware. Specifically, would it hurt? Would it either cause graphical artifacts or degrade performance (opposed to splitting the strips into separate primatives?)
If it causes no artifacts and performs at least as well, I aim to use it solely because it makes organizing vertices in a certain mesh I want to draw easier.
Degenerate triangles work pretty well on all platforms. I'm aware of an old fixed-function console that struggled with degenerate triangles, but anything vaguely modern will be fine. Reducing the number of draw calls is always good and I would certainly use degenerates rather than multiple calls to glDrawArrays.
However, an alternative that usually performs better is indexed draws of triangle lists. With a triangle list you have a lot of flexibility to reorder the triangles to take maximum advantage of the post-transform cache. The post-transform cache is a hardware cache of the last few vertices that went through the vertex shader, the GPU can spot if you've re-issued the same vertex and skip the entire vertex shader for that vertex.
In addition to the above answers (no it shouldn't hurt at all unless you do something mad in terms of the ratio of real triangles to the degenerates), also note that the newer versions of OpenGL and OpenGL ES (3.x or higher) APIs support a means to insert breaks into index lists without needing an actual degenerate triangle, which is called primitive restart.
https://www.khronos.org/opengles/sdk/docs/man3/html/glEnable.xhtml
When enabled you can encode "MAX_INT" for the index type, and when detected that forces the GPU to restart building a new tristrip from the next index value.
It will not cause artifacts. As to "degrading performance"... relative to what? Relative to a random assortment of triangles with no indexing? Yes, it will be faster than that.
But there are plenty of other things one can do. For example, primitive restarting, which removes the need for degenerate triangles. Then there's using ordered lists of triangles for improved cache coherency. Will triangle strips be faster than that?
It rather depends on what you're rendering, how expensive your vertex shaders are, and various other things.
But at the end of the day, if you care about maximum performance on particular platforms, then you should profile for each platform and pick the vertex data based on what platform you're running on. If performance is really that important to you, then you're going to have to put forth some effort.
Is there a "standard" method for 3d picking? What do most game companies do? (for accurate picking)
I thought the fastest way is to use the gpu and render every object with an "color index", and then to use glReadPixels(), but then I heard that it's considered slow because of glFlush(), glFinish() calls.
There's also this ray casting approach, which is nice but isn't accurate because of the spheres/AABBs approximations.
Any question about what is "standard" is probably going to invoke some opinionated responses, but I would suggest that the closest to "standard" here is raycasting.
Take your watertight ray/triangle intersection function and test a ray that is unprojected from your mouse cursor position against the triangles in your scene.
Normally this would be quite slow, requiring linear complexity. So the next step is to accelerate it to something better, like logarithmic time. This is typically achieved with a data structure such as an octree, BVH, K-D tree, or BSP. Sometimes people skip this step and just try to make the ray/tri intersection really fast and really parallel, possibly even using GPGPU.
It takes a lot more work upfront than framebuffer-based solutions, but complex applications tend to go this route probably because:
Portability: it's decoupled from the rendering engine. It doesn't have to be tied to OpenGL or DirectX, e.g., and that improves portability.
Generality: typically the accelerator and associated queries are needed for other things. For example, an FPS game might have players and enemies constantly shooting at each other. Figuring out what projectiles hit what tends to require these kinds of intersection queries occurring constantly, and not just from a uniform viewing angle.
Simplicity: the developers can afford the extra work upfront to simplify things later on.
There's also this ray casting approach, which is nice but isn't
accurate because of the spheres/AABBs approximations.
There should be nothing inaccurate about using AABBs or bounding spheres for acceleration purposes. Those are purely to accelerate the tests and quickly reduce the number of the more costly ray/triangle intersections that need to occur by doing cheaper tests and ones that eliminate large batches of triangles to check in bulk. Normally they should be constructed to encompass the elements in the scene. If you do a ray/AABB intersection first, e.g., and if that hits, test the elements encompassed within the AABB. Any acceleration structure that doesn't give the same results without the accelerator would typically be a glitchy one.
For example, a very basic form of acceleration is just put a bounding box around one mesh element in a scene, like a character, and sometimes this basic form without involving a full-blown accelerator might be useful for very dynamic elements in the scene (to avoid the cost of constantly updating the accelerator). If the ray intersects the character's bounding box, then check all the triangles making up the character. As long as you check the triangles within the AABB afterwards, it becomes about acceleration rather than approximation. Of course if you only checked the AABB and nothing else, then it would be a crude approximation.
I am currently working on a WebGL project, although I imagine this question is generic across many graphics APIs.
Let me use the example of a simple cube to demonstrate what I am asking.
A cube has 6 faces with 4 vertices per face so in total we have 24 vertices that make up the cube. However, we could reduce the total number of vertices to only 8 if we share vertices between faces. As I have been reading this can save a lot of precious GPU memory especially when working with complex models and scenes.
On the other hand though, I have experienced first-hand some of the drawbacks with sharing vertices between faces. These include:
Complex vertex normal calculations as we must find the 'average' normal for each vertex, taking into account the face normals of each face that said vertex is a part of.
Some vertices must be duplicated anyway to 'match up' with their corresponding UV coordinates.
As a vertex may be shared by many faces, we are not able to specify a different colour per face using per vertex colouring.
The book I have been reading really stresses the importance of vertex sharing to minimise memory usage, so when I came across some of the disadvantages of vertex sharing I was unsure as to how viable/helpful vertex shading really is, and as the author did not mention any of the downsides of vertex sharing I would like to get the opinions of you guys. So is the memory saving produced from vertex shading really that important?
The disadvantages you named are indeed very real, especially for shapes with lots of sharp edges or different textures. A cube is the worst possible example for vertex sharing, each vertex has 3 different normals and possibly texture coordinates. It is essentially impossible to share the vertices.
However think of some organic shape. Like a ball, the body of some animal, cars, trees or even something as simple as a desert or something. These shapes probably need a high amount of vertices to look like anything decent, but a lot of those vertices are shared between faces. They need the exact same normals, texture coordinates and whatevers in order to look smooth.
Furthermore, the first disadvantage is not really that important. Calculating the vertices can be done in preprocessing, in most cases even by the modeller. This is basically never done realtime, instead you simply already have it in this format. However if it does need to be done realtime you can imagine this becoming an actual issue, and you need to start thinking about the trade offs and profile. But even then it can probably be dealt with using geometry shaders, if the visual fidelity is needed this can be a preferable solution.
In conclusion it heavily depends on what you're doing. In some cases vertex sharing isn't really viable because of the reasons you mentioned. Regardless, in many many cases it can potentially save a lot of memory.
I'm trying to implement effective fluid solver on the GPU using WebGL and GLSL shader programming.
I've found interesting article:
http://http.developer.nvidia.com/GPUGems/gpugems_ch38.html
See: 38.3.2 Slab Operations
I'm wondering if this technique of enforcing boundary conditions is possible with ping-pong rendering?
If I render only lines, what about an interior of the texture?
I've always assumed that the whole input texture must be copied to temporary texture (ofc boundary is updated during that process), as they are swapped after that operation.
This is interesting especially considering the fact, that Example 38-5. The Boundary Condition Fragment Program (visualization: http://i.stack.imgur.com/M4Hih.jpg) shows scheme that IMHO requires ping-pong technique.
What do you think? Do I misunderstand something?
Generally I've found that texture write is extremely costly and that's why I would like to limit it somehow. Unfortunately, the ping-pong technique enforces a lot of texture writes.
I've actually implemented the technique described in that chapter using FrameBuffer objects as the render to texture method (but in desktop OpenGL since WebGL didn't exist at the time), so it's definitely possible. Unfortunately I don't believe I have the code any more, but if you tag any future questions you have with [webgl] I'll see if I can provide some help.
You will need to ping-pong several times per frame (the article mentions five steps, but I seem to recall the exact number depends on the quality of the simulation you want and on your exact boundary conditions). Using FBOs is quite a bit more efficient than it was when this was written (the author mentions using a GeForce FX 5950, which was a while ago), so I wouldn't worry about the overhead he mentions in the article. As long as you aren't bringing data back to the CPU, you shouldn't find too high a cost for switching between texture and the framebuffer.
You will have some leakage if your boundaries are only a pixel thick, but that may or may not be acceptable depending on how you render your results and the velocity of your fluid. Making the boundaries thicker may help, and there are papers that have been written since this one that explore different ways of confining the fluid within boundaries (I also recall a few on more efficient diffusion/pressure solvers that you might check out after you have this version working...you'll find some interesting follow ups if you search for papers that cite the GPU gems article on google scholar).
Addendum: I'm not sure I entirely understand your question about boundaries. The key is that you must run a shader at each pixel of what you want to be a boundary, but it doesn't really matter how that pixel gets there, whether it's drawn with lines, points, or triangles (as long as its inputs are correct).
In the very general case (which might not apply if you only have a limited number of boundary primitives), you will likely have to draw a framebuffer-covering quad, since the interactions with the velocity and pressure fields are more complicated (any surrounding pixel could be another boundary pixel, instead of having simply defined edges). See section 38.5.4 (Arbitrary Boundaries) for some explanation of how to do it. If something isn't a boundary, you won't touch the vector field, and if it is, instead of hardcoding which directions you want to look in to sum vector values, you'll probably end up testing the surrounding pixels and only summing the ones that aren't boundaries so that you can enforce the boundary conditions.