Hierachical Z-Buffering for occlusion culling - opengl

I'm reading the Occlusion Culling section in Real-Time Rendering 3rd Edition and I couldn't understand how it works. Some questions:
How does having a "Z-pyramid" contribute? Why do we need multiple resolutions of the Z-buffer? In the book, it's displayed as follows (left side):
Is the Octree structure the same Octree that is used for general frustum culling and rendering? Or is it a specialized Octree made just for the occlusion culling technique?
A more general question: In a previous section (and also here), the Occlusion Query term is described as "rendering a simplified bounding-volume of an object and comparing it's depth results to the Z-buffer, returning the amount of pixels that are visible." What functions in OpenGL are associated with this Occlusion Query concept?
Is this technique the standard for open-world games occlusion culling?

Hierarchical Z-buffer is useful in situations when large nearby objects are likely to occlude a lot of small farther objects. An example would be rendering an inside of a building or a mountainous landscape.
When the nearby object is rendered it would set a pixel somewhere on a lower-resolution level of the Z buffer pyramid to a near depth value. When a farther smaller object is being rendered, its bounding box can first be checked against that pixel and be culled in its entirety.
Yes. It's the same octrees. But it doesn't have to be an octree. Any hierarchical spacial indexing data structure would work for both, hierarchical Z-buffer or frustum culling.
In order to benefit from the hierarchical Z-buffer we would need to render the scenery starting with the nearest objects. Techniques like octrees or BSPs can be used for that.
Additionally, having an octree at hand lets us cull entire tree branches based on the distance to their bbox rather than separate objects or triangles.
The part of OpenGL that's responsible for occlusion queries is: glBeginQuery, glEndQuery and glGetQueryObject. See Query Objects for details.
Hierarchical Z buffers were implemented in hardware on some early Radeons. However I didn't hear of it being used nowadays.
Occlusion queries, on the other hand, are normally used. They, in essence, give similar benefits.

Related

Efficiently providing geometry for terrain physics

I have been researching different approaches to terrain systems in game engines for a bit now, trying to familiarize myself with the work. A number of the details seem straightforward, but I am getting hung up on a single detail.
For performance reasons many terrain solutions utilize shaders to generate parts or all of the geometry, such as vertex shaders to generate positions or tessellation shaders for LoD. At first I figured those approaches were exclusively for renders that weren't concerned about physics simulations.
The reason I say that is because as I understand shaders at the moment, the results of a shader computation generally are discarded at the end of the frame. So if you rely on shaders heavily then the geometry information will be gone before you could access it and send it off to another system (such as physics running on the CPU).
So, am I wrong about shaders? Can you store the results of them generating geometry to be accessed by other systems? Or am I forced to keep the terrain geometry on CPU and leave the shaders to the other details?
Shaders
You understand parts of the shaders correctly, that is: after a frame, the data is stored as a final composed image in the backbuffer.
BUT: Using transform feedback it is possible to capture transformed geometry into a vertex buffer and reuse it. Transform Feedback happens AFTER the vertex/geometry/tessellation shader, so you could use the geometry shader to generate a terrain (or visible parts of it once), push it through transform-feedback and store it.
This way, you potentially could use CPU collision detection with your terrain! You can even combine this with tessellation.
You will love this: A Framework for Real-Time, Deformable Terrain.
For the LOD and tessellation: LOD is not the prerequisite of tessellation. You can use tessellation to allow some more sophisticated effects such as adding a detail by recursive subdivision of rough geometry. Linking it with LOD is simply a very good optimization avoiding RAM-memory based LOD-mesh-levels, since you just have your "base mesh" and subdivide it (Although this will be an unsatisfying optimization imho).
Now some deeper info on GPU and CPU exclusive terrain.
GPU Generated Terrain (Procedural)
As written in the NVidia article Generating Complex Procedural Terrains Using the GPU:
1.2 Marching Cubes and the Density Function Conceptually, the terrain surface can be completely described by a single function, called the
density function. For any point in 3D space (x, y, z), the function
produces a single floating-point value. These values vary over
space—sometimes positive, sometimes negative. If the value is
positive, then that point in space is inside the solid terrain.
If the value is negative, then that point is located in empty space
(such as air or water). The boundary between positive and negative
values—where the density value is zero—is the surface of the terrain.
It is along this surface that we wish to construct a polygonal mesh.
Using Shaders
The density function used for generating the terrain, must be available for the collision-detection shader and you have to fill an output buffer containing the collision locations, if any...
CUDA
See: https://www.youtube.com/watch?v=kYzxf3ugcg0
Here someone used CUDA, based on the NVidia article, which however implies the same:
In CUDA, performing collision detection, the density function must be shared.
This will however make the transform feedback techniques a little harder to implement.
Both, Shaders and CUDA, imply resampling/recalculation of the density at at least one location, just for the collision detection of a single object.
CPU Terrain
Usually, this implies a RAM-memory stored set of geometry in the form of vertex/index-buffer pairs, which are regularly processed by the shader-pipeline. As you have the data available here, you will also most likely have a collision mesh, which is a simplified representation of your terrain, against which you perform collision.
Alternatively you could spend your terrain a set of colliders, marking the allowed paths, which is imho performed in the early PS1 Final Fantasy games (which actually don't really have a terrain in the sense we understand terrain today).
This short answer is neither extensively deep nor complete. I just tried to give you some insight into some concepts used in dozens of solutions.
Some more reading: http://prideout.net/blog/?tag=opengl-transform-feedback.

Implementation of raymarching surfaces in GLSL

I've been reading up on a lot of various articles regarding to ray-marching in GLSL shaders (such as this one article: http://www.iquilezles.org/www/articles/rmshadows/rmshadows.htm) and it raised some questions that I wanted to ask.
In my application, I am rendering a scene with a couple of meshes and I wanted to experiment with shadows. While I seem to somewhat understand the concept of how raymarching works, I don't quite understand how to properly implement this in GLSL. I know how to compute the intersection of a ray and a plane but how would this be handled through GLSL shaders?
According to this thread here: (https://gamedev.stackexchange.com/questions/67719/how-do-raymarch-shaders-work) it mentions that you're measuring the distance between the start of the ray and the 'surface'. Is the surface he's referring to the mesh? Do I need to send an array of planes/points that makes up the mesh to the shader in order to compute the ray intersection test? Do I need to use the depth buffer to determine the distance of the surface?
It's depend of what your shader does vs what your rendering engin does. In pure demo shaders like shadertoy (see its shadow examples ) the whole scene is encoded in the shader so there is no problem shooting secondary rays or more (beside perfs).
If the scene is not managed by your shader, then you need a bit of cooperation from your engine. At least, to produce a shadowmap in a first pass (many different algorithms exists).
Note that with SVO representation, the scene is first converted into sparse voxels, which can then be marched by the shader for secondary rays. Could be even for primary ray, but you do can use regular Z-buffer here, and voxel cone-tracing (for instance) for all kinds of secondary rays ( see *Interactive Indirect Illumination Using Voxel Cone Tracing * here: http://gigavoxels.imag.fr/publications.html (ok, you might find it overkill in your simple application). For soft shadows and depth of field, see the seminal paper GigaVoxels : Ray-Guided Streaming for Efficient and Detailed Voxel Rendering . Note that the tree might even be a regular BSP of triangles, instead of on octree of voxels. But then you loose many advantage of SVO (perfs, increased for soft shadows).

State of the art Culling and Batching techniques in rendering

I'm currently working with upgrading and restructuring an OpenGL render engine. The engine is used for visualising large scenes of architectural data (buildings with interior), and the amount of objects can become rather large. As is the case with any building, there is a lot of occluded objects within walls, and you naturally only see the objects that are in the same room as you, or the exterior if you are on the outside. This leaves a large number of objects that should be occluded through occlusion culling and frustum culling.
At the same time there is a lot of repetative geometry that can be batched in renderbatches, and also a lot of objects that can be rendered with instanced rendering.
The way I see it, it can be difficult to combine renderbatching and culling in an optimal fashion. If you batch too many objects in the same VBO it's difficult to cull the objects on the CPU in order to skip rendering that batch. At the same time if you skip the culling on the cpu, a lot of objects will be processed by the GPU while they are not visible. If you skip batching copletely in order to more easily cull on the CPU, there will be an unwanted high amount of render calls.
I have done some research into existing techniques and theories as to how these problems are solved in modern graphics, but I have not been able to find any concrete solution. An idea a colleague and me came up with was restricting batches to objects relatively close to eachother e.g all chairs in a room or within a radius of n meeters. This could be simplified and optimized through use of oct-trees.
Does anyone have any pointers to techniques used for scene managment, culling, batching etc in state of the art modern graphics engines?
There's lots of information about frustum and occlusion culling on the internet.
Most of it comes from game developers.
Here's a list of some articles that will get you started:
http://de.slideshare.net/guerrillagames/practical-occlusion-culling-in-killzone-3
http://de.slideshare.net/TiagoAlexSousa/secrets-of-cryengine-3-graphics-technology
http://de.slideshare.net/Umbra3/siggraph-2011-occlusion-culling-in-alan-wake
http://de.slideshare.net/Umbra3/visibility-optimization-for-games
http://de.slideshare.net/Umbra3/chen-silvennoinen-tatarchuk-polygon-soup-worlds-siggraph-2011-advances-in-realtime-rendering-course
http://de.slideshare.net/DICEStudio/culling-the-battlefield-data-oriented-design-in-practice
http://www.cse.chalmers.se/~uffe/vfc.pdf
My (pretty fast) renderer works similar to this:
Collection: Send all props, which you want to render, to the renderer.
Frustum culling: The renderer culls the invisible props from the list using multiple threads in parallel.
Occlusion culling: Now you could do occlusion culling on the CPU (I haven't implemented it yet, because I don't need it now). Detailed information on how to do it efficiently can be found in the Killzone and Crysis slides. One solution would be to read back the depth buffer of the previous frame from the GPU and then rasterize the bounding boxes of the objects on top of it to check if the object is visible.
Splitting: Since you now know which objects actually have to be rendered, because they are visible, you have to split them by mesh, because each mesh has a different material or texture (otherwise they would be combined into a single mesh).
Batching: Now you have a list of meshes to render. You can sort them:
by depth (this can be done on the prop level instead of the mesh level), to save fillrate (I don't recommend doing this if your fragment shaders are very simple).
by mesh (because there might be multiple instances of the same mesh and it would make it easy to add instancing).
by texture, because texture switches are very costly.
Rendering: Iterate through your partitioned meshes and render them.
And as "Full Frontal Nudity" already said: There's no perfect solution.

Which geometrical calculations can be accelerated using OpenGL

I need to accelerate some programs that use intensive calculations where surface calculations from the intersection between cubes, spheres and similar are needed. Using CUDA I need to specify all the formuale I need, of course, in order to analytically calculate information related to intersections. But since I only need a good approximation of the resulting surface, I read about OpenGL can calculate or estimate such surfaces. I wonder if you could give me your opinion or point me to relevant references
If you just need to render those objects, you could use the stencil buffer to evaluate whatever boolean operations you need: http://www.opengl.org/resources/code/samples/advanced/advanced97/notes/node11.html
Any quantities that could be computed from either a perspective or orthographic projection of the intersection surface could be deduced from such a rendering together with its depth buffer. If you need to extract the whole intersection, you can try using depth peeling together with stencilled CSG to extract a layered representation of the complete intersection, though it can be very inaccurate on the parts of the surface which are parallel to the viewing direction and you will need to do some extra work to stitch the layers back together:
http://developer.download.nvidia.com/SDK/10/opengl/src/dual_depth_peeling/doc/DualDepthPeeling.pdf
EDIT: This will work for arbitrary, free form surfaces and is a fairly standard technique. But it does have its limitations, in that the accuracy you get will be fairly poor and you may have to project onto multiple views in order to get some adequate covering of your object. As an example, here is an application to collision detection: http://www.cs.ucl.ac.uk/staff/b.spanlang/ISBCICSOWH.pdf
OpenGL is of even less use here than CUDA or OpenCL, since it's primarily targeted at drawing triangular tesselated meshes. Of course you can do sophisticated geometrical computations in the various shader stages of modern OpenGL. The problem is, that the result of all those computations is a pixel based picture. There is a feedback mechanism to retrieve the processed vertex data, but that only gives you a mesh.
Intersections of anything planar or/and with spheres is actually quite easy and can be done analytically. The real hard stuff is intersecting freeform curved surfaces (Bezìer or NURBS). Those usually don't have a closed solution, so what you need to do is numerically aproximating a trim curve that best fits the intersection.

OpenGL, applying texture from image to isosurface

I have a program in which I need to apply a 2-dimensional texture (simple image) to a surface generated using the marching-cubes algorithm. I have access to the geometry and can add texture coordinates with relative ease, but the best way to generate the coordinates is eluding me.
Each point in the volume represents a single unit of data, and each unit of data may have different properties. To simplify things, I'm looking at sorting them into "types" and assigning each type a texture (or portion of a single large texture atlas).
My problem is I have no idea how to generate the appropriate coordinates. I can store the location of the type's texture in the type class and use that, but then seams will be horribly stretched (if two neighboring points use different parts of the atlas). If possible, I'd like to blend the textures on seams, but I'm not sure the best manner to do that. Blending is optional, but I need to texture the vertices in some fashion. It's possible, but undesirable, to split the geometry into parts for each type, or to duplicate vertices for texturing purposes.
I'd like to avoid using shaders if possible, but if necessary I can use a vertex and/or fragment shader to do the texture blending. If I do use shaders, what would be the most efficient way of telling it was texture or portion to sample? It seems like passing the type through a parameter would be the simplest way, but possible slow.
My volumes are relatively small, 8-16 points in each dimension (I'm keeping them smaller to speed up generation, but there are many on-screen at a given time). I briefly considered making the isosurface twice the resolution of the volume, so each point has more vertices (8, in theory), which may simplify texturing. It doesn't seem like that would make blending any easier, though.
To build the surfaces, I'm using the Visualization Library for OpenGL and its marching cubes and volume system. I have the geometry generated fine, just need to figure out how to texture it.
Is there a way to do this efficiently, and if so what? If not, does anyone have an idea of a better way to handle texturing a volume?
Edit: Just to note, the texture isn't simply a gradient of colors. It's actually a texture, usually with patterns. Hence the difficulty in mapping it, a gradient would've been trivial.
Edit 2: To help clarify the problem, I'm going to add some examples. They may just confuse things, so consider everything above definite fact and these just as help if they can.
My geometry is in cubes, always (loaded, generated and saved in cubes). If shape influences possible solutions, that's it.
I need to apply textures, consisting of patterns and/or colors (unique ones depending on the point's "type") to the geometry, in a technique similar to the splatting done for terrain (this isn't terrain, however, so I don't know if the same techniques could be used).
Shaders are a quick and easy solution, although I'd like to avoid them if possible, as I mentioned before. Something usable in a fixed-function pipeline is preferable, mostly for the minor increase in compatibility and development time. Since it's only a minor increase, I will go with shaders and multipass rendering if necessary.
Not sure if any other clarification is necessary, but I'll update the question as needed.
On the texture combination part of the question:
Have you looked into 3d textures? As we're talking marching cubes I should probably immediately say that I'm explicitly not talking about volumetric textures. Instead you stack all your 2d textures into a 3d texture. You then encode each texture coordinate to be the 2d position it would be and the texture it would reference as the third coordinate. It works best if your textures are generally of the type where, logically, to transition from one type of pattern to another you have to go through the intermediaries.
An obvious use example is texture mapping to a simple height map — you might have a snow texture on top, a rocky texture below that, a grassy texture below that and a water texture at the bottom. If a vertex that references the water is next to one that references the snow then it is acceptable for the geometry fill to transition through the rock and grass texture.
An alternative is to do it in multiple passes using additive blending. For each texture, draw every face that uses that texture and draw a fade to transparent extending across any faces that switch from one texture to another.
You'll probably want to prep the depth buffer with a complete draw (with the colour masks all set to reject changes to the colour buffer) then switch to a GL_EQUAL depth test and draw again with writing to the depth buffer disabled. Drawing exactly the same geometry through exactly the same transformation should produce exactly the same depth values irrespective of issues of accuracy and precision. Use glPolygonOffset if you have issues.
On the coordinates part:
Popular and easy mappings are cylindrical, box and spherical. Conceptualise that your shape is bounded by a cylinder, box or sphere with a well defined mapping from surface points to texture locations. Then for each vertex in your shape, start at it and follow the normal out until you strike the bounding geometry. Then grab the texture location that would be at that position on the bounding geometry.
I guess there's a potential problem that normals tend not to be brilliant after marching cubes, but I'll wager you know more about that problem than I do.
This is a hard and interesting problem.
The simplest way is to avoid the issue completely by using 3D texture maps, especially if you just want to add some random surface detail to your isosurface geometry. Perlin noise based procedural textures implemented in a shader work very well for this.
The difficult way is to look into various algorithms for conformal texture mapping (also known as conformal surface parametrization), which aim to produce a mapping between 2D texture space and the surface of the 3D geometry which is in some sense optimal (least distorting). This paper has some good pictures. Be aware that the topology of the geometry is very important; it's easy to generate a conformal mapping to map a texture onto a closed surface like a brain, considerably more complex for higher genus objects where it's necessary to introduce cuts/tears/joins.
You might want to try making a UV Map of a mesh in a tool like Blender to see how they do it. If I understand your problem, you have a 3D field which defines a solid volume as well as a (continuous) color. You've created a mesh from the volume, and now you need to UV-map the mesh to a 2D texture with texels extracted from the continuous color space. In a tool you would define "seams" in the 3D mesh which you could cut apart so that the whole mesh could be laid flat to make a UV map. There may be aliasing in your texture at the seams, so when you render the mesh it will also be discontinuous at those seams (ie a triangle strip can't cross over the seam because it's a discontinuity in the texture).
I don't know any formal methods for flattening the mesh, but you could imagine cutting it along the seams and then treating the whole thing as a spring/constraint system that you drop onto a flat surface. I'm all about solving things the hard way. ;-)
Due to the issues with texturing and some of the constraints I have, I've chosen to write a different algorithm to build the geometry and handle texturing directly in that as it produces surfaces. It's somewhat less smooth than the marching cubes, but allows me to apply the texcoords in a way that works for my project (and is a bit faster).
For anyone interested in texturing marching cubes, or just blending textures, Tommy's answer is a very interesting technique and the links timday posted are excellent resources on flattening meshes for texturing. Thanks to both of them for their answers, hopefully they can be of use to others. :)