Bounding volume hierarchy traversal in GLSL - opengl

I am currently working on getting models to work efficiently in a ray tracer implemented as an OpenGL compute shader. I already figured out how to construct the BVH and have implemented this on the CPU side. The problem I now have is that I can't figure out how to handle the tree traversal on the GPU side.
This is a snippet of something I came up with:
struct BVHNode {
vec3 AABBMin;
vec3 AABBMax;
BVHNode Children[2];
int NumTriangles;
Triangle Triangles[MAX_TRIANGLES]; // Waste of memory, non leaf nodes dont have any triangles but still have a fully allocated array
};
struct Mesh {
BVHNode BVH;
Material Material;
};
layout(std140, binding=2) buffer MeshSSBO {
Mesh Meshes[];
} meshSSBO;
This was in my opinion the most straightforward way of doing it, just a normal tree that can be traversed easily with recursion or a stack. Sadly, GLSL supports neither of these so I am not sure how to traverse this tree.
So I have a couple questions:
How to traverse a full binary tree of unknown depth in GLSL?
The algorithm I need is very easily implemented with a stack or recursion, but I don't know how to do it without those. This is a snippet of C# that shows what I want to do.
Stack<BVHNode> stack = new Stack<BVHNode>();
stack.Push(root);
while(stack.Count > 0) {
var currentNode = stack.Pop();
if(rayIntersects(ray, currentNode)) {
// Check triangles if leaf
if (node.NumTriangles > 0) {
for (int i = 0; i < node.NumTriangles; i++) {
if (rayIntersects(ray, currentNode.Triangles[i])) {
// handle intersection
}
}
}
// Otherwise continue traversal
else {
stack.Push(currentNode.Children[0]);
stack.Push(currentNode.Children[1]);
}
}
}
Is loading the meshes and BVHs through an SSBO an efficient way to get this data to the shader?
The vertices are loaded through a separate SSBO, the Triangle struct only contains indices to this vertex array.

Related

How to calculate the miss links in a BVH tree?

I am creating an OpenGl based ray tracer for polygon models. To accelerate the application I am using BVH-trees. Because there is no recursion in GLSL, I decided to find an other way to traverse the bounding boxes, sent to the fragment shader as shader storage buffers.
I would like to implement that kind of way:Traversal of BVH tree in shaders
Actually I don't really understand how to calculate the hit and miss links during the construction of the tree. Hit and miss links help the program to navigate to the next node (bounding box) during the traverse, whether it is intersected or not missed.
Until now I created the method to construct the tree, as well as I can also put the tree into a simple array. I have depth-first implementation to flatten the tree into the array.
Here are the depth-first, tree flattening methods:
FlatBvhNode nodeConverter2(BvhNode node, int& ind){
FlatBvhNode result = FlatBvhNode(node.bBox.min, node.bBox.max, ind, node.isLeaf,
node.indices);
return result;
}
void flattenRecursion(const BvhNode &bvhNode, vector<FlatBvhNode>& nodes, int& ind) {
++ind;
nodes.push_back(nodeConverter2(bvhNode, ind));
if (!bvhNode.isLeaf) {
flattenRecursion(*bvhNode.children.at(0), nodes, ind);
flattenRecursion(*bvhNode.children.at(1), nodes,ind);
}
}
vector<FlatBvhNode>* flatten(const BvhNode& root) {
vector<FlatBvhNode>* nodesArray=new vector<FlatBvhNode>;
nodesArray->reserve(root.countNodes());
int ind=0;
flattenRecursion(root, *nodesArray, ind);
return nodesArray;
}
I have to calculate the following "links" :
The image is from: source. The image shows the different linkings. So, for example the ray intersects a bounding box (Hit links), we can move to the next node in the array. This is all right as I have depth-first traversal. The problem is coming when I have to move to the sibling or even to the parent's sibling. How can I implement these linkings / offsets? I know I should create and indices but how to do this with depth-first tree construction.
Any help is appreciated.
I do not have an answer about a depth-first tree, but I have figured out a way to do that if your tree is a heap. So here is some code in GLSL I used
int left(in int index) { // left child
return 2 * index + 1;
}
int right(in int index) { // right child
return 2 * index + 2;
}
int parent(in int index) {
return (index - 1) / 2;
}
int right_sibling(in int index) { // a leaf hit or a miss link
int result = index;
while(result % 2 == 0 && result != 0) {
result = parent(result);
}
return result + 1 * int(result != 0);
}
I am using this and it works with a pretty reasonable speed. The only problem I have is that loop, which slows the performance. I would really like to have a constant complexity expression in that function.

OpenGL 4.5 - Shader storage buffer objects layout

I'm trying my hand at shader storage buffer objects (aka Buffer Blocks) and there are a couple of things I don't fully grasp. What I'm trying to do is to store the (simplified) data of an indeterminate number of lights n in them, so my shader can iterate through them and perform calculations.
Let me start by saying that I get the correct results, and no errors from OpenGL. However, it bothers me not to know why it is working.
So, in my shader, I got the following:
struct PointLight {
vec3 pos;
float intensity;
};
layout (std430, binding = 0) buffer PointLights {
PointLight pointLights[];
};
void main() {
PointLight light;
for (int i = 0; i < pointLights.length(); i++) {
light = pointLights[i];
// etc
}
}
and in my application:
struct PointLightData {
glm::vec3 pos;
float intensity;
};
class PointLight {
// ...
PointLightData data;
// ...
};
std::vector<PointLight*> pointLights;
glGenBuffers(1, &BBO);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, BBO);
glNamedBufferStorage(BBO, n * sizeof(PointLightData), NULL, GL_DYNAMIC_STORAGE_BIT);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, BBO);
...
for (unsigned int i = 0; i < pointLights.size(); i++) {
glNamedBufferSubData(BBO, i * sizeof(PointLightData), sizeof(PointLightData), &(pointLights[i]->data));
}
In this last loop I'm storing a PointLightData struct with an offset equal to its size times the number of them I've already stored (so offset 0 for the first one).
So, as I said, everything seems correct. Binding points are correctly set to the zeroeth, I have enough memory allocated for my objects, etc. The graphical results are OK.
Now to my questions. I am using std430 as the layout - in fact, if I change it to std140 as I originally did it breaks. Why is that? My hypothesis is that the layout generated by std430 for the shader's PointLights buffer block happily matches that generated by the compiler for my application's PointLightData struct (as you can see in that loop I'm blindingly storing one after the other). Do you think that's the case?
Now, assuming I'm correct in that assumption, the obvious solution would be to do the mapping for the sizes and offsets myself, querying opengl with glGetUniformIndices and glGetActiveUniformsiv (the latter called with GL_UNIFORM_SIZE and GL_UNIFORM_OFFSET), but I got the sneaking suspicion that these two guys only work with Uniform Blocks and not Buffer Blocks like I'm trying to do. At least, when I do the following OpenGL throws a tantrum, gives me back a 1281 error and returns a very weird number as the indices (something like 3432898282 or whatever):
const char * names[2] = {
"pos", "intensity"
};
GLuint indices[2];
GLint size[2];
GLint offset[2];
glGetUniformIndices(shaderProgram->id, 2, names, indices);
glGetActiveUniformsiv(shaderProgram->id, 2, indices, GL_UNIFORM_SIZE, size);
glGetActiveUniformsiv(shaderProgram->id, 2, indices, GL_UNIFORM_OFFSET, offset);
Am I correct in saying that glGetUniformIndices and glGetActiveUniformsiv do not apply to buffer blocks?
If they do not, or the fact that it's working is like I imagine just a coincidence, how could I do the mapping manually? I checked appendix H of the programming guide and the wording for array of structures is somewhat confusing. If I can't query OpenGL for sizes/offsets for what I'm tryind to do, I guess I could compute them manually (cumbersome as it is) but I'd appreciate some help in there, too.

Can you modify a uniform from within the shader? If so. how?

So I wanted to store all my meshes in one large VBO. The problem is, how do you do have just one draw call, but let every mesh have its own model to world matrix?
My idea was to submit an array of matrices to a uniform before drawing. In the VBO I would make the color of every first vertex of a mesh negative (So I'd be using the signing bit to check whether a vertex was the first of a mesh).
Okay, so I can detect when a new mesh has started and I have an array of matrices ready and probably a uniform called 'index'. But how do I increase this index by one every time I encounter a new mesh?
Can you modify a uniform from within the shader? If so, how?
Can you modify a uniform from within the shader?
If you could, it wouldn't be uniform anymore, would it?
Furthermore, what you're wanting to do cannot be done even with Image Load/Store or SSBOs, both of which allow shaders to write data. It won't work because vertex shader invocations are not required to be executed sequentially. Many happen at the same time, and there's no way for any shader invocation to know that it will happen "after" the "first vertex" in a mesh.
The simplest way to deal with this is the obvious solution. Render each mesh individually, but set the uniforms for each mesh before each draw call. Without changing buffers between draws, of course. Uniform changes, while not exactly cheap, aren't the most expensive state changes that exist.
There are more complicated drawing methods that could allow you more performance. But that form is adequate for most needs. You've already done the hard part: you removed the need for any state change (textures, buffers, vertex formats, etc) except uniform state.
There are two approaches to minimize draw calls - instancing and batching. The first (instancing) allows you to draw multiple copies of same meshes in one draw call, but it depends on the API (is available from OpenGL 3.1). Batching is similar to instancing but allows you to draw different meshes. Both of these approaches have restrictions - meshes should be with the same materials and shaders.
If you would to draw different meshes in one VBO then instancing is not an option. So, batching requires keeping all meshes in 'big' VBO with applied world transform. It not a problem with static meshes, but have some discomfort with animated. I give you some pseudocode with batching implementation
struct SGeometry
{
uint64_t offsetVB;
uint64_t offsetIB;
uint64_t sizeVB;
uint64_t sizeIB;
glm::mat4 oldTransform;
glm::mat4 transform;
}
std::vector<SGeometry> cachedGeometries;
...
void CommitInstances()
{
uint64_t vertexOffset = 0;
uint64_t indexOffset = 0;
for (auto instance in allInstances)
{
Copy(instance->Vertexes(), VBO);
for (uint64_t i = 0; i < instances->Indices().size(); ++i)
{
auto index = instances->Indices()[i];
index += indexOffset;
IBO[i] = index;
}
cachedGeometries.push_back({vertexOffset, indexOffset});
vertexOffset += instance->Vertexes().size();
indexOffset += instance->Indices().size();
}
Commit(VBO);
Commit(IBO);
}
void ApplyTransform(glm::mat4 modelMatrix, uint64_t instanceId)
{
const SGeometry& geom = cachedGeometries[i];
glm::mat4 inverseOldTransform = glm::inverse(geom.oldTransform);
VertexStream& stream = VBO->GetStream(Position, geom.offsetVB);
for (uint64_t i = 0; i < geom.sizeVB; ++i)
{
glm::vec3 pos = stream->Get(i);
// We need to revert absolute transformation before applying new
pos = glm::vec3(inverseOldNormalTransform * glm::vec4(pos, 1.0f));
pos = glm::vec3(normalTransform * glm::vec4(pos, 1.0f));
stream->Set(i);
}
// .. Apply normal transformation
}
GPU Gems 2 has a good article about geometry instancing http://www.amazon.com/GPU-Gems-Programming-High-Performance-General-Purpose/dp/0321335597

Removeable lightsources like Minecraft

I have succeded with making lightsources like the ones in Minecraft and it came with a very good result. I have used the cellular automata method to create the following light.
But say I got 2 or more lightsources near each other and I want to remove one of them.
Can you recommend a way to recalculate only the affected tiles?
Here is a image showing one lightsource. http://i.stack.imgur.com/E0dqR.png
Below is my code for calculating a light source and all of its neighbors tiles.
void World::processNeighborLight(Tile *pCurrent, int pLightLevel, int *pIterationCount)
{
*pIterationCount += 1; // Just to keep track of how many iterations were made.
pCurrent->updateLight(pLightLevel);
int newLight = pLightLevel - 1;
if (newLight <= 0) return;
Tile *N = pCurrent->getRelative(sf::Vector2i(0, -1));
Tile *E = pCurrent->getRelative(sf::Vector2i(1, 0));
Tile *S = pCurrent->getRelative(sf::Vector2i(0, 1));
Tile *W = pCurrent->getRelative(sf::Vector2i(-1, 0));
if (N->getLightLevel() < newLight)
{
N->updateLight(newLight);
processNeighborLight(N, newLight, pIterationCount);
}
if (E->getLightLevel() < newLight)
{
E->updateLight(newLight);
processNeighborLight(E, newLight, pIterationCount);
}
if (S->getLightLevel() < newLight)
{
S->updateLight(newLight);
processNeighborLight(S, newLight, pIterationCount);
}
if (W->getLightLevel() < newLight)
{
W->updateLight(newLight);
processNeighborLight(W, newLight, pIterationCount);
}
}
You could, rather than having each cell store a light level, have it store instead a collection of (lightsource, lightlevel) pairs (expensive?), and similarly have each light source store a collection of (cell, lightlevel) pairs (cheap!).
void KillLight (LightSource & kill_me)
{
// All we really do is iterate through each illuminated cell, and remove this lightsource from
// their list of light sources
for (auto i = kill_me.cells.begin(); i != kill_me.cells.end(); ++i)
{
// The cell contains some kind of collection that contains either a list of lightsources that hit it or <lightsource, illumination level>
// pairs. All we need to do is remove this light from that collection and recalculate the cell's light level
i->lights->erase (kill_me); // Note light sources must be comparable objects.
i->RecalculateMaxIllumination(); // The cell needs to figure out which of its sources is brightest now.
}
// And then handle other lightsource removal cleanup actions. Probably just have this method be called by
// ~LightSource()
}
If having each cell store a list of light sources hitting it is too expensive, the impact of having each light source remember which cells it illuminates is still cheap. I can think of alternate solutions, but they all involve some kind of mapping from a given light source to the set of all cells it illuminates.
This assumes, of course, that your light sources are relatively few in number compared to the number of cells, and no really crazy luminous light sources which illuminate tens of thousands of cells.

convert a convex path to triangle list

What is the best way to convert a convex path (it is describing in points set) to a list of triangles to be used in opengl render. I think the best stuff is sample code or demo :) thanks!
It sounds like you are looking for one of the many "convert a polygon to a series of triangles" solutions:
Maybe something in one of these will help:
Ear Clipping
List item
poly2tri (with source code)
If you are trying to understand the concepts, the first two are a good place to start.
If you need an implementation, start with the third.
Was this helpful?
If your polygon is really convex and not concave you can just draw it as a triangle fan. That is guaranteed to work.
Here is a alternative recursive algorithm that I wrote a few years ago. It also triangulates a concave polygon and on average generates a much nicer triangulation (e.g. less sliver polygons):
void ConcaveTesselator (unsigned a_NumVertices)
{
unsigned left[32]; // enough space for 2^32 recursions:
unsigned right[32];
unsigned stacktop = 0;
// prepare stack:
left[0] = 0;
right[0] = a_NumVertices-1;
stacktop = 1;
while (stacktop)
{
unsigned l,r,m;
// pop current interval from the stack and subdivide:
stacktop--;
l = left[stacktop];
r = right[stacktop];
m = (l+r)>>1;
// replace this with your triangle drawing function
// or store the indices l,m,r and draw the triangles
// as a triangle list later:
DrawTriangleWithIndices (l,m,r);
// recursive subdivide:
if (m-l > 1)
{
left[stacktop] = l;
right[stacktop] = m;
stacktop++;
}
if (r-m > 1)
{
left[stacktop] = m;
right[stacktop] = r;
stacktop++;
}
}
}