How to convert large arrays of quad primitives to triangle primitives? - opengl

I have an existing system, which provides 3D meshes. The provided data are an array of vertex coordinates with 3 components (x, y, z) and an index list.
The issue is that the index list is a consecutive array of quad primitives.
The system has to be make runnable with a core profile OpenGL Context first, and later with OpenGL ES 3.x, too.
I know that all the quads have all the same winding order (counter clockwise), but I have no further information about the quads. I don't know anything about their relation or adjacencies.
Since I want to use core profile Context for rendering, I cannot use the GL_QUAD primitive type. I have to convert the quads to triangles.
Of course the array of quad indices can easily be converted to an array of triangle indices:
std::vector<unsigned int> triangles;
triangles.reserve( no_of_indices * 6 / 4 );
for ( int i = 0; i < no_of_indices; i += 4 )
{
int tri[] = { quad[i], quad[i+1], quad[i+2], quad[i], quad[i+2], quad[i+3] };
triangles.insert(triangles.end(), tri, tri+6 );
}
If that has to be done only once, then that would be the solution. But the mesh data are not static. The data can change dynamically.
The data do not change continuously and every time, but the data change unpredictably and randomly.
An other simple solution would be to create an vertex array object, which directly refers to an element array buffer with the quads and draw them in a loop with the GL_TRIANGLE_FAN primitive type:
for ( int i = 0; i < no_of_indices; i += 4 )
glDrawElements( GL_TRIANGLE_FAN, 4, GL_UNSIGNED_INT, (void*)(sizeof(unsigned int) * 4) );
But I hope there is a better solution. I'm searching for a possibility to draw the quads with one single draw call, or to transform the quads to triangles on the GPU.

If that has to be done only once, then that would be the solution. But the mesh data are not static.
The mesh data may be dynamic, but the topology of that list is the same. Every 4 vertices is a quad, so every 4 vertices represents the triangles (0, 1, 2) and (0, 2, 3).
So you can build an arbitrarily large static index buffer containing an ever increasing series of these numbers (0, 1, 2, 0, 2, 3, 4, 5, 6, 4, 6, 7, etc). You can even use baseVertex rendering to offset them to render different serieses of quads using the same index buffer.
My suggestion would be to make the index buffer use GLushort as the index type. This way, your index data only takes up 12 bytes per quad. Using shorts gives you a limit of 16384 quads in a single drawing command, but you can reuse the same index buffer to draw multiple serieses of quads with baseVertex rendering:
constexpr GLushort batchSize = 16384;
constexpr unsigned int vertsPerQuad = 6;
void drawQuads(GLuint quadCount)
{
//Assume VAO is set up.
int baseVertex = 0;
while(quadCount > batchSize)
{
glDrawElementsBaseVertex(GL_TRIANGLES​, batchSize * vertsPerQuad, GL_UNSIGNED_SHORT, 0, baseVertex​ * 4);
baseVertex += batchSize;
quadCount -= batchSize;
}
glDrawElementsBaseVertex(GL_TRIANGLES​, quadCount * vertsPerQuad, GL_UNSIGNED_SHORT, 0, baseVertex​ * 4);
}
If you want slightly less index data, you can use primitive restart indices. This allows you to designate an index to mean "restart the primitive". This allows you to use a GL_TRIANGLE_STRIP primitive and break the primitive up into pieces while still only having a single draw call. So instead of 6 indices per quad, you have 5, with the 5th being the restart index. So now your GLushort indices only take up 10 bytes per quad. However, the batchSize now must be 16383, since the index 0xFFFF is reserved for restarting. And vertsPerQuad must be 5.
Of course, baseVertex rendering works just fine with primitive restarting, so the above code works too.

First I want to mention that this is not a question which I want to answer myself, but I want to provide my current solution to this issue.
This means, that I'm still looking for "the" solution, the perfectly acceptable solution.
In my solution, I decided to use Tessellation. I draw patches with a size of 4:
glPatchParameteri( GL_PATCH_VERTICES, self.__patch_vertices )
glDrawElements( GL_PATCHES, no_of_indices, GL_UNSIGNED_INT, 0 )
The Tessellation Control Shader has a default behavior. The patch data is passed directly from the Vertex Shader invocations to the tessellation primitive generation. Because of that it can be omitted completely.
The Tessellation Evaluation Shader uses a quadrilateral patch (quads) to create 2 triangles:
#version 450
layout(quads, ccw) in;
in TInOut
{
vec3 pos;
} inData[];
out TInOut
{
vec3 pos;
} outData;
uniform mat4 u_projectionMat44;
void main()
{
const int inx_map[4] = int[4](0, 1, 3, 2);
float i_quad = dot( vec2(1.0, 2.0), gl_TessCoord.xy );
int inx = inx_map[int(round(i_quad))];
outData.pos = inData[inx].pos;
gl_Position = u_projectionMat44 * vec4( outData.pos, 1.0 );
}
An alternative solution would be to use a Geometry Shader. The input primitive type lines_adjacency provides 4 vertices, which can be mapped to 2 triangles (triangle_strip). Of course this seems to be a hack, since a lines adjacency is something completely different than a quad, but it works anyway.
glDrawElements( GL_LINES_ADJACENCY, no_of_indices, GL_UNSIGNED_INT, 0 );
Geometry Shader:
#version 450
layout( lines_adjacency ) in;
layout( triangle_strip, max_vertices = 4 ) out;
in TInOut
{
vec3 pos;
} inData[];
out TInOut
{
vec3 col;
} outData;
uniform mat4 u_projectionMat44;
void main()
{
const int inx_map[4] = int[4](0, 1, 3, 2);
for ( int i=0; i < 4; ++i )
{
outData.pos = inData[inx_map[i]].pos;
gl_Position = u_projectionMat44 * vec4( outData.pos, 1.0 );
EmitVertex();
}
EndPrimitive();
}
An improvement would be to use Transform Feedback to capture new buffers, containing triangle primitives.

Related

How to draw only certain parts of an array filled with vertex data

CONTEXT:
I am trying to create an open world nature-like scene in openGL. An array terrainVertices of length 3*NUM_OF_VERTICES contains the vertex data for the terrain, which is a noise generated heightmap. Each vertex has x,y,z coordinates, and based on the value of y a certain color is assigned to the vertex, producing a somewhat smooth transition between deep waters and mountain peaks.
PROBLEM:
The lakes are formed when a neighbourhood of vertices has a y value, such that y<0. The colouring is performed as expected, however the result is not realistic. It looks like a blue pit:
The way I decided to tackle this issue is by creating a layer of vertices, that appear over the lake, having y=0 and a light blue color with low alpha value, thus creating the illusion of a surface, beneath which lies the actual lake.
The terrainVertices array is indexed by an element array elements[NUM_OF_ELEMENTS]. I iterate over the element array and try to find triplets of indices, that correspond to vertices with y<0. I gather every triplet that matches this condition in a vector, then create a new vertex array object, with the same vertex data as the terrain, but with the new element buffer object I just described. The vertices that are "underwater" have their y value replaced by 0, so as to stick to the surface of the lake.
Here's the code I used to accomplish this:
std::vector<GLuint> waterElementsV;
//iterating over every triangle
for (int i = 0; i < NUM_OF_ELEMENTS / 3; i++) {
//accessing each vertex based on its index from the element array
//checking if its height is lower than zero
//elements[3*i+0, +1, +2] represents the indices for a triangle
//terrainVertices[index + 1] represents the height of the vertex
if (terrainVertices[3 * (elements[3 * i]) + 1] < 0 &&
terrainVertices[3 * (elements[3 * i + 1]) + 1] < 0 &&
terrainVertices[3 * (elements[3 * i + 2]) + 1] < 0) {
//since its a valid triangle, add its indices in the elements array.
waterElementsV.push_back(elements[3 * i]);
waterElementsV.push_back(elements[3 * i + 1]);
waterElementsV.push_back(elements[3 * i + 2]);
}
}
//iterating through the terrain vertices, and setting
//each appropriate vertex's y value to water surface level y = 0
for (unsigned int i = 0; i < waterElementsV.size(); i++) {
currentIndex = waterElementsV[i];
terrainVertices[3 * currentIndex + 1] = 0;
}
//get the vector's underlying array
waterElements = waterElementsV.data();
glGenVertexArrays(1, &waterVAO);
glBindVertexArray(waterVAO);
glGenBuffers(1, &waterVerticesVBO);
glBindBuffer(GL_ARRAY_BUFFER, waterVerticesVBO);
glBufferData(GL_ARRAY_BUFFER, NUM_OF_VERTICES, terrainVertices, GL_STATIC_DRAW);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, NULL);
glEnableVertexAttribArray(0);
glGenBuffers(1, &waterEBO);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, waterEBO);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(waterElements), waterElements, GL_STATIC_DRAW);
The shaders are very simple, as I only need to assign a position and a colour. No world view projection matrices are used, because every object is drawn relative to the terrain anyway.
Vertex Shader:
#version 330 core
layout (location = 0) in vec3 pos;
void main()
{
gl_Position = vec4(pos, 1.0);
}
Fragment Shader:
#version 330 core
out vec4 fragColor;
void main()
{
fragColor = vec4(0, 0, 0, 1);
}
The result is exactly the same as in the above picture. I've been picking my brain about what is going on but I cannot seem to figure it out... Note that the lakes are spread out at random places, and are not necessarily adjacent to each other.
Any help or tips are greatly appreciated :)

OpenGL compute shader normal map generation poor performance

I have an height cube map and I want to generate a normal cube map texture from it. My height cube map is just a 2048x2048 image that I load at the beginning of the application for each face of the cube, and I can modify in real time a "maximum height" value which is used as a multiplicator when retrieving a pixel in the height map.
Initially I was calculating the normals in the vertex shader, but it gave me bad lighting results so I decided to move the calculations in the fragment shader.
As the height map does not change every frame (only when I modify the "maximum height" value), I want to generate a normal map texture from it, using a compute shader because I don't need any rasterization, but it gives me very poor performances.
With the fragment shader I ran at 200FPS but using the compute shader I run at 40 FPS.
Here is how I bind my images and start the compute work:
_computeShaderProgram.use();
glUniform1f(_computeShaderProgram.getUniformLocation("maxHeight"), maxHeight);
glBindImageTexture(
0,
static_cast<GLuint>(heightMap),
0,
GL_TRUE,
0,
GL_READ_ONLY,
GL_RGBA32F
);
glBindImageTexture(
1,
static_cast<GLuint>(normalMap),
0,
GL_TRUE,
0,
GL_WRITE_ONLY,
GL_RGBA32F
);
// Start compute work
// I only compute for one face of the cube map
glDispatchCompute(normalMap.getWidth() / 16, normalMap.getWidth() / 16, 1);
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
And the compute shader:
#version 430 core
#extension GL_ARB_compute_shader : enable
layout(local_size_x = 16, local_size_y = 16, local_size_z = 1) in;
layout(rgba32f, binding = 0) readonly uniform imageCube heightMap;
layout(rgba32f, binding = 1) writeonly uniform imageCube normalMap;
uniform float maxHeight;
float getHeight(ivec3 heightMapCoord) {
vec4 heightMapValue = imageLoad(heightMap, heightMapCoord);
return heightMapValue.r * maxHeight;
}
void main() {
ivec3 textCoord = ivec3(gl_GlobalInvocationID);
// Calculate height of neighbors
float leftCubePosHeight = getHeight(textCoord + ivec3(-1, 0, 0));
float rightCubePosHeight = getHeight(textCoord + ivec3(1, 0, 0));
float topCubePosHeight = getHeight(textCoord + ivec3(0, -1, 0));
float bottomCubePosHeight = getHeight(textCoord + ivec3(0, 1, 0));
// Calculate normal using central differences method
vec3 horizontal = vec3(2.0, rightCubePosHeight - leftCubePosHeight, 0.0);
vec3 vertical = vec3(0.0, bottomCubePosHeight - topCubePosHeight, 2.0);
vec3 normal = normalize(cross(vertical, horizontal));
imageStore(normalMap, textCoord, vec4(normal, 1.0));
}
I tried with different work groups sizes (width, width / 8, width / 16, width / 32) and local sizes (1, 8, 16, 32) but the performance is always poor, around 40 FPS or 20 FPS for work group with a size of the full width.
I know I can use shared memory for threads in the same work group to prevent fetching the same texture coordinate 4 times but later I will have height map generated procedurally and will be larger than 2048x2048 I think.
What is the difference between the fragment shader and the compute shader that make it so slow ? Am I doing something wrong ?
Is there any other solutions to generate this normal map ?
EDIT:
The fps I gave above are not right because I was generating 1/16 of the normal map (when I had 40FPS), and I also used the central differences technique to calculate the normals, which is cheap but does not give good lighting results, so I switched to Sobel technique, which is a little more expensive.
I made some tests to know which technique could give the best performance.
Each frame I generate the normal map (this will not be the case later, but it's just to test the performance). Here are my tests:
CPU side single thread: 1.5FPS
Compute shader with local sizes of 1 and one worker group for each image pixel: 4FPS
Compute shader with local sizes of 16 and one worker group for each 16x16 image pixels block: 11FPS
Fragment shader using framebuffer and MRT with 6 color attachments (one for each face of the normal map): 12.5FPS
This is a little laggy when I modify the max height (which generate the normal map again), but I think it's okay as I won't modify it a lot.

OpenGL - Provide a set of values in a 1D texture

I want to provide a set of values in a 1D texture. Please consider the following simple example:
gl.glBindTexture(GL4.GL_TEXTURE_1D, myTextureHandle);
FloatBuffer values = Buffers.newDirectFloatBuffer(N);
for (int x = 0; x < N; ++x)
values.put(x);
values.rewind();
gl.glTexImage1D(GL4.GL_TEXTURE_1D, 0, GL4.GL_R32F, N, 0, GL4.GL_RED, GL4.GL_FLOAT, values);
Here, N is the amount of values I want to store in the texture. However, calling textureSize(myTexture, 0) in my fragment shader yields 1 (no matter to what I set N). So, what's going wrong here?
EDIT: The code above is executed at initialization. My rendering loop looks like
gl.glClear(GL4.GL_COLOR_BIT |GL4.GL_DEPTH_BUFFER_BIT);
gl.glUseProgram(myProgram);
gl.glActiveTexture(MY_TEXTURE_INDEX);
gl.glBindTexture(GL4.GL_TEXTURE_1D, myTextureHandle);
gl.glUniform1i(uMyTexture, MY_TEXTURE_INDEX);
gl.glDrawArrays(GL4.GL_POINTS, 0, 1);
My vertex shader consists of a main-function which does nothing. I'm using the geometry shader to create a fullscreen quad. The pixel shader code looks like
uniform sampler1D myTexture;
out vec4 color;
void main()
{
if (textureSize(myTexture, 0) == 1)
{
color = vec4(1, 0, 0, 1);
return;
}
color = vec4(1, 1, 0, 1);
}
The result is a red-colored window.
Make sure your texture is complete. Since GL_TEXTURE_MIN_FILTER defaults to GL_NEAREST_MIPMAP_LINEAR you'll have to supply a full set of mipmaps.
Or set GL_TEXTURE_MIN_FILTER to GL_NEAREST/GL_LINEAR.
You also need to pass GL_TEXTURE0 + MY_TEXTURE_INDEX (instead of only MY_TEXTURE_INDEX) to glActiveTexture():
gl.glActiveTexture( GL_TEXTURE0 + MY_TEXTURE_INDEX );
...
gl.glUniform1i( uMyTexture, MY_TEXTURE_INDEX );

openGL using glVertexAttribPointer

So I created a quad using glBegin(GL_QUADS) and then drawing vertices and now I want to pass into my shader an array of texture coordinates so that I can apply a texture to my quad.
So I'm having some trouble getting the syntax right.
First I made a 2D array of values
GLfloat coords[4][2];
coords[0][0] = 0;
coords[0][1] = 0;
coords[1][0] = 1;
coords[1][1] = 0;
coords[2][0] = 1;
coords[2][1] = 1;
coords[3][0] = 0;
coords[3][1] = 1;
and then I tried to put it into my shader where I have a attribute vec2 texcoordIn
GLint texcoord = glGetAttribLocation(shader->programID(), "texcoordIn");
glEnableVertexAttribArray(texcoord);
glVertexAttribPointer(texcoord, ???, GL_FLOAT, ???, ???, coords);
So I'm confused as to what I should put in for parameters to glVertexAttribPointer that I marked with '???' and I'm also wondering if I'm even allowed to represent the texture coordinates as a 2d array like I did in the first place.
The proper values would be
glVertexAttribPointer(
texcoord,
2, /* two components per element */
GL_FLOAT,
GL_FALSE, /* don't normalize, has no effect for floats */
0, /* distance between elements in sizeof(char), or 0 if tightly packed */
coords);
and I'm also wondering if I'm even allowed to represent the texture coordinates as a 2d array like I did in the first place.
If you write it in the very way you did above, i.e. using a statically allocated array, then yes, because the C standard asserts that the elements will be tightly packed in memory. However if using a dynamically allocated array of pointers to pointers, then no.

Use index as coordinate in OpenGL

I want to implement a timeseries viewer that allows a user to zoom and smoothly pan.
I've done some immediate mode opengl before, but that's now deprecated in favor of VBOs. All the examples of VBOs I can find store XYZ coordinates of each and every point.
I suspect that I need to keep all my data in VRAM in order to get a framerate during pan that can be called "smooth", but I have only Y data (the dependent variable). X is an independent variable which can be calculated from the index, and Z is constant. If I have to store X and Z then my memory requirements (both buffer size and CPU->GPU block transfer) are tripled. And I have tens of millions of data points through which the user can pan, so the memory usage will be non-trivial.
Is there some technique for either drawing a 1-D vertex array, where the index is used as the other coordinate, or storing a 1-D array (probably in a texture?) and using a shader program to generate the XYZ? I'm under the impression that I need a simple shader anyway under the new fixed-feature-less pipeline model to implement scaling and translation, so if I could combine the generation of X and Z coordinates and scaling/translation of Y that would be ideal.
Is this even possible? Do you know of any sample code that does this? Or can you at least give me some pseudocode saying what GL functions to call in what order?
Thanks!
EDIT: To make sure this is clear, here's the equivalent immediate-mode code, and vertex array code:
// immediate
glBegin(GL_LINE_STRIP);
for( int i = 0; i < N; ++i )
glVertex2(i, y[i]);
glEnd();
// vertex array
struct { float x, y; } v[N];
for( int i = 0; i < N; ++i ) {
v[i].x = i;
v[i].y = y[i];
}
glVertexPointer(2, GL_FLOAT, 0, v);
glDrawArrays(GL_LINE_STRIP, 0, N);
note that v[] is twice the size of y[].
That's perfectly fine for OpenGL.
Vertex Buffer Objects (VBO) can store any information you want in one of GL supported format. You can fill a VBO with just a single coordinate:
glGenBuffers( 1, &buf_id);
glBindBuffer( GL_ARRAY_BUFFER, buf_id );
glBufferData( GL_ARRAY_BUFFER, N*sizeof(float), data_ptr, GL_STATIC_DRAW );
And then bind the proper vertex attribute format for a draw:
glBindBuffer( GL_ARRAY_BUFFER, buf_id );
glEnableVertexAttribArray(0); // hard-coded here for the sake of example
glVertexAttribPointer(0, 1, GL_FLOAT, false, 0, NULL);
In order to use it you'll need a simple shader program. The vertex shader can look like:
#version 130
in float at_coord_Y;
void main() {
float coord_X = float(gl_VertexID);
gl_Position = vec4(coord_X,at_coord_Y,0.0,1.0);
}
Before linking the shader program, you should bind it's at_coord_Y to the attribute index you'll use (=0 in my code):
glBindAttribLocation(program_id,0,"at_coord_Y");
Alternatively, you can ask the program after linking for the index to which this attribute was automatically assigned and then use it:
const int attrib_pos = glGetAttribLocation(program_id,"at_coord_Y");
Good luck!
Would you store ten millions of XY coordinates, in VRAM?
I would you suggest to compute those coordinates on CPU, and pass them to the shader pipeline as uniforms (since coordinates are fixed to the panned image).
Keep it simple.