GLSL - vertex shader and batching using uniform buffer object - opengl

I have the following vertex shader:
#version 450 core
...
layout (binding=2, std140) uniform MATRIX_BLOCK
{
mat4 projection;
mat4 view;
mat4 model[128];
mat4 mvp[128];
};
layout (location = 0) in vec3 aPos;
layout (location = 1) in vec2 aTexCoord;
layout (location = 3) in uint object_idx;
out vec2 TexCoord;
flat out uint instance_idx;
void main()
{
gl_Position = mvp[object_idx] * vec4(aPos.x, aPos.y, aPos.z, 1.0);
TexCoord = aTexCoord;
instance_idx = object_idx;
}
I'm using a uniform buffer to pass in 128 model and model-view-projection matrices, indexed by an object id. The object id is passed to the shader using a vertex attribute object_idx; basically every vertex, besides having x,y,z coordinates and u,v texture coordinates, also has an object id associated with it. The idea would be to be able to store the data for multiple objects in the same buffers but still use specific transformation matrices for each individual object. This is my (possibly stupid) attempt to batch together multiple objects to draw them with a single draw call, without having to rebind anything, using glDrawElements to render triangles.
However, it doesn't work. When I do
gl_Position = mvp[object_idx] * vec4(aPos.x, aPos.y, aPos.z, 1.0);
then triangles with an object_idx of 0 get rendered just fine at the expected position, but triangles with vertices with object_idx's other than 0 don't appear anywhere. I thought I might have gotten the transformation matrices wrong, so for debugging, I reduced the possible objects to just 2 (0 and 1) and inverted the indexing using
gl_Position = mvp[1-object_idx] * vec4(aPos.x, aPos.y, aPos.z, 1.0);
This resulted in all the triangles with object_idx = 0 being rendered at the expected position for mvp[1], but again, no triangles with object_idx = 1 appearing anywhere. So at least I know that the transformation matrices are correct. I then tried
gl_Position = mvp[0] * vec4(aPos.x, aPos.y, aPos.z, 1.0);
and that renders all triangles (using object 0's transformation matrix) and
gl_Position = mvp[1] * vec4(aPos.x, aPos.y, aPos.z, 1.0);
renders all of them, using object 1's transformation matrix.
So, obviously I don't understand something really fundamental about how vertex shaders or glDrawElements do their work.
So, my question:
Why don't all my triangles get rendered when I do a "dynamic" lookup of the mvp transformation matrix using object_idx, when to the best of my ability to check it all the data is passed into the vertex shader just as it's supposed to be?

I'm making an educated guess here:
layout (location = 3) in uint object_idx;
Using uint attribute input requires to set up the attribute pointer with the function glVertexAttribIPointer() (note the extra I in there).
Using the standard glVertexAttribPointer function will always set up float attributes (and using type GL_INT there will convert from integer to float). Technically, when you read such a float attribute as uint in the shader, it will be undefined, but it is quite likely the 0 stays 0 as the float and integer represations of that are usually identical, but that works only by accident.
Apart from that issue, storing the object index per vertex is also quite inefficient. To effectively batch your draw calls, you should have a look at multi draw calls and gl_DrawID (originally from GL_ARB_shader_draw_parameters) features. You might also find the approaching zero driver overhead (AZDO) techniques useful.

Related

Rendering breaks when using an int in GLSL

I'm rendering light spheres for a deferred renderer and I'm in the process of switching to instancing for better performance. I have the following vertex shader:
in vec3 position;
uniform int test_index;
uniform mat4 projectionMatrix;
uniform mat4 viewMatrix;
uniform mat4 modelMatrix[256];
void main(void) {
gl_Position = projectionMatrix * viewMatrix * modelMatrix[test_index] * vec4(position, 1.0);
}
I upload the matrices to the shader with
val location = glGetUniformLocation(programId, "modelMatrix[$i]")
glUniformMatrix4fv(location, false, buf)
When I use the int uniform in the index (Hardcoded to a 0 for debugging purposes), the sphere disappears, except when I clip into geometry (in which case it renders as a white circle). The same happens when I use gl_InstanceID as my index.
Weirdly I noticed that the problem also occurs when I pass an int from vertex to fragment shader and use it there for something completely different, regardless of what I use as the index.
The problem disappears instantly and rendering is completely fine when I hardcode modelMatrix[0] in the shader instead of modelMatrix[test_index].
I've got a different shader (for skeletal animation) which uploads a mat4 uniform array the exact same way, also being indexed with an int, but I've got no such problems there...
I don't really know what to make of this, so any advice on how I can debug this is much appreciated. I'm using OpenGL 3.3 on Kotlin+LWJGL
Edit: This probably has nothing to do with the uniform. The following also does not work:
int i = 0;
gl_Position = projectionMatrix * viewMatrix * modelMatrix[i] * vec4(position, 1.0f);
OpenGL has a limit on how many uniforms one can use. The same applies to attributes too (but that's not the problem here). An array of 256 matrices is very likely to exceed the allowed amount.
The reason why the code only breaks when using the int uniform is that glsl compilers do a lot of optimization under the hood, for example, removing unused uniforms. So if you hardcode the array location in the shader, the compiler will notice that only a single matrix is ever used and might remove all the others.
When you need more uniforms than what OpenGL allows for, you have to use a uniform buffer object (UBO) or a shader storage buffer (SSBO).

How to instance draw with different transformations for multiple objects

Im having a little problem with glDrawArraysInstanced().
Right now Im trying to draw a chess board with pieces.
I have all the models loaded in properly.
Ive tried drawing pawns only with instance drawing and it worked. I would send an array with transformation vec3s to shader through a uniform and move throught the array with gl_InstanceID
That would be done with this for loop (individual draw call for each model):
for (auto& i : this->models) {
i->draw(this->shaders[0], count);
}
which eventually leads to:
glDrawArraysInstanced(GL_TRIANGLES, 0, vertices.size(), count);
where the vertex shader is:
#version 460
layout(location = 0) in vec3 vertex_pos;
layout(location = 1) in vec2 vertex_texcoord;
layout(location = 2) in vec3 vertex_normal;
out vec3 vs_pos;
out vec2 vs_texcoord;
out vec3 vs_normal;
flat out int InstanceID;
uniform mat4 modelMatrix;
uniform mat4 viewMatrix;
uniform mat4 projectionMatrix;
uniform vec3 offsets[16];
void main(void){
vec3 offset = offsets[gl_InstanceID]; //saving transformation in the offset
InstanceID = gl_InstanceID; //unimportant
vs_pos = vec4(modelMatrix * vec4(vertex_pos + offset, 1.f)).xyz; //using the offset
vs_texcoord = vec2(vertex_texcoord.x,1.f-vertex_texcoord.y);
vs_normal = mat3(transpose(inverse(modelMatrix))) * vertex_normal;
gl_Position = projectionMatrix * viewMatrix * modelMatrix * vec4(vertex_pos + offset,1.f); //using the offset
}
Now my problem is that I dont know how to draw multiple objects in this way and change their transformations since gl_InstanceID starts from 0 on each draw call and thus my array with transformations would be used again from the beggining (which would just draw next pieces on pawns positions).
Any help will be appreciated.
You've got two problems. Or rather, you have one problem, but the natural solution will create a second problem for you.
The natural solution to your problem is to use one of the base-instance rendering functions, like glDrawElementsInstancedBaseInstance. These allow you to specify a starting instance for your instanced rendering calls.
This will precipitate a second problem: gl_InstanceID does not respect the base instance. It will always be on the range [0, instancecount). Only instance arrays respect the base instance. So instead of using a uniform to provide your per-instance data, you must use instance array rendering. This means storing the per-instance data in a buffer object (which you should have done anyway) and accessing it via a VS input whose VAO specifies that the particular attribute is instanced.
This also has the advantage of not restricting your instance count to uniform limitations.
OpenGL 4.6/ARB_shader_draw_parameters allows access to the gl_BaseInstance vertex shader input, which provides the baseinstance value specified by the draw command. So if you don't want to/can't use instanced arrays (for example, the amount of per-instance data is too big for the attribute limitations), you will have to rely on that extension/4.6 functionality. Recent desktop GL drivers offer this functionality, so if your hardware is decently new, you should be able to use it.

DirectX11 / OpenGL only renders half of the texture

This is how it should look like. It uses the same vertices/uv coordinates which are used for DX11 and OpenGL. This scene was rendered in DirectX10.
This is how it looks like in DirectX11 and OpenGL.
I don't know how this can happen. I am using for both DX10 and DX11 the same code on top and also they both handle things really similiar. Do you have an Idea what the problem may be and how to fix it?
I can send code if needed.
also using another texture.
changed the transparent part of the texture to red.
Fragment Shader GLSL
#version 330 core
in vec2 UV;
in vec3 Color;
uniform sampler2D Diffuse;
void main()
{
//color = texture2D( Diffuse, UV ).rgb;
gl_FragColor = texture2D( Diffuse, UV );
//gl_FragColor = vec4(Color,1);
}
Vertex Shader GLSL
#version 330 core
layout(location = 0) in vec3 vertexPosition;
layout(location = 1) in vec2 vertexUV;
layout(location = 2) in vec3 vertexColor;
layout(location = 3) in vec3 vertexNormal;
uniform mat4 Projection;
uniform mat4 View;
uniform mat4 World;
out vec2 UV;
out vec3 Color;
void main()
{
mat4 MVP = Projection * View * World;
gl_Position = MVP * vec4(vertexPosition,1);
UV = vertexUV;
Color = vertexColor;
}
Quickly said, it looks like you are using back face culling (which is good), and the other side of your model is wrongly winded. You can ensure that this is the problem by turning back face culling off (OpenGL: glDisable(GL_CULL_FACE​)).
The real correction is (if this was the problem) to have correct winding of faces, usually it is counter-clockwise. This depends where you get this model. If you generate it on your own, correct winding in your model generation routine. Usually, model files created by 3D modeling software have correct face winding.
This is just a guess, but are you telling the system the correct number of polygons to draw? Calls like glBufferData() take the size in bytes of the data, not the number of vertices or polygons. (Maybe they should have named the parameter numBytes instead of size?) Also, the size has to contain the size of all the data. If you have color, normals, texture coordinates and vertices all interleaved, it needs to include the size of all of that.
This is made more confusing by the fact that glDrawElements() and other stuff takes the number of vertices as their size argument. The argument is named count, but it's not obvious that it's vertex count, not polygon count.
I found the error.
The reason is that I forgot to set the Texture SamplerState to Wrap/Repeat.
It was set to clamp so the uv coordinates were maxed to 1.
A few things that you could try :
Is depth test enabled ? It seems that your inner faces of the polygons from the 'other' side are being rendered over the polygons that are closer to the view point. This could happen if depth test is disabled. Enable it just in case.
Is lighting enabled ? If so turn it off. Some flashes of white seem to be coming in the rotating image. Could be because of incorrect normals ...
HTH

OpenGL Missing Triangles Diffuse Shader

I'm using C++ and when I implemented a diffuse shader, it causes every other triangle to disappear.
I can post my render code if need be, but I believe the issue is with my normal matrix (which I wrote it to be the transpose inverse of the model view matrix). Here is the shader code which is sort of similar to that of a tutorial on lighthouse tutorials.
VERTEX SHADER
#version 330
layout(location=0) in vec3 position;
layout(location=1) in vec3 normal;
uniform mat4 transform_matrix;
uniform mat4 view_model_matrix;
uniform mat4 normal_matrix;
uniform vec3 light_pos;
out vec3 light_intensity;
void main()
{
vec3 tnorm = normalize(normal_matrix * vec4(normal, 1.0)).xyz;
vec4 eye_coords = transform_matrix * vec4(position, 1.0);
vec3 s = normalize(vec3(light_pos - eye_coords.xyz)).xyz;
vec3 light_set_intensity = vec3(1.0, 1.0, 1.0);
vec3 diffuse_color = vec3(0.5, 0.5, 0.5);
light_intensity = light_set_intensity * diffuse_color * max(dot(s, tnorm), 0.0);
gl_Position = transform_matrix * vec4(position, 1.0);
}
My fragment shader just outputs the "light_intensity" in the form of a color. My model is straight from Blender and I have tried different exporting options like keeping vertex order, but nothing has worked.
This is not related to you shader.
It appears to be depth test related. Here, the order of triangles in the depth relative to you viewport is messed up, because you do not make sure that only the nearest pixel to your camera gets drawn.
Enable depth testing and make sure you have a z buffer bound to your render target.
Read more about this here: http://www.opengl.org/wiki/Depth_Test
Only the triangles highlighted in red should be visible to the viewer. Due to the lack of a valid depth test, there is no chance to guarantee, what triangle is painted top most. Thus blue triangles of the faces that should not be visible will cover parts of previously drawn red triangles.
The depth test would omit this, by comparing the depth in the z buffer with the depth of the pixel to be drawn at the current moment. Only the color information of a pixel that is closer to the viewer, i.e. has a smaller z value than the z value in the buffer, shall be written to the framebuffer in order to achieve a correct result.
(Backface culling, would be nice too and, if the model is correctly exported, also allow to show it correctly. But it would only hide the main problem, not solve it.)

Is it faster to use texelFetch when rendering fonts?

I am writing some font drawing shaders in OpenGL 3.3. I will render my font into a texture atlas and then generate some display lists for some text I want to draw. I would like the rendering of text to consume the least amount of resources (CPU, GPU memory, GPU time). How can I accomplish this?
Looking at Freetype-gl, I noticed that the author generates 6 indices and 4 vertices per character.
Since I am using OpenGL 3.3, I have some additional freedom. My plan was to generate 1 vertex per character plus one integer "code" per character. The character code can be used in texelFetch operations to retrieve texture coördinates and character size information. A geometry shader turns the size information and vertex into a triangle strip.
Is texelFetch going to be slower than sending more vertices/texture coördinates? Is this worth doing?, or is there are reason why it's not done in the font libraries I looked at?
Final code:
Vertex shader:
#version 330
uniform sampler2D font_atlas;
uniform sampler1D code_to_texture;
uniform mat4 projection;
uniform vec2 vertex_offset; // in view space.
uniform vec4 color;
uniform float gamma;
in vec2 vertex; // vertex in view space of each character adjusted for kerning, etc.
in int code;
out vec4 v_uv;
void main()
{
v_uv = texelFetch(
code_to_texture,
code,
0);
gl_Position = projection * vec4(vertex_offset + vertex, 0.0, 1.0);
}
Geometry shader:
#version 330
layout (points) in;
layout (triangle_strip, max_vertices = 4) out;
uniform sampler2D font_atlas;
uniform mat4 projection;
in vec4 v_uv[];
out vec2 g_uv;
void main()
{
vec4 pos = gl_in[0].gl_Position;
vec4 uv = v_uv[0];
vec2 size = vec2(textureSize(font_atlas, 0)) * (uv.zw - uv.xy);
vec2 pos_opposite = pos.xy + (mat2(projection) * size);
gl_Position = vec4(pos.xy, 0, 1);
g_uv = uv.xy;
EmitVertex();
gl_Position = vec4(pos.x, pos_opposite.y, 0, 1);
g_uv = uv.xw;
EmitVertex();
gl_Position = vec4(pos_opposite.x, pos.y, 0, 1);
g_uv = uv.zy;
EmitVertex();
gl_Position = vec4(pos_opposite.xy, 0, 1);
g_uv = uv.zw;
EmitVertex();
EndPrimitive();
}
Fragment shader:
#version 330
uniform sampler2D font_atlas;
uniform vec4 color;
uniform float gamma;
in vec2 g_uv;
layout (location = 0) out vec4 fragment_color;
void main()
{
float a = texture(font_atlas, g_uv).r;
fragment_color.rgb = color.rgb;
fragment_color.a = color.a * pow(a, 1.0 / gamma);
}
I wouldn't expect there to be a significant performance difference between your proposed method vs storing the quad vertex positions and texture coordinates in a vertex buffer. On the one hand your method requires a smaller vertex buffer and less work for the CPU. On the other hand the texelFetch calls will be more-or-less at random locations, and not make the best use of the cache. This last point may not be very significant as I guess that texture wont be very large. Also, the execution model of geometry shaders mean they can quickly become the bottleneck of the pipeline.
To answer "is this worth doing?" - I suspect not for performance reasons. Unfortunately you can't tell until you implement it and measure the performance. I think it's quite a cool idea though, so I don't think you'd be wasting your time trying it out.
Maybe you can use Atomic Counter to handle current position in text.
Here is an interresting paper on memory bandwidth
GPU perf...
You can cache the result in a fbo.
For realy fast rendering as you said, you may build a geom shader taking points as input and outputing quads and sample a texture to get additional on glyph info.
This appear effectively the best solution...