OpenGL: Performing TexCoord calculations within shader, bad practice? - c++

I’m currently working on a 2D graphics engine for a game. My shader takes in 2 UV offset floats and calculates the TexCoord by applying the offset.
Here is a sample of my vertex shader:
#shader vertex
#version 330 core
layout(location = 0) in vec4 position;
layout(location = 1) in vec2 texCoord;
out vec2 v_TexCoord;
uniform float u_Offset;
uniform float v_Offset;
void main()
{
gl_Position = position;
v_TexCoord = vec2(texCoord.x + u_Offset, texCoord.y + v_Offset);
};
Should I worry about this causing performance issues in the long run? How big of a difference would it make if I were to perform the calculations CPU side before passing the final UV in, and is it worth optimizing?

Modifying the attributes data of a mesh is the main purpose of programmable pipeline, moving it to the CPU will be translated into a peformance downgrade. Also, as pointed out by #BDL , you will need to re-send the data to the GPU, which is the worst part of the whole process.
A different case is when you are performing a calculation which is the same for all the shader instances, which will be more appropiate to perform such operation on the CPU and upload it as an uniform.

Related

Rendering breaks when using an int in GLSL

I'm rendering light spheres for a deferred renderer and I'm in the process of switching to instancing for better performance. I have the following vertex shader:
in vec3 position;
uniform int test_index;
uniform mat4 projectionMatrix;
uniform mat4 viewMatrix;
uniform mat4 modelMatrix[256];
void main(void) {
gl_Position = projectionMatrix * viewMatrix * modelMatrix[test_index] * vec4(position, 1.0);
}
I upload the matrices to the shader with
val location = glGetUniformLocation(programId, "modelMatrix[$i]")
glUniformMatrix4fv(location, false, buf)
When I use the int uniform in the index (Hardcoded to a 0 for debugging purposes), the sphere disappears, except when I clip into geometry (in which case it renders as a white circle). The same happens when I use gl_InstanceID as my index.
Weirdly I noticed that the problem also occurs when I pass an int from vertex to fragment shader and use it there for something completely different, regardless of what I use as the index.
The problem disappears instantly and rendering is completely fine when I hardcode modelMatrix[0] in the shader instead of modelMatrix[test_index].
I've got a different shader (for skeletal animation) which uploads a mat4 uniform array the exact same way, also being indexed with an int, but I've got no such problems there...
I don't really know what to make of this, so any advice on how I can debug this is much appreciated. I'm using OpenGL 3.3 on Kotlin+LWJGL
Edit: This probably has nothing to do with the uniform. The following also does not work:
int i = 0;
gl_Position = projectionMatrix * viewMatrix * modelMatrix[i] * vec4(position, 1.0f);
OpenGL has a limit on how many uniforms one can use. The same applies to attributes too (but that's not the problem here). An array of 256 matrices is very likely to exceed the allowed amount.
The reason why the code only breaks when using the int uniform is that glsl compilers do a lot of optimization under the hood, for example, removing unused uniforms. So if you hardcode the array location in the shader, the compiler will notice that only a single matrix is ever used and might remove all the others.
When you need more uniforms than what OpenGL allows for, you have to use a uniform buffer object (UBO) or a shader storage buffer (SSBO).

flat shading in webGL

I'm trying to implement flat-shading in webgl,
I knew that varying keyword in vertex shader will interpolation that value and pass it to fragment shader.
I'm trying to disable interpolation, and I found that flat keyword can do this, but it seems cannot use in webgl?
flat varying vec4 fragColor;
always getting error: Illegal use of reserved word 'flat'
Check out webGL 2. Flat shading is supported.
For vertex shadder:
#version 300 es
in vec4 vPos; //vertex position from application
flat out vec4 vClr;//color sent to fragment shader
void main(){
gl_Position = vPos;
vClr = gl_Position;//for now just using the position as color
}//end main
For fragment shader
#version 300 es
precision mediump float;
flat in vec4 vClr;
out vec4 fragColor;
void main(){
fragColor = vClr;
}//end main
I think 'flat' is not supported by the version of GLSL used in WebGL. If you want flat shading, there are several options:
1) replicate the polygon's normal in each vertex. It is the simplest solution, but I find it a bit unsatisfactory to duplicate data.
2) in the vertex shader, transform the vertex in view coordinates, and in the fragment shader, compute the normal using the dFdx() and dFdy() functions that compute derivatives. These functions are supported by the extension GL_OES_standard_derivatives (you need to check whether it is supported by the GPU before using it), most GPUs, including the ones in smartphones, support the extension.
My vertex shader is as follows:
struct VSUniformState {
mat4 modelviewprojection_matrix;
mat4 modelview_matrix;
};
uniform VSUniformState GLUP_VS;
attribute vec4 vertex_in;
varying vec3 vertex_view_space;
void main() {
vertex_view_space = (GLUP_VS.modelview_matrix * vertex_in).xyz;
gl_Position = GLUP_VS.modelviewprojection_matrix * vertex_in;
}
and in the associated fragment shader:
#extension GL_OES_standard_derivatives : enable
varying vec3 vertex_view_space;
...
vec3 U = dFdx(vertex_view_space);
vec3 V = dFdy(vertex_view_space);
N = normalize(cross(U,V));
... do the lighting with N
I like this solution because it makes the setup code simpler. A drawback may be that it gives more work to the fragment shader (but with today's GPUs it should not be a problem). If performance is an issue, it may be a good idea to measure it.
3) another possibility is to have a geometry shader (if supported) that computes the normals. In general it is slower (but again, it may be a good idea to measure performance, it may depend on the specific GPU).
See also answers to this question:
How to get flat normals on a cube
My implementation is available here:
http://alice.loria.fr/software/geogram/doc/html/index.html
Some online web-GL demos are here (converted from C++ to JavaScript using emscripten):
http://homepages.loria.fr/BLevy/GEOGRAM/

DirectX11 / OpenGL only renders half of the texture

This is how it should look like. It uses the same vertices/uv coordinates which are used for DX11 and OpenGL. This scene was rendered in DirectX10.
This is how it looks like in DirectX11 and OpenGL.
I don't know how this can happen. I am using for both DX10 and DX11 the same code on top and also they both handle things really similiar. Do you have an Idea what the problem may be and how to fix it?
I can send code if needed.
also using another texture.
changed the transparent part of the texture to red.
Fragment Shader GLSL
#version 330 core
in vec2 UV;
in vec3 Color;
uniform sampler2D Diffuse;
void main()
{
//color = texture2D( Diffuse, UV ).rgb;
gl_FragColor = texture2D( Diffuse, UV );
//gl_FragColor = vec4(Color,1);
}
Vertex Shader GLSL
#version 330 core
layout(location = 0) in vec3 vertexPosition;
layout(location = 1) in vec2 vertexUV;
layout(location = 2) in vec3 vertexColor;
layout(location = 3) in vec3 vertexNormal;
uniform mat4 Projection;
uniform mat4 View;
uniform mat4 World;
out vec2 UV;
out vec3 Color;
void main()
{
mat4 MVP = Projection * View * World;
gl_Position = MVP * vec4(vertexPosition,1);
UV = vertexUV;
Color = vertexColor;
}
Quickly said, it looks like you are using back face culling (which is good), and the other side of your model is wrongly winded. You can ensure that this is the problem by turning back face culling off (OpenGL: glDisable(GL_CULL_FACE​)).
The real correction is (if this was the problem) to have correct winding of faces, usually it is counter-clockwise. This depends where you get this model. If you generate it on your own, correct winding in your model generation routine. Usually, model files created by 3D modeling software have correct face winding.
This is just a guess, but are you telling the system the correct number of polygons to draw? Calls like glBufferData() take the size in bytes of the data, not the number of vertices or polygons. (Maybe they should have named the parameter numBytes instead of size?) Also, the size has to contain the size of all the data. If you have color, normals, texture coordinates and vertices all interleaved, it needs to include the size of all of that.
This is made more confusing by the fact that glDrawElements() and other stuff takes the number of vertices as their size argument. The argument is named count, but it's not obvious that it's vertex count, not polygon count.
I found the error.
The reason is that I forgot to set the Texture SamplerState to Wrap/Repeat.
It was set to clamp so the uv coordinates were maxed to 1.
A few things that you could try :
Is depth test enabled ? It seems that your inner faces of the polygons from the 'other' side are being rendered over the polygons that are closer to the view point. This could happen if depth test is disabled. Enable it just in case.
Is lighting enabled ? If so turn it off. Some flashes of white seem to be coming in the rotating image. Could be because of incorrect normals ...
HTH

How does OpenGL interpolate varying variables on the fragment shader even if they are only set three times on the vertex shader?

Since vertex shader is run once per vertex (that mean in triangle 3 times), how does the varying variable gets computed for every fragment, if it's assigned (as in the example) only three times?
Fragment shader:
precision mediump float;
varying vec4 v_Color;
void main() {
gl_FragColor = v_Color;
}
Vertex shader:
attribute vec4 a_Position;
attribute vec4 a_Color;
varying vec4 v_Color;
void main() {
v_Color = a_Color;
gl_Position = a_Position;
}
So, the question is, how does the system behind this know, how to compute the variable v_Color at every fragment, since this shader assigns v_Color only 3 times (in a triangle).
All outputs of the vertex shader are per vertex. When you set v_Color in the vertex shader, it sets it on the current vertex. When the fragment shader runs, it reads the v_Color value for each vertex in the primitive and interpolates between them based on the fragment's location.
First of all, it is a mistake to assume that the vertex shader is run once per-vertex. Using indexed rendering, primitive assembly can usually access the post T&L cache (result of previous vertex shader invocations) based on vertex index to eliminate evaluating a vertex more than once. However, new things such as geometry shaders can easily cause this to break down.
As for how the fragment shader gets its value, that is generally done during rasterization. Those per-vertex attributes are interpolated along the surface of the primitive (triangle in this case) based on the fragment's distance relative to the vertices that were used to build the primitive. In DX11 interpolation can be deferred until the fragment shader itself runs (called "pull-model" interpolation), but traditionally this is something that happens during rasterization.

Is it faster to use texelFetch when rendering fonts?

I am writing some font drawing shaders in OpenGL 3.3. I will render my font into a texture atlas and then generate some display lists for some text I want to draw. I would like the rendering of text to consume the least amount of resources (CPU, GPU memory, GPU time). How can I accomplish this?
Looking at Freetype-gl, I noticed that the author generates 6 indices and 4 vertices per character.
Since I am using OpenGL 3.3, I have some additional freedom. My plan was to generate 1 vertex per character plus one integer "code" per character. The character code can be used in texelFetch operations to retrieve texture coördinates and character size information. A geometry shader turns the size information and vertex into a triangle strip.
Is texelFetch going to be slower than sending more vertices/texture coördinates? Is this worth doing?, or is there are reason why it's not done in the font libraries I looked at?
Final code:
Vertex shader:
#version 330
uniform sampler2D font_atlas;
uniform sampler1D code_to_texture;
uniform mat4 projection;
uniform vec2 vertex_offset; // in view space.
uniform vec4 color;
uniform float gamma;
in vec2 vertex; // vertex in view space of each character adjusted for kerning, etc.
in int code;
out vec4 v_uv;
void main()
{
v_uv = texelFetch(
code_to_texture,
code,
0);
gl_Position = projection * vec4(vertex_offset + vertex, 0.0, 1.0);
}
Geometry shader:
#version 330
layout (points) in;
layout (triangle_strip, max_vertices = 4) out;
uniform sampler2D font_atlas;
uniform mat4 projection;
in vec4 v_uv[];
out vec2 g_uv;
void main()
{
vec4 pos = gl_in[0].gl_Position;
vec4 uv = v_uv[0];
vec2 size = vec2(textureSize(font_atlas, 0)) * (uv.zw - uv.xy);
vec2 pos_opposite = pos.xy + (mat2(projection) * size);
gl_Position = vec4(pos.xy, 0, 1);
g_uv = uv.xy;
EmitVertex();
gl_Position = vec4(pos.x, pos_opposite.y, 0, 1);
g_uv = uv.xw;
EmitVertex();
gl_Position = vec4(pos_opposite.x, pos.y, 0, 1);
g_uv = uv.zy;
EmitVertex();
gl_Position = vec4(pos_opposite.xy, 0, 1);
g_uv = uv.zw;
EmitVertex();
EndPrimitive();
}
Fragment shader:
#version 330
uniform sampler2D font_atlas;
uniform vec4 color;
uniform float gamma;
in vec2 g_uv;
layout (location = 0) out vec4 fragment_color;
void main()
{
float a = texture(font_atlas, g_uv).r;
fragment_color.rgb = color.rgb;
fragment_color.a = color.a * pow(a, 1.0 / gamma);
}
I wouldn't expect there to be a significant performance difference between your proposed method vs storing the quad vertex positions and texture coordinates in a vertex buffer. On the one hand your method requires a smaller vertex buffer and less work for the CPU. On the other hand the texelFetch calls will be more-or-less at random locations, and not make the best use of the cache. This last point may not be very significant as I guess that texture wont be very large. Also, the execution model of geometry shaders mean they can quickly become the bottleneck of the pipeline.
To answer "is this worth doing?" - I suspect not for performance reasons. Unfortunately you can't tell until you implement it and measure the performance. I think it's quite a cool idea though, so I don't think you'd be wasting your time trying it out.
Maybe you can use Atomic Counter to handle current position in text.
Here is an interresting paper on memory bandwidth
GPU perf...
You can cache the result in a fbo.
For realy fast rendering as you said, you may build a geom shader taking points as input and outputing quads and sample a texture to get additional on glyph info.
This appear effectively the best solution...