GLSL/OpenGL shader tessellation flickering and failure - c++

I just started with OpenGL tessellation and have run into a bit a trouble. I am tessellating series of patches formed by one vertex each. These vertices/patches are structured in a gridlike fashion to later form a terrain generated by Perlin Noise.
The problem I have run into is that starting from the second patch, and every 5th patch after that, sometimes have a lot of tessellation (not the way i configured) but most of the time it doesn't get tessellated at all.
Like so:
The two white circles mark the highly/over tessellated patches. Also note the pattern of untessellated patches.
The strange thing is that it works on my Surface Pro 2 (Intel HD4400 graphics) but bugs on my main desktop computer (AMD HD6950 graphics). Is it possible the hardware is bad?
The patches are generated with the code:
vec4* patches = new vec4[m_patchesWidth * m_patchesDepth];
int c = 0;
for (unsigned int z = 0; z < m_patchesDepth; ++z) {
for (unsigned int x = 0; x < m_patchesWidth; ++x) {
patches[c] = vec4(x * 1.5f, 0, z * 1.5f, 1.0f);
c++;
}
}
m_fxTerrain->Apply();
glGenBuffers(1, &m_planePatches);
glBindBuffer(GL_ARRAY_BUFFER, m_planePatches);
glBufferData(GL_ARRAY_BUFFER, m_patchesWidth * m_patchesDepth * sizeof(vec4), patches, GL_STATIC_DRAW);
GLuint loc = m_fxTerrain->GetAttrib("posIn");
glEnableVertexAttribArray(loc);
glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, sizeof(vec4), nullptr);
delete(patches);
And drawn with:
glPatchParameteri(GL_PATCH_VERTICES, 1);
glBindVertexArray(patches);
glPolygonMode(GL_FRONT_AND_BACK, GL_LINE);
glDrawArrays(GL_PATCHES, 0, nrOfPatches);
Vertex Shader:
#version 430 core
in vec4 posIn;
out gl_PerVertex {
vec4 gl_Position;
};
void main() {
gl_Position = posIn;
}
Control shader:
#version 430
#extension GL_ARB_tessellation_shader : enable
layout (vertices = 1) out;
uniform float OuterTessFactor;
uniform float InnerTessFactor;
out gl_PerVertex {
vec4 gl_Position;
} gl_out[];
void main() {
if (gl_InvocationID == 0) {
gl_TessLevelOuter[0] = OuterTessFactor;
gl_TessLevelOuter[1] = OuterTessFactor;
gl_TessLevelOuter[2] = OuterTessFactor;
gl_TessLevelOuter[3] = OuterTessFactor;
gl_TessLevelInner[0] = InnerTessFactor;
gl_TessLevelInner[1] = InnerTessFactor;
}
gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position;
}
Evaluation shader:
#version 430
#extension GL_ARB_tessellation_shader : enable
layout (quads, equal_spacing, ccw) in;
uniform mat4 ProjView;
uniform sampler2D PerlinNoise;
out vec3 PosW;
out vec3 Normal;
out vec4 ColorFrag;
out gl_PerVertex {
vec4 gl_Position;
};
void main() {
vec4 pos = gl_in[0].gl_Position;
pos.xz += gl_TessCoord.xy;
pos.y = texture2D(PerlinNoise, pos.xz / vec2(8, 8)).x * 10.0f - 10.0f;
Normal = vec3(0, 1, 0);
gl_Position = ProjView * pos;
PosW = pos.xyz;
ColorFrag = vec4(pos.x / 64.0f, 0.0f, pos.z / 64.0f, 1.0f);
}
Fragment shader:
#version 430 core
in vec3 PosW;
in vec3 Normal;
in vec4 ColorFrag;
in vec4 PosH;
out vec3 FragColor;
out vec3 FragNormal;
void main() {
FragNormal = Normal;
FragColor = ColorFrag.xyz;
}
I have tried to hardcode the different tessellation levels but that did not help. I recently started out with OpenGL so please let me know if i am doing something stupid.
So does anyone have any idea what could be causing this "flickering" of certain patches?
Update: I had a friend run the project and he got the same pattern of flickering tessellation but the failing patches were not drawn at all except when being overly tessellated. He has the same graphics card as I do (AMD HD6950).

You should use triangle/quad tessellation, in which each patch has 3 or 4 vertices. As I can see, you use quads (I use them too). In that case, you can set it like this:
glPatchParameteri(GL_PATCH_VERTICES,4);
glBindVertexArray(VertexArray);
(TIP: use drawelements for your terrain, much better performance for 2D-displacement based mesh.)
In the control shader, use
layout (vertices = 4) out;
since your patch has 4 control points. The ordering is still important (CCW/CW).
Personally I don't like to use built-in variables, so for the vertex shader you can send your vertex data to the tesscontrol like this:
layout (location = 0) out vec3 outPos;
....
outPos.xz = grid.xy;
outPos.y = noise(outPos.xz);
Tess control:
layout (location = 0) in vec3 inPos[]; //outPos (location = 0) from vertex shader
//'collects' the 4 control points to an array in the order they're sended
layout (location = 0) out vec3 outPos[]; //send the c.points to the ev. shader
...
gl_TessLevelOuter[0] = outt[0];
gl_TessLevelOuter[1] = outt[1];
gl_TessLevelOuter[2] = outt[2];
gl_TessLevelOuter[3] = outt[3];
gl_TessLevelInner[0] = inn[0];
gl_TessLevelInner[1] = inn[1];
outPos[ID] = inPos[ID];//gl_invocationID = ID
Note that both in and out vertex data is an array.
The tessev is simple:
layout (location = 0) in vec3 inPos[]; //the 4 control points
layout (location = 0) out vec3 outPos; //this is no longer array, next is the fragment shader
...
//edit: do not forgot to add the next line
layout (quads) in;
vec3 interpolate3D(vec3 v0, vec3 v1, vec3 v2, vec3 v3) //linear interpolation for x,y,z coords on the quad
{
return mix(mix(v0,v1,gl_TessCoord.x),mix(v3,v2,gl_TessCoord.x),gl_TessCoord.y);
};
...main{...
outPos = interpolate3D(inPos[0],inPos[1],inPos[2],inPos[3]); //the four control points of the quad. Every other point is linearly interpolated between them according to the TessCoord.
gl_Position = mvp * vec4(outPos,1.0f);
A good representation of the quad domain: http://ogldev.atspace.co.uk/www/tutorial30/tutorial30.html.
I think the problem is with your one-vertex patch. I cannot imagine how a one vertex path can be divided into triangles, I don't know how it works on another hardware. The tessellation is for divide primitives into other simple primitives, to triangles in case of OGL, since it can be handled by a GPU easily (3 points always lie in a plane). So, the minimum number of patch vertices should be 3, for a triangle. I like quads, because it simplier to index, and the memory cost is less. It will be divided into triangles too during tessellation. http://www.informit.com/articles/article.aspx?p=2120983
Also, there is another type, the isoline tessellation. (check out the links, the second is pretty good.)
All in all, try it with quads or triangles, and set the control vertices to 4 (or 3). My (pretty complex) terrain shader is here with frustum culling, tessellation shader culling for a geoclipmap based terrain. Also, without tessellation it works with vertex morph in vertex shader. Maybe some part of this code will be useful. http://speedy.sh/TAvPR/gshader.txt
A scene with tessellation at about 4 pixels/triangle runs at 75 FPS (with fraps) with runtime normal calculation and bicubic smoothing and other things. I'm using AMD HD 5750. It still could be much faster with better code and pre-baked normals:D. (runs at max 120 w/o normal calc.)
Oh, and you can only send the x and z coords if you displace the vertex in the shader. It will be faster too.
Lots of vertices.

Related

Inverted geometry gBuffer positions for perspective. Orthographic is ok?

I have a deferred renderer which appears to work correctly, depth, colour and shading comes out correctly. However the position buffer is fine for orthographic, while the geometry appears 'inverted' (or depth disabled) when using a perspective projection.
I am getting the following buffer outputs for orthographic.
With the final 'shaded' image currently looking correct.
However when I am using a perspective projection I get the following buffers coming out...
And final image is fine, although I don't incorporate any position buffer information at the moment (N.B Only doing 'headlight' shading at the moment)
While the final image appears correct, the depth buffer appears to be ignored for my position buffer...(there is no glDisable(GL_DEPTH_TEST) in the code.
The depth and normal buffers looks ok to me, it's only the 'position' buffer which appears to be ignoring the depth? The render pipeline is exactly the same in for ortho and perspective with the only difference being the projection matrix.
I use glm::ortho, and glm::perspective and I calculate my near/far clipping distances on the fly based on the scene AABB. For orthographic my near/far is 1 & 11.4734 respectively, and for perspective it is 11.0875 & 22.5609... The width and height values are the same, fov is 45 for perspective projection.
I do have these calls before drawing any geometry...
glEnable(GL_DEPTH_TEST);
glEnable(GL_BLEND);
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
Which I use for compositing different layers as part of the render pipeline.
Am I doing anything wrong here? or am I misunderstanding something?
Here are my shaders...
Vertex shader of gBuffer...
#version 430 core
layout (std140) uniform MatrixPV
{
mat4 P;
mat4 V;
};
layout(location = 0) in vec3 InPoint;
layout(location = 1) in vec3 InNormal;
layout(location = 2) in vec2 InUV;
uniform mat4 M;
out vec4 Position;
out vec3 Normal;
out vec2 UV;
void main()
{
mat4 VM = V * M;
gl_Position = P * VM * vec4(InPoint, 1.0);
Position = P * VM * vec4(InPoint, 1.0);
Normal = mat3(M) * InNormal;
UV = InUV;
}
Fragment shader of gBuffer...
#version 430 core
layout(location = 0) out vec4 gBufferPicker;
layout(location = 1) out vec4 gBufferPosition;
layout(location = 2) out vec4 gBufferNormal;
layout(location = 3) out vec4 gBufferDiffuse;
in vec3 Normal;
in vec4 Position;
vec4 Diffuse();
uniform vec4 PickerColour;
void main()
{
gBufferPosition = Position;
gBufferNormal = vec4(Normal.xyz, 1.0);
gBufferPicker = PickerColour;
gBufferDiffuse = Diffuse();
}
And here is the 'second pass' shader to visualise the position buffer...
#version 430 core
uniform sampler2D debugBufferPosition;
in vec2 UV;
out vec4 frag;
void main()
{
vec3 val = texture(debugBufferPosition, UV).xyz;
frag = vec4(val.xyz, 1.0);
}
I haven't used the position buffer data yet, and I know I can reconstruct it without having to store them in another buffer, however the positions are useful for me for other reasons and I would like to know why they are coming out as they are for perspective?
What you actually write in the position buffer is the clip space coordinate
Position = P * VM * vec4(InPoint, 1.0);
The clip space coordinate is a Homogeneous coordinates and transformed to the normaliced device cooridnate (which is a Cartesian coordinate by a Perspective divide.
ndc = gl_Position.xyz / gl_Position.w;
At orthographic projection the w component is 1, but at perspective projection, the w component contains a value which depends on the z component (depth) of the (cartesian ) view space coordinate.
I recommend to store the normalized device coordinate to the position buffer, rather than the clip space coordinate. e.g.:
gBufferPosition = vec4(Position.xyz / Position.w, 1.0);

OpenGL - uniform presence causing shader to be bypassed

UPDATE: So it turns out this was due to a bug in the C side of things, causing some of the matrix to become malformed. The shaders are all fine. So if adding uniforms causes weird things to happen, my advice would be to use a debugger to check the value of ALL uniforms and make sure that they are all being set correctly.
So I am trying to render depth to a cube map to use as a shadow map, but when I add and use a uniform in the fragment shader everything becomes white as if the shader isn't being used. No warnings or errors are generated when compiling/linking the shader.
The shader program I am using to render the depth map (setting the depth simply to the fragment z position as a test) is as follows:
//vertex shader
#version 430
in layout(location=0) vec4 vertexPositionModel;
uniform mat4 modelToWorldMatrix;
void main() {
gl_Position = modelToWorldMatrix * vertexPositionModel;
}
//geometry shader
#version 430
layout (triangles) in;
layout (triangle_strip, max_vertices=18) out;
out vec4 fragPositionWorld;
uniform mat4 projectionMatrices[6];
void main() {
for (int face = 0; face < 6; face++) {
gl_Layer = face;
for (int i = 0; i < 3; i++) {
fragPositionWorld = gl_in[i].gl_Position;
gl_Position = projectionMatrices[face] * fragPositionWorld;
EmitVertex();
}
EndPrimitive();
}
}
//Fragment shader
#version 430
in vec4 fragPositionWorld;
void main() {
gl_FragDepth = abs(fragPositionWorld.z);
}
The main shader samples from the cubemap and simply renders the depth as greyscale colour:
vec3 lightDirection = fragPositionWorld - pointLight.position;
float closestDepth = texture(shadowMap, lightDirection).r;
finalColour = vec4(vec3(closestDepth), 1.0);
The scene is a small cube in a larger cubic room, and renders as expected, dark near z = 0 and the cube projected back onto the wall (The depth map is being rendered from the centre of the room):
Good:
[2
I can move the small cube around and the projection projects correctly onto all the sides of the cubemap. All good so far.
The problem is when I add a uniform to the fragment shader, i.e:
#version 430
in vec4 fragPositionWorld;
uniform vec3 lightPos;
void main() {
gl_FragDepth = min(lightPos.y, 0.5);
}
Everything renders as white, same as if the render failed to compile:
Bad:
gDEBugger reports that the uniform is set correctly (0,4,0) but regardless of what that lightPos is, gl_FragDepth should be set to a value less than 0.5 and appear a shade of grey (which is what happens if I set gl_FragDepth = 0.5 directly), so I can only conclude that the fragment shader is not being used for some reason and the default one is being use instead. Unfortunately I have no idea why.

Layered rendering of a shadow cubemap

I'm trying to render a shadow cubemap in one pass, using layered rendering.
I've tried to be as thorough as possible :
I have bound a cubemap both depth attachment (GL_DEPTH_ATTACHMENT_32F) and color attachment 0 (GL_R32F) using glFramebufferTexture
I made sure to check whether, once the textures are attached to the FBO, that the framebuffer's completeness - it is complete
I have tried both geometry shader instancing using "layout(triangles, invocations=6) in;" and without (resorting to a for(int layer=0;layer<6;++layer) loop, setting gl_Layer = l, first for each primitive, then for each vertex)
Long story short, the first layer (ie. X+ in this case) gets rendered, but none of the others do, be it in the depth or color attachment.
It seems documentation on layered rendering is pretty sparse, even the red book spends at most half a page on it... Anyway :
The code :
Shaders :
Vertex :
#version 440 core
layout(location = 0) in vec3 attrPosition;
void main()
{
gl_Position = vec4(attrPosition, 1.0);
}
Geometry :
#version 440 core
layout(triangles, invocations = 6) in;
layout(triangle_strip, max_vertices = 18) out;
uniform mat4 dkModelMatrix;
uniform mat4 dkViewMatrices[6];
uniform mat4 dkProjectionMatrix;
void main()
{
gl_Layer = gl_InvocationID;
for(int i = 0; i < 3; ++i)
{
gl_Layer = gl_InvocationID;
gl_Position = dkProjectionMatrix * dkViewMatrices[gl_InvocationID] * dkModelMatrix * gl_in[i].gl_Position;
EmitVertex();
}
EndPrimitive();
}
Fragment :
#version 440 core
layout(location = 0) out vec4 dkFragCoord;
void main()
{
dkFragCoord = vec4( vec3(float(gl_Layer) * 0.1 + 0.5) , 1.0);
}
C++ (mostly using my engine's classes, which actually do the bare minimum and has already been tested, in the case of FBOs, with 2D (spot) shadowmaps) :
Shadowmap-related variables creation : https://gist.github.com/xtrium-lnx/77d8989b3c2370607cfc
Shadowmap rendering : https://gist.github.com/xtrium-lnx/387b97c077525be60bb4

Is it faster to use texelFetch when rendering fonts?

I am writing some font drawing shaders in OpenGL 3.3. I will render my font into a texture atlas and then generate some display lists for some text I want to draw. I would like the rendering of text to consume the least amount of resources (CPU, GPU memory, GPU time). How can I accomplish this?
Looking at Freetype-gl, I noticed that the author generates 6 indices and 4 vertices per character.
Since I am using OpenGL 3.3, I have some additional freedom. My plan was to generate 1 vertex per character plus one integer "code" per character. The character code can be used in texelFetch operations to retrieve texture coördinates and character size information. A geometry shader turns the size information and vertex into a triangle strip.
Is texelFetch going to be slower than sending more vertices/texture coördinates? Is this worth doing?, or is there are reason why it's not done in the font libraries I looked at?
Final code:
Vertex shader:
#version 330
uniform sampler2D font_atlas;
uniform sampler1D code_to_texture;
uniform mat4 projection;
uniform vec2 vertex_offset; // in view space.
uniform vec4 color;
uniform float gamma;
in vec2 vertex; // vertex in view space of each character adjusted for kerning, etc.
in int code;
out vec4 v_uv;
void main()
{
v_uv = texelFetch(
code_to_texture,
code,
0);
gl_Position = projection * vec4(vertex_offset + vertex, 0.0, 1.0);
}
Geometry shader:
#version 330
layout (points) in;
layout (triangle_strip, max_vertices = 4) out;
uniform sampler2D font_atlas;
uniform mat4 projection;
in vec4 v_uv[];
out vec2 g_uv;
void main()
{
vec4 pos = gl_in[0].gl_Position;
vec4 uv = v_uv[0];
vec2 size = vec2(textureSize(font_atlas, 0)) * (uv.zw - uv.xy);
vec2 pos_opposite = pos.xy + (mat2(projection) * size);
gl_Position = vec4(pos.xy, 0, 1);
g_uv = uv.xy;
EmitVertex();
gl_Position = vec4(pos.x, pos_opposite.y, 0, 1);
g_uv = uv.xw;
EmitVertex();
gl_Position = vec4(pos_opposite.x, pos.y, 0, 1);
g_uv = uv.zy;
EmitVertex();
gl_Position = vec4(pos_opposite.xy, 0, 1);
g_uv = uv.zw;
EmitVertex();
EndPrimitive();
}
Fragment shader:
#version 330
uniform sampler2D font_atlas;
uniform vec4 color;
uniform float gamma;
in vec2 g_uv;
layout (location = 0) out vec4 fragment_color;
void main()
{
float a = texture(font_atlas, g_uv).r;
fragment_color.rgb = color.rgb;
fragment_color.a = color.a * pow(a, 1.0 / gamma);
}
I wouldn't expect there to be a significant performance difference between your proposed method vs storing the quad vertex positions and texture coordinates in a vertex buffer. On the one hand your method requires a smaller vertex buffer and less work for the CPU. On the other hand the texelFetch calls will be more-or-less at random locations, and not make the best use of the cache. This last point may not be very significant as I guess that texture wont be very large. Also, the execution model of geometry shaders mean they can quickly become the bottleneck of the pipeline.
To answer "is this worth doing?" - I suspect not for performance reasons. Unfortunately you can't tell until you implement it and measure the performance. I think it's quite a cool idea though, so I don't think you'd be wasting your time trying it out.
Maybe you can use Atomic Counter to handle current position in text.
Here is an interresting paper on memory bandwidth
GPU perf...
You can cache the result in a fbo.
For realy fast rendering as you said, you may build a geom shader taking points as input and outputing quads and sample a texture to get additional on glyph info.
This appear effectively the best solution...

Calculate per Vertex Normals in Geometry Shader after Tesselation

I've succeeded in getting tesselation control and evaluation shaders to work correctly, but the lighting for my scene is still blocky because I've been calculating per (triangle) face normals, instead of per vertex. Now that I'm using tesselation, it seems necessary to calculate new normal vertors for the newly tesselated vertexes in either the tess eval or geometry shader.
In order to get smooth per vertex normals I need to calculate per face normals for every triangle sharing the vertex in question, then compute a weighted average of these face normals. But I'm not sure how to get all those triangle faces in the tess eval or geometry shader. The layout (triangles_adjacency) in option for the geometry shader looks promising, but theres not much information about how to use it.
Is it possible to compute smooth per vertex normals on the GPU without the use of a normal/bump map in this way? The end goal is to have smooth per vertex lighting that benefits from the increased level of tesselation detail.
Here are my tessellation control and evaluation shaders:
// control
#version 400
layout (vertices = 3) out;
in vec3 vPosition[];
out vec3 tcPosition[];
const float tessLevelInner = 1.0;
const float tessLevelOuter = 1.0;
void main()
{
tcPosition[gl_InvocationID] = vPosition[gl_InvocationID];
if (gl_InvocationID == 0) {
gl_TessLevelInner[0] = tessLevelInner;
gl_TessLevelOuter[0] = tessLevelOuter;
gl_TessLevelOuter[1] = tessLevelOuter;
gl_TessLevelOuter[2] = tessLevelOuter;
}
}
// eval
#version 400
layout (triangles, equal_spacing, cw) in;
uniform mat4 uProj;
uniform mat4 uModelView;
in vec3 tcPosition[];
out vec3 tePosition;
void main()
{
vec3 p0 = gl_TessCoord.x * tcPosition[0];
vec3 p1 = gl_TessCoord.y * tcPosition[1];
vec3 p2 = gl_TessCoord.z * tcPosition[2];
tePosition = p0 + p1 + p2;
gl_Position = uProj * uModelView * vec4(tePosition, 1);
}