I'm using opengl to render fractals, which can result in very long draw calls (multiple seconds). If the draw call ends up being too long (a couple of seconds), i get this error:
Unhandled exception at 0x69A5E899 (nvoglv32.dll) in Mandelbulb.exe: Fatal program exit requested.
Presumably opengl assumes i'm stuck in an infinite loop and therefore aborts the program.
I've isolated the problem, as it goes away if i lower the quality. (in the example code, notice that the program runs just fine if maxiterations is set to something reasonable like 1000).
Is there any way to hint to opengl that it shouldn't abort?
Maybe using compute shaders instead will work?
Sample code for reproducing (very stupid example, not the focus): make an opengl project, draw two triangles that overlap to form a rectangle covering the entire screen. Then use this fragment shader to color it.
Vertex:
#version 330 core
layout(location = 0) in vec4 position;
out vec4 pos;
void main()
{
gl_Position = position;
pos = position;
};
Fragment:
#version 330 core
layout(location = 0) out vec4 color;
in vec4 pos;
const int maxIterations = 8000000;
void main()
{
vec2 npos = pos.xy;
vec2 z=vec2(0.0,0.0);
int i = maxIterations;
for (; i > 0; i--)
{
float y2 = z.y * z.y;
float x2 = z.x * z.x;
if (x2 + y2 > 4) break;
z.y = 2*z.x*z.y+npos.y;
z.x = x2 - y2 + npos.x;
}
color = vec4(float(i)/float(maxIterations));
};
Edit:
Trying what #Scheff recommended worked. However, i sadly can't accept the comment as an answer
Related
I'm currently working on a tile game in LibGDX and I'm trying to get a "fog of war" effect by obscuring unexplored tiles. The result I get from this is a dynamically generated black texture of the size of the screen that only covers unexplored tiles leaving the rest of the background visible. This is an example of the fog texture rendered on top of a white background:
What I'm now trying to achieve is to dynamically fade the inner borders of this texture to make it look more like a fog that slowly thickens instead of just a bunch of black boxes put together on top of the background.
Googling the problem I found out I could use shaders to do this, so I tried to learn some glsl (I'm at the very start with shaders) and I came up with this shader:
VertexShader:
//attributes passed from openGL
attribute vec3 a_position;
attribute vec2 a_texCoord0;
//variables visible from java
uniform mat4 u_projTrans;
//variables shared between fragment and vertex shader
varying vec2 v_texCoord0;
void main() {
v_texCoord0 = a_texCoord0;
gl_Position = u_projTrans * vec4(a_position, 1f);
}
FragmentShader:
//variables shared between fragment and vertex shader
varying vec2 v_texCoord0;
//variables visible from java
uniform sampler2D u_texture;
uniform vec2 u_textureSize;
uniform int u_length;
void main() {
vec4 texColor = texture2D(u_texture, v_texCoord0);
vec2 step = 1.0 / u_textureSize;
if(texColor.a > 0) {
int maxNearPixels = (u_length * 2 + 1) * (u_length * 2 + 1) - 1;
for(int i = 0; i <= u_length; i++) {
for(float j = 0; j <= u_length; j++) {
if(i != 0 || j != 0) {
texColor.a -= (1 - texture2D(u_texture, v_texCoord0 + vec2(step.x * float(i), step.y * float(j))).a) / float(maxNearPixels);
texColor.a -= (1 - texture2D(u_texture, v_texCoord0 + vec2(-step.x * float(i), step.y * float(j))).a) / float(maxNearPixels);
texColor.a -= (1 - texture2D(u_texture, v_texCoord0 + vec2(step.x * float(i), -step.y * float(j))).a) / float(maxNearPixels);
texColor.a -= (1 - texture2D(u_texture, v_texCoord0 + vec2(-step.x * float(i), -step.y * float(j))).a) / float(maxNearPixels);
}
}
}
}
gl_FragColor = texColor;
}
This is the result I got setting a length of 20:
So the shader I wrote kinda works, but has terrible performance cause it's O(n^2) where n is the length of the fade in pixels (so it can be very high, like 60 or even 80). It also has some problems, like that the edges are still a bit too sharp (I'd like a smother transition) and some of the angles of the border are less faded than others (I'd like to have a fade uniform everywhere).
I'm a little bit lost at this point: is there anything I can do to make it better and faster? Like I said I'm new to shaders, so: is it even the right way to use shaders?
As others mentioned in the comments, instead of blurring in the screen-space, you should filter in the tile-space while potentially exploiting the GPU bilinear filtering. Let's go through it with images.
First define a texture such that each pixel corresponds to a single tile, black/white depending on the fog at that tile. Here's such a texture blown up:
After applying the screen-to-tiles coordinate transformation and sampling that texture with GL_NEAREST interpolation we get the blocky result similar to what you have:
float t = texture2D(u_tiles, M*uv).r;
gl_FragColor = vec4(t,t,t,1.0);
If instead of GL_NEAREST we switch to GL_LINEAR, we get a somewhat better result:
This still looks a little blocky. To improve on that we can apply a smoothstep:
float t = texture2D(u_tiles, M*uv).r;
t = smoothstep(0.0, 1.0, t);
gl_FragColor = vec4(t,t,t,1.0);
Or here is a version with a linear shade-mapping function:
float t = texture2D(u_tiles, M*uv).r;
t = clamp((t-0.5)*1.5 + 0.5, 0.0, 1.0);
gl_FragColor = vec4(t,t,t,1.0);
Note: these images were generated within a gamma-correct pipeline (i.e. sRGB framebuffer enabled). This is one of those few scenarios, however, where ignoring gamma may give better results, so you're welcome to experiment.
The Shader compiles successfully, but the program crashes as soon as rendering starts... This is the error i get: "no uniform with name 'u_texture' in shader". This is what my shader looks like:
#ifdef GL_ES
precision mediump float;
#endif
uniform float time;
uniform vec2 mouse;
uniform vec2 resolution;
varying vec2 surfacePosition;
#define MAX_ITER 10
void main( void ) {
vec2 p = surfacePosition*4.0;
vec2 i = p;
float c = 0.0;
float inten = 1.0;
for (int n = 0; n < MAX_ITER; n++) {
float t = time * (1.0 - (1.0 / float(n+1)));
i = p + vec2(
cos(t - i.x) + sin(t + i.y),
sin(t - i.y) + cos(t + i.x)
);
c += 1.0/length(vec2(
p.x / (sin(i.x+t)/inten),
p.y / (cos(i.y+t)/inten)
)
);
}
c /= float(MAX_ITER);
gl_FragColor = vec4(vec3(pow(c,1.5))*vec3(0.99, 0.97, 1.8), 1.0);
}
Can someone please help me. I don't know what I'm doing wrong. BTW, this is shader i found on the internet, so I know it is working, the only problem is making it work with libgdx.
libGDX's SpriteBatch assumes that your shader will have u_texture uniform. To overcome just add
ShaderProgram.pedantic = false;(Javadoc) before putting your shader program into the SpriteBatch.
UPDATE: raveesh is right about shader compiler vanishing unused uniforms and attributes, but libGDX wraps OpenGL shader in custom ShaderProgram.
Not only should you add the uniform u_texture in your shader program, you should also use it, otherwise it will be optimized away by the shader compiler.
But looking at you shader, you don't seem to need the uniform anyway, so check your program for something like shader.setUniformi("u_texture", 0); and remove the line. It should work fine then.
I am trying to get a hold of how memoryBarrier() works in OpenGL 4.4
I tried the following once with a texture image and once with Shader Storage Buffer Object (SSBO).
The basic idea is to create an array of flags for however many objects that need to be rendered in my scene and then perform a simple test in the geometry shader.
For each primitive in GS, if at least one vertex passes the test, it
sets the corresponding flag in the array at the location specified
by this primitive's object ID (Object IDs are passed to GS as vertex
attributes).
I then perform a memoryBarrier() to make sure all threads have written their values.
Next, I have all primitives read from the flags array and only emit a vertex if the flag is set.
Here is some code from my shaders to explain:
// Vertex Shader:
#version 440
uniform mat4 model_view;
uniform mat4 projection;
layout(location = 0) in vec3 in_pos;
layout(location = 1) in vec3 in_color;
layout(location = 2) in int lineID;
out VS_GS_INTERFACE
{
vec4 position;
vec4 color;
int lineID;
} vs_out;
void main(void) {
vec4 pos = vec4(in_pos, 1.0);
vs_out.position = pos;
vs_out.color = vec4(in_colo, 1.0);
vs_out.lineID = lineID;
gl_Position = projection * model_view * pos;
}
and here is a simple Geometry shader in which I use only a simple test based on lineID ( I realize this test doesn't need a shared data structure but this is just to test program behavior)
#version 440
layout (lines) in;
layout (line_strip, max_vertices = 2) out;
layout (std430, binding = 0) buffer BO {
int IDs[];
};
in VS_GS_INTERFACE
{
vec4 position;
vec4 color;
int lineID;
} gs_in[];
out vec4 f_color;
void main()
{
if(gs_in[0].lineID < 500)
{
IDs[gs_in[0].lineID] = 1;
}
else
{
IDs[gs_in[0].lineID] = -1;
}
memoryBarrier();
// read back the flag value
int flag = IDs[gs_in[0].lineID];
if ( flag > 0)
{
int n;
for( n = 0; n < gl_in.length(), n++)
{
f_color = gs_in[n].color;
gl_Position = gl_in[n].gl_Position;
emitVertex();
}
}
}
No matter what value I put instead of 500, this code always renders only 2 objects. If I change the condition for rendering in the GS to if( flag > = 0) it seems to me that all objects are rendered which means the -1 is never written by the time these IDs are read back by the shader.
Can someone please explain why the writes are not coherently visible to all shader invocations despite the memoryBarrier() and what would be the most efficient work around to get this to work?
Thanks.
Using OpenGL 3.3 core profile, I'm rendering a full-screen "quad" (as a single oversized triangle) via gl.DrawArrays(gl.TRIANGLES, 0, 3) with the following shaders.
Vertex shader:
#version 330 core
#line 1
vec4 vx_Quad_gl_Position () {
const float extent = 3;
const vec2 pos[3] = vec2[](vec2(-1, -1), vec2(extent, -1), vec2(-1, extent));
return vec4(pos[gl_VertexID], 0, 1);
}
void main () {
gl_Position = vx_Quad_gl_Position();
}
Fragment shader:
#version 330 core
#line 1
out vec3 out_Color;
vec3 fx_RedTest (const in vec3 vCol) {
return vec3(0.9, 0.1, 0.1);
}
vec3 fx_Grayscale (const in vec3 vCol) {
return vec3((vCol.r * 0.3) + (vCol.g * 0.59) + (vCol.b * 0.11));
}
void main () {
out_Color = fx_RedTest(out_Color);
out_Color = fx_Grayscale(out_Color);
}
Now, the code may look a bit odd and the present purpose of this may seem useless, but that shouldn't phase the GL driver.
On a GeForce, a get a gray screen as expected. That is, the "grayscale effect" applied to the hard-coded color "red" (0.9, 0.1, 0.1).
However, Intel HD 4000 [driver 9.17.10.2932 (12-12-2012) version -- the newest as of today] always, repeatedly shows nothing but the following constantly-flickering noise pattern:
Now, just to experiment a little, I changed the fx_Grayscale() function around a little bit -- effectively it should be yielding the same visual result, just with slightly different coding:
vec3 fx_Grayscale (const in vec3 vCol) {
vec3 col = vec3(0.9, 0.1, 0.1);
col = vCol;
float x = (col.r * 0.3) + (col.g * 0.59) + (col.b * 0.11);
return vec3(x, x, x);
}
Again, Nvidia does the correct thing whereas Intel HD now always, repeatedly produces a rather different, but still constantly-flickering noise pattern:
Must I suspect (yet another) Intel GL driver bug, or do you see any issues with my GLSL code -- not from a prettiness perspective (it's part of a shader code-gen experimental project) but from a mere spec-correctness point of view?
I think it looks strange to send in a "out" color as parameter to another function. I would rewrite it something like this:
void main () {
vec3 col = vec3(0f,0f,0f);
col = fx_RedTest(col);
col = fx_Grayscale(col);
out_Color = col;
}
Does it make any difference?
First, a screenshot:
As you can see, the tops of the shadows look OK (if you look at the dirt where the tops of the shrubs are projected, it looks more or less correct), but the base of the shadows is way off.
The bottom left corner of the image shows the shadow map I computed. It's a depth-map from the POV of the light, which is also where my character is standing.
Here's another shot, from a different angle:
Any ideas what might be causing it to come out like this? Is the depth of the shrub face too similar to the depth of the ground directly behind it, perhaps? If so, how do I get around that?
I'll post the fragment shader below, leave a comment if there's anything else you need to see.
Fragment Shader
#version 330
in vec2 TexCoord0;
in vec3 Tint0;
in vec4 WorldPos;
in vec4 LightPos;
out vec4 FragColor;
uniform sampler2D TexSampler;
uniform sampler2D ShadowSampler;
uniform bool Blend;
const int MAX_LIGHTS = 16;
uniform int NumLights;
uniform vec3 Lights[MAX_LIGHTS];
const float lightRadius = 100;
float distSq(vec3 v1, vec3 v2) {
vec3 d = v1-v2;
return dot(d,d);
}
float CalcShadowFactor(vec4 LightSpacePos)
{
vec3 ProjCoords = LightSpacePos.xyz / LightSpacePos.w;
vec2 UVCoords;
UVCoords.x = 0.5 * ProjCoords.x + 0.5;
UVCoords.y = 0.5 * ProjCoords.y + 0.5;
float Depth = texture(ShadowSampler, UVCoords).x;
if (Depth < (ProjCoords.z + 0.0001))
return 0.5;
else
return 1.0;
}
void main()
{
float scale;
FragColor = texture2D(TexSampler, TexCoord0.xy);
// transparency
if(!Blend && FragColor.a < 0.5) discard;
// biome blending
FragColor *= vec4(Tint0, 1.0f);
// fog
float depth = gl_FragCoord.z / gl_FragCoord.w;
if(depth>20) {
scale = clamp(1.2-15/(depth-19),0,1);
vec3 destColor = vec3(0.671,0.792,1.00);
vec3 colorDist = destColor - FragColor.xyz;
FragColor.xyz += colorDist*scale;
}
// lighting
scale = 0.30;
for(int i=0; i<NumLights; ++i) {
float dist = distSq(WorldPos.xyz, Lights[i]);
if(dist < lightRadius) {
scale += (lightRadius-dist)/lightRadius;
}
}
scale *= CalcShadowFactor(LightPos);
FragColor.xyz *= clamp(scale,0,1.5);
}
I'm fairly certain this is an offset problem. My shadows look to be about 1 block off, but I can't figure out how to shift them, nor what's causing them to be off.
Looks like "depth map bias" actually:
Not exactly sure how to set this....do I just call glPolygonOffset before rendering the scene? Will try it...
Setting glPolygonOffset to 100,100 amplifies the problem:
I set this just before rendering the shadow map:
GL.Enable(EnableCap.PolygonOffsetFill);
GL.PolygonOffset(100f, 100.0f);
And then disabled it again. I'm not sure if that's how I'm supposed to do it. Increasing the values amplifies the problem....decreasing them to below 1 doesn't seem to improve it though.
Notice also how the shadow map in the lower left changed.
Vertex Shader
#version 330
layout(location = 0) in vec3 Position;
layout(location = 1) in vec2 TexCoord;
layout(location = 2) in mat4 Transform;
layout(location = 6) in vec4 TexSrc; // x=x, y=y, z=width, w=height
layout(location = 7) in vec3 Tint; // x=R, y=G, z=B
uniform mat4 ProjectionMatrix;
uniform mat4 LightMatrix;
out vec2 TexCoord0;
out vec3 Tint0;
out vec4 WorldPos;
out vec4 LightPos;
void main()
{
WorldPos = Transform * vec4(Position, 1.0);
gl_Position = ProjectionMatrix * WorldPos;
LightPos = LightMatrix * WorldPos;
TexCoord0 = vec2(TexSrc.x+TexCoord.x*TexSrc.z, TexSrc.y+TexCoord.y*TexSrc.w);
Tint0 = Tint;
}
While world-aligned cascaded shadow maps are great and used in most new games out there, it's not related to why your shadows have a strange offset with your current implementation.
Actually, it looks like you're sampling from the correct texels on the shadow map just on where the shadows that are occurring are exactly where you'd expect them to be, however your comparison is off.
I've added some comments to your code:
vec3 ProjCoords = LightSpacePos.xyz / LightSpacePos.w; // So far so good...
vec2 UVCoords;
UVCoords.x = 0.5 * ProjCoords.x + 0.5; // Right, you're converting X and Y from clip
UVCoords.y = 0.5 * ProjCoords.y + 0.5; // space to texel space...
float Depth = texture(ShadowSampler, UVCoords).x; // I expect we sample a value in [0,1]
if (Depth < (ProjCoords.z + 0.0001)) // Uhoh, we never converted Z's clip space to [0,1]
return 0.5;
else
return 1.0;
So, I suspect you want to compare to ProjCoords.z * 0.5 + 0.5:
if (Depth < (ProjCoords.z * 0.5 + 0.5 + 0.0001))
return 0.5;
else
return 1.0;
Also, that bias factor makes me nervous. Better yet, just take it out for now and deal with it once you get the shadows appearing in the right spots:
const float bias = 0.0;
if (Depth < (ProjCoords.z * 0.5 + 0.5 + bias))
return 0.5;
else
return 1.0;
I might not be entirely right about how to transform ProjCoords.z to match the sampled value, however this is likely the issue. Also, if you do move to cascaded shadow maps (I recommend world-aligned) I'd strongly recommend drawing frustums representing where each shadow map is viewing -- it makes debugging a whole lot easier.
This is called the "deer in headlights" effect of buffer mapped shadows. There are a several ways to minimize this effect. Look for "light space shadow mapping".
NVidia OpenGL SDK has "cascaded shadow maps" example. You might want to check it out (haven't used it myself, though).
How to improve the quality of my shadows?
The problem could be caused by using incorrect matrix while rendering shadows. Your example doesn't demonstrate how light matrices are set. By murphy's law I'll have to assume that bug lies in this missing piece of code - since you decided that this part isn't important, it probably causes the problem. If matrix used while testing the shadow is different from matrix used to render the shadow, you'll get exactly this problem.
I suggest to forget about the whole minecraft thing for a moment, and play around with shadows in simple application. Make a standalone application with floor plane and rotating cube (or teapot or whatever you want), and debug shadow maps there, until you get hang of it. Since you're willing to throw +100 bounty onto the question, you might as well post the complete code of your standalone sample here - if you still get the problem in the sample. Trying to stick technology you aren't familiar with into the middle of working(?) engine isn't a good idea anyway. Take it slow, get used to the technique/technology/effect, then integrate it.