I need to debug a GLSL program but I don't know how to output intermediate result.
Is it possible to make some debug traces (like with printf) with GLSL ?
You can't easily communicate back to the CPU from within GLSL. Using glslDevil or other tools is your best bet.
A printf would require trying to get back to the CPU from the GPU running the GLSL code. Instead, you can try pushing ahead to the display. Instead of trying to output text, output something visually distinctive to the screen. For example you can paint something a specific color only if you reach the point of your code where you want add a printf. If you need to printf a value you can set the color according to that value.
void main(){
float bug=0.0;
vec3 tile=texture2D(colMap, coords.st).xyz;
vec4 col=vec4(tile, 1.0);
if(something) bug=1.0;
col.x+=bug;
gl_FragColor=col;
}
I have found Transform Feedback to be a useful tool for debugging vertex shaders. You can use this to capture the values of VS outputs, and read them back on the CPU side, without having to go through the rasterizer.
Here is another link to a tutorial on Transform Feedback.
GLSL Sandbox has been pretty handy to me for shaders.
Not debugging per se (which has been answered as incapable) but handy to see the changes in output quickly.
You can try this: https://github.com/msqrt/shader-printf which is an implementation called appropriately "Simple printf functionality for GLSL."
You might also want to try ShaderToy, and maybe watch a video like this one (https://youtu.be/EBrAdahFtuo) from "The Art of Code" YouTube channel where you can see some of the techniques that work well for debugging and visualising. I can strongly recommend his channel as he writes some really good stuff and he also has a knack for presenting complex ideas in novel, highly engaging and and easy to digest formats (His Mandelbrot video is a superb example of exactly that : https://youtu.be/6IWXkV82oyY)
I hope nobody minds this late reply, but the question ranks high on Google searches for GLSL debugging and much has of course changed in 9 years :-)
PS: Other alternatives could also be NVIDIA nSight and AMD ShaderAnalyzer which offer a full stepping debugger for shaders.
If you want to visualize the variations of a value across the screen, you can use a heatmap function similar to this (I wrote it in hlsl, but it is easy to adapt to glsl):
float4 HeatMapColor(float value, float minValue, float maxValue)
{
#define HEATMAP_COLORS_COUNT 6
float4 colors[HEATMAP_COLORS_COUNT] =
{
float4(0.32, 0.00, 0.32, 1.00),
float4(0.00, 0.00, 1.00, 1.00),
float4(0.00, 1.00, 0.00, 1.00),
float4(1.00, 1.00, 0.00, 1.00),
float4(1.00, 0.60, 0.00, 1.00),
float4(1.00, 0.00, 0.00, 1.00),
};
float ratio=(HEATMAP_COLORS_COUNT-1.0)*saturate((value-minValue)/(maxValue-minValue));
float indexMin=floor(ratio);
float indexMax=min(indexMin+1,HEATMAP_COLORS_COUNT-1);
return lerp(colors[indexMin], colors[indexMax], ratio-indexMin);
}
Then in your pixel shader you just output something like:
return HeatMapColor(myValue, 0.00, 50.00);
And can get an idea of how it varies across your pixels:
Of course you can use any set of colors you like.
At the bottom of this answer is an example of GLSL code which allows to output the full float value as color, encoding IEEE 754 binary32. I use it like follows (this snippet gives out yy component of modelview matrix):
vec4 xAsColor=toColor(gl_ModelViewMatrix[1][1]);
if(bool(1)) // put 0 here to get lowest byte instead of three highest
gl_FrontColor=vec4(xAsColor.rgb,1);
else
gl_FrontColor=vec4(xAsColor.a,0,0,1);
After you get this on screen, you can just take any color picker, format the color as HTML (appending 00 to the rgb value if you don't need higher precision, and doing a second pass to get the lower byte if you do), and you get the hexadecimal representation of the float as IEEE 754 binary32.
Here's the actual implementation of toColor():
const int emax=127;
// Input: x>=0
// Output: base 2 exponent of x if (x!=0 && !isnan(x) && !isinf(x))
// -emax if x==0
// emax+1 otherwise
int floorLog2(float x)
{
if(x==0.) return -emax;
// NOTE: there exist values of x, for which floor(log2(x)) will give wrong
// (off by one) result as compared to the one calculated with infinite precision.
// Thus we do it in a brute-force way.
for(int e=emax;e>=1-emax;--e)
if(x>=exp2(float(e))) return e;
// If we are here, x must be infinity or NaN
return emax+1;
}
// Input: any x
// Output: IEEE 754 biased exponent with bias=emax
int biasedExp(float x) { return emax+floorLog2(abs(x)); }
// Input: any x such that (!isnan(x) && !isinf(x))
// Output: significand AKA mantissa of x if !isnan(x) && !isinf(x)
// undefined otherwise
float significand(float x)
{
// converting int to float so that exp2(genType) gets correctly-typed value
float expo=float(floorLog2(abs(x)));
return abs(x)/exp2(expo);
}
// Input: x\in[0,1)
// N>=0
// Output: Nth byte as counted from the highest byte in the fraction
int part(float x,int N)
{
// All comments about exactness here assume that underflow and overflow don't occur
const float byteShift=256.;
// Multiplication is exact since it's just an increase of exponent by 8
for(int n=0;n<N;++n)
x*=byteShift;
// Cut higher bits away.
// $q \in [0,1) \cap \mathbb Q'.$
float q=fract(x);
// Shift and cut lower bits away. Cutting lower bits prevents potentially unexpected
// results of rounding by the GPU later in the pipeline when transforming to TrueColor
// the resulting subpixel value.
// $c \in [0,255] \cap \mathbb Z.$
// Multiplication is exact since it's just and increase of exponent by 8
float c=floor(byteShift*q);
return int(c);
}
// Input: any x acceptable to significand()
// Output: significand of x split to (8,8,8)-bit data vector
ivec3 significandAsIVec3(float x)
{
ivec3 result;
float sig=significand(x)/2.; // shift all bits to fractional part
result.x=part(sig,0);
result.y=part(sig,1);
result.z=part(sig,2);
return result;
}
// Input: any x such that !isnan(x)
// Output: IEEE 754 defined binary32 number, packed as ivec4(byte3,byte2,byte1,byte0)
ivec4 packIEEE754binary32(float x)
{
int e = biasedExp(x);
// sign to bit 7
int s = x<0. ? 128 : 0;
ivec4 binary32;
binary32.yzw=significandAsIVec3(x);
// clear the implicit integer bit of significand
if(binary32.y>=128) binary32.y-=128;
// put lowest bit of exponent into its position, replacing just cleared integer bit
binary32.y+=128*int(mod(float(e),2.));
// prepare high bits of exponent for fitting into their positions
e/=2;
// pack highest byte
binary32.x=e+s;
return binary32;
}
vec4 toColor(float x)
{
ivec4 binary32=packIEEE754binary32(x);
// Transform color components to [0,1] range.
// Division is inexact, but works reliably for all integers from 0 to 255 if
// the transformation to TrueColor by GPU uses rounding to nearest or upwards.
// The result will be multiplied by 255 back when transformed
// to TrueColor subpixel value by OpenGL.
return vec4(binary32)/255.;
}
I am sharing a fragment shader example, how i actually debug.
#version 410 core
uniform sampler2D samp;
in VS_OUT
{
vec4 color;
vec2 texcoord;
} fs_in;
out vec4 color;
void main(void)
{
vec4 sampColor;
if( texture2D(samp, fs_in.texcoord).x > 0.8f) //Check if Color contains red
sampColor = vec4(1.0f, 1.0f, 1.0f, 1.0f); //If yes, set it to white
else
sampColor = texture2D(samp, fs_in.texcoord); //else sample from original
color = sampColor;
}
The existing answers are all good stuff, but I wanted to share one more little gem that has been valuable in debugging tricky precision issues in a GLSL shader. With very large int numbers represented as a floating point, one needs to take care to use floor(n) and floor(n + 0.5) properly to implement round() to an exact int. It is then possible to render a float value that is an exact int by the following logic to pack the byte components into R, G, and B output values.
// Break components out of 24 bit float with rounded int value
// scaledWOB = (offset >> 8) & 0xFFFF
float scaledWOB = floor(offset / 256.0);
// c2 = (scaledWOB >> 8) & 0xFF
float c2 = floor(scaledWOB / 256.0);
// c0 = offset - (scaledWOB << 8)
float c0 = offset - floor(scaledWOB * 256.0);
// c1 = scaledWOB - (c2 << 8)
float c1 = scaledWOB - floor(c2 * 256.0);
// Normalize to byte range
vec4 pix;
pix.r = c0 / 255.0;
pix.g = c1 / 255.0;
pix.b = c2 / 255.0;
pix.a = 1.0;
gl_FragColor = pix;
The GLSL Shader source code is compiled and linked by the graphics driver and executed on the GPU.
If you want to debug the shader, then you have to use graphics debugger like RenderDoc or NVIDIA Nsight.
I found a very nice github library (https://github.com/msqrt/shader-printf)
You can use the printf function in a shader file.
sue this
vec3 dd(vec3 finalColor,vec3 valueToDebug){
//debugging
finalColor.x = (v_uv.y < 0.3 && v_uv.x < 0.3) ? valueToDebug.x : finalColor.x;
finalColor.y = (v_uv.y < 0.3 && v_uv.x < 0.3) ? valueToDebug.y : finalColor.y;
finalColor.z = (v_uv.y < 0.3 && v_uv.x < 0.3) ? valueToDebug.z : finalColor.z;
return finalColor;
}
//on the main function, second argument is the value to debug
colour = dd(colour,vec3(0.0,1.0,1.));
gl_FragColor = vec4(clamp(colour * 20., 0., 1.),1.0);
Do offline rendering to a texture and evaluate the texture's data.
You can find related code by googling for "render to texture" opengl
Then use glReadPixels to read the output into an array and perform assertions on it (since looking through such a huge array in the debugger is usually not really useful).
Also you might want to disable clamping to output values that are not between 0 and 1, which is only supported for floating point textures.
I personally was bothered by the problem of properly debugging shaders for a while. There does not seem to be a good way - If anyone finds a good (and not outdated/deprecated) debugger, please let me know.
Related
In the following shadertoy I illustrate an artefact that occurs when raymarching
https://www.shadertoy.com/view/stdGDl
This is my "scene" (see code fragment below). It renders a primitive "tunnel_fragment" which is an SDF (Signed Distance Function), and uses modulo on the coordinates to calculate "infinite" repetitions of these fragments. It then also calculates which disk we are in (odd/even) to displace them.
I really don't understand why these artefacts occur when the disks (or rings -> see tunnel_fragment, if you remove a comment they become rings instead of disks) present these artefacts when the alternate movement in x direction becomes large.
These artefacts don't appear when the disk structure moves to the right on its whole, it only appears when the disks alternate and the entire structure becomes more complex.
What am I doing wrong? It's really boggling me.
vec2 scene(in vec3 p)
{
float thick = 0.1;
vec3 cp = p;
// Use modulo to simulate inf disks
vec3 c = vec3(0,0,6.0*thick);
vec3 q = mod(cp+0.5*c,c)-0.5*c;
// Find index of the disk
vec3 disk = (cp+0.5*c) / (c);
float idx = floor(disk.z);
// Do something simple with odd/even disks
// Note: changing this shows the artefacts are always there
if(mod(idx,2.0) == 0.0) {
q.x += sin(disk.z*t)*t*t;
} else {
q.x -= sin(disk.z*t)*t*t;
}
float d = tunnel_fragment(q, vec3(0.0), vec3(0.0, 0.0, 1.0), 2.0, thick, 0.2);
return vec2(d, idx);
}
The problem is illustrated with this diagram:
When the current disk (based on modulo) is offset by more than the spacing between the disks, then the distance that you calculate is larger than the distance to the next disk. Consequently you risk in over-stepping the next disk.
To solve this you need to either limit the offset (as said -- no more than the spacing between the disks), or sample odd/even disks separately and min() between them.
A quick summary:
I've a simple Quad tree based terrain rendering system that builds terrain patches which then sample a heightmap in the vertex shader to determine the height of each vertex.
The exact same calculation is done on the CPU for object placement and co.
Super straightforward, but now after adding some systems to procedurally place objects I've discovered that they seem to be misplaced by just a small amount. To debug this I render a few crosses as single models over the terrain. The crosses (red, green, blue lines) represent the height read from the CPU. While the terrain mesh uses a shader to translate the vertices.
(I've also added a simple odd/even gap over each height value to rule out a simple offset issue. So those ugly cliffs are expected, the submerged crosses are the issue)
I'm explicitly using GL_NEAREST to be able to display the "raw" height value:
As you can see the crosses are sometimes submerged under the terrain instead of representing its exact height.
The heightmap is just a simple array of floats on the CPU and on the GPU.
How the data is stored
A simple vector<float> which is uploaded into a GL_RGB32F GL_FLOAT buffer. The floats are not normalized and my terrain usually contains values between -100 and 500.
How is the data accessed in the shader
I've tried a few things to rule out errors, the inital:
vec2 terrain_heightmap_uv(vec2 position, Heightmap heightmap)
{
return (position + heightmap.world_offset) / heightmap.size;
}
float terrain_read_height(vec2 position, Heightmap heightmap)
{
return textureLod(heightmap.heightmap, terrain_heightmap_uv(position, heightmap), 0).r;
}
Basics of the vertex shader (the full shader code is very long, so I've extracted the part that actually reads the height):
void main()
{
vec4 world_position = a_model * vec4(a_position, 1.0);
vec4 final_position = world_position;
// snap vertex to grid
final_position.x = floor(world_position.x / a_quad_grid) * a_quad_grid;
final_position.z = floor(world_position.z / a_quad_grid) * a_quad_grid;
final_position.y = terrain_read_height(final_position.xz, heightmap);
gl_Position = projection * view * final_position;
}
To ensure the slightly different way the position is determined I tested it using hardcoded values that are identical to how C++ reads the height:
return texelFetch(heightmap.heightmap, ivec2((position / 8) + vec2(1024, 1024)), 0).r;
Which gives the exact same result...
How is the data accessed in the application
In C++ the height is read like this:
inline float get_local_height_safe(uint32_t x, uint32_t y)
{
// this macro simply clips x and y to the heightmap bounds
// it does not interfer with the result
BB_TERRAIN_HEIGHTMAP_BOUND_XY_TO_SAFE;
uint32_t i = (y * _size1d) + x;
return buffer->data[i];
}
inline float get_height_raw(glm::vec2 position)
{
position = position + world_offset;
uint32_t x = static_cast<int>(position.x);
uint32_t y = static_cast<int>(position.y);
return get_local_height_safe(x, y);
}
float BB::Terrain::get_height(const glm::vec3 position)
{
return heightmap->get_height_raw({position.x / heightmap_unit_scale, position.z / heightmap_unit_scale});
}
What have I tried:
Comparing the Buffers
I've dumped the first few hundred values from the vector. And compared it with the floating point buffer uploaded to the GPU using Nvidia Nsight, they are equal, rounding/precision errors there.
Sampling method
I've tried texture, textureLod and texelFetch to rule out some issue there, they all give me the same result.
Rounding
The super strange thing, when I round all the height values. They are perfectly aligned which just screams floating point precision issues.
Position snapping
I've tried rounding, flooring and ceiling the position, to ensure the position always maps to the same texel. I also tried adding an epsilon offset to rule out a positional precision error (probably stupid because the terrain is stable...)
Heightmap sizes
I've tried various heightmaps, also of different sizes.
Heightmap patterns
I've created a heightmap containing a pattern to ensure the position is not just offsetet.
I have a hexagonal grid that I want to texture. I want to use a single texture with 16 distinct subtextures arranged in a 4x4 grid. Each "node" in the grid has an image type, and I want to smoothly blend between them. My approach for implementing this is to render triangles in pairs, and encode the 4 image types on all vertices in the two faces, as well as a set of 4 weighting factors (which are the barycentric coordinates for the two tris). I can then use those two things to blend smoothly between any combination of image types.
Here is the fragment shader I'm using. The problems are arising from the use of int types, but I don't understand why. If i only use the first four sub-textures then i can change idx to be float and hardcode the Y-coord to be 0, and then it then works as i expect.
vec2 offset(int idx) {
vec2 v = vec2(idx % 4, idx / 4);
return v / 4.0;
}
void main(void) {
//
// divide the incoming UVs into one of 16 regions. The
// offset() function should take an integer from 0..15
// and return the offset to that region in the 4x4 map
//
vec2 uv = v_uv / 4.0;
//
// The four texture regions involved at
// this vertex are encoded in vec4 t_txt. The same
// values are stored at all vertices, so this doesn't
// vary across the triangle
//
int ia = int(v_txt.x);
int ib = int(v_txt.y);
int ic = int(v_txt.z);
int id = int(v_txt.w);
//
// Use those indices in the offset function to get the
// texture sample at that point
//
vec4 ca = texture2D(txt, uv + offset(ia));
vec4 cb = texture2D(txt, uv + offset(ib));
vec4 cc = texture2D(txt, uv + offset(ic));
vec4 cd = texture2D(txt, uv + offset(id));
//
// Merge them with the four factors stored in vec4 v_tfact.
// These vary for each vertex
//
fragcolour = ca * v_tfact.x
+ cb * v_tfact.y
+ cc * v_tfact.z
+ cd * v_tfact.w;
}
Here is what's happening:
(My "pair of triangles" are actually about 20 and you can see their structure in the artifacts, but the effect is the same)
This artifacting behaves a bit like z-fighting: moving the scene around makes it all shimmer and shift wildly.
Why doesn't this work as I expect?
One solution I can fall back on is to simply use a 1-dimensional texture map, with all 16 sub-images in a horizontal line, then i can switch everything to floating point since I won't need the modulo/integer-divide process to map idx->x,y, but this feels clumsy and I'd at least like to understand what's going on here.
Here is what it should look like, albeit with only 4 of the sub-images in use:
See OpenGL Shading Language 4.60 Specification - 5.4.1. Conversion and Scalar Constructors
When constructors are used to convert a floating-point type to an integer type, the fractional part of the floating-point value is dropped.
Hence int(v_txt.x) does not round v_txt.x, it truncates v_txt.x
You have to round the values to the nearest integer before constructing an integral value:
int ia = int(round(v_txt.x));
int ib = int(round(v_txt.y));
int ic = int(round(v_txt.z));
int id = int(round(v_txt.w));
Alternatively add 0.5 before constructing the integral value:
int ia = int(v_txt.x + 0.5);
int ib = int(v_txt.y + 0.5);
int ic = int(v_txt.z + 0.5);
int id = int(v_txt.w + 0.5);
In GLSL, outputColour = vec3(0, 0, 0.5) outputs an RGB value of (0, 0, 127) instead of (0, 0, 128). Confirmed by GL.ReadPixels() and the Photoshop eyedropper tool.
Currently I'm bypassing the issue with outputColour.b += 0.001 but I sure that will come back to haunt me later.
Has any body experienced this before and what is the solution?
Whenever OpenGL is asked to convert a float to a normalized integer, the implementation is allowed to do the rounding however it likes. So if it wants to chop off the decimal, that's fine.
If you need to control the rounding for normalization, then control it directly.
vec4 NormalizeColor(in vec4 input)
{
vec4 denorm = input * 255.0;
vec4 rounded = round(denorm);
return rounded / 255.0;
}
You can replace the call to round with whatever you like.
Let's say I am rendering 2 samples that will be combined into a single image. The first sample contains values outside the range of a displayable pixel (in this case, greater than 1). But when subtracted by the 2nd sample, it does fall in the range.
I store the samples in framebuffer textures prior to combining them.
I want to be able to store values greater than 1, but those values are being clamped to 1. Can the GLSL fragment shader output such values? Can textures store them? If not, how else can I store them?
According to this page, it is possible:
rendering to screen requires the outputs to be of a displayable format, which is not always the case in a multipass pipeline. Sometimes the textures produced by a pass need to have a floating point format which does not translate directly to colors
But according to the specification, texture floats are clamped to the range [0,1].
The easiest way is to use floating point textures.
var gl = someCanvasElement.getContext("experimental-webgl");
var ext = gl.getExtension("OES_texture_float");
if (!ext) {
alert("no OES_texture_float");
return;
}
now you can create and render with floating point textures. The next thing to do is see if you can render to floating point textures.
var tex = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, tex);
gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, width, height, 0, gl.RGBA, gl.FLOAT, null);
gl.texParameteri(gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_MIN_FILTER, gl.NEAREST);
gl.texParameteri(gl.TEXTURE_MAG_FILTER, gl.NEAREST);
var fb = gl.createFramebuffer();
gl.bindFrameBuffer(gl.FRAMEBUFFER, fb);
gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, tex, 0);
var status = gl.checkFramebufferStatus(gl.FRAMEBUFFER);
if (status != gl.FRAMEBUFFER_COMPLETE) {
alert("can not render to floating point textures");
return;
}
Floats are not clamped when using OES_texture_float
If the device doesn't support rendering to a floating point texture then you'd have to encode your results some other way like gil suggests
WebGL2
Note: in WebGL2 floating point textures are always available. On the other hand you still have to check for and enable OES_texture_float_linear if you want to filter floating point textures. Also in WebGL2 you need to enable EXT_color_buffer_float to render to a floating point texture (and you still need to call gl.checkFramebufferStatus since it's up to the driver which combinations of attachments are supported). And further, there's EXT_float_blend for whether or not you can have blending enabled when rendering to a floating point texture.
Fragment shaders can output values outside the [0.0, 1.0] range, but only if the format of the buffer the values are written to supports values outside that range. What is needed to enable this are render targets (renderbuffers or textures attached to an FBO) that store float values.
OpenGL ES 2.0 and lower do not require support for float format textures. OpenGL ES 3.0 and higher do. For example, in ES 3.0 you could use GL_RGBA16F for a RGBA texture with 16-bit float (aka half-float) components, and GL_RGBA32F for 32-bit float components. Both ES 3.0 and 3.1 still do not require support for using these formats as render targets, though, which is what you need for this use case.
ES 2.0 implementations can provide half-float textures by supporting the OES_texture_half_float and float textures by supporting the OES_texture_float extension. To support rendering to half-float textures, they also need EXT_color_buffer_half_float. EXT_color_buffer_float defines rendering to float textures, but is specified to be based on ES 3.0.
In summary:
ES 2.0 and higher can support rendering to 16-bit float textures by supporting both the OES_texture_half_float and EXT_color_buffer_half_float extensions.
ES 3.0 and higher can support rendering to 32-bit float textures by supporting both the OES_texture_float and EXT_color_buffer_float extensions.
If you want to use these features, you will have to test for the presence of these extensions on your device.
The key idea here is to encode a float in some unrestricted range using 2 or 4 fixed point 8 bit channels (color channels) in the range [0,1]. This method is generic and applies to WebGL or any other GL system.
Let's say you start with a float value:
float value;
Assume your machine support mediump (16 bit float), you can encode value using
2 8 bit channels:
float myNormalize(float val)
{
float min = -1.0;
float max = 1.0;
float norm = (val - min) / (max - min);
return norm;
}
vec2 encode_float_as_2bytes(float a)
{
a = myNormalize(a);
vec2 enc = vec2(1.0, 256.0);
enc *= a;
enc = fract(enc);
enc.x -= enc.y * (1.0 / 256.0);
return enc;
}
Here encode_float_as_2bytes(float a) accepts the value to be encoded. The value is first normalized to [0,1], using some bounding values (on my example my float can take values in[-1, 1]. After normalization, the value is encoded using vec2.
Now you can write the encoded value to the color buffer:
float a = compute_something(...);
gl_FragColor.xy = encode_float_as_2bytes(a);
Now when reading the encoded values (either by other shader or using glReadPixels(), you can decode the encoded float and get the value back:
float denormalize(float val)
{
float min = -1.0;
float max = 1.0;
float den = val * (max - min) + min;
return den;
}
float decode_2_bytes(vec2 a)
{
float ret;
ret = a.x * 1.0 + a.y * 1.0/256.0;
ret = denormalize(ret);
return ret;
}
Pay attention that the denormalization values have to match the normalization values (on this example -1, 1.
You can find more about float encoding here: http://aras-p.info/blog/2009/07/30/encoding-floats-to-rgba-the-final/