Different results in floating-point calculations on WebGL2 and C++ - c++

I am trying to make calculations on the fragment shader in WebGL2. And I've noticed that the calculations there are not as precise as on C++. I know that the high precision float contains 32 bits either in the fragment shader or in C++.
I am trying to compute 1.0000001^(10000000) and get around 2.8 on C++ and around 3.2 on the shader. Do you know the reason that the fragment shader calculations are not as precise as the same calculations on C++?
code on C++
#include <iostream>
void main()
{
const float NEAR_ONE = 1.0000001;
float result = NEAR_ONE;
for (int i = 0; i < 10000000; i++)
{
result = result * NEAR_ONE;
}
std::cout << result << std::endl; // result is 2.88419
}
Fragment shader code:
#version 300 es
precision highp float;
out vec4 color;
void main()
{
const float NEAR_ONE = 1.0000001;
float result = NEAR_ONE;
for (int i = 0; i < 10000000; i++)
{
result = result * NEAR_ONE;
}
if ((result > 3.2) && (result < 3.3))
{
// The screen is colored by red and this is how we know
// that the value of result is in between 3.2 and 3.3
color = vec4(1.0, 0.0, 0.0, 1.0); // Red
}
else
{
// We never come here.
color = vec4(0.0, 0.0, 0.0, 1.0); // Black
}
}
Update:
Here one can find the html file with the full code for the WebGL2 example

OpenGL ES 3.0 on which WebGL2 is based does not require floating point on the GPU to work the same as it does in C++
From the spec
2.1.1 Floating-Point Computation
The GL must perform a number of floating-point operations during the course of
its operation. In some cases, the representation and/or precision of such operations
is defined or limited; by the OpenGL ES Shading Language Specification for operations in shaders, and in some cases implicitly limited by the specified format
of vertex, texture, or renderbuffer data consumed by the GL. Otherwise, the representation of such floating-point numbers, and the details of how operations on
them are performed, is not specified. We require simply that numbers’ floating point parts contain enough bits and that their exponent fields are large enough so
that individual results of floating-point operations are accurate to about 1 part in
105
. The maximum representable magnitude for all floating-point values must be
at least 232
. x· 0 = 0 ·x = 0 for any non-infinite and non-NaN x. 1 ·x = x· 1 = x.
x + 0 = 0 + x = x. 0
0 = 1. (Occasionally further requirements will be specified.)
Most single-precision floating-point formats meet these requirements.
Just for fun let's do it and print the results. Using WebGL1 so can test on more devices
function main() {
const gl = document.createElement('canvas').getContext('webgl');
const ext = gl.getExtension('OES_texture_float');
if (!ext) { return alert('need OES_texture_float'); }
// not required - long story
gl.getExtension('WEBGL_color_buffer_float');
const fbi = twgl.createFramebufferInfo(gl, [
{ type: gl.FLOAT, minMag: gl.NEAREST, wrap: gl.CLAMP_TO_EDGE, }
], 1, 1);
const vs = `
void main() {
gl_Position = vec4(0, 0, 0, 1);
gl_PointSize = 1.0;
}
`;
const fs = `
precision highp float;
void main() {
const float NEAR_ONE = 1.0000001;
float result = NEAR_ONE;
for (int i = 0; i < 10000000; i++) {
result = result * NEAR_ONE;
}
gl_FragColor = vec4(result);
}
`;
const prg = twgl.createProgram(gl, [vs, fs]);
gl.useProgram(prg);
gl.viewport(0, 0, 1, 1);
gl.drawArrays(gl.POINTS, 0, 1);
const values = new Float32Array(4);
gl.readPixels(0, 0, 1, 1, gl.RGBA, gl.FLOAT, values);
console.log(values[0]);
}
main();
<script src="https://twgljs.org/dist/4.x/twgl.js"></script>
My results:
Intel Iris Pro : 2.884186029434204
NVidia GT 750 M : 3.293879985809326
NVidia GeForce GTX 1060 : 3.2939157485961914
Intel UHD Graphics 617 : 3.292219638824464

The difference is precision. In fact, if you compile the c++ fragment using double (64-bit floating-point, with 53-bit mantissa) instead of float (32-bit floating-point, with 24-bit mantissa), you obtain as result 3.29397, which is the result you get using the shader.

Related

Can't loop over a sampler2D array after specifying WebGL2 context in Three.js

I have been using a sampler2D array in my fragment shader (those are shadow maps, there can be up to 16 of them, so an array is more preferable than using 16 separate variables, of course). Then I added the WebGL2 context (const context = canvas.getContext('webgl2');) to the THREE.WebGLRenderer that I'm using and now I can't get the program to work: it says array index for samplers must be constant integral expressions when I attempt to access the sampler array elements in a loop like this:
uniform sampler2D samplers[MAX_SPLITS];
for (int i = 0; i < MAX_SPLITS; ++i) {
if (i >= splitCount) {
break;
}
if (func(samplers[i])) { // that's where the error happens
...
}
}
Is there really no way around this? Do I have to use sixteen separate variables?
(there is no direct #version directive in the shader but THREE.js seems to add a #version 300 es by default)
You can't use dynamic indexing with samplers in GLSL ES 3.0. From the spec
12.29 Samplers
Should samplers be allowed as l-values? The specification already allows an equivalent behavior:
Current specification:
uniform sampler2D sampler[8];
int index = f(...);
vec4 tex = texture(sampler[index], xy); // allowed
Using assignment of sampler types:
uniform sampler2D s;
s = g(...);
vec4 tex = texture(s, xy); // not allowed
RESOLUTION: Dynamic indexing of sampler arrays is now prohibited by the specification. Restrict indexing of sampler arrays to constant integral expressions.
and
12.30 Dynamic Indexing
For GLSL ES 1.00, support of dynamic indexing of arrays, vectors and matrices was not mandated
because it was not directly supported by some implementations. Software solutions (via program
transforms) exist for a subset of cases but lead to poor performance. Should support for dynamic indexing
be mandated for GLSL ES 3.00?
RESOLUTION: Mandate support for dynamic indexing of arrays except for sampler arrays, fragment
output arrays and uniform block arrays.
Should support for dynamic indexing of vectors and matrices be mandated in GLSL ES 3.00?
RESOLUTION: Yes.
Indexing of arrays of samplers by constant-index-expressions is supported
in GLSL ES 1.00. A constant index-expression is an expression formed from
constant-expressions and certain loop indices, defined for
a subset of loop constructs. Should this functionality be included in GLSL ES 3.00?
RESOLUTION: No. Arrays of samplers may only be indexed by constant-integral-expressions.
Can you use a 2D_ARRAY texture to solve your issue? Put each of your current 2D textures into a layer of a 2D_ARRAY texture then the z coord is just an integer layer index. Advantage, you can use many more layers with a 2D_ARRAY then you get samplers. WebGL2 implementations generally only have 32 samplers but allow hundreds or thousands of layers in a 2D_ARRAY texture.
or use GLSL 1.0
const vs1 = `
void main() { gl_Position = vec4(0); }
`;
const vs3 = `#version 300 es
void main() { gl_Position = vec4(0); }
`;
const fs1 = `
precision highp float;
#define MAX_SPLITS 4
uniform sampler2D samplers[MAX_SPLITS];
uniform int splitCount;
bool func(sampler2D s) {
return texture2D(s, vec2(0)).r > 0.5;
}
void main() {
float v = 0.0;
for (int i = 0; i < MAX_SPLITS; ++i) {
if (i >= splitCount) {
break;
}
if (func(samplers[i])) { // that's where the error happens
v += 1.0;
}
}
gl_FragColor = vec4(v);
}
`;
const fs3 = `#version 300 es
precision highp float;
#define MAX_SPLITS 4
uniform sampler2D samplers[MAX_SPLITS];
uniform int splitCount;
bool func(sampler2D s) {
return texture(s, vec2(0)).r > 0.5;
}
out vec4 color;
void main() {
float v = 0.0;
for (int i = 0; i < MAX_SPLITS; ++i) {
if (i >= splitCount) {
break;
}
if (func(samplers[i])) { // that's where the error happens
v += 1.0;
}
}
color = vec4(v);
}
`;
function main() {
const gl = document.createElement('canvas').getContext('webgl2');
if (!gl) {
return alert('need WebGL2');
}
test('glsl 1.0', vs1, fs1);
test('glsl 3.0', vs3, fs3);
function test(msg, vs, fs) {
const p = twgl.createProgram(gl, [vs, fs]);
log(msg, ':', p ? 'success' : 'fail');
}
}
main();
function log(...args) {
const elem = document.createElement('pre');
elem.textContent = [...args].join(' ');
document.body.appendChild(elem);
}
<script src="https://twgljs.org/dist/4.x/twgl.min.js"></script>

How to do dynamic loop in WebGL GLSL [duplicate]

I have the a webgl blur shader:
precision mediump float;
precision mediump int;
uniform sampler2D u_image;
uniform float blur;
uniform int u_horizontalpass; // 0 or 1 to indicate vertical or horizontal pass
uniform float sigma; // The sigma value for the gaussian function: higher value means more blur
// A good value for 9x9 is around 3 to 5
// A good value for 7x7 is around 2.5 to 4
// A good value for 5x5 is around 2 to 3.5
// ... play around with this based on what you need :)
varying vec4 v_texCoord;
const vec2 texOffset = vec2(1.0, 1.0);
// uniform vec2 texOffset;
const float PI = 3.14159265;
void main() {
vec2 p = v_texCoord.st;
float numBlurPixelsPerSide = blur / 2.0;
// Incremental Gaussian Coefficent Calculation (See GPU Gems 3 pp. 877 - 889)
vec3 incrementalGaussian;
incrementalGaussian.x = 1.0 / (sqrt(2.0 * PI) * sigma);
incrementalGaussian.y = exp(-0.5 / (sigma * sigma));
incrementalGaussian.z = incrementalGaussian.y * incrementalGaussian.y;
vec4 avgValue = vec4(0.0, 0.0, 0.0, 0.0);
float coefficientSum = 0.0;
// Take the central sample first...
avgValue += texture2D(u_image, p) * incrementalGaussian.x;
coefficientSum += incrementalGaussian.x;
incrementalGaussian.xy *= incrementalGaussian.yz;
// Go through the remaining 8 vertical samples (4 on each side of the center)
for (float i = 1.0; i <= numBlurPixelsPerSide; i += 1.0) {
avgValue += texture2D(u_image, p - i * texOffset) * incrementalGaussian.x;
avgValue += texture2D(u_image, p + i * texOffset) * incrementalGaussian.x;
coefficientSum += 2.0 * incrementalGaussian.x;
incrementalGaussian.xy *= incrementalGaussian.yz;
}
gl_FragColor = avgValue / coefficientSum;
}
When I build, I get the following error message:
webgl-renderer.js?2eb3:137 Uncaught could not compile shader:ERROR:
0:38: 'i' : Loop index cannot be compared with non-constant expression
I have also tried to use just the uniform float blur to compare i to. Is there any way to fix this?
The problem is further detailed here: https://www.khronos.org/webgl/public-mailing-list/archives/1012/msg00063.php
The solution that I've found looking around is to only use a constant expression when comparing a loop var. This doesn't fit with what I need to do which is vary how many times I'm looping based on the blur radius.
Any thoughts on this?
This happens because on some hardware, GLSL loops are un-rolled into native GPU instructions. This means there needs to be a hard upper limit to the number of passes through the for loop, that governs how many copies of the loop's inner code will be generated. If you replace numBlurPixelsPerSide with a const float or even a #define directive, and the shader compiler can then determine the number of passes at compile time, and generate the code accordingly. But with a uniform there, the upper limit is not known at compile time.
There's an interesting wrinkle in this rule: You're allowed to break or call an early return out of a for loop, even though the max iterations must be discernible at compile time. For example, consider this tiny Mandelbrot shader. This is hardly the prettiest fractal on GLSL Sandbox, but I chose it for its small size:
precision mediump float;
uniform float time;
uniform vec2 mouse;
uniform vec2 resolution;
varying vec2 surfacePosition;
const float max_its = 100.;
float mandelbrot(vec2 z){
vec2 c = z;
for(float i=0.;i<max_its;i++){ // for loop is here.
if(dot(z,z)>4.) return i; // conditional early return here.
z = vec2(z.x*z.x-z.y*z.y,2.*z.x*z.y)+c;
}
return max_its;
}
void main( void ) {
vec2 p = surfacePosition;
gl_FragColor = vec4(mandelbrot(p)/max_its);
}
In this example, max_its is a const so the compiler knows the upper limit and can un-roll this loop if it needs to. Inside the loop, a return statement offers a way to leave the loop early for pixels that are outside of the Mandelbrot set.
You still don't want to set the max iterations too high, as this can produce a lot of GPU instructions and possibly hurt performance.
Try something like this:
const float MAX_ITERATIONS = 100.0;
// Go through the remaining 8 vertical samples (4 on each side of the center)
for (float i = 1.0; i <= MAX_ITERATIONS; i += 1.0) {
if (i >= numBlurPixelsPerSide){break;}
avgValue += texture2D(u_image, p - i * texOffset) * incrementalGaussian.x;
avgValue += texture2D(u_image, p + i * texOffset) * incrementalGaussian.x;
coefficientSum += 2.0 * incrementalGaussian.x;
incrementalGaussian.xy *= incrementalGaussian.yz;
}
Sometimes you can use my very simple solving of issue.
My fragment of the shader source code:
const int cloudPointsWidth = %s;
for ( int i = 0; i < cloudPointsWidth; i++ ) {
//TO DO something
}
You can see '%' : syntax error above. But I am replace %s to a number in my javascript code before use my shader. For example:
vertexCode = vertexCode.replace( '%s', 10 );
vertexCode is my shader source code.
Everytime if I want to change cloudPointsWidth, I am destroying my old shader and creating new shader with new cloudPointsWidth .
Hope sometimes my solving can to help you.
You can just do a for loop with large constant number and use a break.
for(int i = 0; i < 1000000; ++i)
{
// your code here
if(i >= n){
break;
}
}
I've had similar problem with image downsampling shader. The code is basically the same:
for (int dx = -2 * SCALE_FACTOR; dx < 2 * SCALE_FACTOR; dx += 2) {
for (int dy = -2 * SCALE_FACTOR; dy < 2 * SCALE_FACTOR; dy += 2) {
/* accumulate fragment's color */
}
}
What I've ended up doing is using preprocessor and creating separate shader programs for every SCALE_FACTOR used (luckily, only 4 was needed). To achieve that, a small helper function was implemented to add #define ... statements to shader code:
function insertDefines (shaderCode, defines) {
var defineString = '';
for (var define in defines) {
if (defines.hasOwnProperty(define)) {
defineString +=
'#define ' + define + ' ' + defines[define] + '\n';
}
}
var versionIdx = shaderCode.indexOf('#version');
if (versionIdx == -1) {
return defineString + shaderCode;
}
var nextLineIdx = shaderCode.indexOf('\n', versionIdx) + 1;
return shaderCode.slice(0, nextLineIdx) +
defineString +
shaderCode.slice(nextLineIdx);
}
The implementation is a bit tricky because if the code already has #version preprocessor statement in it, all other statements have to follow it.
Then I've added a check for SCALE_FACROR being defined:
#ifndef SCALE_FACTOR
# error SCALE_FACTOR is undefined
#endif
And in my javascript code I've done something like this:
var SCALE_FACTORS = [4, 8, 16, 32],
shaderCode, // the code of my shader
shaderPrograms = SCALE_FACTORS.map(function (factor) {
var codeWithDefines = insertDefines(shaderCode, { SCALE_FACTOR: factor });
/* compile shaders, link program, return */
});
I use opengl es3 on android and solve this problem by using extension above the beginning of program like this:
#extension GL_EXT_gpu_shader5 : require
I don't know whether it work on webGL, but you can try it.
Hope it can help.
You can also use template litterals to set the length of the loop
onBeforeCompile(shader) {
const array = [1,2,3,4,5];
shader.uniforms.myArray = { value: array };
let token = "#include <begin_vertex>";
const insert = `
uniform float myArray[${array.length}];
for ( int i = 0; i < ${array.length}; i++ ) {
float test = myArray[ i ];
}
`;
shader.vertexShader = shader.vertexShader.replace(token, token + insert);
}

WebGL: Loop index cannot be compared with non-constant expression

I have the a webgl blur shader:
precision mediump float;
precision mediump int;
uniform sampler2D u_image;
uniform float blur;
uniform int u_horizontalpass; // 0 or 1 to indicate vertical or horizontal pass
uniform float sigma; // The sigma value for the gaussian function: higher value means more blur
// A good value for 9x9 is around 3 to 5
// A good value for 7x7 is around 2.5 to 4
// A good value for 5x5 is around 2 to 3.5
// ... play around with this based on what you need :)
varying vec4 v_texCoord;
const vec2 texOffset = vec2(1.0, 1.0);
// uniform vec2 texOffset;
const float PI = 3.14159265;
void main() {
vec2 p = v_texCoord.st;
float numBlurPixelsPerSide = blur / 2.0;
// Incremental Gaussian Coefficent Calculation (See GPU Gems 3 pp. 877 - 889)
vec3 incrementalGaussian;
incrementalGaussian.x = 1.0 / (sqrt(2.0 * PI) * sigma);
incrementalGaussian.y = exp(-0.5 / (sigma * sigma));
incrementalGaussian.z = incrementalGaussian.y * incrementalGaussian.y;
vec4 avgValue = vec4(0.0, 0.0, 0.0, 0.0);
float coefficientSum = 0.0;
// Take the central sample first...
avgValue += texture2D(u_image, p) * incrementalGaussian.x;
coefficientSum += incrementalGaussian.x;
incrementalGaussian.xy *= incrementalGaussian.yz;
// Go through the remaining 8 vertical samples (4 on each side of the center)
for (float i = 1.0; i <= numBlurPixelsPerSide; i += 1.0) {
avgValue += texture2D(u_image, p - i * texOffset) * incrementalGaussian.x;
avgValue += texture2D(u_image, p + i * texOffset) * incrementalGaussian.x;
coefficientSum += 2.0 * incrementalGaussian.x;
incrementalGaussian.xy *= incrementalGaussian.yz;
}
gl_FragColor = avgValue / coefficientSum;
}
When I build, I get the following error message:
webgl-renderer.js?2eb3:137 Uncaught could not compile shader:ERROR:
0:38: 'i' : Loop index cannot be compared with non-constant expression
I have also tried to use just the uniform float blur to compare i to. Is there any way to fix this?
The problem is further detailed here: https://www.khronos.org/webgl/public-mailing-list/archives/1012/msg00063.php
The solution that I've found looking around is to only use a constant expression when comparing a loop var. This doesn't fit with what I need to do which is vary how many times I'm looping based on the blur radius.
Any thoughts on this?
This happens because on some hardware, GLSL loops are un-rolled into native GPU instructions. This means there needs to be a hard upper limit to the number of passes through the for loop, that governs how many copies of the loop's inner code will be generated. If you replace numBlurPixelsPerSide with a const float or even a #define directive, and the shader compiler can then determine the number of passes at compile time, and generate the code accordingly. But with a uniform there, the upper limit is not known at compile time.
There's an interesting wrinkle in this rule: You're allowed to break or call an early return out of a for loop, even though the max iterations must be discernible at compile time. For example, consider this tiny Mandelbrot shader. This is hardly the prettiest fractal on GLSL Sandbox, but I chose it for its small size:
precision mediump float;
uniform float time;
uniform vec2 mouse;
uniform vec2 resolution;
varying vec2 surfacePosition;
const float max_its = 100.;
float mandelbrot(vec2 z){
vec2 c = z;
for(float i=0.;i<max_its;i++){ // for loop is here.
if(dot(z,z)>4.) return i; // conditional early return here.
z = vec2(z.x*z.x-z.y*z.y,2.*z.x*z.y)+c;
}
return max_its;
}
void main( void ) {
vec2 p = surfacePosition;
gl_FragColor = vec4(mandelbrot(p)/max_its);
}
In this example, max_its is a const so the compiler knows the upper limit and can un-roll this loop if it needs to. Inside the loop, a return statement offers a way to leave the loop early for pixels that are outside of the Mandelbrot set.
You still don't want to set the max iterations too high, as this can produce a lot of GPU instructions and possibly hurt performance.
Try something like this:
const float MAX_ITERATIONS = 100.0;
// Go through the remaining 8 vertical samples (4 on each side of the center)
for (float i = 1.0; i <= MAX_ITERATIONS; i += 1.0) {
if (i >= numBlurPixelsPerSide){break;}
avgValue += texture2D(u_image, p - i * texOffset) * incrementalGaussian.x;
avgValue += texture2D(u_image, p + i * texOffset) * incrementalGaussian.x;
coefficientSum += 2.0 * incrementalGaussian.x;
incrementalGaussian.xy *= incrementalGaussian.yz;
}
Sometimes you can use my very simple solving of issue.
My fragment of the shader source code:
const int cloudPointsWidth = %s;
for ( int i = 0; i < cloudPointsWidth; i++ ) {
//TO DO something
}
You can see '%' : syntax error above. But I am replace %s to a number in my javascript code before use my shader. For example:
vertexCode = vertexCode.replace( '%s', 10 );
vertexCode is my shader source code.
Everytime if I want to change cloudPointsWidth, I am destroying my old shader and creating new shader with new cloudPointsWidth .
Hope sometimes my solving can to help you.
You can just do a for loop with large constant number and use a break.
for(int i = 0; i < 1000000; ++i)
{
// your code here
if(i >= n){
break;
}
}
I've had similar problem with image downsampling shader. The code is basically the same:
for (int dx = -2 * SCALE_FACTOR; dx < 2 * SCALE_FACTOR; dx += 2) {
for (int dy = -2 * SCALE_FACTOR; dy < 2 * SCALE_FACTOR; dy += 2) {
/* accumulate fragment's color */
}
}
What I've ended up doing is using preprocessor and creating separate shader programs for every SCALE_FACTOR used (luckily, only 4 was needed). To achieve that, a small helper function was implemented to add #define ... statements to shader code:
function insertDefines (shaderCode, defines) {
var defineString = '';
for (var define in defines) {
if (defines.hasOwnProperty(define)) {
defineString +=
'#define ' + define + ' ' + defines[define] + '\n';
}
}
var versionIdx = shaderCode.indexOf('#version');
if (versionIdx == -1) {
return defineString + shaderCode;
}
var nextLineIdx = shaderCode.indexOf('\n', versionIdx) + 1;
return shaderCode.slice(0, nextLineIdx) +
defineString +
shaderCode.slice(nextLineIdx);
}
The implementation is a bit tricky because if the code already has #version preprocessor statement in it, all other statements have to follow it.
Then I've added a check for SCALE_FACROR being defined:
#ifndef SCALE_FACTOR
# error SCALE_FACTOR is undefined
#endif
And in my javascript code I've done something like this:
var SCALE_FACTORS = [4, 8, 16, 32],
shaderCode, // the code of my shader
shaderPrograms = SCALE_FACTORS.map(function (factor) {
var codeWithDefines = insertDefines(shaderCode, { SCALE_FACTOR: factor });
/* compile shaders, link program, return */
});
I use opengl es3 on android and solve this problem by using extension above the beginning of program like this:
#extension GL_EXT_gpu_shader5 : require
I don't know whether it work on webGL, but you can try it.
Hope it can help.
You can also use template litterals to set the length of the loop
onBeforeCompile(shader) {
const array = [1,2,3,4,5];
shader.uniforms.myArray = { value: array };
let token = "#include <begin_vertex>";
const insert = `
uniform float myArray[${array.length}];
for ( int i = 0; i < ${array.length}; i++ ) {
float test = myArray[ i ];
}
`;
shader.vertexShader = shader.vertexShader.replace(token, token + insert);
}

Gaussian-distributed pseudo-random number generator in GLSL [duplicate]

As the GPU driver vendors don't usually bother to implement noiseX in GLSL, I'm looking for a "graphics randomization swiss army knife" utility function set, preferably optimised to use within GPU shaders. I prefer GLSL, but code any language will do for me, I'm ok with translating it on my own to GLSL.
Specifically, I'd expect:
a) Pseudo-random functions - N-dimensional, uniform distribution over [-1,1] or over [0,1], calculated from M-dimensional seed (ideally being any value, but I'm OK with having the seed restrained to, say, 0..1 for uniform result distribution). Something like:
float random (T seed);
vec2 random2 (T seed);
vec3 random3 (T seed);
vec4 random4 (T seed);
// T being either float, vec2, vec3, vec4 - ideally.
b) Continous noise like Perlin Noise - again, N-dimensional, +- uniform distribution, with constrained set of values and, well, looking good (some options to configure the appearance like Perlin levels could be useful too). I'd expect signatures like:
float noise (T coord, TT seed);
vec2 noise2 (T coord, TT seed);
// ...
I'm not very much into random number generation theory, so I'd most eagerly go for a pre-made solution, but I'd also appreciate answers like "here's a very good, efficient 1D rand(), and let me explain you how to make a good N-dimensional rand() on top of it..." .
For very simple pseudorandom-looking stuff, I use this oneliner that I found on the internet somewhere:
float rand(vec2 co){
return fract(sin(dot(co, vec2(12.9898, 78.233))) * 43758.5453);
}
You can also generate a noise texture using whatever PRNG you like, then upload this in the normal fashion and sample the values in your shader; I can dig up a code sample later if you'd like.
Also, check out this file for GLSL implementations of Perlin and Simplex noise, by Stefan Gustavson.
It occurs to me that you could use a simple integer hash function and insert the result into a float's mantissa. IIRC the GLSL spec guarantees 32-bit unsigned integers and IEEE binary32 float representation so it should be perfectly portable.
I gave this a try just now. The results are very good: it looks exactly like static with every input I tried, no visible patterns at all. In contrast the popular sin/fract snippet has fairly pronounced diagonal lines on my GPU given the same inputs.
One disadvantage is that it requires GLSL v3.30. And although it seems fast enough, I haven't empirically quantified its performance. AMD's Shader Analyzer claims 13.33 pixels per clock for the vec2 version on a HD5870. Contrast with 16 pixels per clock for the sin/fract snippet. So it is certainly a little slower.
Here's my implementation. I left it in various permutations of the idea to make it easier to derive your own functions from.
/*
static.frag
by Spatial
05 July 2013
*/
#version 330 core
uniform float time;
out vec4 fragment;
// A single iteration of Bob Jenkins' One-At-A-Time hashing algorithm.
uint hash( uint x ) {
x += ( x << 10u );
x ^= ( x >> 6u );
x += ( x << 3u );
x ^= ( x >> 11u );
x += ( x << 15u );
return x;
}
// Compound versions of the hashing algorithm I whipped together.
uint hash( uvec2 v ) { return hash( v.x ^ hash(v.y) ); }
uint hash( uvec3 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ); }
uint hash( uvec4 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ^ hash(v.w) ); }
// Construct a float with half-open range [0:1] using low 23 bits.
// All zeroes yields 0.0, all ones yields the next smallest representable value below 1.0.
float floatConstruct( uint m ) {
const uint ieeeMantissa = 0x007FFFFFu; // binary32 mantissa bitmask
const uint ieeeOne = 0x3F800000u; // 1.0 in IEEE binary32
m &= ieeeMantissa; // Keep only mantissa bits (fractional part)
m |= ieeeOne; // Add fractional part to 1.0
float f = uintBitsToFloat( m ); // Range [1:2]
return f - 1.0; // Range [0:1]
}
// Pseudo-random value in half-open range [0:1].
float random( float x ) { return floatConstruct(hash(floatBitsToUint(x))); }
float random( vec2 v ) { return floatConstruct(hash(floatBitsToUint(v))); }
float random( vec3 v ) { return floatConstruct(hash(floatBitsToUint(v))); }
float random( vec4 v ) { return floatConstruct(hash(floatBitsToUint(v))); }
void main()
{
vec3 inputs = vec3( gl_FragCoord.xy, time ); // Spatial and temporal inputs
float rand = random( inputs ); // Random per-pixel value
vec3 luma = vec3( rand ); // Expand to RGB
fragment = vec4( luma, 1.0 );
}
Screenshot:
I inspected the screenshot in an image editing program. There are 256 colours and the average value is 127, meaning the distribution is uniform and covers the expected range.
Gustavson's implementation uses a 1D texture
No it doesn't, not since 2005. It's just that people insist on downloading the old version. The version that is on the link you supplied uses only 8-bit 2D textures.
The new version by Ian McEwan of Ashima and myself does not use a texture, but runs at around half the speed on typical desktop platforms with lots of texture bandwidth. On mobile platforms, the textureless version might be faster because texturing is often a significant bottleneck.
Our actively maintained source repository is:
https://github.com/ashima/webgl-noise
A collection of both the textureless and texture-using versions of noise is here (using only 2D textures):
http://www.itn.liu.se/~stegu/simplexnoise/GLSL-noise-vs-noise.zip
If you have any specific questions, feel free to e-mail me directly (my email address can be found in the classicnoise*.glsl sources.)
Gold Noise
// Gold Noise ©2015 dcerisano#standard3d.com
// - based on the Golden Ratio
// - uniform normalized distribution
// - fastest static noise generator function (also runs at low precision)
// - use with indicated fractional seeding method.
float PHI = 1.61803398874989484820459; // Φ = Golden Ratio
float gold_noise(in vec2 xy, in float seed){
return fract(tan(distance(xy*PHI, xy)*seed)*xy.x);
}
See Gold Noise in your browser right now!
This function has improved random distribution over the current function in #appas' answer as of Sept 9, 2017:
The #appas function is also incomplete, given there is no seed supplied (uv is not a seed - same for every frame), and does not work with low precision chipsets. Gold Noise runs at low precision by default (much faster).
There is also a nice implementation described here by McEwan and #StefanGustavson that looks like Perlin noise, but "does not require any setup, i.e. not textures nor uniform arrays. Just add it to your shader source code and call it wherever you want".
That's very handy, especially given that Gustavson's earlier implementation, which #dep linked to, uses a 1D texture, which is not supported in GLSL ES (the shader language of WebGL).
After the initial posting of this question in 2010, a lot has changed in the realm of good random functions and hardware support for them.
Looking at the accepted answer from today's perspective, this algorithm is very bad in uniformity of the random numbers drawn from it. And the uniformity suffers a lot depending on the magnitude of the input values and visible artifacts/patterns will become apparent when sampling from it for e.g. ray/path tracing applications.
There have been many different functions (most of them integer hashing) being devised for this task, for different input and output dimensionality, most of which are being evaluated in the 2020 JCGT paper Hash Functions for GPU Rendering. Depending on your needs you could select a function from the list of proposed functions in that paper and simply from the accompanying Shadertoy.
One that isn't covered in this paper but that has served me very well without any noticeably patterns on any input magnitude values is also one that I want to highlight.
Other classes of algorithms use low-discrepancy sequences to draw pseudo-random numbers from, such as the Sobol squence with Owen-Nayar scrambling. Eric Heitz has done some amazing research in this area, as well with his A Low-Discrepancy Sampler that Distributes Monte Carlo Errors as a Blue Noise in Screen Space paper.
Another example of this is the (so far latest) JCGT paper Practical Hash-based Owen Scrambling, which applies Owen scrambling to a different hash function (namely Laine-Karras).
Yet other classes use algorithms that produce noise patterns with desirable frequency spectrums, such as blue noise, that is particularly "pleasing" to the eyes.
(I realize that good StackOverflow answers should provide the algorithms as source code and not as links because those can break, but there are way too many different algorithms nowadays and I intend for this answer to be a summary of known-good algorithms today)
Do use this:
highp float rand(vec2 co)
{
highp float a = 12.9898;
highp float b = 78.233;
highp float c = 43758.5453;
highp float dt= dot(co.xy ,vec2(a,b));
highp float sn= mod(dt,3.14);
return fract(sin(sn) * c);
}
Don't use this:
float rand(vec2 co){
return fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453);
}
You can find the explanation in Improvements to the canonical one-liner GLSL rand() for OpenGL ES 2.0
hash:
Nowadays webGL2.0 is there so integers are available in (w)GLSL.
-> for quality portable hash (at similar cost than ugly float hashes) we can now use "serious" hashing techniques.
IQ implemented some in https://www.shadertoy.com/view/XlXcW4 (and more)
E.g.:
const uint k = 1103515245U; // GLIB C
//const uint k = 134775813U; // Delphi and Turbo Pascal
//const uint k = 20170906U; // Today's date (use three days ago's dateif you want a prime)
//const uint k = 1664525U; // Numerical Recipes
vec3 hash( uvec3 x )
{
x = ((x>>8U)^x.yzx)*k;
x = ((x>>8U)^x.yzx)*k;
x = ((x>>8U)^x.yzx)*k;
return vec3(x)*(1.0/float(0xffffffffU));
}
Just found this version of 3d noise for GPU, alledgedly it is the fastest one available:
#ifndef __noise_hlsl_
#define __noise_hlsl_
// hash based 3d value noise
// function taken from https://www.shadertoy.com/view/XslGRr
// Created by inigo quilez - iq/2013
// License Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
// ported from GLSL to HLSL
float hash( float n )
{
return frac(sin(n)*43758.5453);
}
float noise( float3 x )
{
// The noise function returns a value in the range -1.0f -> 1.0f
float3 p = floor(x);
float3 f = frac(x);
f = f*f*(3.0-2.0*f);
float n = p.x + p.y*57.0 + 113.0*p.z;
return lerp(lerp(lerp( hash(n+0.0), hash(n+1.0),f.x),
lerp( hash(n+57.0), hash(n+58.0),f.x),f.y),
lerp(lerp( hash(n+113.0), hash(n+114.0),f.x),
lerp( hash(n+170.0), hash(n+171.0),f.x),f.y),f.z);
}
#endif
A straight, jagged version of 1d Perlin, essentially a random lfo zigzag.
half rn(float xx){
half x0=floor(xx);
half x1=x0+1;
half v0 = frac(sin (x0*.014686)*31718.927+x0);
half v1 = frac(sin (x1*.014686)*31718.927+x1);
return (v0*(1-frac(xx))+v1*(frac(xx)))*2-1*sin(xx);
}
I also have found 1-2-3-4d perlin noise on shadertoy owner inigo quilez perlin tutorial website, and voronoi and so forth, he has full fast implementations and codes for them.
I have translated one of Ken Perlin's Java implementations into GLSL and used it in a couple projects on ShaderToy.
Below is the GLSL interpretation I did:
int b(int N, int B) { return N>>B & 1; }
int T[] = int[](0x15,0x38,0x32,0x2c,0x0d,0x13,0x07,0x2a);
int A[] = int[](0,0,0);
int b(int i, int j, int k, int B) { return T[b(i,B)<<2 | b(j,B)<<1 | b(k,B)]; }
int shuffle(int i, int j, int k) {
return b(i,j,k,0) + b(j,k,i,1) + b(k,i,j,2) + b(i,j,k,3) +
b(j,k,i,4) + b(k,i,j,5) + b(i,j,k,6) + b(j,k,i,7) ;
}
float K(int a, vec3 uvw, vec3 ijk)
{
float s = float(A[0]+A[1]+A[2])/6.0;
float x = uvw.x - float(A[0]) + s,
y = uvw.y - float(A[1]) + s,
z = uvw.z - float(A[2]) + s,
t = 0.6 - x * x - y * y - z * z;
int h = shuffle(int(ijk.x) + A[0], int(ijk.y) + A[1], int(ijk.z) + A[2]);
A[a]++;
if (t < 0.0)
return 0.0;
int b5 = h>>5 & 1, b4 = h>>4 & 1, b3 = h>>3 & 1, b2= h>>2 & 1, b = h & 3;
float p = b==1?x:b==2?y:z, q = b==1?y:b==2?z:x, r = b==1?z:b==2?x:y;
p = (b5==b3 ? -p : p); q = (b5==b4 ? -q : q); r = (b5!=(b4^b3) ? -r : r);
t *= t;
return 8.0 * t * t * (p + (b==0 ? q+r : b2==0 ? q : r));
}
float noise(float x, float y, float z)
{
float s = (x + y + z) / 3.0;
vec3 ijk = vec3(int(floor(x+s)), int(floor(y+s)), int(floor(z+s)));
s = float(ijk.x + ijk.y + ijk.z) / 6.0;
vec3 uvw = vec3(x - float(ijk.x) + s, y - float(ijk.y) + s, z - float(ijk.z) + s);
A[0] = A[1] = A[2] = 0;
int hi = uvw.x >= uvw.z ? uvw.x >= uvw.y ? 0 : 1 : uvw.y >= uvw.z ? 1 : 2;
int lo = uvw.x < uvw.z ? uvw.x < uvw.y ? 0 : 1 : uvw.y < uvw.z ? 1 : 2;
return K(hi, uvw, ijk) + K(3 - hi - lo, uvw, ijk) + K(lo, uvw, ijk) + K(0, uvw, ijk);
}
I translated it from Appendix B from Chapter 2 of Ken Perlin's Noise Hardware at this source:
https://www.csee.umbc.edu/~olano/s2002c36/ch02.pdf
Here is a public shade I did on Shader Toy that uses the posted noise function:
https://www.shadertoy.com/view/3slXzM
Some other good sources I found on the subject of noise during my research include:
https://thebookofshaders.com/11/
https://mzucker.github.io/html/perlin-noise-math-faq.html
https://rmarcus.info/blog/2018/03/04/perlin-noise.html
http://flafla2.github.io/2014/08/09/perlinnoise.html
https://mrl.nyu.edu/~perlin/noise/
https://rmarcus.info/blog/assets/perlin/perlin_paper.pdf
https://developer.nvidia.com/gpugems/GPUGems/gpugems_ch05.html
I highly recommend the book of shaders as it not only provides a great interactive explanation of noise, but other shader concepts as well.
EDIT:
Might be able to optimize the translated code by using some of the hardware-accelerated functions available in GLSL. Will update this post if I end up doing this.
lygia, a multi-language shader library
If you don't want to copy / paste the functions into your shader, you can also use lygia, a multi-language shader library. It contains a few generative functions like cnoise, fbm, noised, pnoise, random, snoise in both GLSL and HLSL. And many other awesome functions as well. For this to work it:
Relays on #include "file" which is defined by Khronos GLSL standard and suported by most engines and enviroments (like glslViewer, glsl-canvas VS Code pluging, Unity, etc. ).
Example: cnoise
Using cnoise.glsl with #include:
#ifdef GL_ES
precision mediump float;
#endif
uniform vec2 u_resolution;
uniform float u_time;
#include "lygia/generative/cnoise.glsl"
void main (void) {
vec2 st = gl_FragCoord.xy / u_resolution.xy;
vec3 color = vec3(cnoise(vec3(st * 5.0, u_time)));
gl_FragColor = vec4(color, 1.0);
}
To run this example I used glslViewer.
Please see below an example how to add white noise to the rendered texture.
The solution is to use two textures: original and pure white noise, like this one: wiki white noise
private static final String VERTEX_SHADER =
"uniform mat4 uMVPMatrix;\n" +
"uniform mat4 uMVMatrix;\n" +
"uniform mat4 uSTMatrix;\n" +
"attribute vec4 aPosition;\n" +
"attribute vec4 aTextureCoord;\n" +
"varying vec2 vTextureCoord;\n" +
"varying vec4 vInCamPosition;\n" +
"void main() {\n" +
" vTextureCoord = (uSTMatrix * aTextureCoord).xy;\n" +
" gl_Position = uMVPMatrix * aPosition;\n" +
"}\n";
private static final String FRAGMENT_SHADER =
"precision mediump float;\n" +
"uniform sampler2D sTextureUnit;\n" +
"uniform sampler2D sNoiseTextureUnit;\n" +
"uniform float uNoseFactor;\n" +
"varying vec2 vTextureCoord;\n" +
"varying vec4 vInCamPosition;\n" +
"void main() {\n" +
" gl_FragColor = texture2D(sTextureUnit, vTextureCoord);\n" +
" vec4 vRandChosenColor = texture2D(sNoiseTextureUnit, fract(vTextureCoord + uNoseFactor));\n" +
" gl_FragColor.r += (0.05 * vRandChosenColor.r);\n" +
" gl_FragColor.g += (0.05 * vRandChosenColor.g);\n" +
" gl_FragColor.b += (0.05 * vRandChosenColor.b);\n" +
"}\n";
The fragment shared contains parameter uNoiseFactor which is updated on every rendering by main application:
float noiseValue = (float)(mRand.nextInt() % 1000)/1000;
int noiseFactorUniformHandle = GLES20.glGetUniformLocation( mProgram, "sNoiseTextureUnit");
GLES20.glUniform1f(noiseFactorUniformHandle, noiseFactor);
FWIW I had the same questions and I needed it to be implemented in WebGL 1.0, so I couldn't use a few of the examples given in previous answers. I tried the Gold Noise mentioned before, but the use of PHI doesn't really click for me. (distance(xy * PHI, xy) * seed just equals length(xy) * (1.0 - PHI) * seed so I don't see how the magic of PHI should be put to work when it gets directly multiplied by seed?
Anyway, I did something similar just without PHI and instead added some variation at another place, basically I take the tan of the distance between xy and some random point lying outside of the frame to the top right and then multiply with the distance between xy and another such random point lying in the bottom left (so there is no accidental match between these points). Looks pretty decent as far as I can see. Click to generate new frames.
(function main() {
const dim = [512, 512];
twgl.setDefaults({ attribPrefix: "a_" });
const gl = twgl.getContext(document.querySelector("canvas"));
gl.canvas.width = dim[0];
gl.canvas.height = dim[1];
const bfi = twgl.primitives.createXYQuadBufferInfo(gl);
const pgi = twgl.createProgramInfo(gl, ["vs", "fs"]);
gl.canvas.onclick = (() => {
twgl.bindFramebufferInfo(gl, null);
gl.useProgram(pgi.program);
twgl.setUniforms(pgi, {
u_resolution: dim,
u_seed: Array(4).fill().map(Math.random)
});
twgl.setBuffersAndAttributes(gl, pgi, bfi);
twgl.drawBufferInfo(gl, bfi);
});
})();
<script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>
<script id="vs" type="x-shader/x-vertex">
attribute vec4 a_position;
attribute vec2 a_texcoord;
void main() {
gl_Position = a_position;
}
</script>
<script id="fs" type="x-shader/x-fragment">
precision highp float;
uniform vec2 u_resolution;
uniform vec2 u_seed[2];
void main() {
float uni = fract(
tan(distance(
gl_FragCoord.xy,
u_resolution * (u_seed[0] + 1.0)
)) * distance(
gl_FragCoord.xy,
u_resolution * (u_seed[1] - 2.0)
)
);
gl_FragColor = vec4(uni, uni, uni, 1.0);
}
</script>
<canvas></canvas>

how to implement grayscale rendering in OpenGL?

When rendering a scene of textured polygons, I'd like to be able to switch between rendering in the original colors and a "grayscale" mode. I've been trying to achieve this using blending and color matrix operations; none of it worked (with blending I couldn't find a glBlendFunc() that achieved something remotely resembling to what I wanted, and color matrix operations ...are discussed here).
A solution that comes to mind (but also is rather expensive) is to capture the screen every frame and convert the resulting texture to a grayscale one and display that instead... (Where I said grayscale I actually meant anything with a low saturation, but I'm guessing for most of the possible solutions it won't differ all that much from grayscale).
What other options do I have?
The default OpenGL framebuffer uses the RGB colour-space, which doesn't store an explicit saturation. You need an approach for extracting the saturation, modifying it, and change it back again.
My previous suggestion which simply used the RGB vector length to represent 0 in luminance was incorrect, as it didn't take scaling into account, I apologize.
Credit for the new short snippet goes to the regular user "RTFM_FTW" from ##opengl and ##opengl3 on FreeNode/IRC, and it lets you modify the saturation directly without computing the costly RGB->HSV->RGB conversion, which is exactly what you want. Though the HSV code is inferior with respect to your question, I let it stay.
void main( void )
{
vec3 R0 = texture2DRect( S, gl_TexCoord[0].st ).rgb;
gl_FragColor = vec4( mix( vec3( dot( R0, vec3( 0.2125, 0.7154, 0.0721 ) ) ),
R0, T ), gl_Color.a );
}
If you want more control than just the saturation, you need to convert to HSL or HSV colour-space. As shown below by using a GLSL fragment shader.
Read the OpenGL 3.0 and GLSL 1.30 specification available on http://www.opengl.org/registry to learn how to use GLSL v1.30 functionality.
#version 130
#define RED 0
#define GREEN 1
#define BLUE 2
in vec4 vertexIn;
in vec4 colorIn;
in vec2 tcoordIn;
out vec4 pixel;
Sampler2D tex;
vec4 texel;
const float epsilon = 1e-6;
vec3 RGBtoHSV(vec3 color)
{
/* hue, saturation and value are all in the range [0,1> here, as opposed to their
normal ranges of: hue: [0,360>, sat: [0, 100] and value: [0, 256> */
int sortindex[3] = {RED,GREEN,BLUE};
float rgbArr[3] = float[3](color.r, color.g, color.b);
float hue, saturation, value, diff;
float minCol, maxCol;
int minIndex, maxIndex;
if(color.g < color.r)
swap(sortindex[0], sortindex[1]);
if(color.b < color.g)
swap(sortindex[1], sortindex[2]);
if(color.r < color.b)
swap(sortindex[2], sortindex[0]);
minIndex = sortindex[0];
maxIndex = sortindex[2];
minCol = rgbArr[minIndex];
maxCol = rgbArr[maxIndex];
diff = maxCol - minCol;
/* Hue */
if( diff < epsilon){
hue = 0.0;
}
else if(maxIndex == RED){
hue = ((1.0/6.0) * ( (color.g - color.b) / diff )) + 1.0;
hue = fract(hue);
}
else if(maxIndex == GREEN){
hue = ((1.0/6.0) * ( (color.b - color.r) / diff )) + (1.0/3.0);
}
else if(maxIndex == BLUE){
hue = ((1.0/6.0) * ( (color.r - color.g) / diff )) + (2.0/3.0);
}
/* Saturation */
if(maxCol < epsilon)
saturation = 0;
else
saturation = (maxCol - minCol) / maxCol;
/* Value */
value = maxCol;
return vec3(hue, saturation, value);
}
vec3 HSVtoRGB(vec3 color)
{
float f,p,q,t, hueRound;
int hueIndex;
float hue, saturation, value;
vec3 result;
/* just for clarity */
hue = color.r;
saturation = color.g;
value = color.b;
hueRound = floor(hue * 6.0);
hueIndex = int(hueRound) % 6;
f = (hue * 6.0) - hueRound;
p = value * (1.0 - saturation);
q = value * (1.0 - f*saturation);
t = value * (1.0 - (1.0 - f)*saturation);
switch(hueIndex)
{
case 0:
result = vec3(value,t,p);
break;
case 1:
result = vec3(q,value,p);
break;
case 2:
result = vec3(p,value,t);
break;
case 3:
result = vec3(p,q,value);
break;
case 4:
result = vec3(t,p,value);
break;
default:
result = vec3(value,p,q);
break;
}
return result;
}
void main(void)
{
vec4 srcColor;
vec3 hsvColor;
vec3 rgbColor;
texel = Texture2D(tex, tcoordIn);
srcColor = texel*colorIn;
hsvColor = RGBtoHSV(srcColor.rgb);
/* You can do further changes here, if you want. */
hsvColor.g = 0; /* Set saturation to zero */
rgbColor = HSVtoRGB(hsvColor);
pixel = vec4(rgbColor.r, rgbColor.g, rgbColor.b, srcColor.a);
}
If you're working against a modern-enough OpenGL, I would say pixel shaders is a very suitable solution here. Either by hooking into each polygon's shading as they render, or by doing a single full-screen quad in a second pass that just reads each pixel, converts to grayscale, and writes it back. Unless your resolution, graphics hardware, and target framerate are somehow "extrem", that should be doable these days in most cases.
For most Desktops Render-To-Texture isn't that expensive anymore, all of compiz, aero, etc and effects like bloom or depth of field seen in recent titles depend on it.
Actually you don't convert the screen texture per se to grayscale, you would want to draw a scree-sized quad with the texture and a fragment shader transforming the valures to grayscale.
Another option is to have two sets of fragment shaders for your triangles, one just copying the gl_FrontColor attribute as the fixed function pieline would, and another that writes grayscale values to the screen buffer.
A third option might be indexed color modes, if you set uüp a grayscale palette, but that mode might be deprecated and poorly supported by now; plus you lose a lot of functionality like blending, if I remember correctly.