What is the underlying implementation of normalize() and length() in GLSL? I am trying to gauge the performance of my code and what to know what instructions are being executed for certain built in functions in GLSL. Where can I get more information on the underlying implementation of other built in functions?
The OpenGL Shading Language spec generally doesn't require a particular implementation of its functions, as long as they give the results it specifies. In GLSL 4.5, for example, see the correctness requirements on page 84, and the functions length() and normalize() on page 150.
Moreover, GLSL doesn't define a binary format for compiled shader code; that format is accordingly implementation dependent.
However, in general, I presume the length function would be implemented using a dot product and a square root, and the normalize function would be implemented by calling the length function and also doing three divisions, or one division and a vector multiplication.
The answer is wildly GPU/vendor dependent ; but it may give you a clue to look at the output provided by AMD Shader Analyser or take a peek into 'binary' dumps of compiled shaders on NVidia - last time I checked (long time ago), they were containing assembly code in text form.
Related
Let's say I have a varying variable between any two GLSL shader stages (e.g. the vertex and fragment stage) declared as a vec4:
in/out/varying vec4 texCoord;
What happens if I only use part of that variable (say, through swizzling) in both shaders, i.e. I only write to a part of it in the vertex shader and only read from that same part in the fragment shader?
// vertex shader
texCoord.st = ...
// fragment shader
... = texture2D(..., texCoord.st);
Is that guartanteed (i.e. by specification) to always produce sane results? It seems reasonable that it does, however I'm not too well-versed in the intricacies of GLSL language-lawyering and don't know if that varying variable is interpreted as somehow "incomplete" by the compiler/linker because it isn't fully written to in the preceding stage. I'm sure the values of texCoord.pq will be undefined anyway, but does that affect the validity of texCoord.st too or does the whole varying system operate on a pure component level?
I haven't found anything to that effect in the GLSL specification on first glance and I would prefer answers based either on the actual specification or any other "official" guarantees, rather than statements that it should work on reasonable hardware (unless of course this case simply is unspecified or implementation-defined). I would also be interested in any possible changes of that throughout GLSL history, including all the way back to its appliance to deprecated builtin varying variables like gl_TexCoord[] in good old GLSL 1.10.
I'm trying to argue that your code will be fine, as per the specification. However, I'm not sure if you will find my reasoning 100% convincing, because I think that the spec seems somewhat imprecise about this. I'm going to refer to the OpenGL 4.5 Core Profile Specification and the OpenGL Shading language 4.50 specification.
Concerning input and output variables, the GLSL spec established the following in section 4.3.4
Shader input variables are declared with the storage qualifier in. They form the input interface between
previous stages of the OpenGL pipeline and the declaring shader. [...] Values from the previous pipeline stage are copied into input variables at the beginning of
shader execution.
and 4.3.6, respectively:
Shader output variables are declared with a storage qualifier using the storage qualifier out. They form
the output interface between the declaring shader and the subsequent stages of the OpenGL pipeline. [...]
During shader execution they will behave as normal
unqualified global variables. Their values are copied out to the subsequent pipeline stage on shader exit.
Only output variables that are read by the subsequent pipeline stage need to be written; it is allowed to
have superfluous declarations of output variables.
Section 5.8 "Assignments" establishes that
Reading a variable before writing (or initializing) it is legal, however the value is undefined.
Since the assignment of the .st vector will write to the sub-vector, we can establish that this variable will contain two intialized and two un-initialized components at the end of the shader invocation, and the whole vector will be copied to the output.
Section 11.1.2.1 of the GL spec states:
If the output variables are passed directly to the vertex processing
stages leading to rasterization, the values of all outputs are
expected to be interpolated across the primitive being rendered,
unless flatshaded. Otherwise the values of all outputs are collected
by the primitive assembly stage and passed on to the subsequent
pipeline stage once enough data for one primitive has been collected.
"The values of all outputs" are determined by the shader, and although some components have undefined values, they still have values, and there is no undefined or implementation-defined behavior here. The interpolation formulas for the line and polygon primitives (sections 14.5.1 and 14.6.1) also never mix between the components, so any defined component value will result in a defined value in the interpolated datum.
Section 11.1.2.1 also contains this statement about the vertex shader outputs:
When a program is linked, all components of any outputs written by a
vertex shader will count against this limit. A program whose vertex
shader writes more than the value of MAX_VERTEX_OUTPUT_COMPONENTS
components worth of outputs may fail to link, unless device-dependent
optimizations are able to make the program fit within available
hardware resources.
Note that this language implies that the full 4 components of a vec4 are counted against the limit as soon as a single component is written to.
On output variables, the specification says:
Their values are copied out to the subsequent pipeline stage on shader exit.
So the question boils down to two things:
What is the value of such an output variable?
That is easily answered. The section on swizzling makes it clear that writing to a swizzle mask will not modify the components that are not part of the swizzle mask. Since you did not write to those components, their values are undefined. So undefined values will be copied out to the subsequent pipeline stage.
Will interpolation of undefined values affect the interpolation of defined values?
No. Interpolation is a component-wise operation. The result of one component's interpolation cannot affect another's.
So this is fine.
I have a GLSL shader that's supposed to output NaNs when a condition is met. I'm having trouble actually making that happen.
Basically I want to do this:
float result = condition ? NaN : whatever;
But GLSL doesn't seem to have a constant for NaN, so that doesn't compile. How do I make a NaN?
I tried making the constant myself:
float NaN = 0.0/0.0; // doesn't work
That works on one of the machines I tested, but not on another. Also it causes warnings when compiling the shader.
Given that the obvious computation didn't work on one of the machines I tried, I get the feeling that doing this correctly is quite tricky and involves knowing a lot of real-world facts about the inconsistencies between various types of GPUs.
Don't use NaNs here.
Section 2.3.4.1 from the OpenGL ES 3.2 Spec states that
The special values Inf and −Inf encode values with magnitudes too large to be represented; the special value NaN encodes “Not A Number” values resulting from undefined arithmetic operations such as 0/0. Implementations are permitted, but not required, to support Inf's and NaN's in their floating-point computations.
So it seems to really depend on implementation. You should be outputing another value instead of NaN
Pass it in as a uniform
Instead of trying to make the NaN in glsl, make it in javascript then pass it in:
shader = ...
uniform float u_NaN
...
call shader with "u_NaN" set to NaN
Fool the Optimizer
It seems like the issue is the shader compiler performing an incorrect optimization. Basically, it replaces a NaN expression with 0.0. I have no idea why it would do that... but it does. Maybe the spec allows for undefined behavior?
Based on that assumption, I tried making an obfuscated method that produces a NaN:
float makeNaN(float nonneg) {
return sqrt(-nonneg-1.0);
}
...
float NaN = makeNaN(some_variable_I_know_isnt_negative);
The idea is that the optimizer isn't clever enough to see through this.
And, on the test machine that was failing, this works! I also tried simplifying the function to just return sqrt(-1.0), but that brought back the failure (further reinforcing my belief that the optimizer is at fault).
This is a workaround, not a solution.
A sufficiently clever optimizer could see through the obfuscation and start breaking things again.
I only tested it in a couple machines, and this is clearly something that varies a lot.
The Unity glsl compiler will convert 0.0f/0.0f to intBitsToFloat(int(0xFFC00000u) - since intBitsToFloat is supported from OpenGL ES 3.0 onwards, this is a solution that works in WebGL2 but not WebGL1
Suppose that I have one shader storage buffer and want to have several views into it, e.g. like this:
layout(std430,binding=0) buffer FloatView { float floats[]; };
layout(std430,binding=0) buffer IntView { int ints[]; };
Is this legal GLSL?
opengl.org says no:
Two blocks cannot use the same index.
However, I could not find such a statement in the GL 4.5 Core Spec or GLSL 4.50 Spec (or the ARB_shader_storage_buffer_object extension description) and my NVIDIA Driver seems to compile such code without errors or warnings.
Does the OpenGL specification expressly forbid this? Apparently not. Or at least, if it does, I can't see where.
But that doesn't mean that it will work cross-platform. When dealing with OpenGL, it's always best to take the conservative path.
If you need to "cast" memory from one representation to another, you should just use separate binding points. It's safer.
There is some official word on this now. I filed a bug on this issue, and they've read it and decided some things. Specifically, the conclusion was:
There are separate binding namespaces for: atomic counters, images, textures, uniform buffers, and SSBOs.
We don't want to allow aliasing on any of them except atomic counters, where aliasing with different offsets (e.g. sharing a binding) is allowed.
In short, don't do this. Hopefully, the GLSL specification will be clarified in this regard.
This was "fixed" in the revision 7 of GLSL 4.5:
It is a compile-time or link-time error to use the same binding number for more than one uniform block or for more than one buffer block.
I say "fixed" because you can still perform aliasing manually via glUniform/ShaderStorageBlockBinding. And the specification doesn't say how this will work exactly.
OpenGL buffer objects support various data types of well defined width (GL_FLOAT is 32 bit, GL_HALF_FLOAT is 16 bit, GL_INT is 32 bit ...)
How would one go about ensuring cross platform and futureproof types for OpenGL?
For example, feeding float data from a c++ array to to a buffer object and saying its type is GL_FLOAT will not work on platforms where float isn't 32 bit.
While doing some research on this, I noticed a subtle but interesting change in how these types are defined in the GL specs. The change happened between OpenGL 4.1 and 4.2.
Up to OpenGL 4.1, the table that lists the data types (Table 2.2 in the recent spec documents) has the header Minimum Bit Width for the size column, and the table caption says (emphasis added by me):
GL types are not C types. Thus, for example, GL type int is referred to as GLint outside this document, and is not necessarily equivalent to the C type int. An implementation may use more bits than the number indicated in the table to represent a GL type. Correct interpretation of integer values outside the minimum range is not required, however.
Starting with the OpenGL 4.2 spec, the table header changes to Bit Width, and the table caption to:
GL types are not C types. Thus, for example, GL type int is referred to as GLint outside this document, and is not necessarily equivalent to the C type int. An implementation must use exactly the number of bits indicated in the table to represent a GL type.
This influenced the answer to the question. If you go with the latest definition, you can use standard sized type definitions instead of the GL types in your code, and safely assume that they match. For example, you can use int32_t from <cstdint> instead of GLint.
Using the GL types is still the most straightforward solution. Depending on your code architecture and preferences, it might be undesirable, though. If you like to divide your software into components, and want to have OpenGL rendering isolated in a single component while providing a certain level of abstraction, you probably don't want to use GL types all over your code. Yet, once the data reaches the rendering code, it has to match the corresponding GL types.
As a typical example, say you have computational code that produces data you want to render. You may not want to have GLfloat types all over your computational code, because it can be used independent of OpenGL. Yet, once you're ready to display the result of the computation, and want to drop the data into a VBO for OpenGL rendering, the type has to be the same as GLfloat.
There are various approaches you can use. One is what I mentioned above, using sized types from standard C++ header files in your non-rendering code. Similarly, you can define your own typedefs that match the types used by OpenGL. Or, less desirable for performance reasons, you can convert the data where necessary, possibly based on comparing the sizeof() values between the incoming types and the GL types.
What does the iv at the end of glGetShaderiv() stand for?
It describes the parameters returned, in this case a vector of ints. The same nomenclature is used for glTexParameteriv and glTexParameterfv for example, which updates a vector of ints or floats respectively.
OpenGL has some organized naming conventions regarding the routines they have in the libraries.
Prefixes
All routines have a gl before them. Similar thing you must have observed with glu and glut. Some vender libraries also have prefixes like NVidia guys have put up a NV_ prefix in the hardware feature flags.
Suffixes
Suffixes usually are an indicator to what kind of arguments does the method take.
Some specifying the context of the function. e.g. 1D, 2D or 3D
e.g. glTexCoord2D
Kind of value types is a function's argument taking. e.g. glTranslatef takes GLfloat arguments (observe that even the data types follow the same naming convention) and glTranslated take GLdouble.
The source of the vertices (usually when there are too many vertices and you store them in a single array) taking the method you have mentioned:
glGetShaderiv is a function, takes the parameters for shaders, where datatype is GLint and the source of data is a vector (v).
You can take such kind of conventions to easily identify that what method takes what kind of arguments.
It indicates you want to get a value that is an array of integers. The function reference can be found here, and as you can see, the return param is a GLint *. This is in contrast to functions such as glGetInternalFormati64v, which has a return param of GLint64 *. I believe, but can't locate at the moment, that there have been functions using the fv suffix to denote floats, and possibly others.