I'm going deeper on OpenGL texture mipmapping.
I noticed in the specification that mipmap levels less than zero and greater than log2(maxSize) + 1 are allowed.
Effectively TexImage2D doesn't specify errors for level parameter. So... Probably those mipmaps are not accessed automatically using the standard texture access routines...
How could be effectively used this feature?
For the negative case, the glTexImage2D's man page says:
GL_INVALID_VALUE is generated if level is less than 0.
For the greater than log2(maxsize) case, the specification says what happens to those levels in Raterization/Texturing/Texture Completeness. The short of it is that, yes, they are ignored.
Related
I'm trying to launch a certain kernel using clEnqueueNDRangeKernel(), from within a C++ program. But instead of enqueuing, or returning an error, it gets a floating-point exception signal (SIGFPE).
For IP reasons which I can't go into, it's difficult for me to provide an example triggering this signal. But - there doesn't seem to be any legitimate reason for this to occur. Are there known cases of that function itself actually performing an invalid floating-point operation?
tl;dr: It's a divide-by zero due to a problem with your dimensions.
With NVIDIA CUDA's OpenCL library (at least with v11.2.152), passing workgroup dimensions and overall grid dimensions of different dimensionality may indeed trigger such a situation. OpenCL users have reported similar behavior in the past.
NVIDIA has deplorably chosen to provide its libraries as closed-source only, so I can only speculate about the reason, but it seems to be the following: When you construct a cl::NDRange with two dimensions, the third value in the international representation is initialized to 0 (explicitly and necessarily, or just sometimes). Now, if you read the documentation for clEnqueueNDRangeKernel() carefully, you'll notice that the both the global dimensions and the local dimensions must have the same dimensionality, i.e. same number of dimensions; clEnqueueNDRangeKernel() probably assumes this is the case, and calculates the number of grid blocks (number of workgroups) in the third dimension using global_dims_it_got[i] / local_dims_it_got[i] for every dimension of the global dimensions it received. Thus when the global dimensionality is higher, it ends up dividing by 0. That triggers SIGFPE, which, despite its name, is not only used for invalid floating-point operations, but rather any arithmetic error.
For interests sake, I'm curious if glStencilMask and glStencilMaskSeparate (and similar ones) have a default value, or if they're implementation defined, or if they're undefined.
I assume the wise thing to do is always set them from the get go, but I'm curious if they just "work" by coincidence or whether there is in fact a default value set.
Slightly related: I recall reading somewhere that on nvidia cards you don't have to set the active texture and it's at zero by default, but AMD cards require you to set it or else you can get junk results. This makes me wonder if it's the same thing (where stencil stuff just happens to work for me but just by chance) and by not setting it I've been playing a dangerous game or if this isn't the case.
I looked through the OpenGL spec [section 17.4.2] for the definitions of these functions, but couldn't resolve the answer to my question.
The initial state of glStencilMask is clearly specified. Initially, the mask is all 1's.
OpenGL 4.6 API Core Profile Specification - 17.4.2 Fine Control of Buffer Updates; page 522:
void StencilMask( uint mask );
void StencilMaskSeparate( enum face, uint mask );
control the writing of particular bits into the stencil planes.
The least significant s bits of mask, where s is the number of bits in the stencil buffer, specify an integer mask. Where a 1 appears in this mask, the corresponding bit in the stencil buffer is written; where a 0 appears, the bit is not written.
[...]
In the initial state, the integer masks are all ones, as are the bits controlling depth
value and RGBA component writing.
The OpenGL specification requires that a framebuffer supports at least 8 color attachments. Now, OpenGL uses compile-time constants (at least on my system), for stuff like GL_COLOR_ATTACHMENTi and GL_DEPTH_ATTACHMENT attachment follows 32 units after GL_COLOR_ATTACHMENT0. Doesn't this mean that regardless of how beefy the hardware is, it will never be possible to use more than 32 color attachments? To clarify, this compiles perfectly with GLEW on Ubuntu 16.04:
static_assert(GL_COLOR_ATTACHMENT0 + 32==GL_DEPTH_ATTACHMENT,"");
and since it is static_assert, this would be true for any hardware configuration (unless the driver installer modify the header files, which would result in non-portable binaries). Wouldn't separate functions for different attachment classes would have been better as it removes the possibility of colliding constants?
It is important to note the difference in spec language. glActiveTexture says this about its parameter:
An INVALID_ENUM error is generated if an invalid texture is specified.
texture is a symbolic constant of the form TEXTUREi, indicating that texture unit i is to be modified. Each TEXTUREi adheres to TEXTUREi = TEXTURE0 + i, where i is in the range zero to k−1, and k is the value of MAX_COMBINED_TEXTURE_IMAGE_UNITS
This text explicitly allows you to compute the enum value, explaining exactly how to do so and what the limits are.
Compare this to what it says about glFramebufferTexture:
An INVALID_ENUM error is generated if attachment is not one of the attachments in table 9.2, and attachment is not COLOR_ATTACHMENTm where m is greater than or equal to the value of MAX_COLOR_ATTACHMENTS.
It looks similar. But note that it doesn't have the language about the value of those enumerators. There's nothing in that description about COLOR_ATTACHMENTm = COLOR_ATTACHMENT0 + m.
As such, it is illegal to use any value other than those specific enums. Now yes, the spec does guarantee elsewhere that COLOR_ATTACHMENTm = COLOR_ATTACHMENT0 + m. But because the guarantee isn't in that section, that section explicitly prohibits the use of any value other than an actual enumerator. Regardless of how you compute it, the result must be an actual enumerator.
So to answer your question, at present, there are only 32 color attachment enumerators. Therefore, MAX_COLOR_ATTACHMENT has an effective maximum value of 32.
The OpenGL 4.5 spec states in Section 9.2:
... by the framebuffer attachment points named COLOR_ATTACHMENT0 through COLOR_ATTACHMENTn. Each COLOR_ATTACHMENTi adheres to COLOR_ATTACHMENTi = COLOR_ATTACHMENT0 + i
and as a footnote
The header files define tokens COLOR_ATTACHMENTi for i in the range [0, 31]. Most implementations support fewer than 32 color attachments, and it is an INVALID_OPERATION error to pass an unsupported attachment name to a command accepting color attachment names.
My interpretation of this is, that it is (as long as the hardware supports it) perfectly fine to use COLOR_ATTACHMENT0 + 32 and so on to address more than 32 attachment points. So there is no real limitation of supported color attachments, just the constants are not defined directly. Why it was designed that way can only be answered by people from the khronos group.
Suppose that I have one shader storage buffer and want to have several views into it, e.g. like this:
layout(std430,binding=0) buffer FloatView { float floats[]; };
layout(std430,binding=0) buffer IntView { int ints[]; };
Is this legal GLSL?
opengl.org says no:
Two blocks cannot use the same index.
However, I could not find such a statement in the GL 4.5 Core Spec or GLSL 4.50 Spec (or the ARB_shader_storage_buffer_object extension description) and my NVIDIA Driver seems to compile such code without errors or warnings.
Does the OpenGL specification expressly forbid this? Apparently not. Or at least, if it does, I can't see where.
But that doesn't mean that it will work cross-platform. When dealing with OpenGL, it's always best to take the conservative path.
If you need to "cast" memory from one representation to another, you should just use separate binding points. It's safer.
There is some official word on this now. I filed a bug on this issue, and they've read it and decided some things. Specifically, the conclusion was:
There are separate binding namespaces for: atomic counters, images, textures, uniform buffers, and SSBOs.
We don't want to allow aliasing on any of them except atomic counters, where aliasing with different offsets (e.g. sharing a binding) is allowed.
In short, don't do this. Hopefully, the GLSL specification will be clarified in this regard.
This was "fixed" in the revision 7 of GLSL 4.5:
It is a compile-time or link-time error to use the same binding number for more than one uniform block or for more than one buffer block.
I say "fixed" because you can still perform aliasing manually via glUniform/ShaderStorageBlockBinding. And the specification doesn't say how this will work exactly.
Nevermind that I'm the one who created the texture in the first place and I should know perfectly well how many mipmaps I loaded/generated for it. I'm doing this for a unit test. There doesn't seem to be a glGetTexParameter parameter to find this out. The closest I've come is something like this:
int max_level;
glGetTexParameter( GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, &max_level );
int max_mipmap = -1;
for ( int i = 0; i < max_level; ++i )
{
int width;
glGetTexLevelParameter( GL_TEXTURE_2D, i, GL_TEXTURE_WIDTH, &width );
if ( 0 == width )
{
max_mipmap = i-1;
break;
}
)
Anyhow, glGetTexLevelParameter() will return 0 width for a nonexistent mipmap if I'm using an NVidia GPU, but with Mesa, it returns GL_INVALID_VALUE, which leads me to believe that this is very much the Wrong Thing To Do.
How do I find out which mipmap levels I've populated a texture with?
The spec is kinda fuzzy on this. It says that you will get GL_INVALID_VALUE if the level parameter is "larger than the maximum allowable level-of-detail". Exactly how this is defined is not stated.
The documentation for the function clears it up a bit, saying that it is the maximum possible number of LODs for the largest possible texture (GL_MAX_TEXTURE_SIZE). Other similar functions like the glFramebufferTexture family explicitly state this as the limit for GL_INVALID_VALUE. So I would expect that.
Therefore, Mesa has a bug. However, you could work around this by assuming that either 0 or a GL_INVALID_VALUE error means you've walked off the end of the mipmap array.
That being said, I would suggest employing glTexStorage and never having to even ask the question again. This will forcibly prevent someone from setting MAX_LEVEL to a value that's too large. It's pretty new, from GL 4.2, but it's implemented (or will be very soon) across all non-Intel hardware that's still being supported.
It looks like there is currently no way to query how many mipmap levels a texture has, short of the OPs trial/error with #NicolBolas' invalid value check. For most cases I guess its performance wouldn't matter if the level 0 size doesn't change often.
However, assuming the texture does not have a limited number of levels, the specs give the preferred calculation (note the use of floor, and not ceiling as some examples give):
numLevels = 1 + floor(log2(max(w, h, d)))
What is the dimension reduction rule for each successively smaller mipmap level?
Each successively smaller mipmap level is half the size of the previous level, but if this half value is a fractional value, you should round down to the next largest integer.
...
Note that this extension is compatible with supporting other rules because it merely relaxes the error and completeness conditions for mipmaps. At the same time, it makes sense to provide developers a single consistent rule since developers are unlikely to want to generate mipmaps for different rules unnecessarily. One reasonable rule is sufficient and preferable, and the "floor" convention is the best choice.
[ARB_texture_non_power_of_two]
This can of course be verified with the OPs method, or in my case when I received a GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT with glFramebufferTexture2D(..., numLevels).
Assuming you're building mipmaps in a standard way, the number of unique images will be something like ceil(log_2(max(width,height)))+1. This can be easily derived by noticing that mipmaps reduce image size by a factor of two each time until there is a single pixel.