OpenGL type mismatches - opengl

Why are there mismatching types in OpenGL?
For example, if I have a vertex buffer object,
GLuint handle = 0;
glGenBuffers(1, &handle_); // this function takes (GLsizei, GLuint*)
Now if I want to know the currently bound buffer
glGetIntegerv( GL_ARRAY_BUFFER_BINDING, reinterpret_cast<GLint *>(&handle ) ); // ouch, type mismatch
Why not have a glGetUnsignedIntegerv or
have glGenBuffers take an GLint * instead.

That is because glGetIntegerv function is intended to get any integral type of information back from OpenGL. It includes also GLint type values (negative ones). And also it includes multiple component values like GL_VIEWPORT:
GLint viewport[4];
glGetIntegerv(GL_VIEWPORT, viewport);
From one point of view - it is simpler to have just one function for getting values back, instead of hundreds for each specific parameter.
Form other point of view - of course it's a bit ugly to cast types.
But no idea why they didn't use GLint for buffer id.
Anyway - you shouldn't bee calling any glGet... functions. They are slow and often requires waiting on GPU complete previous commands - meaning CPU will wait idle in that time.

Related

How to send an array of int to my shader

I'm making a voxel engine and I can render a chunk. I'm using instanced rendering, meaning that I can render all of the chunk with a single draw call. Every blocks of a chunk has a single int (From 0 to 4095) that defines his block type (0 for air, 1 for dirt, etc...). I wanna be able to render my block by applying the good texture in my fragment shader. My chunk contains a tri-dimensionnal array :
uint8_t blocks[16][16][16]
The problem is that I can't find a way to send my array of int to the shader. I tried using a VBO but it makes no-sense (I didn't get any result). I also tried to send my array with glUniform1iv() but I failed.
Is it possible to send an array of int to a shader with glUniformX() ?
In order to prevent storing big data, can I set a byte (uint8_t) instead of int with glUniformX() ?
Is there a good way to send that much data to my shader ?
Is instanced drawing a good way to draw the same model with different textures/types of blocks.
For all purposes and intents, data of this type should be treated like texture data. This doesn't mean literally uploading it as texture data, but rather that that's the frame of thinking you should be using when considering how to transfer it.
Or, in more basic terms: don't try to pass this data as uniform data.
If you have access to OpenGL 4.3+ (which is a reasonably safe bet for most hardware no older than 6-8 years), then Shader Storage Buffers are going to be the most laconic solution:
//GLSL:
layout(std430, binding = 0) buffer terrainData
{
int data[16][16][16];
};
void main() {
int terrainType = data[voxel.x][voxel.y][voxel.z];
//Do whatever
}
//HOST:
struct terrain_data {
int data[16][16][16];
};
//....
terrain_data data = get_terrain_data();
GLuint ssbo;
GLuint binding = 0;//Should be equal to the binding specified in the shader code
glGenBuffers(1, &ssbo);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, ssbo);
glBufferData(GL_SHADER_STORAGE_BUFFER, GLsizeiptr size​, data.data​, GLenum usage);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, binding, ssbo);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, 0);
Any point after this where you need to update the data, simply bind ssbo, call glBufferData (or your preferred method for updating buffer data), and then you're good to go.
If you're limited to older hardware, you do have some options, but they quickly get clunky:
You can use Uniform Buffers, which behave very similarly to Shader Storage Buffers, but
Have limited storage space (65kb in most implementations)
Have other restrictions that may or may not be relevant to your use case
You can use textures directly, where you convert the terrain data to floating point values (or use as integers, if the hardware supports integer formats internally), and then convert back inside the shader
Compatible with almost any hardware
But requires extra complexity and calculations in your shader code
I do second the approach as laid out in #Xirema's answer, but come to a slightly different recommendation. Since your original data type is just uint8_t, using an SSBO or UBO directly will require to either waste 3 bytes per element or to manually pack 4 elements into a single uint. From #Xirema's answer:
For all purposes and intents, data of this type should be treated like texture data. This doesn't mean literally uploading it as texture data, but rather that that's the frame of thinking you should be using when considering how to transfer it.
I totally agree to that. Hence I recommend the use of a Texture Buffer Object (TBO), (a.k.a. "Buffer Texture").
Using glTexBuffer() you can basically re-interpret a buffer object as a texture. In your case, you can just pack the uint8_t[16][16][16] array into a buffer and interpret it as GL_R8UI "texture" format, like this:
//GLSL:
uniform usamplerBuffer terrainData;
void main() {
uint terrainType = texelFetch(terrainData, voxel.z * (16*16) + voxel.y * 16 + voxel.x).r
//Do whatever
}
//HOST:
struct terrain_data {
uint8_t data[16][16][16];
};
//....
terrain_data data = get_terrain_data();
GLuint tbo;
GLuint tex;
glGenBuffers(1, &tbo);
glBindBuffer(GL_TEXTURE_BUFFER, tbo);
glBufferData(GL_TEXTURE_BUFFER, sizeof(terrain_data)​, data.data​, usage);
glGenTextures(1, &tex);
glBindTexture(GL_TEXTURE_BUFFER, tex);
glTexBuffer(GL_TEXTURE_BUFFER, GL_R8UI, tbo);
Note that this will not copy the data to some texture object. Accessing the texture means directly accessing the memory of the buffer.
TBOs also have the advantage that they are available since OpenGL 3.1.

Uniform buffered array elements are incorrect

I've been writing something using GL3.3 which takes a uniform buffer, and uses the information from it to select sprite tiles in a frag shader. It's working on my desktop, with a Nvidia GTX780, but my AMD based laptop (A6-4455M) has some issues with it. Both are on the latest (or very recent) drivers.
Back to the code, It first of all sets up a uniform buffer, which consists of two uints, and a uint array. They then get filled, and are accessed in the shader. At first I got a GL error on the laptop because I was not allocating enough, but a temporary change taking padding into account has sorted that out, and now data is actually being buffered.
The first two uints are no problem. I've also got the array somewhat readable in the shader, there is just one problem; The data is multiplied by four! At the moment the array is just some test data, initialized to its index, so spriteArr[1] == 1, spriteArr[34] == 34, etc. However, Accessing it in the shader, spriteArr[10] gives 40. This goes all the way up to spriteArr[143] == 572. Beyond this and it's something else. I don't know exactly why this is, but it would appear to be an incorrect offset.
I am using the shared uniform layout, and getting the uniform offsets from GL itself, so they should be correct. I did notice that the offsets on the AMD card are much larger, as if it is adding more padding. They are always 0,4,8 on the desktop, but 0,16,32 on the laptop.
If it makes any difference, there is another UBO (binding point 0), which is used for the view and projection matrices. These work as intended. However it is not used in the fragment shader. It is also created before this UBO.
UBO initialisation code:
GLuint spriteUBO;
glGenBuffers(1, &spriteUBO);
glBindBuffer(GL_UNIFORM_BUFFER, spriteUBO);
unsigned maxsize = (2 + 576 + 24) * sizeof(GLuint);
/*Bad I know, but temporary. AMD's driver adds 24 bytes of padding. Nvidias has none.
Not the cause of this problem. At least ensures we have enough allocated. */
glBufferData(GL_UNIFORM_BUFFER, maxsize, NULL, GL_STATIC_DRAW);
glBindBuffer(GL_UNIFORM_BUFFER, 0);
//Set binding point
GLuint spriteUBOIndex = glGetUniformBlockIndex(programID, "SpriteMatchData");
glUniformBlockBinding(programID, spriteUBOIndex, 1);
static const GLchar *unames[] =
{
"width", "height",
//"size",
"spriteArr"
};
GLuint uindices[3];
GLint offsets[3];
glGetUniformIndices(programID,3,unames,uindices);
glGetActiveUniformsiv(programID, 3, uindices, GL_UNIFORM_OFFSET, offsets);
//buffer stuff
glBindBufferBase(GL_UNIFORM_BUFFER, 1, spriteUBO);
glBufferSubData(GL_UNIFORM_BUFFER,offsets[0], sizeof(GLuint), tm.getWidth());
glBufferSubData(GL_UNIFORM_BUFFER, offsets[1], sizeof(GLuint), tm.getHeight());
glBufferSubData(GL_UNIFORM_BUFFER, offsets[2], tm.getTileCount() * sizeof(GLuint), tm.getSpriteArray());
Fragment Shader:
layout (shared) uniform SpriteMatchData{
uint width, height;
uint spriteArr[576];};
Then later on I experiment with the array with something like this:
if(spriteArr[10] == uint(40))
{
debug_colour = vec4(0.0,1.0,0.0,0.0);//green
}
else
{
debug_colour = vec4(1.0,0.0,0.0,0.0); //red
}
With debug_colour turning green in this instance.
Is there any way to sort this out with something that works with both systems? Why is the AMD driver handling this so differently? Could it be a bug in the way it deals with uniform uint arrays?
Why is the AMD driver handling this so differently?
Because that's what you asked for:
layout (shared) uniform SpriteMatchData
You explicitly asked for shared layout. That layout is implementation defined. Therefore, two different implementations are allowed to give you two different layouts. As such, if you want to use SpriteMatchData in a platform-independent way, you must query its layout from the program after linking it.
While you did query the offsets for the values, you did not query the array stride: the byte offset from element to element within the array. There is nothing in the specification that requires that shared layouts tightly pack arrays.
Really though, there's pretty much no reason not to use std140 layout. You can avoid all of this querying of offsets and simply design C++ structs that can be directly consumed by GLSL.

Writing to a floating point OpenGL texture in CUDA via a surface

I'm writing an OpenGL/CUDA (6.5) interop application. I get a compile time error trying to write a floating point value to an OpenGL texture through a surface reference in my CUDA kernel.
Here I give a high level description of how I set up the interop, but I am successfully reading from my texture in my CUDA kernel, so I believe this is done correctly. I have an OpenGL texture declared with
glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_RGB32F_ARB, 512, 512, 0, GL_RGB, GL_FLOAT, NULL);
After creating the texture I call cudaGraphicsGLRegisterImage with cudaGraphicsRegisterFlagsSurfaceLoadStore set. Before running my CUDA kernel, I unbind the texture and call cudaGraphicsMapResources on the cudaGraphicsResource pointers obtained from cudaGraphicsGLRegisterImage. Then I get a cudaArray from cudaGraphicsSubResourceGetMappedArray, create an appropriate resource descriptor for the array, and call cudaCreateSurfaceObject to get a pointer to a cudaSurfaceObject_t. I then call cudaMemcpy with cudaMemcpyHostToDevice to copy the cudaSurfaceObject_t to a buffer on the device allocated by cudaMalloc.
In my CUDA kernel I can read from the surface reference with something like this, and I have verified that this works as expected.
__global__ void cudaKernel(cudaSurfaceObject_t tex) {
int x = blockIdx.x*blockDim.x + threadIdx.x;
int y = blockIdx.y*blockDim.y + threadIdx.y;
float4 sample = surf2Dread<float4>(tex, (int)sizeof(float4)*x, y, cudaBoundaryModeClamp);
In the kernel I want to modify sample and write it back to the texture. The GPU has compute capability 5.0, so this should be possible. I am trying this
surf2Dwrite<float4>(sample, tex, (int)sizeof(float4)*x, y, cudaBoundaryModeClamp);
But I get the error:
error: no instance of overloaded function "surf2Dwrite" matches the argument list
argument types are: (float4, cudaSurfaceObject_t, int, int, cudaSurfaceBoundaryMode)
I can see in
cuda-6.5/include/surface_functions.h
that there are only prototypes for integral versions of surf2Dwrite that accept a void * for the second argument. I do see prototypes for surf2Dwrite which accept a float4 with a templated surface object, However, I'm not sure how I could declare a templated surface object with OpenGL interop. I haven't been able to find anything else on how to do this. Any help is appreciated. Thanks.
It turns out the answer was pretty simple, though I don't know why it works. Instead of calling
surf2Dwrite<float4>(sample, tex, (int)sizeof(float4)*x, y, cudaBoundaryModeClamp);
I needed to call
surf2Dwrite(sample, tex, (int)sizeof(float4)*x, y, cudaBoundaryModeClamp);
To be honest I'm not sure I fully understand CUDA's use of templating in c++. Anyone have an explanation?
For a complete example of CUDA writing to a surface that's linked to an OpenGL texture, refer to this project:
https://github.com/nvpro-samples/gl_cuda_interop_pingpong_st
From the CUDA Documentation, here is the definition of surface template functions:
template<class T>
T surf2Dread(cudaSurfaceObject_t surfObj,
int x, int y,
boundaryMode = cudaBoundaryModeTrap);
template<class T>
void surf2Dread(T* data,
cudaSurfaceObject_t surfObj,
int x, int y,
boundaryMode = cudaBoundaryModeTrap);

glReadPixels usage with glPixelStore

I looked at multiple tutorials about glReadPixels but I'm confused:
void glReadPixels(GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, GLvoid * data)
The last argument is a void?
I saw tutorials and they declared the argument as a vector, unsigned char, GLubyte,...
But what does it really mean?
And do you need to call glPixelStoref( , )?
A void* is C/C++ speak for "pointer to block of memory". The purpose of glReadPixels is to take some part of the framebuffer and write that pixel data into your memory. The data parameter is the "your memory" that it writes into.
Exactly what it writes and how much depends on the pixel transfer parameters, format and type. That's why it takes a void*; because it could be writing an array of bytes, an array of ints, an array of floats, etc. It all depends on what those two parameters say.

Get GlProgram binary from Cg vertex shader

I have some Cg Vertex shader and want to get the compiled binary from it to cache.
The way I load the Cg vertex is using glProgramStringARB, the problem with that is that I can't retrieve any value from glGetProgramiv and glGetProgramBinary.
Here is a example code of what I'm doing:
CGprogram program = cgCreateProgram(context, CG_SOURCE, source, ...);
const char* programARB = static_cast<char*>(cgGetProgramString(program,
CG_COMPILED_PROGRAM));
GLuint id;
glGenProgramsARB(1, id);
glBindProgramARB(GL_VERTEX_PROGRAM_ARB, id);
glProgramStringARB(GL_VERTEX_PROGRAM_ARB, GL_PROGRAM_FORMAT_ASCII_ARB,
static_cast<GLsizei>(strlen(programARB)), programARB);
GLint length = -10;
glGetProgramiv(GL_VERTEX_PROGRAM_ARB, GL_PROGRAM_BINARY_LENGTH, &lenght);
printf("LENGTH: %d\n", length);
I initialized length with -10 just to see if the variable would change with glGetProgramiv call, but I always get the -10 as result.
the problem with that is that I can't retrieve any value from glGetProgramiv and glGetProgramBinary.
Of course you can't. You're confusing ARB_vertex_program with GLSL programs. They're not the same thing.
glGetProgramiv takes a GLSL program object (among other things). Odds are good that OpenGL is giving you a GL_INVALID_VALUE error, since the first argument is almost certainly not a valid program object created by glCreateProgram.
You can't get a program binary for an ARB_vertex_program. You would need to compile your Cg shader to GLSL, then use the standard GLSL compile/link process, and get the binary from that.