issues with mixing glGetTexImage and imageStore on nvidia opengl - opengl

I wrote some code, too long to paste here, that renders into a 3D 1 component float texture via a fragment shader that uses bindless imageLoad and imageStore.
That code is definitely working.
I then needed to work around some GLSL compiler bugs, so wanted to read the 3D texture above back to the host via glGetTexImage. Yes, I did do a glMemoryBarrierEXT(GL_ALL_BARRIER_BITS).
I did check the texture info via glGetTexLevelparameteriv() and everything I see matches. I did check for OpenGL errors, and have none.
Sadly, though, glGetTexImage never seems to read what was written by the fragment shader. Instead, it only returns the fake values I put in when I called glTexImage3D() to create the texture.
Is that expected behavior? The documentation implies otherwise.
If glGetTexImage actually works that way, how can I read back the data in that 3D texture (resident on the device?) Clearly the driver can do that as it does when the texture is made non-resident. Surely there's a simple way to do this simple thing...
I was asking if glGetTexImage was supposed to work that way or not. Here's the code:
void Bindless3DArray::dump_array(Array3D<float> &out)
{
bool was_mapped = m_image_mapped;
if (was_mapped)
unmap_array(); // unmap array so it's accessible to opengl
out.resize(m_depth, m_height, m_width);
glBindTexture(GL_TEXTURE_3D, m_textureid); // from glGenTextures()
#if 0
int w,h,d;
glGetTexLevelParameteriv(GL_TEXTURE_3D, 0, GL_TEXTURE_WIDTH, &w);
glGetTexLevelParameteriv(GL_TEXTURE_3D, 0, GL_TEXTURE_HEIGHT, &h);
glGetTexLevelParameteriv(GL_TEXTURE_3D, 0, GL_TEXTURE_DEPTH, &d);
int internal_format;
glGetTexLevelParameteriv(GL_TEXTURE_3D, 0, GL_TEXTURE_INTERNAL_FORMAT, &internal_format);
int data_type_r, data_type_g;
glGetTexLevelParameteriv(GL_TEXTURE_3D, 0, GL_TEXTURE_RED_TYPE, &data_type_r);
glGetTexLevelParameteriv(GL_TEXTURE_3D, 0, GL_TEXTURE_GREEN_TYPE, &data_type_g);
int size_r, size_g;
glGetTexLevelParameteriv(GL_TEXTURE_3D, 0, GL_TEXTURE_RED_SIZE, &size_r);
glGetTexLevelParameteriv(GL_TEXTURE_3D, 0, GL_TEXTURE_GREEN_SIZE, &size_g);
#endif
glGetTexImage(GL_TEXTURE_3D, 0, GL_RED, GL_FLOAT, &out(0,0,0));
glBindTexture(GL_TEXTURE_3D, 0);
CHECK_GLERROR();
if (was_mapped)
map_array_to_cuda(); // restore state
}
Here's the code that creates the bindless array:
void Bindless3DArray::allocate(int w, int h, int d, ElementType t)
{
if (!m_textureid)
glGenTextures(1, &m_textureid);
m_type = t;
m_width = w;
m_height = h;
m_depth = d;
glBindTexture(GL_TEXTURE_3D, m_textureid);
CHECK_GLERROR();
glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MAX_LEVEL, 0); // ensure only 1 miplevel is allocated
CHECK_GLERROR();
Array3D<float> foo(d, h, w);
// DEBUG -- glGetTexImage returns THIS data, not what's on device
for (int z=0; z<m_depth; ++z)
for (int y=0; y<m_height; ++y)
for (int x=0; x<m_width; ++x)
foo(z,y,x) = 3.14159;
//-- Texture creation
if (t == ElementInteger)
glTexImage3D(GL_TEXTURE_3D, 0, GL_R32UI, w, h, d, 0, GL_RED_INTEGER, GL_INT, 0);
else if (t == ElementFloat)
glTexImage3D(GL_TEXTURE_3D, 0, GL_R32F, w, h, d, 0, GL_RED, GL_FLOAT, &foo(0,0,0));
else
throw "Invalid type for Bindless3DArray";
CHECK_GLERROR();
m_handle = glGetImageHandleNV(m_textureid, 0, true, 0, (t == ElementInteger) ? GL_R32UI : GL_R32F);
glMakeImageHandleResidentNV(m_handle, GL_READ_WRITE);
CHECK_GLERROR();
#ifdef USE_CUDA
checkCuda(cudaGraphicsGLRegisterImage(&m_image_resource, m_textureid, GL_TEXTURE_3D, cudaGraphicsRegisterFlagsSurfaceLoadStore));
#endif
}
I allocate the array, render to it via an OpenGL fragment program, and then I call dump_array() to read the data back. Sadly, I only get what I loaded in the allocate call.
The render program looks like
void App::clear_deepz()
{
deepz_clear_program.bind();
deepz_clear_program.setUniformValue("sentinel", SENTINEL);
deepz_clear_program.setUniformValue("deepz", deepz_array.handle());
deepz_clear_program.setUniformValue("sem", semaphore_array.handle());
run_program();
glMemoryBarrierEXT(GL_ALL_BARRIER_BITS);
// glMemoryBarrierEXT(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
// glMemoryBarrierEXT(GL_SHADER_GLOBAL_ACCESS_BARRIER_BIT_NV);
deepz_clear_program.release();
}
and the fragment program is:
#version 420\n
in vec4 gl_FragCoord;
uniform float sentinel;
coherent uniform layout(size1x32) image3D deepz;
coherent uniform layout(size1x32) uimage3D sem;
void main(void)
{
ivec3 coords = ivec3(gl_FragCoord.x, gl_FragCoord.y, 0);
imageStore(deepz, coords, vec4(sentinel));
imageStore(sem, coords, ivec4(0));
discard; // don't write to FBO at all
}

discard; // don't write to FBO at all
That's not what discard means. Oh, it does mean that. But it also means that all Image Load/Store writes will be discarded too. Indeed, odds are, the compiler will see that statement and just do nothing for the entire fragment shader.
If you want to just execute the fragment shader, you can employ the GL 4.3 feature (available on your NVIDIA hardware) of having an empty framebuffer object. Or you could use a compute shader. If you can't use GL 4.3 yet, then use a write mask to turn off all color writes.

As Nicol mentions above, if you want side effects only of image load and store, the proper way is to use an empty frame buffer object.
The bug of mixing glGetTexImage() and bindless textures was in fact a driver bug, and has been fixed as of driver version 335.23. I filed the bug and have confirmed my code is now working properly.
Note I am using empty frame buffer objects in the code, and don't use "discard" any more.

Related

OpenGL ping pong feedback texture not completely clearing itself. Trail is left behind

The goal:
Effectively read and write to the same texture, like how Shadertoy does their buffers.
The setup:
I have a basic feedback system with 2 textures each connected to a framebuffer. As I render to frame buffer 1, I bind Texture 2 for sampling in the shader. Then, as I render to frame buffer 2, I bind texture 1 for sampling, and repeat. Finally, I output texture 1 to the whole screen with the default frame buffer and a sperate shader.
The issue:
This almost works as intended as I'm able to read from the texture in the shader and also output to it, creating the desired feedback loop.
The problem is that the frame buffers do not clear completely to black it seems.
To test, I made a simple trailing effect.
In shadertoy, the trail completely disappears as intended:
Live in shadertoy
But in my app, the trail begins to disappear, but leaves a small amount behind:
My thoughts are I'm not clearing the frame buffers correctly or I am not using GLFW's double buffering correctly in this instance. I've tried every combination of clearing the framebuffers but I must be missing something here.
The code:
Here is the trailing effect shader with a moving circle (Same as above images)
#version 330
precision highp float;
uniform sampler2D samplerA; // Texture sampler
uniform float uTime; // current execution time
uniform vec2 uResolution; // resolution of window
void main()
{
vec2 uv = gl_FragCoord.xy / uResolution.xy; // Coordinates from 0 - 1
vec3 tex = texture(samplerA, uv).xyz;// Read ping pong texture that we are writing to
vec2 pos = .3*vec2(cos(uTime), sin(uTime)); // Circle position (circular motion around screen)
vec3 c = mix(vec3(1.), vec3(0), step(.0, length(uv - pos)-.07)); // Circle color
tex = mix(c, tex, .981); // Replace some circle color with the texture color
gl_FragColor = vec4(tex, 1.0); // Output to texture
}
Frame buffer and texture creation:
// -- Generate frame buffer 1 --
glGenFramebuffers(1, &frameBuffer1);
glBindFramebuffer(GL_FRAMEBUFFER, frameBuffer1);
// Generate texture 1
glGenTextures(1, &texture1);
// Bind the newly created texture
glBindTexture(GL_TEXTURE_2D, texture1);
// Create an empty image
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 1920, 1080, 0, GL_RGBA, GL_FLOAT, 0);
// Nearest filtering, for sampling
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
// Attach output texture to frame buffer
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, texture1, 0);
// -- Generate frame buffer 2 --
glGenFramebuffers(1, &frameBuffer2);
glBindFramebuffer(GL_FRAMEBUFFER, frameBuffer2);
// Generate texture 2
glGenTextures(1, &texture2);
// Bind the newly created texture
glBindTexture(GL_TEXTURE_2D, texture2);
// Create an empty image
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 1920, 1080, 0, GL_RGBA, GL_FLOAT, 0);
// Nearest filtering, for sampling
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
// Attach texture 2 to frame buffer 2
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, texture2, 0);
Main loop:
while(programIsRunning){
// Draw scene twice, once to frame buffer 1 and once to frame buffer 2
for (int i = 0; i < 2; i++)
{
// Start trailing effect shader program
glUseProgram(program);
glViewport(0, 0, platform.windowWidth(), platform.windowHeight());
// Write to frame buffer 1
if (i == 0)
{
// Bind and clear frame buffer 1
glBindFramebuffer(GL_FRAMEBUFFER, frameBuffer1);
glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT);
// Bind texture 2 for sampler
glActiveTexture(GL_TEXTURE0 + 0);
glBindTexture(GL_TEXTURE_2D, texture2);
glUniform1i(uniforms.samplerA, 0);
}
else // Write to frame buffer 2
{
// Bind and clear frame buffer 2
glBindFramebuffer(GL_FRAMEBUFFER, frameBuffer2);
glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT);
// Bind texture 1 for sampler
glActiveTexture(GL_TEXTURE0 + 0);
glBindTexture(GL_TEXTURE_2D, texture1);
glUniform1i(uniforms.samplerA, 0);
}
// Render to screen
glDrawArrays(GL_TRIANGLES, 0, 6);
}
// Start screen shader program
glUseProgram(screenProgram);
// Bind default frame buffer
glBindFramebuffer(GL_FRAMEBUFFER, 0);
glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT);
glViewport(0, 0, platform.windowWidth(), platform.windowHeight());
// Bind texture 1 for sampler (binding texture 2 should be the same?)
glActiveTexture(GL_TEXTURE0 + 0);
glBindTexture(GL_TEXTURE_2D, texture1);
glUniform1i(uniforms.samplerA, 0);
// Draw final rectangle to screen
glDrawArrays(GL_TRIANGLES, 0, 6);
// Swap glfw buffers
glfwSwapBuffers(platform.window());
}
If this is an issue with clearing I would really like to know why. Changing which frame buffer gets cleared doesn't seem to change anything.
I will keep experimenting in the meantime.
Thank you!
The problem is that you are creating a texture with too little precision for your exponential moving average computations to ultimately discretize to zero.
In your call to:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 1920, 1080, 0, GL_RGBA, GL_FLOAT, 0);
you are using the unsized internal format GL_RGBA (third argument), which will very likely ultimately result in the GL_RGBA8 internal format actually being used. So, all channels will have a precision of 8 bits.
You probably believed that using GL_FLOAT as the argument for the type parameter results in a 32-bit floating-point texture being allocated: It does not. The type parameter is used to indicate to OpenGL how it should interpret your data (last parameter of the function) when/if you actually specify data to be uploaded. You use 0/NULL so the type parameter really does not influence the call, as there is no memory to be interpreted as float values to be uploaded.
So, your texture will have a precision of 8 bits per channel and therefore each channel can hold at most 256 different values.
Given that in your shown RGB image the RGB value is 24 for each channel, we can do the math how OpenGL gets to this value and why it won't get any lower than that:
First, let's do another round of your exponential moving average between (0, 0, 0) and (24, 24, 24)/255 with a factor of your 0.981:
d = (24, 24, 24)/255 * 0.981
If we had infinite precision, this value d would be 0.09232941176.
Now, let's see what RGB value within the representable range [0, 255] this comes close to: 0.09232941176 * 255 = 23.5439999988.
So, this value is actually (when correctly rounded to the nearest representable value within the [0, 255] discretization) 24 again. And that's where it stays.
In order to fix this, you likely need to use a higher precision internal texture format, such as GL_RGBA32F (which is actually what ShaderToy itself uses).

OpenGL array textures not rendering at all

I'm currently working to convert a project from using a texture atlas to an array texture, but for the life of me I can't get it working.
Some notes about my environment:
I'm using OpenGL 3.3 core context with GLSL version 3.30
The textures are all 128x128 and rendered perfectly fine when using an atlas (barring the edge artifacts which convinced me to switch)
Problems I believe I've ruled out:
Resolution issues - 128x128, being a power of two, should be fine
Texture loading (it works perfectly as it did before)
Incomplete textures (mipmap issues) - I've gone through the common issues regarding mipmaps and I don't believe OpenGL should be expecting them
Here's my code for creating the array texture:
public void createTextureArray() {
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
int handle = glGenTextures();
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D_ARRAY, handle);
glPixelStorei(GL_UNPACK_ROW_LENGTH, Texture.SIZE);
glPixelStorei(GL_UNPACK_ALIGNMENT, 4);
glTexStorage3D(GL_TEXTURE_2D_ARRAY, 1, GL_RGBA8, Texture.SIZE, Texture.SIZE, textures.size());
try {
int layer = 0;
for (Texture tex : textures.values()) {
// Next few lines are just for loading the texture. They've been ruled out as the issue.
PNGDecoder decoder = new PNGDecoder(ImageHelper.asInputStream(tex.getImage()));
ByteBuffer buffer = BufferUtils.createByteBuffer(decoder.getWidth() * decoder.getHeight() * 4);
decoder.decode(buffer, decoder.getWidth() * 4, PNGDecoder.Format.RGBA);
buffer.flip();
glTexSubImage3D(GL_TEXTURE_2D_ARRAY, 0, 0, 0, layer, decoder.getWidth(), decoder.getHeight(), 1,
GL_RGBA, GL_UNSIGNED_BYTE, buffer);
tex.setLayer(layer);
layer++;
}
} catch (IOException ex) {
ex.printStackTrace();
System.err.println("Failed to create/load texture array");
System.exit(-1);
}
}
The code for creating the VAO/VBO:
private static int prepareVbo(int handle, FloatBuffer vbo) {
IntBuffer vaoHandle = BufferUtils.createIntBuffer(1);
glGenVertexArrays(vaoHandle);
glBindVertexArray(vaoHandle.get());
glBindBuffer(GL_ARRAY_BUFFER, handle);
glBufferData(GL_ARRAY_BUFFER, vbo, GL_STATIC_DRAW);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D_ARRAY, GraphicsMain.TEXTURE_REGISTRY.atlasHandle);
glEnableVertexAttribArray(positionAttrIndex);
glEnableVertexAttribArray(texCoordAttrIndex);
glVertexAttribPointer(positionAttrIndex, 3, GL_FLOAT, false, 24, 0);
glVertexAttribPointer(texCoordAttrIndex, 3, GL_FLOAT, false, 24, 12);
glBindVertexArray(0);
vaoHandle.rewind();
return vaoHandle.get();
}
Fragment shader:
#version 330 core
uniform sampler2DArray texArray;
varying vec3 texCoord;
void main() {
gl_FragColor = texture(texArray, texCoord);
}
(texCoord is working fine; it's being passed from the vertex shader correctly.)
I'm about out of ideas, so being still somewhat new to modern OpenGL, I'd like to know if there's anything glaringly wrong with my code.
Some considerations:
you don't need any more to have power of two textures
be sure that every layer has the same number of levels/mipmaps, as the wiki says
the first four glTexParameteri will affect what is bound to GL_TEXTURE_2D_ARRAY at that moment, so you better want to move them after glBindTexture
how can you specify how many textures you want to create with glGenTextures()? If you have the possibility for a more specific method, please use it
GL_UNPACK_ROW_LENGTH if greater than 0, defines the number of pixels in a row. I suppose then Texture.SIZE is not really the texture size but the dimension on one side (128 in your case). Anyway you don't need to set that, you can skip it
set GL_UNPACK_ALIGNMENT to 4 only if your row lenght is a multiple of it. Most of time people set it to 1 before loading a texture to avoid any trouble and then set it back to 4 once done
last argument of glTexStorage3D is expected to be the number of layers, I hope textures.size() better returns that rather than the size (128x128)
glActiveTexture and glBindTexture inside prepareVbo are useless, they are not part of the vao
don't use varying in glsl, it's deprecated, switch to a simple in out
you may want to take inspiration from this sample
use sampler, they give you more flexibility
use Debug Output if available, otherwise glGetError(), some silent errors may not be seen explicitely by the rendering output
you called it prepareVbo but you do initialize in it both vao and vbo

GLSL Render to Texture not working

I'm trying to do a compute pass where I render to a texture that will be used in a draw pass later on. My initial implementation was based on shader storage buffer objects and was working nicely. But I want to apply a computation method that is going to take advantage of the blend hardware of the GPU so I started porting the SSBO implementation to RTT one. Unfortunately the code has stopped working. Now when I read back the texture it is getting wrong values.
Here is my texture and frame buffer setup code:
glGenFramebuffers(1, &m_fbo);
glBindFramebuffer(GL_FRAMEBUFFER, m_fbo);
// Create render textures
glGenTextures(NUM_TEX_OUTPUTS, m_renderTexs);
m_texSize = square_approximation(m_numVertices);
cout << "Textures size: " << glm::to_string(m_texSize) << endl;
GLenum drawBuffers[NUM_TEX_OUTPUTS];
for (int i = 0 ; i < NUM_TEX_OUTPUTS; ++i)
{
glBindTexture(GL_TEXTURE_2D, m_renderTexs[i]);
// 1st 0: level, 2nd 0: no border, 3rd 0: no initial data
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, m_texSize.x, m_texSize.y, 0, GL_RGBA, GL_FLOAT, 0);
// XXX: do we need this?
// Poor filtering. Needed !
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, 0);
// 0: level
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, GL_TEXTURE_2D, m_renderTexs[i], 0);
drawBuffers[i] = GL_COLOR_ATTACHMENT0 + i;
}
glDrawBuffers(NUM_TEX_OUTPUTS, drawBuffers);
if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE)
{
cout << "Error when setting frame buffer" << endl;
// throw exception?
}
glBindFramebuffer(GL_FRAMEBUFFER, 0);
And this is the code to start the compute pass:
m_shaderProgram.use();
// setup openGL
glPolygonMode(GL_FRONT_AND_BACK, GL_LINE);
glDisable(GL_CULL_FACE);
glDisable(GL_DEPTH_TEST);
glViewport(0, 0, m_texSize.x, m_texSize.y); // setup viewport (equal to textures size)
// make a single patch have the vertex, the bases and the neighbours
glPatchParameteri(GL_PATCH_VERTICES, m_maxNeighbours + 5);
// Wait all writes to shader storage to finish
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
glUniform1i(m_shaderProgram.getUniformLocation("curvTex"), m_renderTexs[2]);
glUniform2i(m_shaderProgram.getUniformLocation("size"), m_texSize.x, m_texSize.y);
glUniform2f(m_shaderProgram.getUniformLocation("vertexStep"), (umax - umin)/divisoes,
(vmax-vmin)/divisoes);
// Bind buffers
glBindFramebuffer(GL_FRAMEBUFFER, m_fbo);
glBindBuffer(GL_ARRAY_BUFFER, m_vbo);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, m_ibo);
glBindBufferBase(GL_UNIFORM_BUFFER, m_mvp_location, m_mvp_ubo);
// Make textures active
for (int i = 0; i < NUM_TEX_OUTPUTS; ++i)
{
glActiveTexture(GL_TEXTURE0 + i);
glBindTexture(GL_TEXTURE_2D, m_renderTexs[i]);
}
// no need to pass index array 'cause ibo is bound already
glDrawElements(GL_PATCHES, m_numElements, GL_UNSIGNED_INT, 0);
I then read back the textures using the following:
bool readTex(GLuint tex, void *dest)
{
glBindTexture(GL_TEXTURE_2D, tex);
glGetTexImage(GL_TEXTURE_2D, 0, GL_RGBA, GL_FLOAT, dest);
glBindTexture(GL_TEXTURE_2D, 0);
// TODO: check glGetTexImage return values for error
return true;
}
for (int i = 0; i < NUM_TEX_OUTPUTS; ++i)
{
if (m_tensors[i] == NULL) {
m_tensors[i] = new glm::vec4[m_texSize.x*m_texSize.y];
}
memset(m_tensors[i], 0, m_texSize.x*m_texSize.y*sizeof(glm::vec4));
readTex(m_renderTexs[i], m_tensors[i]);
}
Finally, the fragment shader code is:
#version 430
#extension GL_ARB_shader_storage_buffer_object: require
layout(pixel_center_integer) in vec4 gl_FragCoord;
layout(std140, binding=6) buffer EvalBuffer {
vec4 evalDebug[];
};
uniform ivec2 size;
in TEData {
vec4 _a;
vec4 _b;
vec4 _c;
vec4 _d;
vec4 _e;
};
layout(location = 0) out vec4 a;
layout(location = 1) out vec4 b;
layout(location = 2) out vec4 c;
layout(location = 3) out vec4 d;
layout(location = 4) out vec4 e;
void main()
{
a= _a;
b= _b;
c= _c;
d= _d;
e= _e;
evalDebug[gl_PrimitiveID] = gl_FragCoord;
}
The fragment coordinates are correct (each fragment is pointing to a x,y coordinate in the texture), so are all the input values (_a to _e), but I do not see them outputted correctly to the textures when reading back. I also tried accessing the texture in the shader to see if it was only a read-back error, but my debug SSBO returned all zeroes.
Am I missing some setup step?
I've tested both on linux and windows (titan and 540M geforces) and I'm using openGL 4.3.
As derhass pointed out in the comments above, the problem was with the texture format. I assumed that by passing GL_FLOAT as the data type it would use 32bit floats for each of the RGBA channels. It was not so.
As derhass said, the data type parameter here does not change the texture format. I had to change the internalFormat parameter to what I wanted (GL_RGBA32F) so that it would work as expected.
So, after changing glTexImage2D call to:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, m_texSize.x, m_texSize.y, 0, GL_RGBA, GL_FLOAT, 0);
I was able to correctly render the results to the texture and read it back. :)

Problems with GL_LUMINANCE and ATI

I'm trying to use luminance textures on my ATI graphics card.
The problem: I'm not being able to correctly retrieve data from my GPU. Whenever I try to read it (using glReadPixels), all it gives me is an 'all-ones' array (1.0, 1.0, 1.0...).
You can test it with this code:
#include <stdio.h>
#include <stdlib.h>
#include <GL/glew.h>
#include <GL/glut.h>
static int arraySize = 64;
static int textureSize = 8;
//static GLenum textureTarget = GL_TEXTURE_2D;
//static GLenum textureFormat = GL_RGBA;
//static GLenum textureInternalFormat = GL_RGBA_FLOAT32_ATI;
static GLenum textureTarget = GL_TEXTURE_RECTANGLE_ARB;
static GLenum textureFormat = GL_LUMINANCE;
static GLenum textureInternalFormat = GL_LUMINANCE_FLOAT32_ATI;
int main(int argc, char** argv)
{
// create test data and fill arbitrarily
float* data = new float[arraySize];
float* result = new float[arraySize];
for (int i = 0; i < arraySize; i++)
{
data[i] = i + 1.0;
}
// set up glut to get valid GL context and
// get extension entry points
glutInit (&argc, argv);
glutCreateWindow("TEST1");
glewInit();
// viewport transform for 1:1 pixel=texel=data mapping
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluOrtho2D(0.0, textureSize, 0.0, textureSize);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glViewport(0, 0, textureSize, textureSize);
// create FBO and bind it (that is, use offscreen render target)
GLuint fboId;
glGenFramebuffersEXT(1, &fboId);
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fboId);
// create texture
GLuint textureId;
glGenTextures (1, &textureId);
glBindTexture(textureTarget, textureId);
// set texture parameters
glTexParameteri(textureTarget, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(textureTarget, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(textureTarget, GL_TEXTURE_WRAP_S, GL_CLAMP);
glTexParameteri(textureTarget, GL_TEXTURE_WRAP_T, GL_CLAMP);
// define texture with floating point format
glTexImage2D(textureTarget, 0, textureInternalFormat, textureSize, textureSize, 0, textureFormat, GL_FLOAT, 0);
// attach texture
glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, textureTarget, textureId, 0);
// transfer data to texture
//glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT);
//glRasterPos2i(0, 0);
//glDrawPixels(textureSize, textureSize, textureFormat, GL_FLOAT, data);
glBindTexture(textureTarget, textureId);
glTexSubImage2D(textureTarget, 0, 0, 0, textureSize, textureSize, textureFormat, GL_FLOAT, data);
// and read back
glReadBuffer(GL_COLOR_ATTACHMENT0_EXT);
glReadPixels(0, 0, textureSize, textureSize, textureFormat, GL_FLOAT, result);
// print out results
printf("**********************\n");
printf("Data before roundtrip:\n");
printf("**********************\n");
for (int i = 0; i < arraySize; i++)
{
printf("%f, ", data[i]);
}
printf("\n\n\n");
printf("**********************\n");
printf("Data after roundtrip:\n");
printf("**********************\n");
for (int i = 0; i < arraySize; i++)
{
printf("%f, ", result[i]);
}
printf("\n");
// clean up
delete[] data;
delete[] result;
glDeleteFramebuffersEXT (1, &fboId);
glDeleteTextures (1, &textureId);
system("pause");
return 0;
}
I also read somewhere on the internet that ATI cards don't support luminance yet. Does anyone know if this is true?
This has nothing to do with luminance values; the problem is with you reading floating point values.
In order to read floating-point data back properly via glReadPixels, you first need to set the color clamping mode. Since you're obviously not using OpenGL 3.0+, you should be looking at the ARB_color_buffer_float extension. In that extension is glClampColorARB, which works pretty much like the core 3.0 verison.
here's what I found out:
1) If you use GL_LUMINANCE as texture format (and GL_LUMINANCE_FLOAT32_ATI GL_LUMINANCE32F_ARB or GL_RGBA_FLOAT32_ATI as internal format), the glClampColor(..) (or glClampColorARB(..)) doesn't seem to work at all.
I was only able to see the values getting actively clamped/not clamped if I set the texture format to GL_RGBA. I don't understand why this happens, since the only glClampColor(..) limitation I heard of is that it works exclusively with floating-point buffers, which all chosen internal formats seems to be.
2) If you use GL_LUMINANCE (again, with GL_LUMINANCE_FLOAT32_ATI, GL_LUMINANCE32F_ARB or GL_RGBA_FLOAT32_ATI as internal format), it looks like you must "correct" your output buffer dividing each of its elements by 3. I guess this happens because when you use glTexImage2D(..) with GL_LUMINANCE it internally replicates each array component three times and when you read GL_LUMINANCE values with glReadPixel(..) it calculates its values from the sum of the RGB components (thus, three times what you have given as input). But again, it stills give you clamped values.
3) Finally, if you use GL_RED as texture format (instead of GL_LUMINANCE), you don't need to pack your input buffer and you get your output buffer properly. The values are not clamped and you don't need to call glClampColor(..) at all.
So, I guess I'll stick with GL_RED, because in the end what I wanted was an easy way to send and collect floating-point values from my "kernels" without having to worry about offsetting array indexes or anything like this.

What can cause glDrawArrays to generate a GL_INVALID_OPERATION error?

I've been attempting to write a two-pass GPU implementation of the Marching Cubes algorithm, similar to the one detailed in the first chapter of GPU Gems 3, using OpenGL and GLSL. However, the call to glDrawArrays in my first pass consistently fails with a GL_INVALID_OPERATION.
I've looked up all the documentation I can find, and found these conditions under which glDrawArrays can throw that error:
GL_INVALID_OPERATION is generated if a non-zero buffer object name is bound to an enabled array or to the GL_DRAW_INDIRECT_BUFFER binding and the buffer object's data store is currently mapped.
GL_INVALID_OPERATION is generated if glDrawArrays is executed between the execution of glBegin and the corresponding glEnd.
GL_INVALID_OPERATION will be generated by glDrawArrays or glDrawElements if any two active samplers in the current program object are of different types, but refer to the same texture image unit.
GL_INVALID_OPERATION is generated if a geometry shader is active and mode is incompatible with the input primitive type of the geometry shader in the currently installed program object.
GL_INVALID_OPERATION is generated if mode is GL_PATCHES and no tessellation control shader is active.
GL_INVALID_OPERATION is generated if recording the vertices of a primitive to the buffer objects being used for transform feedback purposes would result in either exceeding the limits of any buffer object’s size, or in exceeding the end position offset + size - 1, as set by glBindBufferRange.
GL_INVALID_OPERATION is generated by glDrawArrays() if no geometry shader is present, transform feedback is active and mode is not one of the allowed modes.
GL_INVALID_OPERATION is generated by glDrawArrays() if a geometry shader is present, transform feedback is active and the output primitive type of the geometry shader does not match the transform feedback primitiveMode.
GL_INVALID_OPERATION is generated if the bound shader program is invalid.
EDIT 10/10/12: GL_INVALID_OPERATION is generated if transform feedback is in use, and the buffer bound to the transform feedback binding point is also bound to the array buffer binding point. This is the problem I was having, due to a typo in which buffer I bound. While the spec does state that this is illegal, it isn't listed under glDrawArrays as one of the reasons it can throw an error, in any documentation I found.
Unfortunately, no one piece of official documentation I can find covers more than 3 of these. I had to collect this list from numerous sources. Points 7 and 8 actually come from the documentation for glBeginTransformFeedback, and point 9 doesn't seem to be documented at all. I found it mentioned in a forum post somewhere. However, I still don't think this list is complete, as none of these seem to explain the error I'm getting.
I'm not mapping any buffers at all in my program, anywhere.
I'm using the Core profile, so glBegin and glEnd aren't even available.
I have two samplers, and they are of different types, but they're definitely mapped to different textures.
A geometry shader is active, but it's input layout is layout (points) in, and glDrawArrays is being called with GL_POINTS.
I'm not using GL_PATCHES or tessellation shaders of any sort.
I've made sure I'm allocating the maximum amount of space my geometry shaders could possible output. Then I tried quadrupling it. Didn't help.
There is a geometry shader. See the next point.
Transform feedback is being used, and there is a geometry shader, but the output layout is layout (points) out and glBeginTransformFeedback is called with GL_POINTS.
I tried inserting a call to glValidateProgram right before the call to glDrawArrays, and it returned GL_TRUE.
The actual OpenGL code is here:
const int SECTOR_SIZE = 32;
const int SECTOR_SIZE_CUBED = SECTOR_SIZE * SECTOR_SIZE * SECTOR_SIZE;
const int CACHE_SIZE = SECTOR_SIZE + 3;
const int CACHE_SIZE_CUBED = CACHE_SIZE * CACHE_SIZE * CACHE_SIZE;
MarchingCubesDoublePass::MarchingCubesDoublePass(ServiceProvider* svc, DensityMap* sourceData) {
this->sourceData = sourceData;
densityCache = new float[CACHE_SIZE_CUBED];
}
MarchingCubesDoublePass::~MarchingCubesDoublePass() {
delete densityCache;
}
void MarchingCubesDoublePass::InitShaders() {
ShaderInfo vertShader, geoShader, fragShader;
vertShader = svc->shader->Load("data/shaders/MarchingCubesDoublePass-Pass1.vert", GL_VERTEX_SHADER);
svc->shader->Compile(vertShader);
geoShader = svc->shader->Load("data/shaders/MarchingCubesDoublePass-Pass1.geo", GL_GEOMETRY_SHADER);
svc->shader->Compile(geoShader);
shaderPass1 = glCreateProgram();
static const char* outputVaryings[] = { "triangle" };
glTransformFeedbackVaryings(shaderPass1, 1, outputVaryings, GL_SEPARATE_ATTRIBS);
assert(svc->shader->Link(shaderPass1, vertShader, geoShader));
uniPass1DensityMap = glGetUniformLocation(shaderPass1, "densityMap");
uniPass1TriTable = glGetUniformLocation(shaderPass1, "triangleTable");
uniPass1Size = glGetUniformLocation(shaderPass1, "size");
attribPass1VertPosition = glGetAttribLocation(shaderPass1, "vertPosition");
vertShader = svc->shader->Load("data/shaders/MarchingCubesDoublePass-Pass2.vert", GL_VERTEX_SHADER);
svc->shader->Compile(vertShader);
geoShader = svc->shader->Load("data/shaders/MarchingCubesDoublePass-Pass2.geo", GL_GEOMETRY_SHADER);
svc->shader->Compile(geoShader);
fragShader = svc->shader->Load("data/shaders/MarchingCubesDoublePass-Pass2.frag", GL_FRAGMENT_SHADER);
svc->shader->Compile(fragShader);
shaderPass2 = glCreateProgram();
assert(svc->shader->Link(shaderPass2, vertShader, geoShader, fragShader));
uniPass2DensityMap = glGetUniformLocation(shaderPass2, "densityMap");
uniPass2Size = glGetUniformLocation(shaderPass2, "size");
uniPass2Offset = glGetUniformLocation(shaderPass2, "offset");
uniPass2Matrix = glGetUniformLocation(shaderPass2, "matrix");
attribPass2Triangle = glGetAttribLocation(shaderPass2, "triangle");
}
void MarchingCubesDoublePass::InitTextures() {
for (int x = 0; x < CACHE_SIZE; x++) {
for (int y = 0; y < CACHE_SIZE; y++) {
for (int z = 0; z < CACHE_SIZE; z++) {
densityCache[x + y*CACHE_SIZE + z*CACHE_SIZE*CACHE_SIZE] = sourceData->GetDensity(Vector3(x-1, y-1, z-1));
}
}
}
glGenTextures(1, &densityTex);
glBindTexture(GL_TEXTURE_3D, densityTex);
glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_R, GL_CLAMP_TO_EDGE);
glTexImage3D(GL_TEXTURE_3D, 0, GL_R32F, CACHE_SIZE, CACHE_SIZE, CACHE_SIZE, 0, GL_RED, GL_FLOAT, densityCache);
glGenTextures(1, &triTableTex);
glBindTexture(GL_TEXTURE_RECTANGLE, triTableTex);
glTexParameteri(GL_TEXTURE_RECTANGLE, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_RECTANGLE, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_RECTANGLE, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_RECTANGLE, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexImage2D(GL_TEXTURE_RECTANGLE, 0, GL_R16I, 16, 256, 0, GL_RED_INTEGER, GL_INT, triTable);
}
void MarchingCubesDoublePass::InitBuffers() {
float* voxelGrid = new float[SECTOR_SIZE_CUBED*3];
unsigned int index = 0;
for (int x = 0; x < SECTOR_SIZE; x++) {
for (int y = 0; y < SECTOR_SIZE; y++) {
for (int z = 0; z < SECTOR_SIZE; z++) {
voxelGrid[index*3 + 0] = x;
voxelGrid[index*3 + 1] = y;
voxelGrid[index*3 + 2] = z;
index++;
}
}
}
glGenBuffers(1, &bufferPass1);
glBindBuffer(GL_ARRAY_BUFFER, bufferPass1);
glBufferData(GL_ARRAY_BUFFER, SECTOR_SIZE_CUBED*3*sizeof(float), voxelGrid, GL_STATIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, 0);
glGenBuffers(1, &bufferPass2);
glBindBuffer(GL_ARRAY_BUFFER, bufferPass2);
glBufferData(GL_ARRAY_BUFFER, SECTOR_SIZE_CUBED*5*sizeof(int), NULL, GL_DYNAMIC_COPY);
glBindBuffer(GL_ARRAY_BUFFER, 0);
glGenVertexArrays(1, &vaoPass1);
glBindVertexArray(vaoPass1);
glBindBuffer(GL_ARRAY_BUFFER, bufferPass1);
glVertexAttribPointer(attribPass1VertPosition, 3, GL_FLOAT, GL_FALSE, 0, (void*)0);
glBindBuffer(GL_ARRAY_BUFFER, 0);
glEnableVertexAttribArray(attribPass1VertPosition);
glBindVertexArray(0);
glGenVertexArrays(1, &vaoPass2);
glBindVertexArray(vaoPass2);
glBindBuffer(GL_ARRAY_BUFFER, bufferPass2);
glVertexAttribIPointer(attribPass2Triangle, 1, GL_INT, 0, (void*)0);
glBindBuffer(GL_ARRAY_BUFFER, 0);
glEnableVertexAttribArray(attribPass2Triangle);
glBindVertexArray(0);
glGenQueries(1, &queryNumTriangles);
}
void MarchingCubesDoublePass::Register(Genesis::ServiceProvider* svc, Genesis::Entity* ent) {
this->svc = svc;
this->ent = ent;
svc->scene->RegisterEntity(ent);
InitShaders();
InitTextures();
InitBuffers();
}
void MarchingCubesDoublePass::Unregister() {
if (!ent->GetBehavior<Genesis::Render>()) {
svc->scene->UnregisterEntity(ent);
}
}
void MarchingCubesDoublePass::RenderPass1() {
glEnable(GL_RASTERIZER_DISCARD);
glUseProgram(shaderPass1);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_3D, densityTex);
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_RECTANGLE, triTableTex);
glUniform1i(uniPass1DensityMap, 0);
glUniform1i(uniPass1TriTable, 1);
glUniform1i(uniPass1Size, SECTOR_SIZE);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, bufferPass2);
glBindVertexArray(vaoPass2);
glBeginQuery(GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, queryNumTriangles);
glBeginTransformFeedback(GL_POINTS);
GLenum error = glGetError();
glDrawArrays(GL_POINTS, 0, SECTOR_SIZE_CUBED);
error = glGetError();
glEndTransformFeedback();
glEndQuery(GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN);
glBindVertexArray(0);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, 0);
glUseProgram(0);
glDisable(GL_RASTERIZER_DISCARD);
glGetQueryObjectuiv(queryNumTriangles, GL_QUERY_RESULT, &numTriangles);
}
void MarchingCubesDoublePass::RenderPass2(Matrix mat) {
glUseProgram(shaderPass2);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_3D, densityTex);
glUniform1i(uniPass2DensityMap, 0);
glUniform1i(uniPass2Size, SECTOR_SIZE);
glUniform3f(uniPass2Offset, 0, 0, 0);
mat.UniformMatrix(uniPass2Matrix);
glBindVertexArray(vaoPass2);
glDrawArrays(GL_POINTS, 0, numTriangles);
glBindVertexArray(0);
glUseProgram(0);
}
void MarchingCubesDoublePass::OnRender(Matrix mat) {
RenderPass1();
RenderPass2(mat);
}
The actual error is the call to glDrawArrays in RenderPass1. Worth noting that if I comment out the calls to glBeginTransformFeedback and glEndTransformFeedback, then glDrawArrays stops generating the error. So whatever's wrong, it's probably somehow related to transform feedback.
Edit 8/18/12, 9 PM:
I just found the NVIDIA GLExpert feature in gDEBugger, which I wasn't previously familiar with. When I turned this on, it gave somewhat more substantial information on the GL_INVALID_OPERATION, specifically The current operation is illegal in the current state: Buffer is mapped.. So I'm running into point 1, above. Though I have no idea how.
I have no calls to glMapBuffer, or any related function, anywhere in my code. I set gDEBugger to break on any calls to glMapBuffer, glMapBufferARB, glMapBufferRange, glUnmapBuffer and glUnmapBufferARB, and it didn't break anywhere. Then I added code to the start of RenderPass1 to explicitly unmap bother buffers. Not only did the error not go away, but calls to glUnmapBuffer now both generate The current operation is illegal in the current state: Buffer is unbound or is already unmapped.. So if neither of the buffers I'm using are mapped, where is the error coming from?
Edit 8/19/12, 12 AM:
Based on the error messages I'm getting out of GLExpert in gDEBugger, it appears that calling glBeginTransformFeedback is causing the buffer bound to GL_TRANSFORM_FEEDBACK_BUFFER to become mapped. Specifically, when I click on the buffer in "Textures, Buffers and Images Viewer" it outputs the message The current operation is illegal in the current state: Buffer must be bound and not mapped.. However, if I add this between glBeginTransformFeedback and glEndTransformFeedback:
int bufferBinding;
glGetBufferParameteriv(GL_TRANSFORM_FEEDBACK_BUFFER, GL_BUFFER_MAPPED, &bufferBinding);
printf("Transform feedback buffer binding: %d\n", bufferBinding);
it outputs 0, which would indicate that GL_TRANSFORM_FEEDBACK_BUFFER is not mapped. If this buffer is mapped on another binding point, would this still return 0? Why would glBeginTransformFeedback map the buffer, thus rendering it unusable for transform feedback?
The more I learn here, the more confused I'm becoming.
Edit 10/10/12:
As indicated in my reply below to Nicol Bolas' solution, I found the problem, and it's the same one he found: Due to a stupid typo, I was binding the same buffer to both the input and output binding points.
I found it probably two weeks after posting the question. I'd given up in frustration for a time, and eventually came back and basically re-implemented the whole thing from scratch, regularly comparing bits and pieces the older, non-working one. When I was done, the new version worked, and it was when I searched out the differences that I discovered I'd been binding the wrong buffer.
I figured out your problem: you are rendering to the same buffer that you're sourcing your vertex data.
glBindVertexArray(vaoPass2);
I think you meant vaoPass1
From the spec:
Buffers should not be bound or in use for both transform feedback and other
purposes in the GL. Specifically, if a buffer object is simultaneously bound to a
transform feedback buffer binding point and elsewhere in the GL, any writes to
or reads from the buffer generate undefined values. Examples of such bindings
include ReadPixels to a pixel buffer object binding point and client access to a
buffer mapped with MapBuffer.
Now, you should get undefined values; I'm not sure that a GL error qualifies, but it probably should be an error.
Another (apparently undocumented) case where glDrawArrays and glDrawElements fail with GL_INVALID_OPERATION:
GL_INVALID_OPERATION is generated if a sampler uniform is set to an invalid texture unit identifier. (I had mistakenly performed glUniform1i(location, GL_TEXTURE0); when I meant glUniform1i(location, 0);.)
Another (undocumented) case where glDraw*() calls can fail with GL_INVALID_OPERATION:
GL_INVALID_OPERATION is generated if a sampler uniform is set to a texture unit bound to a texture of the incorrect type. For example, if a uniform sampler2D is set glUniform1i(location, 0);, but GL_TEXTURE0 has a GL_TEXTURE_2D_ARRAY texture bound.