I wish to process an image using glsl. For instance - for each pixel, output its squared value:
(r,g,b)-->(r^2,g^2,b^2). Then I want to read the result into cpu memory using glReadPixels.
This should be simple. However, most glsl examples that I find explain about shaders for image post-processing; thus, their output value already lies in [0,255]. In my example, however, I want to get output values in the range [0^2,255^2]; and I don't want them normalized to [0,255].
The main parts of my code are (after some trials and permutations):
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16F, width, height, 0, GL_BGR, GL_FLOAT, NULL);
glReadPixels(0, 0, width, height, GL_RGB, GL_FLOAT, data_float);
I don't post my entire code since I think these two lines is where my problem lies.
Edit
Following #Arttu's suggestion, and following this post and this post my code now reads as follows:
glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_RGBA32F_ARB, width, height, 0, GL_RGBA, GL_FLOAT, NULL);
glReadPixels(0, 0, width, height, GL_RGB, GL_FLOAT, data_float);
Still, this does not solve my problem. If I understand correctly - no matter what, my input values get scaled to [0,1] when I insert them. It's up to me to multiply later by 255 or by 255^2...
Using floating-point texture format will keep your values intact without clamping them to any specific range (in this case, within the limits of 16-bit float representation, of course). You didn't specify your OpenGL version, so this assumes 4.3.
You seem to have conflicting format and internalformat. You're specifying internalformat RGBA16F, but format BGR, without the alpha component (glTexImage2D man page). Try the following:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA16F, width, height, 0, GL_BGRA, GL_FLOAT, NULL);
glReadPixels(0, 0, width, height, GL_RGBA, GL_FLOAT, data_float);
On the first line you're specifying a 2D texture with four-component, 16-bit floating point format, and OpenGL will expect the texture data to be in BGRA format. Since you have 0 as the last parameter, you're not specifying any image data. Remember that RGBA16F format gives you half values in your shader, which will be implicitly casted to 32-bit format if you're assigning the values to float or vec* variables.
On the second line, you're downloading image data from the device to data_float, this time in RGBA order.
If this doesn't solve your problem, you'll probably need to include some more code. Also, adding glGetError calls into your code will help you find the call that causes an error. Good luck :)
Related
After creating a 2D texture array with
glTexImage3D(GL_TEXTURE_2D_ARRAY, 0, GL_RED, 1024, 1024, 1, 0, GL_RED, GL_UNSIGNED_BYTE, NULL);
I upload image data portion by portion using the function glTexSubImage3D() with
glTexSubImage3D(GL_TEXTURE_2D_ARRAY, 0, 0, 0, 0, 66, 66, 1, GL_RED, GL_UNSIGNED_BYTE, data);
The image gets uploaded but in an incorrect way. It appears to be smeared, as if it's using a different pitch instead of 66 bytes. This is on an NVIDIA card using fairly recent drivers.
Funny enough if I make the image 100 pixels wide instead (but not 99), the upload works correctly. Any idea what might be going wrong?
Found the problem. OpenGL has an initial default pixel alignment of 4, even if you specify that the pixel data format is GL_RED.
By changing the row alignment to 1 byte with
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
the problem goes away.
I am using offscreen rendering to texture for a simple GPU calculation. I am using
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, texSize, texSize, 0, GL_RGBA, GL_FLOAT, nullptr);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, texture, 0);
to allocate storage for the texture and
glReadPixels(0, 0, texSize, texSize, GL_RGBA, GL_FLOAT, data);
to read out the computed data. The problem is that the output from the fragment shader I am interested in is only vec2, so the first two slots of the color attachment are populated and the other two are garbage. I then need to post-process data to only take two out of each four floats, which takes needless cycles and storage.
If it was one value, I'd use GL_RED, if it was three, I'd use GL_RGB in my glReadPixels. But I couldn't find a format that would read two values. I'm only using GL_RGBA for convenience as it seems more natural to take 2 floats out of 2×2 than out of 3.
Is there another way which would read all the resulting vec2 tightly packed? I thought of reading RED only, somehow convincing OpenGL to skip four bytes after each value, and then reading GREEN only into the same array to fill in the gaps. To this end I tried to study about glPixelStore but it does not seem to be for this purpose. Is this, or any other way, even possible?
If you only want to read the RG components of the image, you use a transfer format of GL_RG in your glReadPixels command.
However, that's going to be a slow read unless your image also only stores 2 channels. So your image's internal format should be GL_RG32F.
I'm specifying cubemap texture for my skybox in the following way:
glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + 0, 0, GL_RGB, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, texData(0));
glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + 1, 0, GL_RGB, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, texData(1));
glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + 2, 0, GL_RGB, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, texData(2));
glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + 3, 0, GL_RGB, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, texData(3));
glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + 4, 0, GL_RGB, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, texData(4));
glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X + 5, 0, GL_RGB, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, texData(5));
texData is an unsigned char* vector.
Using Visual Studio Debugger I found that each line takes about 4ms to run, so using 6 lines to specify the cubemap texture takes about 20-25ms in total. I update this cubemap texture in each iteration of my main loop, and it is slowing down my main loop considerably. I know skyboxes are tradionally static but my application needs the skybox to be updated because I'm creating a 360 video viewer.
Is there another way to specify the cubemap texture that could be faster? I have checked OpenGL's docs already but I don't see a faster way.
UPDATE: I replaced glTexImage2D with glTexSubImage2D for all iterations except the 0th iteration and now the total time taken by 6 glTexSubImage2D lines is under 5ms. This is satisfactory for me but I guess I'll leave the question open because technically there's no answer yet.
glTexImage is slower as every time you call it, it will allocate memory on driver side and copy pixel data from your image from CPU to GPU, which happens over the bus.
On the other hand, glTexSubImage is not allocating memory every time. On first call it allocates memory and holds the pointer to it. Later it just copies directly to the memory via that pointer.
I think depending on filtering flags you mention, OpenGL might be creating different texture levels.
Try using glTexStorage2D with level 1.
Also, try using SOIL library - it has one function with simple API to load cubemap.
One other thing you can try is compressing textures and then profile your program - I am sure this option should give you the best performance.
glTexImage2D is slow because it is copying large amounts of dqta from CPU memory to GPU memory. If the images are coming from a video decoder, they are possibly already in GPU memory. In such a case you might be able to texturize then using OpenGL extensions.
These tend to be platform specific though.
I have created a sample application using glew and glut which reads a dds file and displays it. I manually read the dds file (NPOT(886 x 317) file in R8G8B8) and creates the data pointer(unsigned char*).
Then I prepared the texture using
void prepareTexture(int w, int h, unsigned char* data) {
/* Create and load texture to OpenGL */
glGenTextures(1, &textureID); /* Texture name generation */
glBindTexture(GL_TEXTURE_2D, textureID);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB,
w, h,
0, GL_RGB, GL_UNSIGNED_BYTE,
data);
glGenerateMipmap(GL_TEXTURE_2D);
}
In the above figure, First one shows the original dds file and second one is the rendering result of my application which is obviously wrong. If I re-size the image to 1024 x 512, both images will look same.
From the OpenGL Specification
I.3 Non-Power-Of-Two Textures
The restriction of textures to power-of-two dimensions has been relaxed for all texture targets, so
that non-power-of-two textures may be specified without generating
errors. Non-power-of-two textures was promoted from the ARB texture
non power of two extension.
From which what I understand is from OpenGl 2.0 we can use NPOT textures and OpenGL will handle this.
I tried using the DevIL image library to load the dds file but end up with same result. If I convert the image to a RGBA and and change the internal format and format of glTexImage2D to GL_RGBA I will get correct result even if the dds file is NPOT.
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA,
w, h,
0, GL_RGBA, GL_UNSIGNED_BYTE,
data);
I tried the application in PC's with NVIDA card and Radeon card and both of them are giving the same result.
My sample source code can be downloaded from the link
Can anybody tell me what is wrong with my application? Or OpenGL does not allow NPOT if the image is in R8G8B8.
This looks like an alignment issue. Add this before the glTexImage2D() call:
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
This value specifies the row alignment of your data in bytes. The default value is 4.
With your texture width of 886 and 3 bytes per pixel for GL_RGB, each row is 886 * 3 = 2658 bytes, which is not a multiple of 4.
With the UNPACK_ALIGNMENT value at the default, the size would be rounded up to the next multiple of 4, which is 2660. So 2660 bytes will be read for each row, which explains the increasing shift for each row. The first row would be correct, the second one 2 bytes off, the 2nd row 4 bytes off, the 3rd row 6 bytes off, etc.
I have created a sample application using glew and glut which reads a dds file and displays it. I manually read the dds file (NPOT(886 x 317) file in R8G8B8) and creates the data pointer(unsigned char*).
Then I prepared the texture using
void prepareTexture(int w, int h, unsigned char* data) {
/* Create and load texture to OpenGL */
glGenTextures(1, &textureID); /* Texture name generation */
glBindTexture(GL_TEXTURE_2D, textureID);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB,
w, h,
0, GL_RGB, GL_UNSIGNED_BYTE,
data);
glGenerateMipmap(GL_TEXTURE_2D);
}
In the above figure, First one shows the original dds file and second one is the rendering result of my application which is obviously wrong. If I re-size the image to 1024 x 512, both images will look same.
From the OpenGL Specification
I.3 Non-Power-Of-Two Textures
The restriction of textures to power-of-two dimensions has been relaxed for all texture targets, so
that non-power-of-two textures may be specified without generating
errors. Non-power-of-two textures was promoted from the ARB texture
non power of two extension.
From which what I understand is from OpenGl 2.0 we can use NPOT textures and OpenGL will handle this.
I tried using the DevIL image library to load the dds file but end up with same result. If I convert the image to a RGBA and and change the internal format and format of glTexImage2D to GL_RGBA I will get correct result even if the dds file is NPOT.
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA,
w, h,
0, GL_RGBA, GL_UNSIGNED_BYTE,
data);
I tried the application in PC's with NVIDA card and Radeon card and both of them are giving the same result.
My sample source code can be downloaded from the link
Can anybody tell me what is wrong with my application? Or OpenGL does not allow NPOT if the image is in R8G8B8.
This looks like an alignment issue. Add this before the glTexImage2D() call:
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
This value specifies the row alignment of your data in bytes. The default value is 4.
With your texture width of 886 and 3 bytes per pixel for GL_RGB, each row is 886 * 3 = 2658 bytes, which is not a multiple of 4.
With the UNPACK_ALIGNMENT value at the default, the size would be rounded up to the next multiple of 4, which is 2660. So 2660 bytes will be read for each row, which explains the increasing shift for each row. The first row would be correct, the second one 2 bytes off, the 2nd row 4 bytes off, the 3rd row 6 bytes off, etc.