Can glReadPixels be used to read layers from GL_TEXTURE_3D? - opengl

I'm trying to read a 3D texture I rendered using an FBO. This texture is so large that glGetTexImage results in GL_OUT_OF_MEMORY error due to failure of nvidia driver to allocate memory for intermediate storage* (needed, I suppose, to avoid changing destination buffer in case of error).
So I then thought of getting this texture layer by layer, using glReadPixels after I render each layer. But glReadPixels doesn't have layer index as a parameter. The only place where it actually appears as something that directs I/O to the particular layer is gl_Layer output in the geometry shader. And that is for the writing stage, not reading.
As I tried simply doing the calls to glReadPixels anyway after I render each layer, I only got the texels for layer 0. So glReadPixels at least doesn't fail to get something.
But the question is: can I get arbitrary layer of a 3D texture using glReadPixels? And if not, what should I use instead, given the above described memory constraints? Do I have to sample the layer from 3D texture in a shader to render the result to a 2D texture, and read this 2D texture afterwards?
*It's not a guess, I've actually tracked it down to a failing malloc call (with the size of the texture as argument) from within the nvidia driver's shared library.

If you have access to GL 4.5 or ARB_get_texture_sub_image, you can employ glGetTextureSubImage. As the function name suggests, it's for querying a sub-section of a texture's image data. This allows you to read slices of the texture without having to get the whole thing in one go.
The extension seems fairly widely supported, available on any implementation that's still being supported by its IHV.

Yes, glReadPixels can read other slices from the 3D texture. One just has to use glFramebufferTextureLayer to attach the correct current slice to the FBO — instead of attaching the full 3D texture as the color attachment. Here's the replacement code for glGetTexImage (a special FBO for this, fboForTextureSaving, should be generated beforehand):
GLint origReadFramebuffer=0, origDrawFramebuffer=0;
gl.glGetIntegerv(GL_READ_FRAMEBUFFER_BINDING, &origReadFramebuffer);
gl.glGetIntegerv(GL_DRAW_FRAMEBUFFER_BINDING, &origDrawFramebuffer);
gl.glBindFramebuffer(GL_FRAMEBUFFER, fboForTextureSaving);
for(int layer=0; layer<depth; ++layer)
{
gl.glFramebufferTextureLayer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
texture, 0, layer);
checkFramebufferStatus("framebuffer for saving textures");
gl.glReadPixels(0,0,w,h,GL_RGBA,GL_FLOAT, subpixels+layer*w*h*4);
}
gl.glBindFramebuffer(GL_READ_FRAMEBUFFER, origReadFramebuffer);
gl.glBindFramebuffer(GL_DRAW_FRAMEBUFFER, origDrawFramebuffer);
Anyway, this is not a long-term solution to the problem. The first reason for GL_OUT_OF_MEMORY errors with large textures is actually not lack of RAM or VRAM. It's subtler: each texture allocated on GPU is mapped to the process' address space (at least on Linux/nvidia). So if your process doesn't malloc even half of the RAM available to it, its address space may be already used by these large mappings. Add to this a bit of memory fragmentation, and you get either GL_OUT_OF_MEMORY, or malloc failure, or std::bad_alloc somewhere even earlier than expected.
The proper long-term solution is to embrace the 64-bit reality and compile your app as 64-bit code. This is what I ended up doing, ditching all this layer-by-layer kludge and simplifying the code quite a bit.

So once you got your 3D texture you can do this:
for (z=0;z<z_resolution_of_your_txr;z++)
{
render_textured_quad(using z slice of 3D texture);
glReadPixels(...);
}
Its best to match the QUAD size ot your 3D texture x,y resolutions and use GL_NEAREST filtering...
This will be slow so if you are not on Intel and want to be more fast you can use render to 2D Texture instead and use glGetTexImage on the target 2D texture instead of glReadPixels.
Here example shaders for rendering slice z:
Vertex:
//------------------------------------------------------------------
#version 420 core
//------------------------------------------------------------------
uniform float aspect;
layout(location=0) in vec2 pos;
out smooth vec2 vpos;
//------------------------------------------------------------------
void main(void)
{
vpos=pos;
gl_Position=vec4(pos.x,pos.y*aspect,0.0,1.0);
}
//------------------------------------------------------------------
Fragment:
//------------------------------------------------------------------
#version 420 core
//------------------------------------------------------------------
uniform float slice=0.25; // <0,1> slice of txr
in smooth vec2 vpos;
uniform sampler3D vol_txr; // 3D texture unit used
out layout(location=0) vec4 frag_col;
void main()
{
frag_col=texture(vol_txr,vec3(0.5*(vpos+1.0),slice));
}
//---------------------------------------------------------------------------
So you need to change the slice uniform before each slice render. The rendering itself is just single QUAD covering the screen <-1,+1> while viewport matches the texture x,y resolution...

Related

How to loop over every pixel in a 3D texture/buffer without using compute shaders

I understand how you would do this with a 2D buffer. Just draw two triangles that make a quad that fully encompass the 2D buffer space. That way when the fragment shader runs it runs for all the pixels in the buffer.
Question: How would this work for a 3D buffer?
You could just write a lot of triangles for each cross-section of the 3D buffer. However, if you had a texture that was 1x1x256 that would mean that you would need to draw 256*2 triangles for each slice to iterate over all of the pixels. I know this is an extreme case and there are ways of optimizing this solution. However, I feel like there is a more elegant solution that I am missing.
What I am trying to do: I am trying to make a 3D fluid solver that iterates through each of the pixels of the 3D texture and computes its velocity, density, etc. I am trying to do this via the fragment shader because I am using OpenGL 3.0 which does not use compute shaders.
#version 330 core
out vec4 FragColor;
uniform sampler3D volume;
void main()
{
// computing the fluid density, velocity, and center of mass
// output the values to the 3D buffer to diffrent color channels:
fragColor = vec4(density, velocity.xy, centerOfMass);
}
At some point in the fragment shader, you're going to write some statement of the form:
vec4 value = texture(my_texture, TexCoords);
Where TexCoords is the location in my_texture that maps to some particular value in the source texture. But... that mapping is entirely up to you. Nobody's making you use gl_FragCoord.xy / textureSize(my_texture). You could just as easily use vec3(gl_FragCoord.x, Y_value, gl_FragCoord.y) / textureSize(my_texture), which puts the Y component of the fragment location in the Z dimension of the texture. Y_value in this case is a value passed from the outside that tells which vertical slice of the 3D texture to use.
Of course, whatever mapping you use to fetch the data must also be used when you write the data. If you're writing via fragment shader outputs, that poses a problem. A 3D texture can only be attached to an FBO as either a single 2D slice or as a layered set of 2D slices, with these slices always being along the Z dimension of the image. So even if you try to read in slices along the Y dimension, it has to be written in Z slices. So you'd be moving around the location of the data, which makes this non-viable.
If you're using image load/store, then you have no problem. You can just write to the appropriate texel (indeed, you can read from it as an image using integer coordinates, so there's no need to divide by the texture's size).

GLSL : accessing framebuffer to get RGB and change it

I'd like to access framebuffer to get RGB and change their values for each pixel. It is because the glReadPixels, and glDrawPixels are too slow to use, so that i should use shaders instead of using them.
Now, I write code, and success to display three-dimensional model using GLSL shaders.
I drew two cubes as follows.
....
glDrawArrays(GL_TRIANGLES, 0, 12*6);
....
and fragment shader :
varying vec3 fragmentColor;
void main()
{
gl_FragColor = vec4(fragmentColor, 1);
}
Then, how can I access to RGB values and change it?
For example, If the pixel values at (u1, v1) on window and (u2, v2) are (0,0,255), then I want to change them to (255,0,0)
With the exception of an OpenGL ES-only extension, fragment shaders cannot just read from the current framebuffer. Otherwise, we wouldn't need blending.
You also can't just render to the image you're reading from in a shader. So if you need to do some sort of post-processing, then that is best done by rendering to a separate image. That is, you do your rendering to image 1, then bind that as a texture and change the FBO so that you're rendering to image 2.
Alternatively, if you have access to OpenGL 4.5/ARB/NV_texture_barrier, then you can use texture barriers to handle this. This permits you a single read/modify/write pass, if you bind the current framebuffer's image as a texture. You'd issue the barrier before doing your read/modify/write, then bind that texture to a sampler while still rendering to that framebuffer.
Also, this requires that the FS read from the exact texel that it would write to. Assuming a viewport anchored at 0,0, the code for this would be texelFetch(sampler, ivec2(gl_FragCoord.xy), 0). You can't read from someone else's texel and modify it.
Obviously you must be rendering to a texture; you cannot use the default framebuffer for this.
Texture barrier could be used for cases where you read from different texels than you write to. But that would require doing something similar to the first case of switching bound images. Though you wouldn't need to change the FBO exactly; you could change the region of the FBO that you render to. That is, so long as you're reading from a different area than you're rendering to, and you use barriers appropriately when switching between those regions, everything is fine.

How do I render multiple textures in modern OpenGL?

I am currently writing a 2d engine for a small game.
The idea was that I could render the whole scene in just one draw call. I thought I could render every 2d image on a quad which means that I could use instancing.
I imagined that my vertex shader could look like this
...
in vec2 pos;
in mat3 model;
in sampler2d tex;
in vec2 uv;
...
I thought I could just load a texture on the gpu and get a handle to it like I would do with a VBO, but it seems it is not that simple.
It seems that I have to call
glActiveTexture(GL_TEXTURE0..N);
for every texture that I want to load. Now this doesn't seem as easy to program as I thought. How do modern game engines render multiple textures?
I read that the texture limit of GL_TEXTURE is dependent on the GPU but it is at least 45. What if I want to render an image that consists of more than 45 textures for example 90?
It seems that I would have to render the first 45 textures and delete all the texture from the gpu and load the other 45 textures from the hdd to the gpu. That doesn't seem very reasonable to do every frame. Especially when I want to to animate a 2D image.
I could easily think that a simple animation of a 2d character could consist of 10 different images. That would mean I could easily over step the texture limit.
A small idea of mine was to combine multiple images in to one mega image and then offset them via uv coordinates.
I wonder if I just misunderstood how textures work in OpenGL.
How would you render multiple textures in OpenGL?
The question is somewhat broad, so this is just a quick overview of some options for using multiple textures in the same draw call.
Bind to multiple texture units
For this approach, you bind each texture to a different texture unit, using the typical sequence:
glActiveTexture(GL_TEXTURE0 + i);
glBindTexture(GL_TEXTURE_2D, tex[i]);
In the shader, you can have either a bunch of separate sampler2D uniforms, or an array of sampler2D uniforms.
The main downside of this is that you're limited by the number of available texture units.
Array textures
You can use array textures. This is done by using the GL_TEXTURE_2D_ARRAY texture target. In many ways, a 2D texture array is similar to a 3D texture. It's basically a bunch of 2D textures stacked on top of each other, and stored in a single texture object.
The downside is that all textures need to have the same size. If they don't, you have to use the largest size for the size of the texture array, and you waste memory for the smaller textures. You'll also have to apply scaling to your texture coordinates if the sizes aren't all the same.
Texture atlas
This is the idea you already presented. You store all textures in a single large texture, and use the texture coordinates to control which texture is used.
While a popular approach, there are some technical challenges with this. You have to be careful at the seams between textures so that they don't bleed into each other when using linear sampling. And while this approach, unlike texture arrays, allows for different texture sizes without wasting memory, allocating regions within the atlas gets a little trickier with variable sizes.
Bindless textures
This is only available as an extension so far: ARB_bindless_texture.
You need to learn about the difference of texture units and texture objects.
Texture units are like "texture cartridges" of the OpenGL rasterizer. The rasterizer has a limited amount of "cartridge" slots (called texture units). To load a texture into a texture unit you first select the unit with glActiveTexture, then you load the texture "cartridge" (the texture object) using glBindTexture.
The amount of texture object you can have is only limited by your systems memory (and storage capabilities), but only a limited amount of textures can be "slotted" into the texture unit at the same time.
Samplers are like "taps" into the texture units. Different samplers within a shader may "tap" into the same texture unit. By setting the sampler uniform to a texture unit you select which unit you want to sample from.
And then you can also have the same texture "slotted" into multiple texture units at the same time.
Update (some clarification)
I read that the texture limit of GL_TEXTURE is dependent on the GPU but it is at least 45. What if I want to render an image that consists of more than 45 textures for example 90?
Normally you don't try to render the whole image with a single drawing call. It's practically impossible to catch all variations on which textures to use in what situation. Normally you write shaders for specific looks of a "material". Say you have a shader simulating paint on some metal. You'd have 3 textures: Metal, Paint and a modulating texture that controls where metal and where paint is visible. The shader would then have 3 sampler uniforms, one for each texture. To render the surface with that appearance you'd
select the shader program to use (glUseProgram)
for each texture activate in turn the texture unit (glActiveTexture(GL_TEXTURE_0+i) and bind the texture ('glBindTexture`)
set the sampler uniforms to the texture units to use (glUniform1i(…, i)).
draw the geometry.

Use shader on texture instead of screen

I've written a simple GL fragment shader which performs an RGB gamma adjustment on an image:
uniform sampler2D tex;
uniform vec3 gamma;
void main()
{
vec3 texel = texture2D(tex, gl_TexCoord[0].st).rgb;
texel = pow(texel, gamma);
gl_FragColor.rgb = texel;
}
The texture paints most of the screen and it's occurred to me that this is applying the adjustment per output pixel on the screen, instead of per input pixel on the texture. Although this doesn't change its appearance, this texture is small compared to the screen.
For efficiency, how can I make the shader process the texture pixels instead of the screen pixels? If it helps, I am changing/reloading this texture's data on every frame anyway, so I don't mind if the texture gets permanently altered.
and it's occurred to me that this is applying the adjustment per output pixel on the screen
Almost. Fragment shaders are executed per output fragment (hence the name). A fragment is a the smallest unit of rasterization, before it's written into a pixel. Every pixel that's covered by a piece of visible rendered geometry is turned into one or more fragments (yes, there may be even more fragments than covered pixels, for example when drawing to an antialiased framebuffer).
For efficiency,
Modern GPUs won't even "notice" the slightly reduced load. This is a kind of microoptimization, that's on the brink of non-measureability. My advice: Don' worry about it.
how can I make the shader process the texture pixels instead of the screen pixels?
You could preprocess the texture, by first rendering it through a texture sized, not antialiased framebuffer object to a intermediate texture. However if your change is nonlinear, and a gamma adjustment is exactly that, then you should not do this. You want to process images in a linear color space and apply nonlinear transformation only as late as possible.

Fast paletted screen blit with OpenGL

A game uses software rendering to draw a full-screen paletted (8-bit) image in memory.
What's the fastest way to put that image on the screen, using OpenGL?
Things I've tried:
glDrawPixels with glPixelMap to specify the palette, and letting OpenGL do the palette mapping. Performance was horrendous (~5 FPS).
Doing palette mapping (indexed -> BGRX) in software, then drawing that using glDrawPixels. Performance was better, but CPU usage is still much higher than using 32-bit DirectDraw with the same software palette mapping.
Should I be using some kind of texture instead?
glDrawPixels with glPixelMap to specify the palette, and letting OpenGL do the palette mapping. Performance was horrendous (~5 FPS).
That's not a surprise. glDrawPixels is not very fast to begin with and glPixelMap will do the index/palette → RGB conversion on the CPU in a surely not very optimized codepath.
Doing palette mapping (indexed -> BGRX) in software, then drawing that using glDrawPixels.
glDrawPixels is about one of the slowest functions in OpenGL there is. This has two main reasons: First it's a codepatch not very much optimized, second it writes directly into the target framebuffer, hence forcing the pipeline into synchronization every time it's called. Also on most GPU it isn't backed by any cache.
What I suggest is you place your indexed image into single channel texture, e.g. GL_R8 (for OpenGL-3 or later) or GL_LUMINANCE8, and your palette into a 1D RGB texture, so that the index used as texture coordinate does look up the color. Using a texture as a LUT is perfectly normal. With this combination you use a fragment shader for in-situ palette index to color conversion.
The fragment shader would look like this
#version 330
uniform sampler2D image;
uniform sampler1D palette;
in vec2 texcoord;
void main()
{
float index = tex2D(image, texcoord).r * 255.f; // 255 for a 8 bit palette
gl_FragColor = texelFetch(palette, index, 0);
}