OpenGL: in-GPU texture to texture copy - c++

I want to copy parts of one texture that is already in video memory to a subregion of another texture that is also already in video memory. Fast. Without going through CPU side memory.
That's the way I try to do it:
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, src_texId, 0);
glReadBuffer(GL_COLOR_ATTACHMENT0);
glBindTexture(GL_TEXTURE_2D, dst_texId);
glCopyTexSubImage2D(GL_TEXTURE_2D, 0, dst_x, dst_y, src_x, src_y, width, height);
glBindTexture(GL_TEXTURE_2D, 0);
the code compiles, and my destination texture does receive an update, but it does not seem to work correctly as it is updated with blueish junk data. Is my approach wrong?

Related

Loading image to opengl texture qt c++

I'm using opengl calls on Qt with c++, in order to display images to the screen. In order to get the image to the texture, I first read it as a QImage, and then use the following code to load to texture:
void imageDisplay::loadImage(const QImage &img){
glEnable(GL_TEXTURE_2D); // Enable texturing
glBindTexture(GL_TEXTURE_2D, texture); // Set as the current texture
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE);
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, img.width(),
img.height(),
0, GL_BGRA, GL_UNSIGNED_BYTE, img.bits() );
glFinish();
glDisable(GL_TEXTURE_2D);
}
However, when profiling performance, I suspect this is not the most efficient way of doing this. Performance is critical for the application that I am developing, and I would really appreciate any suggestions on how to improve this part.
(BTW - reading the image is done on a separate module than the one that is displaying it - is it possible to read and load to a texture, and then move the texture the displaying object?)
Thank you
Kick that glFinish out, it decreases the performance heavily while you don't need it at all.
It's hard to say without looking at your profiling results, but a few things to try:
You are sending your data via glTexImage2D() in GL_BGRA format and having the driver reformat. Does it work any faster when you pre-format the bits?: (would be surprising, but you never know what drivers do under the hood.)
QImage imageUpload = imageOriginal.convertToFormat( QImage::Format_ARGB32 ).rgbSwapped();
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, image.width(), image.height(), 0, GL_RGBA, GL_UNSIGNED_BYTE, image.bits() );
Are you populating that QImage from a file on disk? How big is it? If the sheer volume of image data is the problem, and it's a static texture, you can try compressing the image to a DXT/St3 format offline and saving that for later. Then read and upload those bits at runtime instead. This will reduce the number of bytes being transferred to the GPU (and stored on GPU memory) to 1/6 or 1/4 the original. If you need to generate the QImage at runtime and can't precompress, then this will probably slow things down, so keep that in mind.
Is the alpha combine important? i.e. if you remove glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE); does it help? I wonder if that's causing some unwanted churn. I.e. it to be downloaded to host, modified, then sent back or something...
I agree that you should remove the glFinish(); unless you have a VERY specific reason for it.
How often are you needing to do this? every frame?
What about STB image loader:
its raw pixel data in a buffer, it returns a char pointer to a byte buffer which you can use and release after its loaded in OpenGL.
int width, height comp;
unsigned char *data = stbi_load(filename, &width, &height, &comp, 4); // ask it to load 4 components since its rgba // demo only
glGenBuffers(1, &m_vbo);
glBindBuffer(GL_ARRAY_BUFFER, m_vbo);
glBufferData(GL_ARRAY_BUFFER, vertexByteSize), this->Vertices, GL_STATIC_DRAW);
glGenTexture(...)
glBindTexture(GL_TEXTURE_2D, texture);
glTexImage2D( ... );
// you can generate mipmaps here
// Important
stbi_image_free(data);
You can find STB here: get the image load header:
https://github.com/nothings/stb
Drawing with mipmaps using openGL improves performance.
generate mipmaps after you have used glTexImage2D
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_NEAREST);
glGenerateMipmap(GL_TEXTURE_2D);
Good luck

OpenGL: fastest way to draw 2d image

I am writing an interactive path tracer and I was wondering what is the best way to draw the result on screen in modern GL. I have the result of the rendering stored in a pixel buffer that is updated on each pass (+1 ssp). And I would like to draw it on screen after each pass. I did some searching and people have suggested drawing a textured quad for displaying 2d images. Does that mean I would create a new texture each time I update? And given that my pixels are updated very frequently, is this still a good idea?
You don't need to create an entirely new texture every time you want to update the content. If the size stays the same, you can reserve the storage once, using glTexImage2D() with NULL as the last argument. E.g. for a 512x512 RGBA texture with 8-bit component precision:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, 512, 512, 0,
GL_RGBA, GL_UNSIGNED_BYTE, NULL);
In OpenGL 4.2 and later, you can also use:
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8, 512, 512);
You can then update all or parts of the texture with glTexSubImage2D(). For example, to update the whole texture following the example above:
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 512, 512,
GL_RGBA, GL_UNSIGNED_BYTE, data);
Of course, if only rectangular part(s) of the texture change each time, you can make the updates more selective by choosing the 2nd to 5th parameter accordingly.
Once your current data is in a texture, you can either draw a textured screen size quad, or copy the texture to the default framebuffer using glBlitFramebuffer(). You should be able to find plenty of sample code for the first option. The code for the second option would look something like this:
// One time during setup.
GLuint readFboId = 0;
glGenFramebuffers(1, &readFboId);
glBindFramebuffer(GL_READ_FRAMEBUFFER, readFboId);
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL_TEXTURE_2D, tex, 0);
glBindFramebuffer(GL_READ_FRAMEBUFFER, 0);
// Every time you want to copy the texture to the default framebuffer.
glBindFramebuffer(GL_READ_FRAMEBUFFER, readFboId);
glBlitFramebuffer(0, 0, texWidth, texHeight,
0, 0, winWidth, winHeight,
GL_COLOR_BUFFER_BIT, GL_LINEAR);
glBindFramebuffer(GL_READ_FRAMEBUFFER, 0);

OpenGL glGeneratemipmap and Framebuffers

I'm wrapping my head around generating mipmaps on the fly, and reading this bit with this code: http://www.g-truc.net/post-0256.html
//Create the mipmapped texture
glGenTextures(1, &ColorbufferName);
glBindTexture(ColorbufferName);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, 512, 512, 0, GL_UNSIGNED_BYTE, NULL);
glGenerateMipmap(GL_TEXTURE_2D); // /!\ Allocate the mipmaps /!\
...
//Create the framebuffer object and attach the mipmapped texture
glBindFramebuffer(GL_FRAMEBUFFER, FramebufferName);
glFramebufferTexture2D(
GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, ColorbufferName, 0);
...
//Commands to actually draw something
render();
...
//Generate the mipmaps of ColorbufferName
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, ColorbufferName);
glGenerateMipmap(GL_TEXTURE_2D);
My questions:
Why does glGenerateMipmap needs to be called twice in the case of render to texture?
Does it have to be called like this every frame?
If I for example import a diffuse 2d texture I only need to call it once after I load it into OpenGL like this:
GLCALL(glGenTextures(1, &mTexture));
GLCALL(glBindTexture(GL_TEXTURE_2D, mTexture));
GLint format = (colorFormat == ColorFormat::COLOR_FORMAT_RGB ? GL_RGB : colorFormat == ColorFormat::COLOR_FORMAT_RGBA ? GL_RGBA : GL_RED);
GLCALL(glTexImage2D(GL_TEXTURE_2D, 0, format, textureWidth, textureHeight, 0, format, GL_UNSIGNED_BYTE, &textureData[0]));
GLCALL(glGenerateMipmap(GL_TEXTURE_2D));
GLCALL(glBindTexture(GL_TEXTURE_2D, 0));
I suspect it is because the textures are redrawn every frame and the mipmap generation uses its content in the process but I want confirmation of this.
3 - Also, if I render to my gbuffer and then immediately glBlitFramebuffer it to the default FBO, do I need to bind and glGenerateMipmap like this?
GLCALL(glBindTexture(GL_TEXTURE_2D, mGBufferTextures[GBuffer::GBUFFER_TEXTURE_DIFFUSE]));
GLCALL(glGenerateMipmap(GL_TEXTURE_2D));
GLCALL(glReadBuffer(GL_COLOR_ATTACHMENT0 + GBuffer::GBUFFER_TEXTURE_DIFFUSE));
GLCALL(glBlitFramebuffer(0, 0, mWindowWidth, mWindowHeight, 0, 0, mWindowWidth, mWindowHeight, GL_COLOR_BUFFER_BIT, GL_LINEAR));
As explained in the post you link to, "[glGenerateMipmap] does actually two things which is maybe the only issue with it: It allocates the mipmaps memory and generate the mipmaps."
Notice that what precedes the first glGenerateMipmap call is a glTexImage2D call with a NULL data pointer. Those two calls combined will simply allocate the memory for all of the texture's levels. The data they contain at this point is garbage.
Once you have an image loaded into the texture's first level, you will have to call glGenerateMipmap a second time to actually fill the smaller levels with downsampled images.
Your guess is right, glGenerateMipmap is called every frame because the image rendered to the texture's first level changes every frame (since it is being rendered to). If you don't call the function, then the smaller mipmaps will never be modified (if you were to map such a texture, you would see your uninitialized smaller mipmap levels when far enough away).
No. Mipmaps are only needed if you intend to map the texture to triangles with a texture filtering mode that uses mipmaps. If you're only dealing with the first level of the texture, you don't need to generate the mipmaps. In fact, if you never map the texture, you can use a renderbuffer instead of a texture in your framebuffer.

How to copy texture1 to texture2 efficiently?

I want to copy texture1 to texture2.
The most stupid way is copying tex1 data from GPU to CPU, and then copy CPU data to GPU.
The stupid code is as below:
float *data = new float[width*height*4];
glBindTexture(GL_TEXTURE_2D, tex1);
glGetTexImage(GL_TEXTURE_2D, 0, GL_RGBA, GL_FLOAT, data);
glBindTexture(GL_TEXTURE_2D, tex2]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, width, height, 0, GL_RGBA, GL_FLOAT, data);
As I know, it must exist a method that supports data copying from GPU tex to GPU tex without CPU involved. I consider about using FBO that rendering a tex1 quad to tex2. But somehow I think it is still naive. So what is the most efficient way to implement this?
If you have support for OpenGL 4.3, there is the straight-forward glCopyImageSubData for exactly this purpose:
glCopyImageSubData(tex1, GL_TEXTURE_2D, 0, 0, 0, 0,
tex2, GL_TEXTURE_2D, 0, 0, 0, 0,
width, height, 1);
Of course this requires the destination texture to already be allocated with an image of appropriate size and format (using a simple glTexImage2D(..., nullptr), or maybe even better glTexStorage2D if you have GL 4 anyway).
If you don't have that, then rendering one texture into the other using an FBO might still be the best approach. In the end you don't even need to render the source texture. You can just attach both textures to an FBO and blit one color attachment over into the other using glBlitFramebuffer (core since OpenGL 3, or with GL_EXT_framebuffer_blit extension in 2.x, virtually anywhere where FBOs are in the first place):
glBindFramebuffer(GL_FRAMEBUFFER, fbo);
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL_TEXTURE_2D, tex1, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT1,
GL_TEXTURE_2D, tex2, 0);
glDrawBuffer(GL_COLOR_ATTACHMENT1);
glBlitFramebuffer(0, 0, width, height, 0, 0, width, height,
GL_COLOR_BUFFER_BIT, GL_NEAREST);
Of course if you do that multiple times, it might be a good idea to keep this FBO alive. And likewise this also requires the destination texture image to have the appropriate size and format beforehand. Or you could also use Michael's suggestion of only attaching the source texture to the FBO and doing a good old glCopyTex(Sub)Image2D into the destination texture. Needs to be evaluated which performs better (if any).
And if you don't even have that one, then you could still use your approach of reading one texture and writing that data into the other. But instead of using the CPU memory as temporary buffer, use a pixel buffer object (PBO) (core since OpenGL 2.1). You will still have an additional copy, but at least that will (or is likely to be) a GPU-GPU copy.

How to prevent FBO rendering corrupting when running multiple OpenGL processes at same time?

I need to be able to render with multiple processes at same time, using OpenGL.
I'm using FBO to render into a texture. I read the pixels by glGetTexImage() multiple times in that one process (tiled rendering).
Then I launched multiple programs to run at same time and noticed that sometimes it works and sometimes it doesn't. Sometimes the whole image is corrupted (repeats only one tile), sometimes only small part is corrupted. I also noticed earlier that I was not able to use 4096x4096 size FBO texture for some reason, and the errors from that texture size was same as this "multiple processes at once" tiling error, so I thought it could be something to do with the program trying to get a texture that is not yet fully rendered at all? I also noticed that the smaller texture I use, the more processes I can run at the same time. My GFX card memory is 256 MB I think. But even with 8 processes of 1024x1024 size texture size it uses only 33 MB of memory at worst, so it cant be my GFX card memory limitations.
The tiling error looks like it doesn't get the new tile pixel data, so it uses the old buffer again.
What can I do to prevent the corruption of my rendering?
Here is my rendering code structure:
for(y...){
for(x...){
// set viewport & translatef
glBindFramebuffer(GL_FRAMEBUFFER, fboId);
// glclear()
render_tile();
glFlush();
glBindFramebuffer(GL_FRAMEBUFFER, 0);
glBindTexture(GL_TEXTURE_2D, textureId);
glGetTexImage(GL_TEXTURE_2D, 0, GL_RGB, GL_UNSIGNED_BYTE, pixels);
glBindTexture(GL_TEXTURE_2D, 0);
copy_tile_pixels_to_output_image();
}
}
And here is the FBO initialization (only opengl related commands are shown):
// texture:
glGenTextures(1, &textureId);
glBindTexture(GL_TEXTURE_2D, textureId);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, TEXTURE_WIDTH, TEXTURE_HEIGHT, 0, GL_RGBA, GL_UNSIGNED_BYTE, 0);
glBindTexture(GL_TEXTURE_2D, 0);
// FBO:
glGenFramebuffers(1, &fboId);
glBindFramebuffer(GL_FRAMEBUFFER, fboId);
glGenRenderbuffers(1, &rboId);
glBindRenderbuffer(GL_RENDERBUFFER, rboId);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_STENCIL, TEXTURE_WIDTH, TEXTURE_HEIGHT);
glBindRenderbuffer(GL_RENDERBUFFER, 0);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, textureId, 0);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, rboId);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_RENDERBUFFER, rboId);
checkFramebufferStatus(); // will exit if errors found. none found, however.
glBindFramebuffer(GL_FRAMEBUFFER, 0);
Edit: as datenwolf noticed, the problem goes away by using glReadPixels(). But im not still sure why, so it would be good to know whats happening under the hood, to be sure it will not make such errors in any case in the future!
The FBO itself is just an abstract object without its own backing image storage. Basically the FBO itself consists of only slots into which you can plug image sinks and sources. Textures can act as such, but there are also renderbuffers, which serve kind of the same purpose, but cannot be used as a texturing sample source
You can read back directly from a bound FBO using glReadPixels.