Lots of big textures is slow. Best way to speed up? - opengl

I am writing a terrain engine that has a lot of large textures draped over a heightmap. These textures are generated by rendering a bunch of stuff and then using the glCopyTexSubImage2D command a few times.
All fine and well (on speed and quality fronts), but when I make a lot of them (>45 1Mpixel) my framerate takes a nosedive from 60 to about 2. My first thought is that, well something (GPU or CPU) has to be pulling all million pixels per texture each time it is rendered which would definitely slow it down, so I tried what I thought was the solution: to implement mipmapping.
So I pull all the rgb values into an array and repass it into gluBuild2DMipmaps. (Is this not wasteful? I ask for some data and then give it right back? Is there a better way to do this with what I have (see below)).
Now, the mid-distant textures look terrible and bland and I am no better on the speed front.
Is there some way to get more detail on the further out textures while speeding up my rendering? Bear in mind that I am using freeglut and so am rather limited to opengl 2.
[EDIT: some code samples]
The generation of the texture:
//Only executes once as then the texture is defined.
if (TextureNumber == -1)
{
glGenTextures (1, &TextureNumber);
glBindTexture(GL_TEXTURE_2D, TextureNumber);
glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR );
glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR_MIPMAP_LINEAR );
// ... Other not directly related stuff
glTexImage2D (GL_TEXTURE_2D, 0, GL_RGB, TEXTURE_SIZE, TEXTURE_SIZE, 0, GL_RGB, GL_UNSIGNED_BYTE, NULL);
}
//The texture is built up a little bit at a time, over a number of calls.
// Do some rendering
// ...
glBindTexture(GL_TEXTURE_2D, TextureNumber);
//And copy it into the big texture
glCopyTexSubImage2D (GL_TEXTURE_2D, 0, texX * _patch_size, texY * _patch_size, 0, 0, _patch_size, _patch_size);
Finally, this is run once:
unsigned char* dat = new unsigned char [TEXTURE_SIZE*TEXTURE_SIZE*3];
glBindTexture(GL_TEXTURE_2D, TextureNumber);
glGetTexImage(GL_TEXTURE_2D,0,GL_RGB,GL_UNSIGNED_BYTE,dat);
gluBuild2DMipmaps(GL_TEXTURE_2D,3,TEXTURE_SIZE,TEXTURE_SIZE,GL_RGB,GL_UNSIGNED_BYTE,dat);
finishedTexture = true;
delete[] dat;
The rendering:
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glColor3f(1,1,1);
glVertexPointer( 3, GL_FLOAT, 0, VertexData);
glTexCoordPointer(2, GL_FLOAT, 0, TextureData);
glBindTexture(GL_TEXTURE_2D, TextureNumber);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
glDrawElements( GL_TRIANGLES, //mode
numTri[detail], //count, ie. how many indices
GL_UNSIGNED_INT, //type of the index array
TriangleData[detail]);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glDisableClientState(GL_VERTEX_ARRAY);

First way to speed up that code is to get rid of all functions that are from 20 years ago and convert it to shaders. Fixed pipeline can be very unoptimal on modern hardware, and constant sending data to GPU is also probably killing the performance.
Bear in mind that I am using freeglut and so am rather limited to opengl 2.
No, that's not true. Freeglut is concerned mostly with window and context creation, and you can still use GLLoad or GLEW to get OGL 3.x or 4.x functions.
A quick list of things I see:
Fixed pipeline state used glColor
No VBO used / combined with deprecated glVertexPointer
Perhaps FBO should be used to fill the textures initially?

Related

Loading image to opengl texture qt c++

I'm using opengl calls on Qt with c++, in order to display images to the screen. In order to get the image to the texture, I first read it as a QImage, and then use the following code to load to texture:
void imageDisplay::loadImage(const QImage &img){
glEnable(GL_TEXTURE_2D); // Enable texturing
glBindTexture(GL_TEXTURE_2D, texture); // Set as the current texture
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE);
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, img.width(),
img.height(),
0, GL_BGRA, GL_UNSIGNED_BYTE, img.bits() );
glFinish();
glDisable(GL_TEXTURE_2D);
}
However, when profiling performance, I suspect this is not the most efficient way of doing this. Performance is critical for the application that I am developing, and I would really appreciate any suggestions on how to improve this part.
(BTW - reading the image is done on a separate module than the one that is displaying it - is it possible to read and load to a texture, and then move the texture the displaying object?)
Thank you
Kick that glFinish out, it decreases the performance heavily while you don't need it at all.
It's hard to say without looking at your profiling results, but a few things to try:
You are sending your data via glTexImage2D() in GL_BGRA format and having the driver reformat. Does it work any faster when you pre-format the bits?: (would be surprising, but you never know what drivers do under the hood.)
QImage imageUpload = imageOriginal.convertToFormat( QImage::Format_ARGB32 ).rgbSwapped();
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, image.width(), image.height(), 0, GL_RGBA, GL_UNSIGNED_BYTE, image.bits() );
Are you populating that QImage from a file on disk? How big is it? If the sheer volume of image data is the problem, and it's a static texture, you can try compressing the image to a DXT/St3 format offline and saving that for later. Then read and upload those bits at runtime instead. This will reduce the number of bytes being transferred to the GPU (and stored on GPU memory) to 1/6 or 1/4 the original. If you need to generate the QImage at runtime and can't precompress, then this will probably slow things down, so keep that in mind.
Is the alpha combine important? i.e. if you remove glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE); does it help? I wonder if that's causing some unwanted churn. I.e. it to be downloaded to host, modified, then sent back or something...
I agree that you should remove the glFinish(); unless you have a VERY specific reason for it.
How often are you needing to do this? every frame?
What about STB image loader:
its raw pixel data in a buffer, it returns a char pointer to a byte buffer which you can use and release after its loaded in OpenGL.
int width, height comp;
unsigned char *data = stbi_load(filename, &width, &height, &comp, 4); // ask it to load 4 components since its rgba // demo only
glGenBuffers(1, &m_vbo);
glBindBuffer(GL_ARRAY_BUFFER, m_vbo);
glBufferData(GL_ARRAY_BUFFER, vertexByteSize), this->Vertices, GL_STATIC_DRAW);
glGenTexture(...)
glBindTexture(GL_TEXTURE_2D, texture);
glTexImage2D( ... );
// you can generate mipmaps here
// Important
stbi_image_free(data);
You can find STB here: get the image load header:
https://github.com/nothings/stb
Drawing with mipmaps using openGL improves performance.
generate mipmaps after you have used glTexImage2D
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_NEAREST);
glGenerateMipmap(GL_TEXTURE_2D);
Good luck

How to copy texture1 to texture2 efficiently?

I want to copy texture1 to texture2.
The most stupid way is copying tex1 data from GPU to CPU, and then copy CPU data to GPU.
The stupid code is as below:
float *data = new float[width*height*4];
glBindTexture(GL_TEXTURE_2D, tex1);
glGetTexImage(GL_TEXTURE_2D, 0, GL_RGBA, GL_FLOAT, data);
glBindTexture(GL_TEXTURE_2D, tex2]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, width, height, 0, GL_RGBA, GL_FLOAT, data);
As I know, it must exist a method that supports data copying from GPU tex to GPU tex without CPU involved. I consider about using FBO that rendering a tex1 quad to tex2. But somehow I think it is still naive. So what is the most efficient way to implement this?
If you have support for OpenGL 4.3, there is the straight-forward glCopyImageSubData for exactly this purpose:
glCopyImageSubData(tex1, GL_TEXTURE_2D, 0, 0, 0, 0,
tex2, GL_TEXTURE_2D, 0, 0, 0, 0,
width, height, 1);
Of course this requires the destination texture to already be allocated with an image of appropriate size and format (using a simple glTexImage2D(..., nullptr), or maybe even better glTexStorage2D if you have GL 4 anyway).
If you don't have that, then rendering one texture into the other using an FBO might still be the best approach. In the end you don't even need to render the source texture. You can just attach both textures to an FBO and blit one color attachment over into the other using glBlitFramebuffer (core since OpenGL 3, or with GL_EXT_framebuffer_blit extension in 2.x, virtually anywhere where FBOs are in the first place):
glBindFramebuffer(GL_FRAMEBUFFER, fbo);
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL_TEXTURE_2D, tex1, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT1,
GL_TEXTURE_2D, tex2, 0);
glDrawBuffer(GL_COLOR_ATTACHMENT1);
glBlitFramebuffer(0, 0, width, height, 0, 0, width, height,
GL_COLOR_BUFFER_BIT, GL_NEAREST);
Of course if you do that multiple times, it might be a good idea to keep this FBO alive. And likewise this also requires the destination texture image to have the appropriate size and format beforehand. Or you could also use Michael's suggestion of only attaching the source texture to the FBO and doing a good old glCopyTex(Sub)Image2D into the destination texture. Needs to be evaluated which performs better (if any).
And if you don't even have that one, then you could still use your approach of reading one texture and writing that data into the other. But instead of using the CPU memory as temporary buffer, use a pixel buffer object (PBO) (core since OpenGL 2.1). You will still have an additional copy, but at least that will (or is likely to be) a GPU-GPU copy.

Does glGenTextures and glBindTexture needs to be called before doing glTexImage2D and glTexSubImage2D

Hi I have 35 images to draw in a display. I am drawing it in 7X5 grid. The images are downloaded from internet. Each time an image is downloaded I try to draw the whole 35 images. But some of them are not downloaded yet. So I draw some default tile for them instead. The problem is every time an image is downloaded I am drawing the previously drawn images again too. I want to reduce it. So I was thinking about doing something like Texture Atlas. I am trying to do it manually. I am doing it by making a big image with glTexImage2D and adding subimages to it with glTexSubImage2D.
glGenTextures(1, tex);
glBindTexture(GL_TEXTURE_2D, (*tex));
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glTexImage2D(GL_TEXTURE_2D, 0, textureImageInfo->format, textureImageInfo->texWidth, textureImageInfo->texHeight, 0, textureImageInfo->format, GL_UNSIGNED_BYTE, NULL);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, textureImageInfo->imageWidth, textureImageInfo->imageHeight, textureImageInfo->format, GL_UNSIGNED_BYTE, textureImageInfo->image);
I call 35 glTexSubImage2D to add all 35 images to a big glTexImage2D. Here I wrote only one for easier explanation. Then finally I do
glEnable(GL_TEXTURE_2D);
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glVertexPointer(3, GL_FLOAT, 0, this->tileCoordList);
glTexCoordPointer(3, GL_FLOAT, 0, this->tileTextureCoordList);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glDisable(GL_TEXTURE_2D);
Now what I am confused with is do I need to generate and bind textures 35 times too for 35 call of glTexSubImage2D ?? Or just doing once is enough. The actual problem I don't understand whats binding the texture has got to do anything with it. Thanks.
There is no such thing as a "subimage" in a texture. There are only images in a texture. glTexImage2D allocates storage for a particular image in the texture, and optionally uploads data to that image. glTexSubImage2D only uploads data to an image. The "sub" means that you can update part of the image, not necessarily all of it.
glTexImage2D is like malloc followed by memcpy. glTexSubImage2D is just a memcpy. That's why you have to call glTexImage2D first.

How to copy depth buffer to a texture on the GPU?

I want to get the current depth buffer to a texture, to access it in a shader. For various reasons I can't do a separate depth pass, but would need to copy the already-rendered depth.
glReadPixels would involve the CPU and potentially kill performance, and as far as I know glBlitFramebuffer can't blit depth-to-color, only depth-to-depth.
How to do this on the GPU?
The modern way of doing this would be to use a FBO. Attach a color and depth texture to it, render, then disable the FBO and use the textures as inputs to a shader that will render to the default framebuffer.
All the details you need about FBO can be found here.
Copying the depth buffer to a texture is pretty simple. If you have created a new texture that you haven't called glTexImage* on, you can use glCopyTexImage2D. This will copy pixels from the framebuffer to the texture. To copy depth pixels, you use a GL_DEPTH_COMPONENT format. I'd suggest GL_DEPTH_COMPONENT24.
If you have previously created a texture with a depth component format (ie: anytime after the first frame), then you can copy directly into this image data with glCopyTexSubImage2D.
It also seems as though you're having trouble accessing depth component textures in your shader, since you want to copy depth-to-color (which is not allowed). If you are, then that is a problem you should get fixed.
In any case, copying should be the method of last resort. You should use framebuffer objects whenever possible. Just render directly to your texture.
Best way would be using FBOs, for better performance and some coding style issues whatsoever.
If you are not interested take a look at this code. It is from the days when I was much younger!(and didn't know FBOs exist)
int shadowMapWidth = 512;
int shadowMapHeight = 512;
glGenTextures(1, &m_depthTexture);
glBindTexture(GL_TEXTURE_2D, m_depthTexture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP);
glTexImage2D( GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, shadowMapWidth, shadowMapHeight, 0, GL_DEPTH_COMPONENT, GL_UNSIGNED_BYTE, 0);
glCopyTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 0, 0, 512,512);

OpenGL texture randomly not shown

I have got a very, very strange problem in my C++ OpenGL application.
I simply load a texture and apply it to a quadric:
glGenTextures(1, &tex);
glBindTexture(GL_TEXTURE_2D, tex);
glTexImage2D(GL_TEXTURE_2D, 0, 3, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, image);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
Then
glEnable(GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D, tex);
gluQuadricDrawStyle(quad,GLU_FILL);
gluQuadricTexture(quad,GL_TRUE);
gluCylinder(quad,1,0,2,20,1);
glDisable(GL_TEXTURE_2D);
Now: it works perfectly 9 times out of ten, but sometimes the texture isn't shown (the quadric stays white).
The image is correctly loaded, so the problem should be with OpenGL. I have tried with several different images too. Always GL_NO_ERROR.
Any idea ? It is driving me crazy...
Found :) It was the GLint texture member that wasn't correctly reallocated in the copy constructor.
However, i still don't understand why it worked sometimes...
The code you are using seems valid. Have you ...
tried to use a simple quad instead of the quadric
assured that image is filled correctly
verified that tex is not altered somewhere else
assured that no other programs are using opengl at the same time
restarted your computer ;)