OpenGL textures: why does order matter? - c++

I am trying to setup a simple function which will make it a lot easier for me to texture map geometry in OpenGL, but for some reason when I'm trying to make a skybox, I am getting a white box instead of the texture mapped geometry. I think that the problemed code lies within the following:
void MapTexture (char *File, int TextNum) {
if (!TextureImage[TextNum]){
TextureImage[TextNum]=auxDIBImageLoad(File);
glGenTextures(1, &texture[TextNum]);
glBindTexture(GL_TEXTURE_2D, texture[TextNum]);
glTexImage2D(GL_TEXTURE_2D, 0, 3, TextureImage[TextNum]->sizeX, TextureImage[TextNum]->sizeY, 0, GL_RGB, GL_UNSIGNED_BYTE, TextureImage[TextNum]->data);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_LINEAR);
}
glEnable(GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D, texture[TextNum]);
//glTexImage2D(GL_TEXTURE_2D, 0, 3, TextureImage[TextNum]->sizeX, TextureImage[TextNum]->sizeY, 0, GL_RGB, GL_UNSIGNED_BYTE, TextureImage[TextNum]->data);
//glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_LINEAR);
}
The big thing I don't understand is for some reason the glBindTexture() must come between glGenTextures() and glTexImage2D. If I place it anywhere else, it screws everything up. What could be causing this problem? Sorry if it's something simple, I'm brand new to openGL.
Below is a screenshot of the whitebox I am talking about:
+++++++++++++++++++++++++++++++
EDIT
+++++++++++++++++++++++++++++++
After playing around with the code a bit more, i realized that if I added glTexImage2D() and glTexParameteri()after the last glBindTexture() then all the textures load. Why is it that without these two lines most textures would load, and yet there are a few that would not, and why do I have to call glTexImage() for every frame, but only for a few textures?

Yes, order is definitely important.
glGenTexture creates a texture name.
glBindTexture takes the texture name generated by glGenTexture, so it can't be run before glGenTexture.
glTexImage2D uploads data to the currently bound texture, so it can't be run before glBindTexture.
The client-side interface to OpenGL is a Big Giant Squggly State Machine. There are an enormous number of parameters and flags that you can change, and you have to be scrupulous to always leave OpenGL in the right state. This usually means popping matrices you push and restoring flags that you modify (at least in OpenGL 1.x).

OpenGL is a state machine, which means that you can pull its levers, turn its knobs, and it will keep those settings, until you change them actively.
However it also manages it's persistent data in objects. Such objects are something abstract, and must not be confused with objects seen on the screen!
Now to the outside OpenGL identifies objects by their so called name, a numerical ID. You create a (list of) name(s) – but not the object(s)! – with glGenTextures for texture objects, which are such a kind of OpenGL object.
To maniupulate such an object, OpenGL must first be put into a state that all the following calls to manipulate such objects of that type happen to one particular object. This is done with glBindTexture. After calling glBindTexture all the following calls that manipulate textures happen to that one texture object you've just bound. If the object didn't exist previously, it is created if a new assigned object name is bound for the first time.
Now OpenGL uses that particular object.
glTexImage2D is just one of several functions to maniuplate the data of the currently bound textures.
Otherwise your function points into the right direction. OpenGL has no real initialization phase, you just do things as you go along. And it makes sense to defer loading of data until you need it. But it also makes sense to have multiple iterations over the lists of objects before you actually draw a frame. One of the preparations should be, that you iterate over all objects (now not OpenGL but your's) to test if the data's already loaded. If a significant amount of data's still missing, draw a loading screen instead, so that the user doesn't get the impression your program hangs. Maybe even carry out lengthy loading operations in a separate thread, but with OpenGL this requires some precautions.

Related

OpenGL: Repeated use of transform feedback buffers overwrites already established textures

I have a working implementation of this technique for view frustum culling of instanced geometry. The gist of the technique is that we use a vertex shader to check if the bounds of an object lie within the view frustum, and if they do we output the position of that object, using a transform feedback buffer and a geometry shader, to a texture. We can then, during an actual rendering pass, use that texture, along with a query of how many positions we emitted, to acquire the relevant position data for the object we're rendering, and number of draws to specify in our call to glDrawElementsInstanced. One difference between what I do, and what the article does, is that I emit a full transformation matrix, rather than a simple position vector, to the texture, but I doubt that has any bearing on my problem.
The actual problem: Currently I have this setup so that, for each object type being rendered (i.e. tree, box, rock, whatever), the actual rendering pass follows immediately upon the frustum cull rendering pass. This works, and gives the intended results. What I want to do instead, however, is to go over all my drawcommands and do all the frustum culling for the various objects first, and only thereafter do all the actual rendering, to avoid a bunch of unnecessary state changes (i.e. switching back and forth between shader programs). When I do this, however, I encounter the problem that previously established textures -- the ones I use for reading positions from during the actual rendering passes -- all seem to be overwritten by the latest call to the frustum culling function, meaning that all textures established seemingly contain only the position information from the last frustum cull call.
For example: I render, in order, 4 trees, 10 boxes and 3 rocks, and what I will see instead is a tree, a box, and a rock, at all the (three) positions where I would expect only the 3 rocks to be. I cannot for the life of me figure out why this is, because I quite clearly bind new buffers and textures to the TRANSFORM_FEEDBACK_BUFFER every time I call the function. Why are the previously used textures still receiving the new data from the latest call?
Code, in C, for the frustum culling function:
void fcullidraw(drawcommand *tar) {
/* printf("Fculling %s\n", tar->res->name); */
mesh *rmesh = &tar->res->amod->meshes[0];
/* glDeleteTextures(1, &rmesh->ctex); */
if(rmesh->ctbuf == 0)
glGenBuffers(1, &rmesh->ctbuf);
glBindBuffer(GL_TEXTURE_BUFFER, rmesh->ctbuf);
glBufferData(GL_TEXTURE_BUFFER, sizeof(instancedata) * tar->nodraws, NULL, GL_DYNAMIC_COPY);
if(rmesh->ctex == 0)
glGenTextures(1, &rmesh->ctex);
glBindTexture(GL_TEXTURE_BUFFER, rmesh->ctex);
glTexBuffer(GL_TEXTURE_BUFFER, GL_RGBA32F, rmesh->ctbuf);
if(rmesh->cquery == 0)
glGenQueries(1, &rmesh->cquery);
checkactiveshader(tar->tar, findshader("icull"));
glEnable(GL_RASTERIZER_DISCARD);
glUniform1f(activeshader->radius, tar->res->amesh->bbox.radius);
glUniform3fv(activeshader->extent, 1, (const GLfloat*)&tar->res->amesh->bbox.ext);
glUniform3fv(activeshader->cp, 1, (const GLfloat*)&tar->res->amesh->bbox.cp);
glBindVertexArray(tar->res->amod->meshes[0].vao);
glBindBuffer(GL_ARRAY_BUFFER, tar->res->amod->meshes[0].posarray);
glBufferData(GL_ARRAY_BUFFER, sizeof(mat4_t) * tar->nodraws, tar->posarray, GL_DYNAMIC_DRAW);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, rmesh->ctbuf);
glBeginTransformFeedback(GL_POINTS);
glBeginQuery(GL_PRIMITIVES_GENERATED, rmesh->cquery);
glDrawArrays(GL_POINTS, 0, tar->nodraws);
glEndQuery(GL_PRIMITIVES_GENERATED);
glEndTransformFeedback();
glDisable(GL_RASTERIZER_DISCARD);
glGetQueryObjectuiv(rmesh->cquery, GL_QUERY_RESULT, &rmesh->visibleinstances);
}
tar and rmesh obviously vary between each call to this function. Do note that I have left in a few lines of comments here containing code to delete the buffers and textures between each rendering cycle, rather than simply overwriting them, but using that code instead has no effect on the error mode.
I'm stumped. I feel that the textures and buffers are well defined and clearly kept separate, so I do not understand how the textures from previous calls to fcullidraw are somehow still bound to and being overwritten by the TransformFeedback, if that is indeed what is happening, and it certainly seems to be, because the earlier objects will read in the entire transformation matrix of the rock quite neatly, with the "right" rotation, translation, and everything.
The article linked does do the operations in the order I want to do them -- i.e. first repeated frustum culls, and then repeated rendering -- and I'm not sure I see what I do differently. Might be some small and obvious thing, and I might be an idiot, but in that case I'd love to know why and how I am that.
EDIT: I pushed on and updated my implementation with a refinement of the original technique, suggested here, which gets rid of the writing-to-texture method altogether, in favor of instead simply writing to a buffer bound to the VAO, and set to update once per rendered instance with a VertexAttribDivisor. This method looks at lot cleaner on the whole, and incidentally had the additional side effect of not having my original problem at all, as I'm no longer writing to and uploading textures. This is, thus, no longer a practical problem for me, but the answer to the theoretical question does still elude me, so if anyone has ideas I'm still all ears.

OpenGL new textures replacing old ones after deletion

I've run into a bit of a confusing problem with OpenGL, it's rather simple but I've failed to find any directly related information.
What I'm trying to do
I'm creating several new textures every frame, and right after creation I bind them, use them for drawing, and then delete them right after.
The Problem
If I delete every texture right after it was used, the last one to be drawn replaces the previous ones(but their different geometry works as it should). If I batch my deletions after all drawing has been done, it works as expected, but if I do any draw calls at all after deleting the textures, the texture used in the last drawcall replaces the old ones(which could be some common permanent sprite texture).
Results from debugging
I've tried using glFlush(), which didn't seem to do anything at all, not deleting the textures at all gives the correct behaviour, and also not drawing anything at all between deleting the textures and calling SwapBuffers() works.
Code
This is not what my code looks like, but this is what the relevant parts boil down to:
int Tex1, Tex2, Tex3;
glGenTextures(1, &Tex1);
glBindTexture(GL_TEXTURE_2D, Tex1);
// ... Fill Texture with data, set correct filtering etc.
glDrawElements(GL_TRIANGLES, ...); // Using Tex1
glGenTextures(1, &Tex2);
glBindTexture(GL_TEXTURE_2D, Tex2);
// ... Fill Texture with data, set correct filtering etc.
glDrawElements(GL_TRIANGLES, ...); // Using Tex2
// I delete some textures here.
glDeleteTextures(1, &Tex1);
glDeleteTextures(1, &Tex2);
// If I comment out this section, everything works correctly
// If I leave it in, this texture replaces Tex1 and Tex2, but
// the geometry is correct for each geometry batch.
glGenTextures(1, &Tex3);
glBindTexture(GL_TEXTURE_2D, Tex3);
// ... Fill Texture with data, set correct filtering etc.
glDrawElements(GL_TRIANGLES, ...); // Using Tex3
glDeleteTextures(1, &Tex3);
// ...
SwapBuffers();
I suspect this might have something to do with OpenGL buffering my draw calls,
and by the time they are actually processed the textures are deleted? It doesn't really make sense to me though, why would drawing something else after deleting the previous textures cause this behaviour?
More context
The generated textures are text strings, that may or may not change each frame, right now I create new textures for each string each frame and then render the texture and discard it right after. The bitmap data is generated with Windows GDI.
I'm not really looking for advice on efficiency, ideally I want an answer that can quote the documentation on the expected/correct behaviour for rendering using temporary textures like this, as well as possible common gotchas with this approach.
The expected behavior is clear. You can delete the objects as soon as you are done using them. In your case, after you made the draw calls that use the textures, you can call glDeleteTextures() on those textures. No additional precautions are required from your side.
Under the hood, OpenGL will typically execute the draw calls asynchronously. So the texture will still be used after the draw call returns. But that's not your problem. The driver is responsible for tracking and managing the lifetime of objects to keep them around until they are not used anymore.
The clearest expression of this I found in the spec is on page 28 of the OpenGL 4.5 spec:
If an object is deleted while it is currently in use by a GL context, its name is immediately marked as unused, and some types of objects are automatically unbound from binding points in the current context, as described in section 5.1.2. However, the actual underlying object is not deleted until it is no longer in use.
In your code, this means that the driver can't delete the textures until the GPU completed the draw call using the texture.
Why that doesn't work in your case is hard to tell. One possibility is always that something in your code unintentionally deletes the texture earlier than it should be. With complex software architectures, that happens much more easily than you might think. For example, a really popular cause is that people wrap OpenGL objects in C++ classes, and let those C++ objects go out of scope while the underlying OpenGL object is still in use.
So you should definitely double check (for example by using debug breakpoints or logging) that no code that deletes textures is invoked at unexpected times.
The other option is a driver bug. While object lifetime management is not entirely trivial, it is so critical that it's hard to imagine it being broken for a very basic case. But it's certainly possible, and more or less likely depending on vendor and platform.
As a workaround, you could try not deleting the texture objects, and only specifying new data (using glTexImage2D()) for the same objects instead. If the texture size does not change, it would probably be more efficient to only replace the data with glTexSubImage2D() anyway.

OpenGL texture terminology/conceptual confusion

I've found a lot of resources that tell you what to type to get a texture on screen, but would like a higher level conceptual understanding of what the openGL API is "doing" and what all of the differences in terminology "mean".
I'm going to do my best to explain what I've picked up, but would love any corrections/additions, or pointers to resources where I can read further (and just a note that I've found the documentation of the actual API calls to just reference themselves in circles and be conceptually lacking).
glGenTextures- this won't actually allocate any memory for the data of a texture on the graphics card (you just tell it "how many" textures you want it to generate, so it doesn't know anything about the size...), but instead sets kind of a "name" aside so you can reference given textures consistently (I've been thinking of it as kind of "allocating a pointer").
glBindTexture- use the "name" generated in glGenTexture to specify that "we're now talking about this texture for future API calls until further notice", and further, we're specifying some metadata about that "pointer" we've allocated saying whether the texture it points to (/will point to) is of type GL_TEXTURE_2D or ..._3D or whatever. (Is it just me, or is it weird that this call has those two seemingly totally different functionalities?)
glTexParameter- sets other specified metadata about the currently "bound" texture. (I like this API as it seems pretty self explanatory and lets you set metadata explicitly... but I wonder why letting OpenGL know that it's a GL_TEXTURE_2D isn't part of THIS call, and not the previous? Especially because you have to specify that it's a GL_TEXTURE_2D every time you call this anyways? And why do you have to do that?)
glTexImage2D- allocates the memory for the actual data for the texture on the graphics card (and optionally uploads it). It further specifies some metadata regarding how it ought be read: its width, height, formatting (GL_RGB, GL_RGBA, etc...). Now again, why do I again have to specify that it's a GL_TEXTURE_2D when I've done it in all the previous calls? Also, I guess I can understand why this includes some metadata (rather than offloading ALL the texture metadata calls to glTexParameter as these are pretty fundamental/non-optional bits of info, but there are also some weird parameters that seem like they oughtn't have made the cut? oh well...)
glActiveTexture- this is the bit that I really don't get... So I guess graphics cards are capable of having only a limited number of "texture units"... what is a texture unit? Is it that there can only be N texture buffers? Or only N texture pointers? Or (this is my best guess...) there can only be N pointers being actively read by a given draw call? And once I get that, where/how often to I have to specify the "Active Texture"? Does glBindTexture associate the bound texture with the currently active texture? Or is it the other way around (bind, then set active)? Or does uploading/allocating the graphics card memory do that?
sampler2D- now we're getting into glsl stuff... So, a sampler is a thing that can reference a texture from within a shader. I can get its location via glGetUniformLocation, so I can set which texture that sampler is referencing- does this correspond to the "Active Texture"? So if I want to talk about the texture I've specified as GL_TEXTURE0, I'd call glUniform1i(location_of_sampler_uniform,0)? Or are those two different things?
I think that's all I got... if I'm obviously missing some intuition or something, please let me know! Thanks!
Let me apologize for answering with what amounts to a giant wall of text. I could not figure out how to format this any less obnoxious way ;)
glGenTextures
this won't actually allocate any memory for the data of a texture on the graphics card (you just tell it "how many" textures you want it to generate, so it doesn't know anything about the size...), but instead sets kind of a "name" aside so you can reference given textures consistently (I've been thinking of it as kind of "allocating a pointer").
You can more or less think of it as "allocating a pointer." What it really does is reserve a name (handle) in the set of textures. Nothing is allocated at all at this point, basically it just flags GL to say "you can't hand out this name anymore." (more on this later).
glBindTexture
use the "name" generated in glGenTexture to specify that "we're now talking about this texture for future API calls until further notice", and further, we're specifying some metadata about that "pointer" we've allocated saying whether the texture it points to (/will point to) is of type GL_TEXTURE_2D or ..._3D or whatever. (Is it just me, or is it weird that this call has those two seemingly totally different functionalities?)
If you will recall, glGenTextures (...) only reserves a name. This function is what takes the reserved name and effectively finalizes it as a texture object (the first time it is called). The type you pass here is immutable, once you bind a name for the first time, it has to use the same type for every successive bind.
Now you have finally finished allocating a texture object, but it has no data store at this point -- it is just a set of states with no data.
glTexParameter
sets other specified metadata about the currently "bound" texture. (I like this API as it seems pretty self explanatory and lets you set metadata explicitly... but I wonder why letting OpenGL know that it's a GL_TEXTURE_2D isn't part of THIS call, and not the previous? Especially because you have to specify that it's a GL_TEXTURE_2D every time you call this anyways? And why do you have to do that?)
I am actually not quite clear what you are asking here -- maybe my explanation of the previous function call will help you? But you are right, this function sets the state associated with a texture object.
glTexImage2D
allocates the memory for the actual data for the texture on the graphics card (and optionally uploads it). It further specifies some metadata regarding how it ought be read: its width, height, formatting (GL_RGB, GL_RGBA, etc...). Now again, why do I again have to specify that it's a GL_TEXTURE_2D when I've done it in all the previous calls? Also, I guess I can understand why this includes some metadata (rather than offloading ALL the texture metadata calls to glTexParameter as these are pretty fundamental/non-optional bits of info, but there are also some weird parameters that seem like they oughtn't have made the cut? oh well...)
This is what allocates the data store and (optionally) uploads texture data (you can supply NULL for the data here and opt to finish the data upload later with glTexSubImage2D (...)).
You have to specify the texture target here because there are half a dozen different types of textures that use 2D data stores. The simplest way to illustrate this is a cubemap.
A cubemap has type GL_TEXTURE_CUBE_MAP, but you cannot upload its texture data using GL_TEXTURE_CUBE_MAP -- that is nonsensical. Instead, you call glTexImage2D (...) while the cubemap is bound to GL_TEXTURE_CUBE_MAP and then you pass something like GL_TEXTURE_CUBE_MAP_POSITIVE_X to indicate which of the 6 2D faces of the cubemap you are referencing.
glActiveTexture
this is the bit that I really don't get... So I guess graphics cards are capable of having only a limited number of "texture units"... what is a texture unit? Is it that there can only be N texture buffers? Or only N texture pointers? Or (this is my best guess...) there can only be N pointers being actively read by a given draw call? And once I get that, where/how often to I have to specify the "Active Texture"? Does glBindTexture associate the bound texture with the currently active texture? Or is it the other way around (bind, then set active)? Or does uploading/allocating the graphics card memory do that?
This is an additional level of indirection for texture binding (GL did not always have multiple texture units and you would have to do multiple render passes to apply multiple textures).
Once multi-texturing was introduced, binding a texture actually started to work this way:
glBindTexture (target, name) => ATIU.targets [target].bound = name
Where:
* ATIU is the active texture image unit
* targets is an array of all possible texture types that can be bound to this unit
* bound is the name of the texture bound to ATIU.targets [target]
The rules since OpenGL 3.0 have been, you get a minimum of 16 of these for every shader stage in the system.
This requirement allows you enough binding locations to maintain a set of 16 different textures for each stage of the programmable pipeline (vertex,geometry,fragment -- 3.x, tessellation control / evaluation -- 4.0). Most implementations can only use 16 textures in a single shader invocation (pass, basically), but you have a total of 48 (GL3) or 80 (GL4) places you can select from.
sampler2D
now we're getting into glsl stuff... So, a sampler is a thing that can reference a texture from within a shader. I can get its location via glGetUniformLocation, so I can set which texture that sampler is referencing- does this correspond to the "Active Texture"? So if I want to talk about the texture I've specified as GL_TEXTURE0, I'd call glUniform1i(location_of_sampler_uniform,0)? Or are those two different things?
Yes, the samplers in GLSL store indices that correspond to GL_TEXTUREn, where n is the value you have assigned to this uniform.
These are not regular uniforms, mind you, they are called opaque types (the value assigned cannot be changed/assigned from within a shader at run-time). You do not need to know that, but it might help to understand that if the question ever arises:
"Why can't I dynamically select a texture image unit for my sampler at run-time?" :)
In later versions of OpenGL, samplers actually became state objects of their own. They decouple some of the state that used to be tied directly to texture objects but had nothing to do with interpreting how the texture's data was stored. The decoupled state includes things like texture wrap mode, min/mag filter and mipmap levels. Sampler objects store no data.
This decoupling takes place whenever you bind a sampler object to a texture image unit - that will override the aforementioned states that are duplicated by every texture object.
So effectively, a GLSL sampler* references neither a texture nor a sampler; it references a texture image unit (which may have one or both of those things bound to it). GLSL will pull sampler state and texture data accordingly from that unit based on the declared sampler type.

OpenGL: Copying a Frame Buffer Object

I need to gather information about the target used in a texture attachment of an FBO in order to copy it to another FBO.
As far as OpenGL ES 2.0 is concerned, I can use glGetFramebufferAttachmentParameter[if]v() and, since OpenGL ES 2.0 only supports GL_TEXTURE_2D and GL_TEXTURE_CUBE_MAP, the information returned is enough to determine the texture target that was used (when it's not a cube map face, it is a GL_TEXTURE_2D since it can't be anything else).
On the desktop, however, things change:
Because then we have GL_TEXTURE_1D, GL_TEXTURE_2D, GL_TEXTURE_2D_MULTISAMPLE, GL_TEXTURE_RECTANGLE, GL_TEXTURE_3D, and the 6 cube map faces as valid targets for an FBO's texture attachment, and while the 6 cube map faces and GL_TEXTURE_3D targets are easy to tell (since there are specific queries for cube map faces and layered textures), the same does not apply to the remaining targets: GL_TEXTURE_1D, GL_TEXTURE_2D, GL_TEXTURE_2D_MULTISAMPLE, and GL_TEXTURE_RECTANGLE, at least as far as the manual pages are concerned. Therefore, how would I be able to tell which of these 4 targets was used in a texture attachment?
The need to copy an FBO stems from the fact that FBOs are not shared between contexts, the implementation creates screen FBOs in the main thread, and I want to use them in child threads dedicated to each screen so as to not stall the main thread with render loops and thus keep the application responsive to UI events. Caching state is both undesirable and unfeasible in this case; undesirable because it cuts through otherwise distinct concerns of the application when the client library (which only concern is to serve as a communication API between the application and the OpenGL server) is in a much better position to cache state itself, and unfeasible since in this case I don't even control some of the concerns in my application, as mentioned before.
Right now this is a theoretical question, because the implementation I'm working on only supports OpenGL ES 2.0, but I would rather write future-proof code where I can be certain about the exact texture target used as an FBO attachment than code that works only because the number of available options is limited to the point where I can figure out which option was chosen by excluding those that weren't, an approach that, as demonstrated above, wouldn't work on the feature-rich desktop versions and may not work on future OpenGL ES version.
OpenGL has no solution for the problem you're having. There is no way to look at a texture object and know what target it is, nor is there a way to know what the textarget parameter of a texture that was attached to an FBO was. Generally speaking, you are expected to keep track of the texture object's target, just as you're expected to keep track of the texture object's name (the GLuint you get back from glGenTextures).
The best way to handle this would be to simply ask the client library what textures and texture targets it adds to it's FBO. If you can't get this client library to provide you this information, then you can't do what you need to do.

How to render offscreen on OpenGL? [duplicate]

This question already has answers here:
How to use GLUT/OpenGL to render to a file?
(6 answers)
Closed 9 years ago.
My aim is to render OpenGL scene without a window, directly into a file. The scene may be larger than my screen resolution is.
How can I do this?
I want to be able to choose the render area size to any size, for example 10000x10000, if possible?
It all starts with glReadPixels, which you will use to transfer the pixels stored in a specific buffer on the GPU to the main memory (RAM). As you will notice in the documentation, there is no argument to choose which buffer. As is usual with OpenGL, the current buffer to read from is a state, which you can set with glReadBuffer.
So a very basic offscreen rendering method would be something like the following. I use c++ pseudo code so it will likely contain errors, but should make the general flow clear:
//Before swapping
std::vector<std::uint8_t> data(width*height*4);
glReadBuffer(GL_BACK);
glReadPixels(0,0,width,height,GL_BGRA,GL_UNSIGNED_BYTE,&data[0]);
This will read the current back buffer (usually the buffer you're drawing to). You should call this before swapping the buffers. Note that you can also perfectly read the back buffer with the above method, clear it and draw something totally different before swapping it. Technically you can also read the front buffer, but this is often discouraged as theoretically implementations were allowed to make some optimizations that might make your front buffer contain rubbish.
There are a few drawbacks with this. First of all, we don't really do offscreen rendering do we. We render to the screen buffers and read from those. We can emulate offscreen rendering by never swapping in the back buffer, but it doesn't feel right. Next to that, the front and back buffers are optimized to display pixels, not to read them back. That's where Framebuffer Objects come into play.
Essentially, an FBO lets you create a non-default framebuffer (like the FRONT and BACK buffers) that allow you to draw to a memory buffer instead of the screen buffers. In practice, you can either draw to a texture or to a renderbuffer. The first is optimal when you want to re-use the pixels in OpenGL itself as a texture (e.g. a naive "security camera" in a game), the latter if you just want to render/read-back. With this the code above would become something like this, again pseudo-code, so don't kill me if mistyped or forgot some statements.
//Somewhere at initialization
GLuint fbo, render_buf;
glGenFramebuffers(1,&fbo);
glGenRenderbuffers(1,&render_buf);
glBindRenderbuffer(render_buf);
glRenderbufferStorage(GL_RENDERBUFFER, GL_BGRA8, width, height);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER​,fbo);
glFramebufferRenderbuffer(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, render_buf);
//At deinit:
glDeleteFramebuffers(1,&fbo);
glDeleteRenderbuffers(1,&render_buf);
//Before drawing
glBindFramebuffer(GL_DRAW_FRAMEBUFFER​,fbo);
//after drawing
std::vector<std::uint8_t> data(width*height*4);
glReadBuffer(GL_COLOR_ATTACHMENT0);
glReadPixels(0,0,width,height,GL_BGRA,GL_UNSIGNED_BYTE,&data[0]);
// Return to onscreen rendering:
glBindFramebuffer(GL_DRAW_FRAMEBUFFER​,0);
This is a simple example, in reality you likely also want storage for the depth (and stencil) buffer. You also might want to render to texture, but I'll leave that as an exercise. In any case, you will now perform real offscreen rendering and it might work faster then reading the back buffer.
Finally, you can use pixel buffer objects to make read pixels asynchronous. The problem is that glReadPixels blocks until the pixel data is completely transfered, which may stall your CPU. With PBO's the implementation may return immediately as it controls the buffer anyway. It is only when you map the buffer that the pipeline will block. However, PBO's may be optimized to buffer the data solely on RAM, so this block could take a lot less time. The read pixels code would become something like this:
//Init:
GLuint pbo;
glGenBuffers(1,&pbo);
glBindBuffer(GL_PIXEL_PACK_BUFFER, pbo);
glBufferData(GL_PIXEL_PACK_BUFFER, width*height*4, NULL, GL_DYNAMIC_READ);
//Deinit:
glDeleteBuffers(1,&pbo);
//Reading:
glBindBuffer(GL_PIXEL_PACK_BUFFER, pbo);
glReadPixels(0,0,width,height,GL_BGRA,GL_UNSIGNED_BYTE,0); // 0 instead of a pointer, it is now an offset in the buffer.
//DO SOME OTHER STUFF (otherwise this is a waste of your time)
glBindBuffer(GL_PIXEL_PACK_BUFFER, pbo); //Might not be necessary...
pixel_data = glMapBuffer(GL_PIXEL_PACK_BUFFER, GL_READ_ONLY);
The part in caps is essential. If you just issue a glReadPixels to a PBO, followed by a glMapBuffer of that PBO, you gained nothing but a lot of code. Sure the glReadPixels might return immediately, but now the glMapBuffer will stall because it has to safely map the data from the read buffer to the PBO and to a block of memory in main RAM.
Please also note that I use GL_BGRA everywhere, this is because many graphics cards internally use this as the optimal rendering format (or the GL_BGR version without alpha). It should be the fastest format for pixel transfers like this. I'll try to find the nvidia article I read about this a few monts back.
When using OpenGL ES 2.0, GL_DRAW_FRAMEBUFFER might not be available, you should just use GL_FRAMEBUFFER in that case.
I'll assume that creating a dummy window (you don't render to it; it's just there because the API requires you to make one) that you create your main context into is an acceptable implementation strategy.
Here are your options:
Pixel buffers
A pixel buffer, or pbuffer (which isn't a pixel buffer object), is first and foremost an OpenGL context. Basically, you create a window as normal, then pick a pixel format from wglChoosePixelFormatARB (pbuffer formats must be gotten from here). Then, you call wglCreatePbufferARB, giving it your window's HDC and the pixel buffer format you want to use. Oh, and a width/height; you can query the implementation's maximum width/heights.
The default framebuffer for pbuffer is not visible on the screen, and the max width/height is whatever the hardware wants to let you use. So you can render to it and use glReadPixels to read back from it.
You'll need to share you context with the given context if you have created objects in the window context. Otherwise, you can use the pbuffer context entirely separately. Just don't destroy the window context.
The advantage here is greater implementation support (though most drivers that don't support the alternatives are also old drivers for hardware that's no longer being supported. Or is Intel hardware).
The downsides are these. Pbuffers don't work with core OpenGL contexts. They may work for compatibility, but there is no way to give wglCreatePbufferARB information about OpenGL versions and profiles.
Framebuffer Objects
Framebuffer Objects are more "proper" offscreen rendertargets than pbuffers. FBOs are within a context, while pbuffers are about creating new contexts.
FBOs are just a container for images that you render to. The maximum dimensions that the implementation allows can be queried; you can assume it to be GL_MAX_VIEWPORT_DIMS (make sure an FBO is bound before checking this, as it changes based on whether an FBO is bound).
Since you're not sampling textures from these (you're just reading values back), you should use renderbuffers instead of textures. Their maximum size may be larger than those of textures.
The upside is the ease of use. Rather than have to deal with pixel formats and such, you just pick an appropriate image format for your glRenderbufferStorage call.
The only real downside is the narrower band of hardware that supports them. In general, anything that AMD or NVIDIA makes that they still support (right now, GeForce 6xxx or better [note the number of x's], and any Radeon HD card) will have access to ARB_framebuffer_object or OpenGL 3.0+ (where it's a core feature). Older drivers may only have EXT_framebuffer_object support (which has a few differences). Intel hardware is potluck; even if they claim 3.x or 4.x support, it may still fail due to driver bugs.
If you need to render something that exceeds the maximum FBO size of your GL implementation libtr works pretty well:
The TR (Tile Rendering) library is an OpenGL utility library for doing
tiled rendering. Tiled rendering is a technique for generating large
images in pieces (tiles).
TR is memory efficient; arbitrarily large image files may be generated
without allocating a full-sized image buffer in main memory.
The easiest way is to use something called Frame Buffer Objects (FBO). You will still have to create a window to create an opengl context though (but this window can be hidden).
The easiest way to fulfill your goal is using FBO to do off-screen render. And you don't need to render to texture, then get the teximage. Just render to buffer and use function glReadPixels. This link will be useful. See Framebuffer Object Examples