I'm working with OpenGL Frame Buffer Objects. I have created a Frame Buffer Object with 2 color textures and a depth texture.
I'm using
glBindFramebuffer(GL_READ_FRAMEBUFFER, ID);
To bind my framebuffer, but on console i'm getting this warning
Redundant State change in glBindFramebuffer call, FBO 1 already bound
How can I check which of my framebuffers is already bound? I mean which OpenGL function allows me to check the ID of the already bound framebuffer so that I can prevent redundant binding.
Hold your horses... Yes, you can get the currently bound draw and read FBOs with:
GLint drawFboId = 0, readFboId = 0;
glGetIntegerv(GL_DRAW_FRAMEBUFFER_BINDING, &drawFboId);
glGetIntegerv(GL_READ_FRAMEBUFFER_BINDING, &readFboId);
and for backwards compatibility, GL_FRAMEBUFFER_BINDING is equivalent to GL_DRAW_FRAMEBUFFER_BINDING:
glGetIntegerv(GL_FRAMEBUFFER_BINDING, &drawFboId);
But for the scenario you describe, you most likely do not want to use this. The diagnostic message tells you that you're making a redundant state change. But querying the current state to compare it with your new value is most likely much worse.
glGet*() calls can cause a certain level of synchronization, and be fairly harmful to performance. They should generally be avoided in performance critical parts of your code.
You have two options that are both likely to be better than what you were planning to do:
Ignore the diagnostic message. The driver will probably detect the redundant change, and avoid unnecessary work anyway. And it can do that much more efficiently than a solution that involves the app making glGet*() calls.
Keep track of the most recently bound FBO in your own code, so that you can filter out redundant changes without using any glGet*() calls.
In any case, what you had in mind would be like the proverbial "putting out fire with gasoline".
It's simply glGetIntegerv(GL_FRAMEBUFFER_BINDING, &result);
Related
I'm facing a problem where the use of an occlusion query in combination with instanced rendering would be desirable.
As far as I understood, something like
glBeginQuery(GL_ANY_SAMPLES_PASSED, occlusionQuery);
glDrawArraysInstanced(mode, i, j, countInstances);
glEndQuery(GL_ANY_SAMPLES_PASSED);
will only tell me, if any of the instances were drawn.
What I would need to know is, what set of instances has been drawn (giving me the IDs of all visible instances). Drawing each instance in an own call is no option for me.
An alternative would be to color-code the instances and detect the visible instances manually.
But is there really no way to solve this problem with a query command and why would it not be possible?
It's not possible for several reasons.
Query objects only contain a single counter value. What you want would require a separate sample passed count for each instance.
Even if query objects stored arrays of sample counts, you can issue more than one draw call in the begin/end scope of a query. So how would OpenGL know which part of which draw call belonged to which query value in the array? You can even change other state within the query scope; uniform bindings, programs, pretty much anything.
The samples-passed count is determined entirely by the rasterizer hardware on the GPU. And the rasterizer neither knows nor cares which instance generated a triangle.
Instancing is a function of the vertex processing and/or vertex specification stages; by the time the rasterizer sees it, that information is gone. Notice that fragment shaders don't even get an instance ID as input, unless you explicitly create one by passing it from your vertex processing stage(s).
However, if you truly want to do this you could use image load/store and its atomic operations. That is, pass the fragment shader the instance in question (as an int data type, with flat interpolation). This FS also uses a uimageBuffer buffer texture, which uses the GL_R32UI format (or you can use an SSBO unbounded array). It then performs an imageAtomicAdd, using the instance value passed in as the index to the buffer. Oh, and you'll need to have the FS explicitly require early tests, so that samples which fail the fragment tests will not execute.
Then use a compute shader to build up a list of rendering commands for the instances which have non-zero values in the array. Then use an indirect rendering call to draw the results of this computation. Now obviously, you will need to properly synchronize access between these various operations. So you'll need to use appropriate glMemoryBarrier calls between each one.
Even if queries worked the way you want them to, this would be overall far more preferable than using a query object. Unless you're reading a query into a buffer object, reading a query object requires a GPU/CPU synchronization of some form. Whereas the above requires some synchronization and barrier operations, but they're all on-GPU operations, rather than synchronizing with the CPU.
I have a need to stream a texture (essentially a camera feed).
With object streaming, the following scenarios seem to be arise:
Is the new object's data store larger, smaller or same size as the old one?
Subset of or whole texture being updated?
Are we streaming a buffer object or texture object (any difference?)
Here are the following approaches I have come across:
Allocate object data store (either BufferData for buffers or TexImage2D for textures) and then each frame, update subset of data with BufferSubData or TexSubImage2D
Nullify/invalidate the object after the last call (eg. draw) that uses the object either with:
Nullify: glTexSubImage2D( ..., NULL), glBufferSubData( ..., NULL)
Invalidate: glBufferInvalidate(), glMapBufferRange with the GL_MAP_INVALIDATE_BUFFER_BIT, glDeleteTextures ?
Simpliy reinvoke BufferData or TexImage2D with the new data
Manually implement object multi-buffering / buffer ping-ponging.
Most immediately, my problem scenario is: entire texture being replaced with new one of same size. How do I implement this? Will (1) implicitly synchronize ? Does (2) avoid the synchronization? Will (3) synchronize or will a new data store for the object be allocated, where our update can be uploaded without waiting for all drawing using the old object state to finish? This passage from the Red Book V4.3 makes be believe so:
Data can also be copied between buffer objects using the
glCopyBufferSubData() function. Rather than assembling chunks of data
in one large buffer object using glBufferSubData(), it is possible to
upload the data into separate buffers using glBufferData() and then
copy from those buffers into the larger buffer using
glCopyBufferSubData(). Depending on the OpenGL implementation, it may
be able to overlap these copies because each time you call
glBufferData() on a buffer object, it invalidates whatever contents
may have been there before. Therefore, OpenGL can sometimes just
allocate a whole new data store for your data, even though a copy
operation from the previous store has not completed yet. It will then
release the old storage at a later opportunity.
But if so, why the need for (2)[nullify/invalidates]?
Also, please discuss the above approaches, and others, and their effectiveness for the various scenarios, while keeping in mind atleast the following issues:
Whether implicit synchronization to object (ie. synchronizing our update with OpenGL's usage) occurs
Memory usage
Speed
I've read http://www.opengl.org/wiki/Buffer_Object_Streaming but it doesn't offer conclusive information.
Let me try to answer at least a few of the questions you raised.
The scenarios you talk about can have a great impact on the performance on the different approaches, especially when considering the first point about the dynamic size of the buffer. In your scenario of video streaming, the size will rarely change, so a more expensive "re-configuration" of the data structures you use might be possible. If the size changes every frame or every few frames, this is typically not feasable. However, if a resonable maximum size limit can be enforced, just using buffers/textures with the maximum size might be a good strategy. Neither with buffers nor with textures you have to use all the space there is (although there are some smaller issues when you do this with texures, like wrap modes).
3.Are we streaming a buffer object or texture object (any difference?)
Well, the only way to efficiently stream image data to or from the GL is to use pixel buffer objects (PBOs). So you always have to deal with buffer objects in the first place, no matter if vertex data, image data or whatever data is to be tranfered. The buffer is just the source for some glTex*Image() call in the texture case, and of course you'll need a texture object for that.
Let's come to your approaches:
In approach (1), you use the "Sub" variant of the update commands. In that case, (parts of or the whole) storage of the existing object is updated. This is likely to trigger an implicit synchronziation ifold data is still in use. The GL has basically only two options: wait for all operations (potentially) depending on that data to complete, or make an intermediate copy of the new data and let the client go on. Both options are not good from a performance point of view.
In approach (2), you have some misconception. The "Sub" variants of the update commands will never invalidate/orphan your buffers. The "non-sub" glBufferData() will create a completely new storage for the object, and using it with NULL as data pointer will leave that storage unintialized. Internally, the GL implementation might re-use some memory which was in use for earlier buffer storage. So if you do this scheme, there is some probablity that you effectively end up using a ring-buffer of the same memory areas if you always use the same buffer size.
The other methods for invalidation you mentiond allow you to also invalidate parts of the buffer and also a more fine-grained control of what is happening.
Approach (3) is basically the same as (2) with the glBufferData() oprhaning, but you just specify the new data directly at this stage.
Approach (4) is the one I actually would recommend, as it is the one which gives the application the most control over what is happening, without having to relies on the GL implementation's specific internal workings.
Without taking synchronization into account, the "sub" variant of the update commands is
more efficient, even if the whole data storage is to be changed, not just some part. That is because the "non-sub" variants of the commands basically recreate the storage and introduce some overhead with this. With manually managing the ring buffers, you can avoid any of that overhead, and you don't have to rely in the GL to be clever, by just using the "sub" variants of the updates functions. At the same time, you can avoid implicit synchroniztion by only updating buffers which aren't in use by th GL any more. This scheme can also nicely be extenden into a multi-threaded scenario. You can have one (or several) extra threads with separate (but shared) GL contexts to fill the buffers for you, and just passing the buffer handlings to the draw thread as soon as the update is complete. You can also just map the buffers in the draw thread and let the be filled by worker threads (wihtout the need for additional GL contexts at all).
OpenGL 4.4 introduced GL_ARB_buffer_storage and with it came the GL_MAP_PERSISTEN_BIT for glMapBufferRange. That will allow you to keep all of the buffers mapped while they are used by the GL - so it allows you to avoid the overhead of mapping the buffers into the address space again and again. You then will have no implicit synchronzation at all - but you have to synchronize the operations manually. OpenGL's synchronization objects (see GL_ARB_sync) might help you with that, but the main burden on synchronization is on your applications logic itself. When streaming videos to the GL, just avoid re-using the buffer which was the source for the glTexSubImage() call immediately and try to delay its re-use as long as possible. You are of course also trading throughput for latency. If you need to minimize latency, you might to have to tweak this logic a bit.
Comparing the approaches for "memory usage" is really hard. There are a lot of of implementation specific details to consider here. A GL implementation might keep some old buffer memories around for some time to fullfill recreation requests of the same size. Also, an GL implementation might make shadow copies of any data at any time. The approaches which don't orphan and recreate storages all the time in principle expose more control of the memory which is in use.
"Speed" itself is also not a very useful metric. You basically have to balance throughput and latency here, according to the requirements of your application.
Following along with the tutorials here to get an introduction to OpenGL 3.3, I understand that vertex and index buffers need to be bound with glBindBuffer() in order to issue commands to them. There is a mention that it is possible to unbind buffers by passing a handle of 0 to glBindBuffer(), which seems like a good idea to prevent accidentally using an incorrect buffer when you are done using it. Is there any reason to not always unbind vertex and index buffers after issuing setup or draw calls?
Due to OpenGL's architecture as a state machine, such topics are often raised - and there is no definitive answer. The buffer bindings do influence the operations of various other GL commands, depending on the binding target.
In some cases, object 0 represenets some "no buffer/default object" case. For example, with GL_PIXEL_UNPACK_BUFFER, using 0 will allow you to transfer pixel data directly from client memory to the GL - so if you want to do that, you absolutely need to unbind any bound PBO at some point. If you do it as early as possible or as late as possible is up to you as the programmer, and depends heavily on the architecture of your software. The general rule should be to avoid unnecessary state changes, and that includs buffer (un)bind operations. But following this route often leads to situations where some state "leaks" out of the scope it really was meant to - and such stuff can be annoying to debug.
In other cases, 0 is not a valid state to do anything. For example, modern GL requires you to use VBOs. There is not really a need to ever have 0 bound as GL_ARRAY_BUFFER, as the only use case for that - specifying some attrib pointer to client memory - is gone. So unbinding a VBO is always a waste of time - the next time you set an attrib pointer, you have to bind some VBO anyways, and if you don't set an attrib pointer, that binding target is completely irrelevant.
I am currently writing an OpenGL program that creates a vertex buffer, uses it exactly once to draw a batch of triangles.
How long do I have to keep it around. Right now I just keep it until the next drawing batch gets started but I'm not sure if this is safe. The documentation in glDeleteBuffers is a bit unclear.
From the looks of it, unlike shaders, it implies that the buffer is deleted immediately.
Does this also happen when the buffer is currently used for rendering or does it delay the actual deletion.
So, what's the safest way to do this without accumulating too many buffers?
You can unbind and delete the buffer object right after the glDraw… call. It's a specified OpenGL requirement, that after a implementation keeps track of all internal references for as long as required and then cleans up internally. This holds not only for glDelete… but to every OpenGL call that modifies data.
The documentation indicates that this "allocates" storage for a texture and its levels. The pseudocode provided seems to indicate that this is for the mipmap levels.
How does usage of glTexStorage relate to glGenerateMipmap? glTexStorage seems to "lock" a texture's storage size. It seems to me this would only serve to make things less flexible. Are there meant to be performance gains to be had here?
It's pretty new and only available in 4.2 so I'm going to try to avoid using it, but I'm confused because its description makes it sound kind of important.
How is storage for textures managed in earlier GL versions? When i call glTexImage2D I effectively erase and free the storage previously associated with the texture handle, yes? and generating mipmaps also automatically handles storage for me as well.
I remember using the old-school glTexSubImage2D method to implement OpenGL 2-style render-to-texture to do some post-process effects in my previous engine experiment. It makes sense that glTexStorage will bring about a more sensible way of managing texture-related resources, now that we have better ways to do RTT.
To understand what glTexStorage does, you need to understand what the glTexImage* functions do.
glTexImage2D does three things:
It allocates OpenGL storage for a specific mipmap layer, with a specific size. For example, you could allocate a 64x64 2D image as mipmap level 2.
It sets the internal format for the mipmap level.
It uploads pixel data to the texture. The last step is optional; if you pass NULL as the pointer value (and no buffer object is bound to GL_PIXEL_UNPACK_BUFFER), then no pixel transfer takes place.
Creating a mipmapped texture by hand requires a sequence of glTexImage calls, one for each mipmap level. Each of the sizes of the mipmap levels needs to be the proper size based on the previous level's size.
Now, if you look at section 3.9.14 of the GL 4.2 specification, you will see two pages of rules that a texture object must follow to be "complete". A texture object that is incomplete cannot be accessed from.
Among those rules are things like, "mipmaps must have the appropriate size". Take the example I gave above: a 64x64 2D image, which is mipmap level 2. It would be perfectly valid OpenGL code to allocate a mipmap level 1 that used a 256x256 texture. Or a 16x16. Or a 10x345. All of those would be perfectly functional as far as source code is concerned. Obviously they would produce nonsense as a texture, since the texture would be incomplete.
Again consider the 64x64 mipmap 2. I create that as my first image. Now, I could create a 128x128 mipmap 1. But I could also create a 128x129 mipmap 1. Both of these are completely consistent with the 64x64 mipmap level 2 (mipmap sizes always round down). While they are both consistent, they're also both different sizes. If a driver has to allocate the full mipmap chain at once (which is entirely possible), which size does it allocate? It doesn't know. It can't know until you explicitly allocate the rest.
Here's another problem. Let's say I have a texture with a full mipmap chain. It is completely texture complete, according to the rules. And then I call glTexImage2D on it again. Now what? I could accidentally change the internal format. Each mipmap level has a separate internal format; if they don't all agree, then the texture is incomplete. I could accidentally change the size of the texture, again making the texture incomplete.
glTexStorage prevents all of these possible errors. It creates all the mipmaps you want up-front, given the base level's size. It allocates all of those mipmaps with the same image format, so you can't screw that up. It makes the texture immutable, so you can't come along and try to break the texture with a bad glTexImage2D call. And it prevents other errors I didn't even bother to cover.
The question isn't "what does glTexStorage do?" The question is "why did we go so long without it."
glTexStorage has no relation to glGenerateMipmap; they are orthogonal functionality. glTexStorage does exactly what it says: it allocates texture storage space. It does not fill that space with anything. So it creates a texture with a given size filled with uninitialized data. Much like glRenderbufferStorage allocates a renderbuffer with a given size filled with uninitialized data. If you use glTexStorage, you need to upload data with glTexSubImage (since glTexImage is forbidden on an immutable texture).
glTexStorage creates space for mipmaps. glGenerateMipmap creates the mipmap data itself (the smaller versions of the base layer). It can also create space for mipmaps if that space doesn't already exist. They're used for two different things.
Before calling glGenerateMipmap, the base mipmap level must be established. (either with mutable or immutable storage).so...,you can using glTexImage2D+glGenerateMipmap only,more simple!