I'm just wondering: when (and maybe how) to clear the data from a VBO. Do you have to clear it always before rewriting the data? Why clear it?
Clearing the buffer (i.e. setting each byte to 0) isn't too useful. Invalidating the buffer is.
Invalidating a section of a buffer means that the contents of that section become invalid, and you must write new content to that section before using it. This allows the OpenGL implementation to avoid waiting until the buffer object is no longer being used in order to upload data to it by giving you a completely 'new' buffer to write to (under the same name). This technique is called buffer orphaning.
To invalidate a buffer, you can either call glBufferData with the same size and usage hints, but with a NULL data pointer, use glMapBufferRange with the GL_MAP_INVALIDATE_BUFFER_BIT, or glInvalidateBufferData if your GPU supports it.
The OpenGL Wiki article for Buffer Object Streaming covers this in more detail, and also offers several other solutions.
To directly answer your question, it is not required that you invalidate or clear a buffer before updating it. You can call glBufferSubData whenever you want to update whatever contents you want. However, doing so without invalidation may cause a pipeline stall as OpenGL waits for the buffer to finish being used before safely updating it.
Related
I'm working on a program that loads new data of a model to the graphics card using OpenGL, it then switches to rendering that one, and then removes the old data so as to create more space for other uses.
From my understanding I shouldn't be creating/releasing buffers on the fly as it can lead to memory thrashing.
Is it bad to call glBufferData frequently to add new data to the graphics card? Does this count as creating/releasing buffers?
If you call glBufferData with the same size and usage parameters as it was called previously, then this is effectively invalidating or "orphaning" the buffer. To do anything else, to change the size or usage, is to effectively create a new buffer.
If you aren't streaming data (uploading new data every frame or so), invalidation is not especially useful. If you're no longer using the buffer, and you haven't used it in a while, just leave it there if you're going to need buffer storage again.
And if your models use different sizes, preallocate a large buffer object and have different models use different regions from that one allocation.
I have a huge vbo, and the entire thing changes every frame.
I have heard of different methods of quickly changing buffer data, however only one of them seems like a good idea for my program. However I dont understand it and cant find any code samples for it.
I have heard people claim that you should call glBufferData with "null" as the data then fill it with your real data each frame. What is the goal of this? What does this look like in code?
It's all in the docs.
https://www.opengl.org/sdk/docs/man/html/glBufferData.xhtml
If you pass NULL to glBufferData(), it looks something like this:
int bufferSize = ...;
glBufferData(GL_ARRAY_BUFFER, bufferSize, NULL, GL_DYNAMIC_DRAW);
void *ptr = glMapBuffer(GL_ARRAY_BUFFER, GL_WRITE_ONLY);
...
Ignore most of that function call, the only two important parts are bufferSize and NULL. This tells OpenGL that the buffer has size bufferSize and the contents are uninitialized / undefined. In practice, this means that OpenGL is free to continue using any previous data in the buffer as long as it needs to. For example, a previous draw call using the buffer may not have finished yet, and using glBufferData() allows you to get a new piece of memory for the buffer instead of waiting for the implementation to finish using the old piece of memory.
This is an old technique and it works fairly well. There are a couple other common techniques. One such technique is to double buffer, and switch between two VBOs every frame. A more sophisticated technique is to use a persistent buffer mapping, but this requires you to manage memory fences yourself in order for it to work correctly.
Note that if you are uploading data with glBufferData() anyway, then calling glBufferData() beforehand with NULL doesn't actually accomplish anything.
I have a need to stream a texture (essentially a camera feed).
With object streaming, the following scenarios seem to be arise:
Is the new object's data store larger, smaller or same size as the old one?
Subset of or whole texture being updated?
Are we streaming a buffer object or texture object (any difference?)
Here are the following approaches I have come across:
Allocate object data store (either BufferData for buffers or TexImage2D for textures) and then each frame, update subset of data with BufferSubData or TexSubImage2D
Nullify/invalidate the object after the last call (eg. draw) that uses the object either with:
Nullify: glTexSubImage2D( ..., NULL), glBufferSubData( ..., NULL)
Invalidate: glBufferInvalidate(), glMapBufferRange with the GL_MAP_INVALIDATE_BUFFER_BIT, glDeleteTextures ?
Simpliy reinvoke BufferData or TexImage2D with the new data
Manually implement object multi-buffering / buffer ping-ponging.
Most immediately, my problem scenario is: entire texture being replaced with new one of same size. How do I implement this? Will (1) implicitly synchronize ? Does (2) avoid the synchronization? Will (3) synchronize or will a new data store for the object be allocated, where our update can be uploaded without waiting for all drawing using the old object state to finish? This passage from the Red Book V4.3 makes be believe so:
Data can also be copied between buffer objects using the
glCopyBufferSubData() function. Rather than assembling chunks of data
in one large buffer object using glBufferSubData(), it is possible to
upload the data into separate buffers using glBufferData() and then
copy from those buffers into the larger buffer using
glCopyBufferSubData(). Depending on the OpenGL implementation, it may
be able to overlap these copies because each time you call
glBufferData() on a buffer object, it invalidates whatever contents
may have been there before. Therefore, OpenGL can sometimes just
allocate a whole new data store for your data, even though a copy
operation from the previous store has not completed yet. It will then
release the old storage at a later opportunity.
But if so, why the need for (2)[nullify/invalidates]?
Also, please discuss the above approaches, and others, and their effectiveness for the various scenarios, while keeping in mind atleast the following issues:
Whether implicit synchronization to object (ie. synchronizing our update with OpenGL's usage) occurs
Memory usage
Speed
I've read http://www.opengl.org/wiki/Buffer_Object_Streaming but it doesn't offer conclusive information.
Let me try to answer at least a few of the questions you raised.
The scenarios you talk about can have a great impact on the performance on the different approaches, especially when considering the first point about the dynamic size of the buffer. In your scenario of video streaming, the size will rarely change, so a more expensive "re-configuration" of the data structures you use might be possible. If the size changes every frame or every few frames, this is typically not feasable. However, if a resonable maximum size limit can be enforced, just using buffers/textures with the maximum size might be a good strategy. Neither with buffers nor with textures you have to use all the space there is (although there are some smaller issues when you do this with texures, like wrap modes).
3.Are we streaming a buffer object or texture object (any difference?)
Well, the only way to efficiently stream image data to or from the GL is to use pixel buffer objects (PBOs). So you always have to deal with buffer objects in the first place, no matter if vertex data, image data or whatever data is to be tranfered. The buffer is just the source for some glTex*Image() call in the texture case, and of course you'll need a texture object for that.
Let's come to your approaches:
In approach (1), you use the "Sub" variant of the update commands. In that case, (parts of or the whole) storage of the existing object is updated. This is likely to trigger an implicit synchronziation ifold data is still in use. The GL has basically only two options: wait for all operations (potentially) depending on that data to complete, or make an intermediate copy of the new data and let the client go on. Both options are not good from a performance point of view.
In approach (2), you have some misconception. The "Sub" variants of the update commands will never invalidate/orphan your buffers. The "non-sub" glBufferData() will create a completely new storage for the object, and using it with NULL as data pointer will leave that storage unintialized. Internally, the GL implementation might re-use some memory which was in use for earlier buffer storage. So if you do this scheme, there is some probablity that you effectively end up using a ring-buffer of the same memory areas if you always use the same buffer size.
The other methods for invalidation you mentiond allow you to also invalidate parts of the buffer and also a more fine-grained control of what is happening.
Approach (3) is basically the same as (2) with the glBufferData() oprhaning, but you just specify the new data directly at this stage.
Approach (4) is the one I actually would recommend, as it is the one which gives the application the most control over what is happening, without having to relies on the GL implementation's specific internal workings.
Without taking synchronization into account, the "sub" variant of the update commands is
more efficient, even if the whole data storage is to be changed, not just some part. That is because the "non-sub" variants of the commands basically recreate the storage and introduce some overhead with this. With manually managing the ring buffers, you can avoid any of that overhead, and you don't have to rely in the GL to be clever, by just using the "sub" variants of the updates functions. At the same time, you can avoid implicit synchroniztion by only updating buffers which aren't in use by th GL any more. This scheme can also nicely be extenden into a multi-threaded scenario. You can have one (or several) extra threads with separate (but shared) GL contexts to fill the buffers for you, and just passing the buffer handlings to the draw thread as soon as the update is complete. You can also just map the buffers in the draw thread and let the be filled by worker threads (wihtout the need for additional GL contexts at all).
OpenGL 4.4 introduced GL_ARB_buffer_storage and with it came the GL_MAP_PERSISTEN_BIT for glMapBufferRange. That will allow you to keep all of the buffers mapped while they are used by the GL - so it allows you to avoid the overhead of mapping the buffers into the address space again and again. You then will have no implicit synchronzation at all - but you have to synchronize the operations manually. OpenGL's synchronization objects (see GL_ARB_sync) might help you with that, but the main burden on synchronization is on your applications logic itself. When streaming videos to the GL, just avoid re-using the buffer which was the source for the glTexSubImage() call immediately and try to delay its re-use as long as possible. You are of course also trading throughput for latency. If you need to minimize latency, you might to have to tweak this logic a bit.
Comparing the approaches for "memory usage" is really hard. There are a lot of of implementation specific details to consider here. A GL implementation might keep some old buffer memories around for some time to fullfill recreation requests of the same size. Also, an GL implementation might make shadow copies of any data at any time. The approaches which don't orphan and recreate storages all the time in principle expose more control of the memory which is in use.
"Speed" itself is also not a very useful metric. You basically have to balance throughput and latency here, according to the requirements of your application.
I am currently writing an OpenGL program that creates a vertex buffer, uses it exactly once to draw a batch of triangles.
How long do I have to keep it around. Right now I just keep it until the next drawing batch gets started but I'm not sure if this is safe. The documentation in glDeleteBuffers is a bit unclear.
From the looks of it, unlike shaders, it implies that the buffer is deleted immediately.
Does this also happen when the buffer is currently used for rendering or does it delay the actual deletion.
So, what's the safest way to do this without accumulating too many buffers?
You can unbind and delete the buffer object right after the glDraw… call. It's a specified OpenGL requirement, that after a implementation keeps track of all internal references for as long as required and then cleans up internally. This holds not only for glDelete… but to every OpenGL call that modifies data.
Is it possible to glUnmapBuffer a GL_STREAM_DRAW pixel-buffer-object and still keep the data pointed to by the pointer returned previously by glMapBuffer valid for read-only operations using SSE 4.1 streaming loads?
If not, is there any technical reason for this? Or was this "feature" just left out?
The purpose of map and unmap is to say "I need a pointer to this." and "I'm finished using the pointer to this." That's what the functions do.
When you unmap a buffer, the driver is now free to:
Copy the data you wrote (if you wrote anything) to the buffer object, if the mapped pointer is not a direct pointer to that buffer.
Move the buffer object around in GPU memory, if the mapped pointer is a pointer to that object.
Remember: mapping a pointer does not have to return an actual pointer to that buffer. It simply returns a pointer that when read, will have the data stored in that buffer, and when written, the written bytes will be transferred into the buffer upon unmapping.
Furthermore, the only reason to do what you're suggesting is because you want to read the data in the buffer. Well, since you just mapped the buffer (presumably for writing, or else you wouldn't have unmapped it), you know what's in it. If you needed CPU access to it, you should have just stored the data locally; you'll get a lot more reliable access to it that way.
And if you do another pixel transfer, reading from that pointer it means that OpenGL would have to execute a synchronization, because the whole point of PBO is asynchornous transfer. That is, when you execute glReadPixels or whatever, OpenGL can wait to actually finish this operation until you map the buffer or use glGetBufferSubData.
But if the buffer is mapped for reads, then OpenGL doesn't know when you're going to read from it (since it can't tell when you read from a pointer). So OpenGL can't guarantee the storage within it. In short: undefined behavior. You could get anything at that point.
So what you're talking about doesn't make sense.