When should glInvalidateBufferData be used? - opengl

OpenGL provides the functions glInvalidateBufferData and glInvalidateBufferSubData for invalidating buffers.
What these functions do according to the OpenGL wiki:
When a buffer or region thereof is invalidated, it means that the contents of that buffer are now undefined.
and
The idea is that, by invalidating a buffer or range thereof, the implementation will simply grab a new piece of memory to use for any later operations. Thus, while previously issued GL commands can still read the buffer's original data, you can fill the invalidated buffer with new values (via mapping or glBufferSubData) without causing implicit synchronization.
I'm having a hard time understanding when this function should be called and when it shouldn't. In theory, if the contents of the buffer are going to be overwritten it makes absolutely no difference if the previous contents were trashed or not. Does this mean that a buffer should be called before every write? Or only in certain situations?

In theory, if the contents of the buffer are going to be overwritten it makes absolutely no difference if the previous contents were trashed or not.
You're absolutely right.
In essence whether you call glBufferData or glInvalidateBufferData you're achieving the same end goal. The difference is that with glBufferData you're more saying "I need this much memory now", which in turn means that the old memory is discarded. However with glInvalidateBufferData you're saying "I don't need this memory anymore". The same goes for glInvalidateBufferSubData() vs glBufferSubData() as well as all the other glInvalidate*() functions.
The key here is that if you have a buffer, and you currently aren't needing it anymore however you're keeping the handle for later use. Then you can call glInvalidateBufferData to tell that the memory can be released.
The same goes for glInvalidateBufferSubData(). If you suddenly don't need the last half chunk of the memory assigned to the buffer, then now your driver knows that this chunk of memory can be reassigned.
When should glInvalidateBufferData be used?
So to answer your question. When you have a buffer laying around and you don't need the memory anymore.

Related

Should volatile be used when mapping GPU memory?

Both OpenGL and Vulkan allow to obtain a pointer to a part of GPUs memory by using glMapBuffer and vkMapMemory respectively. They both give a void* to the mapped memory. To interpret its contents as some data it has to be cast to an appropriate type. The simplest example could be to cast to a float* to interpret the memory as an array of floats or vectors or similar.
It seems that any kind of memory mapping is undefined behaviour in C++, as it has no notion of memory mapping. However, this isn't really an issue because this topic is outside of the scope of the C++ Standard. However, there is still the question of volatile.
In the linked question the pointer is additionally marked as volatile because the contents of the memory it points at can be modified in a way that the compiler cannot anticipate during compilation. This seems reasonable though I rarely see people use volatile in this context (more broadly, this keyword seems to be barely used at all nowadays).
At the same time in this question the answer seems to be that using volatile is unnecessary. This is due to the fact that the memory they speak of is mapped using mmap and later given to msync which can be treated as modifying the memory, which is similar to explicitly flushing it in Vulkan or OpenGL. I'm afraid though that this doesn't apply to neither OpenGL nor Vulkan.
In case of the memory being mapped as not GL_MAP_FLUSH_EXPLICIT_BIT or it being VK_MEMORY_PROPERTY_HOST_COHERENT_BIT than no flushing is needed at all and the memory contents update automagically. Even if the memory is flushed by hand by using vkFlushMappedMemoryRanges or glFlushMappedBufferRange neither of these functions actually takes the mapped pointer as a parameter, so the compiler couldn't possibly know that they modify the mapped memory's contents.
As such, is it necessary to mark pointers to mapped GPU memory as volatile? I know that technically this is all undefined behaviour, but I am asking what is required in practice on real hardware.
By the way, neither the Vulkan Specification or the OpenGL Specification mention the volatile qualifier at all.
EDIT: Would marking the memory as volatile incur a performance overhead?
OK, let's say that we have a compiler that is omniscient about everything that happens in your code. This means that the compiler can follow any pointer, even through the runtime execution of your code perfectly and correctly every time, no matter how you try to hide it. So even if you read a byte at one end of your program, the compiler will somehow remember the exact bytes you've read and anytime you try to read them again, it can choose to not execute that read and just give you the previous value, unless the compiler is aware of something that can change it.
But let's also say that our omniscient compiler is completely oblivious to everything that happens in OpenGL/Vulkan. To this compiler, the graphics API is a black box. Here, there be dragons.
So you get a pointer from the API, read from it, the GPU writes to it, and then you want to read that new data the GPU just wrote. Why would a compiler believe that the data behind that pointer has been altered; after all, the alterations came from outside of the system, from a source that the C++ standard does not recognize.
That's is what volatile is for, right?
Well, here's the thing. In both OpenGL and Vulkan, to ensure that you can actually read that data, you need to do something. Even if you map the memory coherently, you have to make an API call to ensure that the GPU process that wrote to the memory has actually executed. For Vulkan, you're waiting on a fence or an event. For OpenGL, you're waiting on a fence or executing a full finish.
Either way, before executing the read from the memory, the omniscient compiler encounters a function call into a black box which as established earlier the compiler knows nothing about. Since the mapped pointer itself came from the same black box, the compiler cannot assume that the black box doesn't have a pointer to that memory. So as far as the compiler is concerned, calling those functions could have written data to that memory.
And therefore, our omniscient-yet-oblivious compiler cannot optimize away such memory accesses. Once we get control back from those functions, the compiler must assume that any memory from any pointer reachable through that address could have been altered.
And if the compiler were able to peer into the graphics API itself, to read and understand what those functions are doing, then it would definitely see things that would tell it, "oh, I should not make assumptions about the state of memory retrieved through these pointers."
This is why you don't need volatile.
Also, note that the same applies to writing data. If you write to persistent, coherent mapped memory, you still have to perform some synchronization action with the graphics API so that your CPU writes so that the GPU isn't reading it. So that's where the compiler knows that it can no longer rely on its knowledge of previously written data.

Mutable vs Immutable storage

A few closely-related questions regarding buffer objects in OpenGL.
Besides persistent mapping, is there any other reason to allocate an immutable buffer? Even if the user allocates memory for the buffer only once, with mutable buffers he always has the ability to do it again if he needs to. Plus, with mutable buffers you can explicitly specify a usage hint.
How do people usually change data through a mapped pointer? The way I see it, you can either make changes to a single element, or multiple. For single-element changes all I could think of is an operator[] on a mapped pointer as if it was a C-style array. For multi-element changes, only thing I could think of is a memcpy, but in that case isn't it just better to use glBufferSubData?
Speaking of glBufferSubData, is there truly any difference between calling it and just doing a memcpy on a mapped pointer? I've heard the former does more than 1 memcpy, is it true?
Is there a known reason why you can't specify a usage hint for an immutable buffer?
I know these questions are mostly about performance, and thus can be answered with a simple "just do some profiling and see", but at the time of me asking this it's not so much about performance as it is about design - i.e., I want to know the good practices of choosing between a mutable buffer vs an immutable one, and how should I be modifying their contents.
Even if the user allocates memory for the buffer only once, with mutable buffers he always has the ability to do it again if he needs to.
And that's precisely why you shouldn't use them. Reallocating a buffer object's storage (outside of invalidation) is not a useful thing. Drivers have to do a lot of work to make it feasible.
So having an API that takes away tools you shouldn't use is a good thing.
How do people usually change data through a mapped pointer?
You generally use whatever tool is most appropriate to the circumstance. The point of having a mapped pointer is to access the storage directly, so writing your data elsewhere and copying it in manually is kind of working against that purpose.
Is there a known reason why you can't specify a usage hint for an immutable buffer?
Because the immutable buffer API was written by people who didn't want to have terrible, useless, and pointless parameters. The usage hint on mutable buffers is completely ignored by several implementations because users were so consistently confused about what those hints mean that people used them in strange scenarios.
Immutable buffers instead make you state how you intend to use the buffer, and then hold you to it. If you ask for a static buffer whose contents you will never modify, then you cannot modify it, period. This is prevented at the API level, unlike usage hints, where you could use any buffer in any particular way regardless of the hint.
Hints were a bad idea and needed to die.

Orphaning: calling glBufferData with a NULL pointer, really?

From OpenGL wiki:
If you will be updating a small section, use glBufferSubData. If you will update the entire VBO, use glBufferData (this information reportedly comes from a nVidia document). However, another approach reputed to work well when updating an entire buffer is to call glBufferData with a NULL pointer, and then glBufferSubData with the new contents. The NULL pointer to glBufferData lets the driver know you don't care about the previous contents so it's free to substitute a totally different buffer, and that helps the driver pipeline uploads more efficiently.
I realize from my research, what they mean by "helps pipeline upload more efficiently" is it allows asynchronous buffer writes which makes dynamic buffers faster.
However, this excerpt and others seem to suggest that the "NULL" argument to the glBufferData is very important, and it's the cause for the orphaning and it's what promotes asynchronous writes.
But my common sense is telling me, it doesn't need to be null, you can put a pointer to some of your data there instead of null. The important part for the orphaning is just the fact that you're calling glBufferData again and therefore reallocating the storage of the buffer, therefore orphaning the old storage, therefore allowing asynchronous writes. Am I correct by thinking this?

OpenGL 2.1: glMapBuffer and usage hints

I've been using glBufferData, and it makes sense to me that you'd have to specify usage hints (e.g. GL_DYNAMIC_DRAW).
However, it was recently suggested to me on Stack Overflow that I use glMapBuffer or glMapBufferRange to modify non-contiguous blocks of vertex data.
When using glMapBuffer, there does not seem to be any point at which you specify a usage hint. So, my questions are as follows:
Is it valid to use glMapBuffer on a given VBO if you've never called glBufferData on that VBO?
If so, how does OpenGL guess the usage, since it hasn't been given a hint?
What are the advantages/disadvantages of glMapBuffer vs glBufferData? (I know they don't do exactly the same thing. But it seems that by getting a pointer with glMapBuffer and then writing to that address, you can do the same thing glBufferData does.)
Is it valid to use glMapBuffer on a given VBO if you've never called glBufferData on that VBO?
No, because to map some memory, it must be allocated first.
If so, how does OpenGL guess the usage, since it hasn't been given a hint?
It doesn't. You must call glBufferData at least once to initialize the buffer object. If you don't want to actually upload data (because you're going to use glMapBuffer), just pass a null pointer for the data pointer. This works just like with glTexImage, where a buffer/texture object is created, to be filled with either glBufferSubData/glTexSubImage, or in the case of a buffer object as well as through a memory mapping.
What are the advantages/disadvantages of glMapBuffer vs glBufferData? (I know they don't do exactly the same thing. But it seems that by getting a pointer with glMapBuffer and then writing to that address, you can do the same thing glBufferData does.)
glMapBuffer allows you to write to the buffer asynchronously from another thread. And for some implementations it may be possible, that the OpenGL driver gives your process direct access to DMA memory of the GPU or even better to the memory of the GPU itself. For example on SoC architectures with integrated graphics.
No, this appears to be invalid. You must call glBufferData, because otherwise OpenGL cannot know the size of your buffer.
As to which is faster, neither I nor the internet at large appears to know a definite answer. Just test it and see.

does glMapBuffer copy data?

i am new to OpenGL.
My question is: what does glMapBuffer do behind the scenes? does it allocate a new host memory, copies the GL object data to it and and returns the pointer?
is it gauranteed to receive the same pointer for subsequent calls to this method? ofcourse with releasing in between.
Like so often, the answer is "It depends". In some situations glMapBuffer will indeed allocate memory through malloc, copy the data there for your use and glUnmapBuffer releases it.
However the common way to implement glMapBuffer, is through memory mapping. If you don't know what this is, take a look at the documentation of the syscalls mmap (*nix systems like Linux, MacOS X) or CreateFileMap. What happens there is kind of interesting: Modern operating systems manage the running processes' address space in virtual memory. Everytime some "memory" is accessed the OS' memory management uses the accessed address as an index into a translation table, to redirect the operation to system RAM, swap space, etc. (of course the details are quite involved, the memory management is one of the more difficult things in a kernel to understand). A driver can install its very own access handler. So a process can mmap something managed by a driver into its address space and everytime is performs an access on this, the driver's mmap handler gets called. This allows the driver to map some GPU memory (through DMA) into the process' address space and do the neccessary bookkeeping. In this situation glMapBuffer will create such a memory mapping and the pointer you recieve will point to some address space in your process, that has been mapped into DMA memory reserved for the GPU.
It could do any of those. The specific memory behavior of glMapBuffer is implementation-defined.
If you map a buffer for reading, it might give you a pointer to that buffer object's memory. Or it might allocate a pointer and copy it. If you map a buffer for writing, it might give you a pointer to that buffer object's memory. Or it might allocate a pointer and hand that to you.
There is no way to know; you can't rely on it doing either one. It ought to give you better performance than glBufferSubData, assuming you can generate your data (from a file or from elsewhere) directly into the memory you get back from glMapPointer. The worst-case would be equal performance to glBufferSubData.
is it gauranteed to receive the same pointer for subsequent calls to this method?
Absolutely not.