OpenGL 2.1: glMapBuffer and usage hints - opengl

I've been using glBufferData, and it makes sense to me that you'd have to specify usage hints (e.g. GL_DYNAMIC_DRAW).
However, it was recently suggested to me on Stack Overflow that I use glMapBuffer or glMapBufferRange to modify non-contiguous blocks of vertex data.
When using glMapBuffer, there does not seem to be any point at which you specify a usage hint. So, my questions are as follows:
Is it valid to use glMapBuffer on a given VBO if you've never called glBufferData on that VBO?
If so, how does OpenGL guess the usage, since it hasn't been given a hint?
What are the advantages/disadvantages of glMapBuffer vs glBufferData? (I know they don't do exactly the same thing. But it seems that by getting a pointer with glMapBuffer and then writing to that address, you can do the same thing glBufferData does.)

Is it valid to use glMapBuffer on a given VBO if you've never called glBufferData on that VBO?
No, because to map some memory, it must be allocated first.
If so, how does OpenGL guess the usage, since it hasn't been given a hint?
It doesn't. You must call glBufferData at least once to initialize the buffer object. If you don't want to actually upload data (because you're going to use glMapBuffer), just pass a null pointer for the data pointer. This works just like with glTexImage, where a buffer/texture object is created, to be filled with either glBufferSubData/glTexSubImage, or in the case of a buffer object as well as through a memory mapping.
What are the advantages/disadvantages of glMapBuffer vs glBufferData? (I know they don't do exactly the same thing. But it seems that by getting a pointer with glMapBuffer and then writing to that address, you can do the same thing glBufferData does.)
glMapBuffer allows you to write to the buffer asynchronously from another thread. And for some implementations it may be possible, that the OpenGL driver gives your process direct access to DMA memory of the GPU or even better to the memory of the GPU itself. For example on SoC architectures with integrated graphics.

No, this appears to be invalid. You must call glBufferData, because otherwise OpenGL cannot know the size of your buffer.
As to which is faster, neither I nor the internet at large appears to know a definite answer. Just test it and see.

Related

Should volatile be used when mapping GPU memory?

Both OpenGL and Vulkan allow to obtain a pointer to a part of GPUs memory by using glMapBuffer and vkMapMemory respectively. They both give a void* to the mapped memory. To interpret its contents as some data it has to be cast to an appropriate type. The simplest example could be to cast to a float* to interpret the memory as an array of floats or vectors or similar.
It seems that any kind of memory mapping is undefined behaviour in C++, as it has no notion of memory mapping. However, this isn't really an issue because this topic is outside of the scope of the C++ Standard. However, there is still the question of volatile.
In the linked question the pointer is additionally marked as volatile because the contents of the memory it points at can be modified in a way that the compiler cannot anticipate during compilation. This seems reasonable though I rarely see people use volatile in this context (more broadly, this keyword seems to be barely used at all nowadays).
At the same time in this question the answer seems to be that using volatile is unnecessary. This is due to the fact that the memory they speak of is mapped using mmap and later given to msync which can be treated as modifying the memory, which is similar to explicitly flushing it in Vulkan or OpenGL. I'm afraid though that this doesn't apply to neither OpenGL nor Vulkan.
In case of the memory being mapped as not GL_MAP_FLUSH_EXPLICIT_BIT or it being VK_MEMORY_PROPERTY_HOST_COHERENT_BIT than no flushing is needed at all and the memory contents update automagically. Even if the memory is flushed by hand by using vkFlushMappedMemoryRanges or glFlushMappedBufferRange neither of these functions actually takes the mapped pointer as a parameter, so the compiler couldn't possibly know that they modify the mapped memory's contents.
As such, is it necessary to mark pointers to mapped GPU memory as volatile? I know that technically this is all undefined behaviour, but I am asking what is required in practice on real hardware.
By the way, neither the Vulkan Specification or the OpenGL Specification mention the volatile qualifier at all.
EDIT: Would marking the memory as volatile incur a performance overhead?
OK, let's say that we have a compiler that is omniscient about everything that happens in your code. This means that the compiler can follow any pointer, even through the runtime execution of your code perfectly and correctly every time, no matter how you try to hide it. So even if you read a byte at one end of your program, the compiler will somehow remember the exact bytes you've read and anytime you try to read them again, it can choose to not execute that read and just give you the previous value, unless the compiler is aware of something that can change it.
But let's also say that our omniscient compiler is completely oblivious to everything that happens in OpenGL/Vulkan. To this compiler, the graphics API is a black box. Here, there be dragons.
So you get a pointer from the API, read from it, the GPU writes to it, and then you want to read that new data the GPU just wrote. Why would a compiler believe that the data behind that pointer has been altered; after all, the alterations came from outside of the system, from a source that the C++ standard does not recognize.
That's is what volatile is for, right?
Well, here's the thing. In both OpenGL and Vulkan, to ensure that you can actually read that data, you need to do something. Even if you map the memory coherently, you have to make an API call to ensure that the GPU process that wrote to the memory has actually executed. For Vulkan, you're waiting on a fence or an event. For OpenGL, you're waiting on a fence or executing a full finish.
Either way, before executing the read from the memory, the omniscient compiler encounters a function call into a black box which as established earlier the compiler knows nothing about. Since the mapped pointer itself came from the same black box, the compiler cannot assume that the black box doesn't have a pointer to that memory. So as far as the compiler is concerned, calling those functions could have written data to that memory.
And therefore, our omniscient-yet-oblivious compiler cannot optimize away such memory accesses. Once we get control back from those functions, the compiler must assume that any memory from any pointer reachable through that address could have been altered.
And if the compiler were able to peer into the graphics API itself, to read and understand what those functions are doing, then it would definitely see things that would tell it, "oh, I should not make assumptions about the state of memory retrieved through these pointers."
This is why you don't need volatile.
Also, note that the same applies to writing data. If you write to persistent, coherent mapped memory, you still have to perform some synchronization action with the graphics API so that your CPU writes so that the GPU isn't reading it. So that's where the compiler knows that it can no longer rely on its knowledge of previously written data.

When should glInvalidateBufferData be used?

OpenGL provides the functions glInvalidateBufferData and glInvalidateBufferSubData for invalidating buffers.
What these functions do according to the OpenGL wiki:
When a buffer or region thereof is invalidated, it means that the contents of that buffer are now undefined.
and
The idea is that, by invalidating a buffer or range thereof, the implementation will simply grab a new piece of memory to use for any later operations. Thus, while previously issued GL commands can still read the buffer's original data, you can fill the invalidated buffer with new values (via mapping or glBufferSubData) without causing implicit synchronization.
I'm having a hard time understanding when this function should be called and when it shouldn't. In theory, if the contents of the buffer are going to be overwritten it makes absolutely no difference if the previous contents were trashed or not. Does this mean that a buffer should be called before every write? Or only in certain situations?
In theory, if the contents of the buffer are going to be overwritten it makes absolutely no difference if the previous contents were trashed or not.
You're absolutely right.
In essence whether you call glBufferData or glInvalidateBufferData you're achieving the same end goal. The difference is that with glBufferData you're more saying "I need this much memory now", which in turn means that the old memory is discarded. However with glInvalidateBufferData you're saying "I don't need this memory anymore". The same goes for glInvalidateBufferSubData() vs glBufferSubData() as well as all the other glInvalidate*() functions.
The key here is that if you have a buffer, and you currently aren't needing it anymore however you're keeping the handle for later use. Then you can call glInvalidateBufferData to tell that the memory can be released.
The same goes for glInvalidateBufferSubData(). If you suddenly don't need the last half chunk of the memory assigned to the buffer, then now your driver knows that this chunk of memory can be reassigned.
When should glInvalidateBufferData be used?
So to answer your question. When you have a buffer laying around and you don't need the memory anymore.

Orphaning: calling glBufferData with a NULL pointer, really?

From OpenGL wiki:
If you will be updating a small section, use glBufferSubData. If you will update the entire VBO, use glBufferData (this information reportedly comes from a nVidia document). However, another approach reputed to work well when updating an entire buffer is to call glBufferData with a NULL pointer, and then glBufferSubData with the new contents. The NULL pointer to glBufferData lets the driver know you don't care about the previous contents so it's free to substitute a totally different buffer, and that helps the driver pipeline uploads more efficiently.
I realize from my research, what they mean by "helps pipeline upload more efficiently" is it allows asynchronous buffer writes which makes dynamic buffers faster.
However, this excerpt and others seem to suggest that the "NULL" argument to the glBufferData is very important, and it's the cause for the orphaning and it's what promotes asynchronous writes.
But my common sense is telling me, it doesn't need to be null, you can put a pointer to some of your data there instead of null. The important part for the orphaning is just the fact that you're calling glBufferData again and therefore reallocating the storage of the buffer, therefore orphaning the old storage, therefore allowing asynchronous writes. Am I correct by thinking this?

glBuffer with heap memory

I have run into an issue I am unsure of how to properly handle. I recently began creating a particle system for my game, and have been using a structure called 'Particle' for my particle data. 'Particle' contains the vertex information for rendering.
The reason I am having issues is that I am pooling my particle structures in heap memory in order to save on large amounts of allocations, however I am unsure of how to use an array of pointers in glBufferData, I am under the impression that glBufferData requires the actual structure instance rather then a pointer to the structure instance.
I know I can rebuild an array of floats each render just to draw my particles, but is there an OpenGL call like glBufferData which I am missing somewhere that is able to de-reference my pointers as it is going through the data I supply? I would ideally like to prevent having to iterate over the array just to copy the data.
I am under the impression that glBufferData requires the actual structure instance rather then a pointer to the structure instance.
Correct. Effectively glBufferData creates a flat copy of the data preseted to it at the address pointed it via the data parameter.
which I am missing somewhere that is able to de-reference my pointers as it is going through the data I supply?
You're thinking of client side vertex arrays, and those are among the oldest features of OpenGL. They're around since OpenGL-1.1, released 19 years ago.
You just don't use a buffer object, i.e. don't call glGenBuffers, glBindBuffer, glBufferData and pass your client side data address directly to glVertexPointer or glVertexAttribPointer.
However I strongly advise to actually use buffer objects. The data must be copied to the GPU anyway, so that it can be rendered. And doing it through a buffer object enables the OpenGL driver to work more efficiently. Also since OpenGL-4 the use of buffer objects is no longer optional.

does glMapBuffer copy data?

i am new to OpenGL.
My question is: what does glMapBuffer do behind the scenes? does it allocate a new host memory, copies the GL object data to it and and returns the pointer?
is it gauranteed to receive the same pointer for subsequent calls to this method? ofcourse with releasing in between.
Like so often, the answer is "It depends". In some situations glMapBuffer will indeed allocate memory through malloc, copy the data there for your use and glUnmapBuffer releases it.
However the common way to implement glMapBuffer, is through memory mapping. If you don't know what this is, take a look at the documentation of the syscalls mmap (*nix systems like Linux, MacOS X) or CreateFileMap. What happens there is kind of interesting: Modern operating systems manage the running processes' address space in virtual memory. Everytime some "memory" is accessed the OS' memory management uses the accessed address as an index into a translation table, to redirect the operation to system RAM, swap space, etc. (of course the details are quite involved, the memory management is one of the more difficult things in a kernel to understand). A driver can install its very own access handler. So a process can mmap something managed by a driver into its address space and everytime is performs an access on this, the driver's mmap handler gets called. This allows the driver to map some GPU memory (through DMA) into the process' address space and do the neccessary bookkeeping. In this situation glMapBuffer will create such a memory mapping and the pointer you recieve will point to some address space in your process, that has been mapped into DMA memory reserved for the GPU.
It could do any of those. The specific memory behavior of glMapBuffer is implementation-defined.
If you map a buffer for reading, it might give you a pointer to that buffer object's memory. Or it might allocate a pointer and copy it. If you map a buffer for writing, it might give you a pointer to that buffer object's memory. Or it might allocate a pointer and hand that to you.
There is no way to know; you can't rely on it doing either one. It ought to give you better performance than glBufferSubData, assuming you can generate your data (from a file or from elsewhere) directly into the memory you get back from glMapPointer. The worst-case would be equal performance to glBufferSubData.
is it gauranteed to receive the same pointer for subsequent calls to this method?
Absolutely not.