I am currently working with OpenGL - version 4.4 and I had a question about the glMapBuffer function. In some other API's I have used (E.G. DX12 and VK) you can keep the pointer that the map function returns alive and flush the memory with a separate call instead of deallocating that pointer.
Is there a way to keep this pointer for a longer duration of time and updating the GPU memory without deallocating this pointer calling glUnmapBuffer?
A buffer object which has immutable storage (using gl(Named)BufferStorage from GL 4.4 or ARB_buffer_storage) can be so allocated with the GL_MAP_PERSISTENT_BIT flag. This allows glMap(Named)BufferRange to be given the same flag and thereby persistently map that range of the buffer's storage.
A buffer whose storage is persistently mapped can be used for many buffer operations without being unmapped first. However, the onus of all synchronization for data access is now on the user. So get comfortable with using fences and double-buffering to access/manipulate the data.
Related
OpenGL provides the functions glInvalidateBufferData and glInvalidateBufferSubData for invalidating buffers.
What these functions do according to the OpenGL wiki:
When a buffer or region thereof is invalidated, it means that the contents of that buffer are now undefined.
and
The idea is that, by invalidating a buffer or range thereof, the implementation will simply grab a new piece of memory to use for any later operations. Thus, while previously issued GL commands can still read the buffer's original data, you can fill the invalidated buffer with new values (via mapping or glBufferSubData) without causing implicit synchronization.
I'm having a hard time understanding when this function should be called and when it shouldn't. In theory, if the contents of the buffer are going to be overwritten it makes absolutely no difference if the previous contents were trashed or not. Does this mean that a buffer should be called before every write? Or only in certain situations?
In theory, if the contents of the buffer are going to be overwritten it makes absolutely no difference if the previous contents were trashed or not.
You're absolutely right.
In essence whether you call glBufferData or glInvalidateBufferData you're achieving the same end goal. The difference is that with glBufferData you're more saying "I need this much memory now", which in turn means that the old memory is discarded. However with glInvalidateBufferData you're saying "I don't need this memory anymore". The same goes for glInvalidateBufferSubData() vs glBufferSubData() as well as all the other glInvalidate*() functions.
The key here is that if you have a buffer, and you currently aren't needing it anymore however you're keeping the handle for later use. Then you can call glInvalidateBufferData to tell that the memory can be released.
The same goes for glInvalidateBufferSubData(). If you suddenly don't need the last half chunk of the memory assigned to the buffer, then now your driver knows that this chunk of memory can be reassigned.
When should glInvalidateBufferData be used?
So to answer your question. When you have a buffer laying around and you don't need the memory anymore.
I am using a board with integrated gpu and cpu memory. I am also using an external matrix library (Blitz++). I would like to be able to grab the pointer to my data from the matrix object and pass it into a cuda kernel. After doing some digging, it sounds like I want to use some form of a zero copy by calling cudaHostGetDevicePointer. What I am unsure of is the allocation of the memory. Do I have to have created the pointer using cudaHostAlloc? I do not want to have to re-write Blitz++ to do cudaHostAlloc if I don't have to.
My code currently works, but does a copy of the matrix data every time. That is not needed on the integrated memory cards.
The pointer has to be created (i.e. allocated) with cudaHostAlloc, even on integrated systems like Jetson.
The reason for this is that the GPU requires (zero-copy) memory to be pinned, i.e. removed from the host demand-paging system. Ordinary allocations are subject to demand-paging, and may not be used as zero-copy i.e. mapped memory for the GPU.
I have run into an issue I am unsure of how to properly handle. I recently began creating a particle system for my game, and have been using a structure called 'Particle' for my particle data. 'Particle' contains the vertex information for rendering.
The reason I am having issues is that I am pooling my particle structures in heap memory in order to save on large amounts of allocations, however I am unsure of how to use an array of pointers in glBufferData, I am under the impression that glBufferData requires the actual structure instance rather then a pointer to the structure instance.
I know I can rebuild an array of floats each render just to draw my particles, but is there an OpenGL call like glBufferData which I am missing somewhere that is able to de-reference my pointers as it is going through the data I supply? I would ideally like to prevent having to iterate over the array just to copy the data.
I am under the impression that glBufferData requires the actual structure instance rather then a pointer to the structure instance.
Correct. Effectively glBufferData creates a flat copy of the data preseted to it at the address pointed it via the data parameter.
which I am missing somewhere that is able to de-reference my pointers as it is going through the data I supply?
You're thinking of client side vertex arrays, and those are among the oldest features of OpenGL. They're around since OpenGL-1.1, released 19 years ago.
You just don't use a buffer object, i.e. don't call glGenBuffers, glBindBuffer, glBufferData and pass your client side data address directly to glVertexPointer or glVertexAttribPointer.
However I strongly advise to actually use buffer objects. The data must be copied to the GPU anyway, so that it can be rendered. And doing it through a buffer object enables the OpenGL driver to work more efficiently. Also since OpenGL-4 the use of buffer objects is no longer optional.
I want to customize std::vector class in order to use an OpenGL buffer object as storage.
Is it possible doing so without relying in a specific implementation of STL, by writing a custom allocator and/or subclassing the container?
My problem is how to create wrapper methods for glMapBuffer/glUnmapBuffer in order to user the buffer object for rendering which leave the container in a consistent state.
Is it possible doing so without relying in a specific implementation of STL, by writing a custom allocator and/or subclassing the container?
You can, but that doesn't make it a good idea. And even that you can is dependent on your compiler/standard library.
Before C++11, allocators can not have state. They can cannot have useful members, because the containers are not required to actually use the allocator you pass it. They are allowed to create their own. So you can set the type of allocator, but you cannot give it a specific allocator instance and expect that instance to always be used.
So your allocator cannot just create a buffer object and store the object internally. It would have to use a global (or private static or whatever) buffer. And even then, multiple instances would be using the same buffer.
You could get around this by having the allocator stores (in private static variables) a series of buffer objects and mapped pointers. This would allow you to allocate a buffer object of a particular size, and you get a mapped pointer back. The deallocator would use the pointer to figure out which buffer object it came from and do the appropriate cleanup for it.
Of course, this would be utterly useless for actually doing anything with those buffers. You can't use a buffer that is currently mapped. And if your allocator deletes the buffer once the vector is done with the memory, then you can never actually use that buffer object to do something.
Also, don't forget: unmapping a buffer can fail for unspecified reasons. If it does fail, you have no way of knowing that it did, because the unmap call is wrapped up in the allocator. And destructors shouldn't throw exceptions.
C++11 does make it so that allocators can have state. Which means that it is more or less possible. You can have the allocator survive the std::vector that built the data, and therefore, you can query the allocator for the buffer object post-mapping. You can also store whether the unmap failed.
That still doesn't make it a good idea. It'll be much easier overall to just use a regular old std::vector and use glBufferSubData to upload it. After all, mapping a buffer with READ_WRITE almost guarantees that it's going to be regular memory rather than a GPU address. Which means that unmapping is just going to perform a DMA, which glBufferSubData does. You won't gain much performance by mapping.
Reallocation with buffer objects is going to be much more painful. Since the std::vector object is the one that decides how much extra memory to store, you can't play games like allocating a large buffer object and then just expanding the amount of memory that the container uses. Every time the std::vector thinks that it needs more memory, you're going to have to create a new buffer object name, and the std::Vector will do an element-wise copy from mapped memory to mapped memory.
Not very fast.
Really, what you want is to just make your own container class. It isn't that hard. And it'll be much easier to control when it is mapped and when it is not.
I want to customize std::vector class in order to use an OpenGL buffer object as storage.
While this certainly is possible, I strongly discourage doing so. Mapped buffer objects must be unmapped before they can be used by OpenGL as data input. Thus such a derived, let's call it glbuffervector would have to map/unmap the buffer object for each and every access. Also taking an address of a dereferenced element will not work, since after dereferencing the buffer object would be unmapped again.
Instead of trying to make a vector that stores in a buffer object, I'd implement a referencing container, which can be created from an existing buffer object, together with a layout, so that iterators can be obtained. Following an RAII scheme the buffer object would be mapped creating an instance, and unmapped with instance deallocation.
Is it possible doing so without relying in a specific implementation
of STL, by writing a custom allocator and/or subclassing the
container?
If you are using Microsoft Visual C++, there is a blog post describing how to define a custom STL allocator: "The Mallocator".
I think writing custom allocators is STL-implementation-specific.
i am new to OpenGL.
My question is: what does glMapBuffer do behind the scenes? does it allocate a new host memory, copies the GL object data to it and and returns the pointer?
is it gauranteed to receive the same pointer for subsequent calls to this method? ofcourse with releasing in between.
Like so often, the answer is "It depends". In some situations glMapBuffer will indeed allocate memory through malloc, copy the data there for your use and glUnmapBuffer releases it.
However the common way to implement glMapBuffer, is through memory mapping. If you don't know what this is, take a look at the documentation of the syscalls mmap (*nix systems like Linux, MacOS X) or CreateFileMap. What happens there is kind of interesting: Modern operating systems manage the running processes' address space in virtual memory. Everytime some "memory" is accessed the OS' memory management uses the accessed address as an index into a translation table, to redirect the operation to system RAM, swap space, etc. (of course the details are quite involved, the memory management is one of the more difficult things in a kernel to understand). A driver can install its very own access handler. So a process can mmap something managed by a driver into its address space and everytime is performs an access on this, the driver's mmap handler gets called. This allows the driver to map some GPU memory (through DMA) into the process' address space and do the neccessary bookkeeping. In this situation glMapBuffer will create such a memory mapping and the pointer you recieve will point to some address space in your process, that has been mapped into DMA memory reserved for the GPU.
It could do any of those. The specific memory behavior of glMapBuffer is implementation-defined.
If you map a buffer for reading, it might give you a pointer to that buffer object's memory. Or it might allocate a pointer and copy it. If you map a buffer for writing, it might give you a pointer to that buffer object's memory. Or it might allocate a pointer and hand that to you.
There is no way to know; you can't rely on it doing either one. It ought to give you better performance than glBufferSubData, assuming you can generate your data (from a file or from elsewhere) directly into the memory you get back from glMapPointer. The worst-case would be equal performance to glBufferSubData.
is it gauranteed to receive the same pointer for subsequent calls to this method?
Absolutely not.