glUnmapBuffer while keeping glMapBuffer memory valid as read-only - c++

Is it possible to glUnmapBuffer a GL_STREAM_DRAW pixel-buffer-object and still keep the data pointed to by the pointer returned previously by glMapBuffer valid for read-only operations using SSE 4.1 streaming loads?
If not, is there any technical reason for this? Or was this "feature" just left out?

The purpose of map and unmap is to say "I need a pointer to this." and "I'm finished using the pointer to this." That's what the functions do.
When you unmap a buffer, the driver is now free to:
Copy the data you wrote (if you wrote anything) to the buffer object, if the mapped pointer is not a direct pointer to that buffer.
Move the buffer object around in GPU memory, if the mapped pointer is a pointer to that object.
Remember: mapping a pointer does not have to return an actual pointer to that buffer. It simply returns a pointer that when read, will have the data stored in that buffer, and when written, the written bytes will be transferred into the buffer upon unmapping.
Furthermore, the only reason to do what you're suggesting is because you want to read the data in the buffer. Well, since you just mapped the buffer (presumably for writing, or else you wouldn't have unmapped it), you know what's in it. If you needed CPU access to it, you should have just stored the data locally; you'll get a lot more reliable access to it that way.
And if you do another pixel transfer, reading from that pointer it means that OpenGL would have to execute a synchronization, because the whole point of PBO is asynchornous transfer. That is, when you execute glReadPixels or whatever, OpenGL can wait to actually finish this operation until you map the buffer or use glGetBufferSubData.
But if the buffer is mapped for reads, then OpenGL doesn't know when you're going to read from it (since it can't tell when you read from a pointer). So OpenGL can't guarantee the storage within it. In short: undefined behavior. You could get anything at that point.
So what you're talking about doesn't make sense.

Related

What does ID3D11Device::CreateBuffer do under the hood?

I know this function create a "buffer." But what exactly is a buffer? Is it a COM object in memory? If it is, then in my understanding, this function takes in a descriptor and some initial data to create this COM object in memory, and then set the ID3D11Buffer pointer pointed by the input ID3D11Buffer** to the interface in the newly created COM object. Once the COM object is created, the initializing data is not needed any more and we can delete them. And once we call ID3DBuffer::Release(), the underline COM object will be destroyed. Is my understanding correct?
CreateBuffer returns a COM interface object ID3D11Buffer*. As long as it has a non-zero reference count (it starts at 1; each call to AddRef adds 1, each call to Release subtracts 1) then whatever resources it controls are active.
As to where exactly the resources are allocated, it really depends. You may find this article interesting as it covers different ways Direct3D allocates resources.
UPDATE: You should also read this Microsoft Docs introduction to the subset of COM used by DirectX.
In the general case, a buffer is a continuous, managed, area of memory.
Memory is a large set of addresses of read/writable elements (one element per address, of course), say 230 addresses of elements of 8-bit makes a 1GiB memory.
If there is only a single program and it uses these addresses statically (e.g. addresses from 0x1000 to 0x2000 are used to store the list of items) then memory doesn't need to be managed and in this context a buffer is just a continuous range of addresses.
However, if there are multiple programs or a program memory usage is dynamic (e.g. it depends on how many items it's been asked to read from input) then memory must be managed.
You must keep track of which ranges are already in use and which are not. So a buffer becomes a continuous range of addresses with their attributes (e.g. if it's in use or not).
The attributes of a buffer can vary a lot between the different memory allocators, in general, we say that a buffer is managed because we let the memory allocator handle it: find a suitable free range, mark it used, tell it if it can move the buffer aftward, mark it free when where are finished.
This is true for every memory that is shared, so it is certainly true for the main memory (RAM) and the graphic memory.
This is the memory inside the graphic card, that is accessed just like the main memory (from the CPU point of view).
What CreateBuffer return is a COM object in the main memory that contains the metadata necessary to handle the buffer just allocated.
It doesn't contain the buffer itself because this COM object is always in memory while the buffer usually is not (it is in the graphic memory).
CreateBuffer asks the graphic driver to find a suitable range of free addresses, in the memory asked, and fill in some metadata.
Before the CPU can access the main memory it is necessary to set up some metadata tables (the page tables) as part of its protection mechanism.
This is also true if the CPU needs to access the graphic memory (with possibly a few extra steps, for managing the MMIO if necessary).
The GPU also has page tables, so if the main memory has to be accessed by the GPU these page tables must also have to be created.
You see that it's important to know how the buffer will be used.
Another thing to consider is that the GPUs use highly optimized memory format - for example, the buffer used for a surface can be pictured as a rectangular area of memory.
The same is true for the buffer used by a texture.
However the twos are stored differently: the surface is stored linearly, each row after another, while the texture buffer is tiled (it's like it's made of many, say, 16x16 surfaces stored linearly one after the other).
This makes sampling and filtering faster.
Also, some GPU may need to have texture images on a specific area of memory, vertex buffer in another and so on.
So it's important to give the graphic driver all the information it needs to make the best choice when allocating a buffer.
Once the buffer has been found, the driver (or the D3D runtime) will initialize the buffer if requested.
It can do this by copying the data or by aliasing through the page tables (if the pitch allows for it) and eventually using some form of Copy-On-Write.
However it does that, the source data are not needed anymore (see this).
The COM object returned by CreateBuffer is a convenient proxy, then it is disposed of, thanks to the usual come AddRef/Release mechanism, it also asks the graphic driver to deallocate the buffer.

Passing buffer memory mapped pointer to glTex(Sub)Image2D. Is texture upload asynchronous?

Suppose I map a buffer, with
map_ptr = glMapBuffer (..) (The target shouldn't matter, but let's say its GL_TEXTURE_BUFFER)
Next I upload texture data with:
glTexImage2D(..., map_ptr), passing map_ptr as my texture data. (I don't have a GL_PIXEL_UNPACK_BUFFER bound)
Semantically, this involves copying the data from the buffer's data store to the texture object's data store, and the operation can be accomplished with a GPU DMA copy.
But what actually happens? Is the data copied entirely on the GPU, or does the CPU read and cache the mapped memory, and then write back to GPU at a separate GPU memory location? I.e. is the copy asynchronous, or does the CPU synchronously coordinate the copy, utilizing CPU cycles?
Is the answer to that implementation dependent? Does it depend on whether the OpenGL driver is intelligent enough to recognize the data pointer passed to glTexImage2D a GPU memory mapped pointer, and that a round-trip to the CPU is unnecessary? If so, how common is this feature in prevalent drivers today?
Also, what about the behaviour for an OpenCL buffer whose memory was mapped, i.e:
map_ptr = clEnqueueMapBuffer(..) (OpenCL buffer mapped memory)
and map_ptr was passed to glTexImage2D?
What you do there is simply undefined behavior as per the spec.
Pointer values returned by MapBufferRange may not be passed as parameter
values to GL commands. For example, they may not be used to specify array
pointers, or to specify or query pixel or texture image data; such
actions produce undefined results, although implementations may not
check for such behavior for performance reasons.
Let me quote from the GL_ARB_vertex_buffer_object extension spec, which originally introduced buffer objects and mapping operations (emphasis mine):
Are any GL commands disallowed when at least one buffer object is mapped?
RESOLVED: NO. In general, applications may use whatever GL
commands they wish when a buffer is mapped. However, several
other restrictions on the application do apply: the
application must not attempt to source data out of, or sink
data into, a currently mapped buffer. Furthermore, the
application may not use the pointer returned by Map as an
argument to a GL command.
(Note that this last restriction is unlikely to be enforced in
practice, but it violates reasonable expectations about how
the extension should be used, and it doesn't seem to be a very
interesting usage model anyhow. Maps are for the user, not
for the GL.)

When to clear a vertex buffer object

I'm just wondering: when (and maybe how) to clear the data from a VBO. Do you have to clear it always before rewriting the data? Why clear it?
Clearing the buffer (i.e. setting each byte to 0) isn't too useful. Invalidating the buffer is.
Invalidating a section of a buffer means that the contents of that section become invalid, and you must write new content to that section before using it. This allows the OpenGL implementation to avoid waiting until the buffer object is no longer being used in order to upload data to it by giving you a completely 'new' buffer to write to (under the same name). This technique is called buffer orphaning.
To invalidate a buffer, you can either call glBufferData with the same size and usage hints, but with a NULL data pointer, use glMapBufferRange with the GL_MAP_INVALIDATE_BUFFER_BIT, or glInvalidateBufferData if your GPU supports it.
The OpenGL Wiki article for Buffer Object Streaming covers this in more detail, and also offers several other solutions.
To directly answer your question, it is not required that you invalidate or clear a buffer before updating it. You can call glBufferSubData whenever you want to update whatever contents you want. However, doing so without invalidation may cause a pipeline stall as OpenGL waits for the buffer to finish being used before safely updating it.

Read from a socket without the associated memcpy from kernel space to user space

In Linux, is there a way to read from a socket while avoiding the implicit memcpy of the data from kernel space to user space?
That is, instead of doing
ssize_t n = read(socket_fd, buffer, count);
which obviously requires the kernel to do a memcpy from the network buffer into my supplied buffer, I would do something like
ssize_t n = fancy_read(socket_fd, &buffer, count);
and on return have buffer pointing to non memcpy()'ed data received from the network.
Initially I thought AF_PACKET option to socket family can be of help, but it cannot.
Nevertheless it is possible technically, as there is nothing that prevents you from implementing kernel module handling system call that returns user mapped pointer to kernel data (even if it is not very safe).
There are couple of questions regarding the call you would like to have:
Memory management. How would you know the memory can still be accessed after fancy_read system call returned?
How would you tell the kernel to eventually free that memory? There would need to be some form of memory management in place and if you would like kernel to give you a safe pointer to nonmemcpy'ed memory than a lot of changes would need to go into the kernel to enable this feature. Just imagine that all that data couldn't be freed before you tell that it can, so kernel would need to keep track of all of these returned pointers.
These could be done in a lot of ways, so basically yes, this is possible but you need to take many things into consideration.

Is there any chance to get access violation from memcpy, if src buffer is accessed(read/write) from other thread

I got an access violation from my application.
CallStack:
0da0ccfc 77c46fa3 ntdll!KiUserExceptionDispatcher+0xe
0da0d004 4dfeee3a msvcrt!memcpy+0x33
0da0d45c 4dfdbc4b MyLibrary!MyClass::MyFunc+0x8d [MyFile.cpp # 574]
[MyFile.cpp # 574 memcpy( m_pMyPointer, m_pSrcPointer, m_nDataSize);
Here Im sure the following things.. m_pMyPointer is valid and any other thread will not read or write to this memory. Size of m_pMyPointer is greater than m_nDataSize. m_pSrcPointer may accessible from other thread( read or write ) there is very little chance for the size of m_pSrcPointer as less than m_nDataSize.
My doubt is, is there any cahnce to get access violation from memcpy( m_pMyPointer, m_pSrcPointer, m_nDataSize), if any other thread tries to read/write to m_pSrcPointer. Since memcpy() reads m_pSrcPointer, and not write to it..
I would exclude that. Concurrent read access to a memory area is by definition thread-safe. When one thread writes to a location of memory which is read by another, you lose thread-safety in the sense that result is unpredictable but you still should not get an access violation (in most sane platforms, including x86).
Most likely, the size of the valid memory area pointed by either m_pMyPointer or m_pSrcPointer is smaller than m_nDataSize.
However, if you have doubts that the same piece of memory is read and written to by different threads at the same time, it means you are at the very least missing a locking scheme there.
If the concurrent threads change only the data in the buffer, you should not get any AV by copying from/to the buffer.
If the concurrent threads change the pointer to the buffer or the variable containing the size of the buffer (number of bytes or elements), you can get an AV easily from copying to/from the buffer using these pointer and size variables. Here you're entering the land of undefined behavior.
There is a small possibility. If the write to m_srcPtr isn't atomic, or the other thread is writing to one of the other members and you haven't told us about it (m_nDataSize sounds like a likely thing for instance).
If the write to m_srcPtr isn't atomic, depending on your architecture, you could temporarily have an invalid pointer between the 2 writes to the pointer. If m_nDataSize is updated at "the same time" as m_srcPtr, you have plenty of opportunity for bad things to happen.
Note this is highly architecture dependant.