OpenGL. Updating vertex buffer with glBufferData - opengl

I'm using OpenGL to implement some kind of batched drawing. For this I create a vertex buffer to store data.
Note: this buffer generally will update on each frame, but will never decrease size (but still can increase).
My question is: is it technically correct to use glBufferData (with streaming write-only mode) for updating it (instead of e.g. glMapBuffer)? I suppose there's no need to map it, since full data is updated, so I just send a full pack at once. And if the current buffer size is less, than I'm sending, it will automatically increase, won't it? I'm just now sure about the way it really works (maybe it will recreate buffer on each call, no?).

It would be better to have buffer with fixed size and do not recreate it every frame.
You can achieve this by:
creating buffer with max size, for instance space for 1000 verts
update only the beginning of the buffer with new data. So if you changed data for 500 verts then fill ony first half of the buffer using glMapBuffer
change count of drawn vertices when drawing. You can, for instance, use only some range of verts (eg. from 200 to 500) from the whole 1000 verts buffer. Use glDrawArrays(mode, first, count)
ideas from comments:
glMapBufferRange and glBufferSubData could also help
also consider double buffering of buffers
link: http://hacksoflife.blogspot.com/2010/02/double-buffering-vbos.html
hope that helps

In addition to what fen and datenwolf said, see Chapter 22 of OpenGL Insights; in particular, it includes timings for a variety of hardware & techniques.

Related

How to sample Renderbuffer depth information and process it in CPU code, without causing an impact on performance?

I am trying to sample a few fragments' depth data that I need to use in my client code (that runs on CPU).
I tried a glReadPixel() on my FrameBuffer Object, but turns out it stalls the render pipeline as it transfers data from Video Memory to Main Memory through the CPU, thus causes unbearable lag (please, correct me if I am wrong).
I read about Pixel Buffer objects, that we can use them as copies of other buffers, and very importantly, perform glReadPixel() operation without stalling the performance, but not without compromising to use outdated information. (That's OK for me.)
But, I am unable to understand about how to use Pixel Buffers.
What I've learnt is we need to sample data from a texture to store it in a PixelBuffer. But I am trying to sample from a Renderbuffer, which I've read is not possible.
So here's my problem - I want to sample the depth information stored in my Render Buffer, store it in RAM, process it and do other stuff, without causing any issues to the Rendering Pipeline. If I use a depth texture instead of a renderbuffer, i don't know how to use it for depth testing.
Is it possible to copy the entire Renderbuffer to the Pixelbuffer and perform read operations on it?
Is there any other way to achieve what I am trying to do?
Thanks!
glReadPixels can also transfer from a framebuffer to a standard GPU side buffer object. If you generate a buffer and bind it to the GL_PIXEL_PACK_BUFFER target, the data pointer argument to glReadPixels is instead an offset into the buffer object. (So probably should be 0 unless you are doing something clever.)
Once you've copied the pixels you need into a buffer object, you can transfer or map or whatever back to the CPU at a time convenient for you.

How to update vertex buffer data frequently in DirectX 11?

I am trying to update my vertex buffer data with the map function in dx. Though it does update the data once, but if i iterate over it the model disappears. i am actually trying to manipulate vertices in real-time by user input and to do so i have to update the vertex buffer every frame while the vertex is selected.
Perhaps this happens because the Map function disables GPU access to the vertices until the Unmap function is called. So if the access is blocked every frame, it kind of makes sense for it to not be able render the mesh. However when i update the vertex every frame and then stop after sometime, theatrically the mesh should show up again, but it doesn't.
i know that the proper way to update data every frame is to use constant buffers, but manipulating vertices with constant buffers might not be a good idea. and i don't think that there is any other way to update the vertex data. i expect dynamic vertex buffers to be able to handle being updated every frame.
D3D11_MAPPED_SUBRESOURCE mappedResource;
ZeroMemory(&mappedResource, sizeof(D3D11_MAPPED_SUBRESOURCE));
// Disable GPU access to the vertex buffer data.
pRenderer->GetDeviceContext()->Map(pVBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &mappedResource);
// Update the vertex buffer here.
memcpy((Vertex*)mappedResource.pData + index, pData, sizeof(Vertex));
// Reenable GPU access to the vertex buffer data.
pRenderer->GetDeviceContext()->Unmap(pVBuffer, 0);
As this has been already answered the key issue that you are using Discard (which means you won't be able to retrieve the contents from the GPU), I thought I would add a little in terms of options.
The question I have is whether you require performance or the convenience of having the data in one location?
There are a few configurations you can try.
Set up your Buffer to have both CPU Read and Write Access. This though mean you will be pushing and pulling your buffer up and down the bus. In the end, it also causes performance issues on the GPU such as blocking etc (waiting for the data to be moved back onto the GPU). I personally don't use this in my editor.
If memory is not the issue, set up a copy of your buffer on CPU side, each frame map with Discard and block copy the data across. This is performant, but also memory intensive. You obviously have to manage the data partioning and indexing into this space. I don't use this, but I toyed with it, too much effort!
You bite the bullet, you map to the buffer as per 2, and write each vertex object into the mapped buffer. I do this, and unless the buffer is freaking huge, I havent had issue with it in my own editor.
Use the Computer shader to update the buffer, create a resource view and access view and pass the updates via a constant buffer. Bit of a Sledgehammer to crack a wallnut. And still doesn't stop the fact you may need pull the data back off the GPU ala as per item 1.
There are some variations on managing the buffer, such as interleaving you can play with also (2 copies, one on GPU while the other is being written to) which you can try also. There are some rather ornate mechanisms such as building the content of the buffer in another thread and then flagging the update.
At the end of the day, DX 11 doesn't offer the ability (someone might know better) to edit the data in GPU memory directly, there is alot shifting between CPU and GPU.
Good luck on which ever technique you choose.
Mapping buffer with D3D11_MAP_WRITE_DISCARD flag will cause entire buffer content to become invalid. You can not use it to update just a single vertex. Keep buffer on the CPU side instead and then update entire buffer on GPU side once per frame.
If you develop for UWP - use of map/unmap may result in sync problems. ID3D11DeviceContext methods are not thread safe: https://learn.microsoft.com/en-us/windows/win32/direct3d11/overviews-direct3d-11-render-multi-thread-intro.
If you update buffer from one thread and render from another - you may get different errors. In this case you must use some synchronization mechanism, such as critical sections. Example is here https://developernote.com/2015/11/synchronization-mechanism-in-directx-11-and-xaml-winrt-application/

Calling glBufferSubData several times per frame

I´m doing a rendering where i call glBufferSubData several times per frame.
Here is how i do it in my code :
glBindBuffer(GL_ARRAY_BUFFER,_vboID);
//Buffering the data
glBufferData(GL_ARRAY_BUFFER,_vboDATA[type].size()*sizeof(vboData),nullptr,GL_DYNAMIC_DRAW);
glBufferSubData(GL_ARRAY_BUFFER,0,_vboDATA[type].size()*sizeof(vboData),_vboDATA[type].data());
glBindBuffer(GL_ARRAY_BUFFER,0);
I´m asking because i have a different types of elements and each has it´s own vector to represent vbo data(im using 1 VBO(for colors,UVs,positions - and 1 VAO) in my program and i´m doing the rendering like :
1.) Load vector of element 1 to buffer
2.) Render element 1
3.) Load vector of element 2 to buffer
4.) Render element 2
...
Is it a proper way to do it like that?
The proper way depends on your usage scenario. When the data is mostly static, then the best way is to upload the data once to the GPU. This can either be done by creating a VBO for each object, or by patching all objects together into a single VBO.
In case the data is fully dynamic (and changing in every frame), then you will have to upload the data anyway, so using just one buffer could be fine. But you should still try to get along with just uploading the data (glBufferSubData) and avoid allocating new memory (glBufferData) all over. You can, for example, choose an initial size of the buffer that is large enough to fit all your data. Another option would be to use persistently mapped buffer.

CPU/GPU Shared Buffer in Direct3D12

I have no experience with Direct3D, so I may just be looking in the wrong places. However, I would like to convert a program I have written in OpenGL (using FreeGLUT) to a Windows IoT compatible UWP (running Direct3D, 12 'caus it's cool). I'm trying to port my program to a Raspberry Pi 3 and I don't want to convert to Linux.
Through the examples provided by Microsoft I have figured out most of what I believe I need to know to get started, but I can't figure out how to share a dynamic data buffer between the CPU and GPU.
What I want to know how to do:
Create a CPU/GPU shared circular buffer
Read and Draw with the GPU
Write / Replace sections with the CPU
Quick semi-pseudo code:
while (!buffer.inUse()){ //wait until buffer is not in use
updateBuffer(buffer.id, data, start, end); //insert data into buffer
drawToScreen(buffer.id); //draw using vertex data in buffer
}
This was previously done in OpenGL by simply using glBegin()/glEnd() and glVertex3f() for each value in an array when it wasn't being written to.
Update: I basically want a Direct3D12 equivalent of OpenGLs VBO editing using glBufferSubData(). If that makes more sense.
Update 2: I found that I can get away with discarding the vertex buffer every frame and re-uploading a new buffer to the GPU. There's a fair amount of overhead, as one would expect with transferring 10,000 - 200,000 doubles every frame. So I'm trying to find a way to use constant buffers to port the 5-10 updated vertexes into the shader, so I can copy from the constant buffer into the vertex buffer using the shader and not have to use map/unmap every frame. This way my circular buffer on the CPU is independent of the buffer being used on the GPU, but they will both share the same information through periodic updates. I'll do some more looking and post another more specific question on shaders if I don't find a solution.

Read Framebuffer-texture like an 1D array

I am doing some gpgpu calculations with GL and want to read my results from the framebuffer.
My framebuffer-texture is logically an 1D array, but I made it 2D to have a bigger area. Now I want to read from any arbitrary pixel in the framebuffer-texture with any given length.
That means all calculations are already done on GPU side and I only need to pass certain data to the cpu that could be aligned over the border of the texture.
Is this possible? If yes is it slower/faster than glReadPixels on the whole image and then cutting out what I need?
EDIT
Of course I know about OpenCL/CUDA but they are not desired because I want my program to run out of the box on (almost) any platform.
Also I know that glReadPixels is very slow and one reason might be that it offers some functionality that I do not need (Operating in 2D). Therefore I asked for a more basic function that might be faster.
Reading the whole framebuffer with glReadPixels just to discard it all except for a few pixels/lines would be grossly inefficient. But glReadPixels lets you specify a rect within the framebuffer, so why not just restrict it to fetching the few rows of interest ? So you maybe end up fetching some extra data at the start and end of the first and last lines fetched, but I suspect the overhead of that is minimal compared with making multiple calls.
Possibly writing your data to the framebuffer in tiles and/or using Morton order might help structure it so a tighter bounding box can be be found and the extra data retrieved minimised.
You can use a pixel buffer object (PBO) to transfer pixel data from the framebuffer to the PBO, then use glMapBufferARB to read the data directly:
http://www.songho.ca/opengl/gl_pbo.html