Multithreaded loading data to GPU on Opengl - c++

I am attempting to write a multi threaded "load from disk" algorithm for my C++/OpenGL game engine.
My current situation is as follows:
Thread #0(main thread): Core engine functions, Rendering work.
Thread #1: Spatial processing //Physics, etc. Not really relevant to the problem.
Thread #2: background loading from disk
The algorithm loads Entity details from an XML file on disk, which contains graphics information, such as model and texture files for rendering.
The idea is that the engine should be capable of loading entities from disk, and then from memory to the GPU without blocking the main thread. However, at the moment, it can only load into main memory.
Whenever I load data on thread 2, I have to notify thread 0 that the load from disk is complete. Then the engine code running on thread 0 makes the necessary GL calls to send data from memory to the GPU, and sends the entities to the renderer.
I am aware that making OpenGL calls on multiple threads is undefined and will cause the program to crash.
I am also aware that it is possible to have a shared gl context, and that on each thread that you want to make GL calls, you must first make the context current (Synchronise), then make the calls.
As I understand it, making a GL context current will make all other contexts on other threads inactive, which leads us back to the undefined behaviour/crash situation.
I would prefer that the VAO/VBO/texture object populating is done on thread 2. I think this could be achieved by creating a context on thread 2, and making it currently active context, but I am unsure how this would affect rendering. Would I have to stop rendering whilst this is being done? If so, then I don't see any benefit as it may as well be done on the main thread.
Is there a way to create and populate buffer objects on thread 2, whilst not interfering with rendering operations on thread 0?
To Clarify:
Thread 2 will never perform rendering, only loading of data onto the GPU.
Thread 0 only works with already populated buffers. It will only be given data that already exists on the GPU.

Related

synchronising fences and texture unloading

My OpenGL application displays a 3D view where images are loaded on demand based on what is visible. The load operation send's a load request to the texture loader thread, which then does this:
...
<Open image and read into an openGL texture = tex>
// is this lock needed if this is one of multiple loader threads that each pass textures to the main thread? I.e is glClientWaitSync() thread-safe?
{
std::lock_guard<std::mutex> lk(mOpenGLClientWaitSyncMutex);
// we need to wait on a fence before alerting the primary thread that the Texture is ready
glClientWaitSync(mSync, GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
// Pass texture to MAIN THREAD
textureLoaderRequest->mContentViewer->setTexture(tex, requestFrameIndex);
}
...
Note. A shared OpenGL context is created for each loader thread and associated with the main.
I have the following areas of uncertainty:
a) OpenGL Fences.
I must admit, I am still chewing over the OpenGL fences subject. At the moment there is only 1 texture loader thread, and the main thread. If I had multiple texture loader threads, would I need to synchronise the fence->clientWaitSync()'s? (or am I being daft here as would the clientWaitSync() in each loader thread provide the necessary sync guards for multiple threads setting a texture to be drawn on the main thread). I did some reading, added the above lock_guard, and a bit of testing and it didn't seem to affect anything. (but that doesn't usually mean anything, hence my question).
b) Texture Unloading.
My goal is to have another thread that performs the unloads to keep the UI smooth, whilst not delaying the IO related loader thread/s. Because I am creating the texture on the loader thread, what sort of context arrangement is required such that the unloader thread can unload it. I mean, at the moment they are all 'attached' to the main context, so I guess my question is whether the siblings need some sort of "direct" sharing arrangement, or are they already shared due to the association with the main opengl context? How would you tackle such a problem?

Open GL: multithreaded glFlushMappedBufferRange?

I know that multi threaded OpenGL is a delicate topic and I am not trying here to render from multiple threads. I also do not try to create multiple contexts and share objects with share lists. I have a single context and I issue draw commands and gl state changes only from the main thread.
However, I am dynamically updating parts of a VBO in every frame. I only write to the VBO, I do not need to read it on the CPU side. I use glMapBufferRange so I can compute the changed data on the fly and don't need an additional copy (which would be created by the blocking glBufferSubData).
It works and now I would like to multi thread the the data update (since it needs to update a lot of vertices at steady 90 fps) and use a persistently mapped buffer (using GL_MAP_PERSISTENT_BIT). This will require to issue glFlushMappedBufferRange whenever a worker thread finished updating parts of the mapped buffer.
Is it fine to call glFlushMappedBufferRange on a separate thread? The Ranges the different threads operate on do not overlap. Is there an overhead or implicit synchronisation involved in doing so?
No you need to call glFlushMappedBufferRange in the thread that does the openGL stuff.
To overcome this you have 2 options:
get the openGL context and make it current in the worker thread. Which means the openGL thread has to relinquish the context for it to work.
push the relevant range into a thread-safe queue and let the openGL thread pop each range from it and call glFlushMappedBufferRange.

A way of generating chunks

I'm making a game and I'm actually on the generation of the map.
The map is generated procedurally with some algorithms. There's no problems with this.
The problem is that my map can be huge. So I've thought about cutting the map in chunks.
My chunks are ok, they're 512*512 pixels each, but the only problem is : I have to generate a texture (actually a RenderTexture from the SFML). It takes around 0.5ms to generate so it makes the game to freeze each time I generate a chunk.
I've thought about a way to fix this : I've made a kind of a threadpool with a factory. I just have to send a task to it and it creates the chunk.
Now that it's all implemented, it raises opengl warnings like :
"An internal OpenGL call failed in RenderTarget.cpp (219) : GL_INVALID_OPERATION, the specified operation is not allowed in the current state".
I don't know if this is the good way of dealing with chunks. I've also thought about saving the chunks into images / files, but I fear that it take too much time to save / load them.
Do you know a better way to deal with this kind of "infinite" maps ?
It is an invalid operation because you must have a context bound to each thread. More importantly, all of the GL window system APIs enforce a strict 1:1 mapping between threads and contexts... no thread may have more than one context bound and no context may be bound to more than one thread. What you would need to do is use shared contexts (one context for drawing and one for each worker thread), things like buffer objects and textures will be shared between all shared contexts but the state machine and container objects like FBOs and VAOs will not.
Are you using tiled rendering for this map, or is this just one giant texture?
If you do not need to update individual sub-regions of your "chunk" images you can simply create new textures in your worker threads. The worker threads can create new textures and give them data while the drawing thread goes about its business. Only after a worker thread finishes would you actually try to draw using one of the chunks. This may increase the overall latency between the time a chunk starts loading and eventually appears in the finished scene but you should get a more consistent framerate.
If you need to use a single texture for this, I would suggest you double buffer your texture. Have one that you use in the drawing thread and another one that your worker threads issue glTexSubImage2D (...) on. When the worker thread(s) finish updating their regions of the texture you can swap the texture you use for drawing and updating. This will reduce the amount of synchronization required, but again increases the latency before an update eventually appears on screen.
things to try:
make your chunks smaller
generate the chunks in a separate thread, but pass to the gpu from the main thread
pass to the gpu a small piece at a time, taking a second or two

Is there a way to safely bind texture using OpenGL in a worker thread of Qt GUI application?

I am currently working on a GUI software project for visualizing 3D scenes using Qt. The GUI allows user to load batches of 3D data files such as .obj with some .mtl support and .stl as well as 2D image files into the scene as SceneObject-class objects which is rendered on a QGLWidget-derived widget.
When I load them in batches on the main GUI thread however, the long loading time causes the GUI to freeze, which is ugly. I have tried performing the loading on a separate thread but there is one big catch: when loading .obj textures or image files, I will also perform binding using OpenGL glBindtexture() immediately after loading each image or texture so that I only need to save texture IDs in each SceneObject instance. When I tried to perform the load in a worker thread, the whole program would just crash.
I have read that each thread can only access one OGL context and context switching across threads is one but dangerous way to achieve what I wanted to do. Another possible way would be to perform texture binding on the GUI thread after loading is completed but that would mean a complete re-design on my SceneObject class :(
Can anyone give me some advice on how to implement a loading thread for loading assets into a OpenGL scene?
I will also perform binding using OpenGL glBindtexture() immediately after loading each image or texture so that I only need to save texture IDs in each SceneObject instance. When I tried to perform the load in a worker thread, the whole program would just crash.
A OpenGL context may be active in only one thread at a time. In general multithreaded OpenGL operation usually becomes a major nightmare, to get right. In your case what you intend to do is delegating resource loading. In the old days, before there were buffer object you'd have done this by creating a helper context and share its "lists" with the main context.
Today we have something better: Buffer objects. A buffer object allows you to send data to OpenGL in a asynchronous way. It goes along the likes of
glGenBuffers(...);
glBindBuffer(...);
glBufferData(..., size, usage);
void *p = glMapBuffer(...);
memcpy(p, data, size);
glUnmapBuffer(...);
glTexImage / glDrawPixels / etc.
The important part to understand is, that the address space allocated by glMapBuffer is shared across threads. So you can tell the OpenGL context in the main thread to map a buffer object, sending a signal to your worker thread, with the allocation. The worker thread then fills in the data and upon finishing sends a signal to the OpenGL context thread to unmap.
EDIT for multithreading
So to do this you'd implement some signal handlers on both sides (pseudocode)
signal OpenGLThread::mapPixelBufferObjectForWrite(ImageLoader il):
glGenBuffers(1, &self.bufferId)
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, self.bufferId)
glBufferData(GL_PIXEL_UNPACK_BUFFER, il.unpackedImageSize, NULL, GL_STATIC_DRAW)
BufferObjectMapping map(glMapBuffer(GL_PIXEL_UNPACK_BUFFER, GL_WRITE_ONLY))
send_signal(target_thread = workerthread, target_signal_handler = workerthread::loadImageToBuffer(map, il), signal_on_finish = self.unmapPixelBufferObjectWriteFinishedGenTexture(map))
signal IOThread::loadImageToBuffer(BufferObjectMapping map, ImageLoader il):
/* ... */
signal OpenGLThread::unmapPixelBufferObjectWriteFinishedGenTexture(BufferObjectMapping map, ImageLoader il):
if(map.mapping_target == GL_PIXEL_UNPACK_BUFFER)
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, self.bufferId)
glUnmapBuffer(GL_PIXEL_UNPACK_BUFFER)
glGenTextures(1, &il.textureId)
glBindTexture(il.target, il.textureId)
for mipmaplevel in il.levels
glTexImage2D(il.target, mipmaplevel, il.internalformat, il.width, il.height, il.border, il.format, il.type, mipmaplevel.offset)

What is a good way to load textures dynamically in OpenGL?

Currently I am loading an image in to memory on a 2nd thread, and then during the display loop (if there is a texture load required), load the texture.
I discovered that I could not load the texture on the 2nd thread because OpenGL didn't like that; perhaps this is possible but I did something wrong - so please correct me if this is actually possible.
On the other hand, if my failure was valid - how do I load a texture without disrupting the rendering loop? Currently the textures take around 1 second to load from memory, and although this isn't a major issue, it can be slightly irritating for the user.
You can load a texture from disk to memory on any thread you like, using any tool you wish for reading the files.
However, when you bind it to OpenGL, it's going to need to be handled on the same thread as the rendering for that OpenGL context. That being said, this discussion suggests that using a PBO in a second thread is an option, and can speed up the process.
You can certainly load the texture from disk into RAM in any number of threads you like, but OpenGL won't upload to VRAM in multiple threads for the reason mentioned in Reed's answer.
Given the loading from disk is the slowest part, thats the bit you'll probably want to thread. The loading thread(s) build up a queue of textures to be uploaded, then this queue is consumed by the thread that owns the GL context (mind your access to that queue by the various threads however). You could also consider a non-threaded approach of uploading N textures per frame, where N is a number that doesn't slow the rendering down too much.