Is glDrawElements synchronized on back buffer? - opengl

If I call glDrawElements with the draw target being the back buffer, and then I call glReadPixels, is it guaranteed that I will read what was drew?
In other word, is glDrawElements a blocking call?
Note: I am observing an weird issue here that may be caused by glDrawElements not being blocking...

In other word, is glDrawElements a blocking call?
That's not how OpenGL works.
The OpenGL memory model is built on the "as if" rule. Certain exceptions aside, everything in OpenGL will function as if all of the commands you issued have already completed. In effect, everything will work as if every command blocked until it completed.
However, this does not mean that the OpenGL implementation actually works this way. It just has to do everything to make it appear to work that way.
Therefore, glDrawElements is generally not a blocking call; however, glReadPixels (when reading to client memory) is a blocking call. Because the results of a pixel transfer directly to client memory must be available when glReadPixels has returned, the implementation must check to see if there are any outstanding rendering commands going to the framebuffer being read from. If there are, then it must block until those rendering commands have completed. Then it can execute the read and store the data in your client memory.
If you were reading to a buffer object, there would be no need for glReadPixels to block. No memory accessible to the client will be modified by the function, since you're reading into a buffer object. So the driver can issue the readback asynchronously. However, if you issue some command that depends on the contents of this buffer (like mapping it for reading or using glGetBufferSubData), then the OpenGL implementation must stall until the reading operation is done.
In short, OpenGL tries to delay blocking for as long as possible. Your job, to ensure performance, is to help OpenGL to do so by not forcing an implicit synchronization unless absolutely necessary. Sync objects can help with this.

Related

OpenGL commands - sequential or parallel

I'm reading this document
and I have a question about this sentence:
While OpenGL explicitly requires that commands are completed in order,
that does not mean that two (or more) commands cannot be concurrently
executing. As such, it is possible for shader invocations from one
command to be exeucting in tandem with shader invocations from other
commands.
Does this mean that, for example, when I issue two consecutive glDrawArrays calls it is possible that the second call is processed immediately before the first one has finished?
My first idea was that the OpenGL calls merely map to internal commands of the gpu and that the OpenGL call returns immediately without those commands completed, thus enabling the second OpenGL call to issue its own internal commands. The internal commands created by the OpenGL calls can then be parallelized.
What is says is, that the exact order in which the commands are executed and any concurrency is left to the judgement of the implementation with the only constraint being that the final result must look exactly as if all the commands would have been executed one after another in the very order they were called by the client program.
EDIT: Certain OpenGL calls cause an implicit or explicit synchronization. Reading back pixels for example or waiting for a synchronization event.

What is the purpose of clEnqueueAcquireGLObjects?

I did google for the question, and got from this link
clEnqueueAcquireGLObjects
Acquire OpenCL memory objects that have been created from OpenGL objects.
These objects need to be acquired before they can be used by any OpenCL commands queued to a command-queue.
I really don't understand why these objects need to be acquired. In my opinion, the reason of the acquiring is NOT OpenGL/OpenCL synchronization because the synchronization can be achieved by glFinish and clFinish.
I mean, if clEnqueueAcquireGLObjects/clEnqueueReleaseGLObjects are used, then glFinish/clFinish are redundant, and vice-versa.
I mean, if clEnqueueAcquireGLObjects/clEnqueueReleaseGLObjects are used, then glFinish/clFinish are redundant, and vice-versa.
You're thinking about this in entirely the wrong way.
glFinish causes OpenGL to perform a full CPU synchronization, such that the implementation will have completed all commands afterwards. clFinish does something similar for OpenCL.
The fact that you called one or the other has absolutely no effect on what a different system does. OpenGL has no idea that OpenCL exists, and vice-versa. glFinish has nothing to do with clFinish and vice-versa. So while OpenGL may have finished making some modification to an object, OpenCL has no idea that these modifications took place.
The purpose of acquiring and releasing OpenGL objects is for OpenCL and OpenGL to talk to one another. When objects are acquired, OpenCL tells OpenGL, "Hey, see these objects? They're mine now, so give them to me." This means that the OpenGL/OpenCL driver will do whatever mechanics are necessary to transfer access control over those objects to OpenCL.
For example, if an object has been paged out of GPU memory, OpenCL acquiring it may need to make it resident again. OpenCL and OpenGL have two separate sets of records that refer to this memory; by acquiring the object, you synchronize the OpenCL data with changes made by OpenGL. And so forth.
Notice that these mechanics have nothing at all to do with synchronizing GPU operations. They are about making the objects accessible to OpenCL.
If your OpenCL implementation doesn't have cl_khr_gl_event, then you must use OpenGL's synchronization mechanism to ensure that those objects are no longer in use before you acquire them. The two functions aren't redundant; they're doing different things to ensure the integrity of the system.

Do opengl functions cause main thread to freeze?

So when you call opengl functions, like glDraw or gLBufferData, does it cause the thread of the program to stop and wait for GL to finish the calls?
If not, then how does GL handle calling important functions like glDraw, and then immediately afterwards having a setting changed that affects the draw calls?
No, they (mostly) do not. The majority of GL functions are buffered when used and actually executed later. This means that you cannot think of the CPU and the GPU as two processors working together at the same time. Usually, the CPU executes a bunch of GL functions that get buffered and, as soon as they are delivered to the GPU, this one executes them. This means that you cannot reliably control how much time it took for a specific GL function to execute by just comparing the time before and after it's execution.
If you want to do that, you need to first run a glFinish() so it will actually wait for all previously buffered GL calls to execute, and then you can start counting, execute the calls that you want to benchmark, call glFinish again to make sure these calls executed as well, and then finish the benchmark.
On the other hand, I said "mostly". This is because reading functions will actually NEED to synchronize with the GPU to show real results and so, in this case, they DO wait and freeze the main thread.
edit: I think the explanation itself answers the question you asked second, but just in case: the fact that all calls are buffered make it possible for a draw to complete first, and then change a setting afterwards for succesive calls
It strictly depends on the OpenGL call in question and the OpenGL state. When you make OpenGL calls, the implementation first queues them up internally and then executes them asynchronously to the calling program's execution. One important concept of OpenGL are synchronization points. Those are operations in the work queue that require the OpenGL call to block until certain conditions are met.
OpenGL objects (textures, buffer objects, etc.) are purely abstract and by specification the handle of an object in the client program always to the data, the object has at calling time of OpenGL functions that refer to this object. So take for example this sequence:
glBindTexture(GL_TEXTURE_2D, texID);
glTexImage2D(..., image_1);
draw_textured_quad();
glTexImage2D(..., image_2);
draw_textured_quad();
The first draw_textured_quad may return even long before anything has been drawn. However by making the calls OpenGL creates an internal reference to the data currently hold by the texture. So when glTexImage2D is called a second time, which may happen before the first quad was drawn, OpenGL must internally create a secondary texture object that's to become texture texID and to be used by the second calls of draw_textured_quad. If glTexSubImage2D was called, it would even have to make a modified copy of it.
OpenGL calls will only block, if the result of the call modifies client side memory and depends of data generated by previous OpenGL calls. In other words, when doing OpenGL calls, the OpenGL implementation internally generates a dependency tree to keep track of what depends on what. And when a synchronization point must block it will at least block until all dependencies are met.

pointer from glMapBufferARB used in another context

I want to do parallel rendering with 2 GPUs. So a readback from GPU1 and then drawing pixels to GPU2 are needed.
I created two windows in each screen with its own GPU connected. And there are two threads associated to each window.
However, the readpixel+drawpixel is a bottleneck. So a async PBO method is considered: 2 PBOs for reading back and 2 PBOs for drawing back in alternative way.
My question is:
Could Pointer returned from glMapBufferARB be used in another thread and different GPU?
If not, I must copy data to main memory and copy it to another GPU, the bottleneck will be CPU->GPU copying. Is there any better idea?
Yes, pointer form glMapBuffer can be used by any thread - even without GL context. Just remember to synchronize threads and don't call glUnmapBuffer before thread finishes its job with pointer.h

Do work instead of waiting for glMapBuffer

I'm using OpenGL for some GPGPU processing. So I have different threads giving work to a OpenGL processing thread.
After each "work-item" I need to call glReadPixels and glMapBuffer in order to transfer back data to the host from the PBO. The problem with this however is that glMapBuffer blocks the thread and no useful work can be done until the DMA transfer finished, even though the GPU is idle. The usual way to solve this is to create a pipeline with a time depth of the longest DMA transfer. However, as I'm working on a low latency system this is suboptimal.
Is there a way to maybe wait for glMapBuffer on a separate thread or maybe get some notification as to when the DMA transfer has finished in order to reduce the latency as much as possible?
Do some additional work in other threads than the one glMapBuffer blocks in? You can have multiple OpenGL contexts, each active in its own thread; if they're configured to share objects they can operate simultanously.
However a DMA actually means work for the GPU, at least the bandwidth to it is fully consumed and so you might end up with even worse performance.