Using a framebuffer as a vertex buffer without moving the data to the CPU - opengl

In OpenGL, is there a way to use framebuffer data as vertex data without moving the data through the CPU? Ideally, a framebuffer object could be recast as a vertex buffer object directly on the GPU. I'd like to use the fragment shader to generate a mesh and then render that mesh.

There's a couple ways you could go about this, the first has already been mentioned by spudd86 (except you need to use GL_PIXEL_PACK_BUFFER, that's the one that's written to by glReadPixels).
The other is to use a framebuffer object and then read from its texture in your vertex shader, mapping from a vertex id (that you would have to manage) to a texture location. If this is a one-time operation though I'd go with copying it over to a PBO and then binding into GL_ARRAY_BUFFER and then just using it as a VBO.

Just use the functions to do the copy and let the driver figure out how to do what you want, chances are as long as you copy directly into the vertex buffer it won't actually do a copy but will just make your VBO a reference to the data.
The main thing to be careful of is that some drivers may not like you using something you told it was for vertex data with an operation for pixel data...
Edit: probably something like the following may or may not work... (IIRC the spec says it should)
int vbo;
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, vbo);
// use appropriate pixel formats and size
glReadPixels(0, 0, w, h, GL_RGBA, GL_UNSIGNED_BYTE, 0);
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, 0);
glEnableClientState(GL_VERTEX_ARRAY);
glBindBufferARB(GL_ARRAY_BUFFER_ARB, vbo);
// draw stuff
Edited to correct buffer bindings thanks Phineas

The specification for GL_pixel_buffer_object gives an example demonstrating how to render to a vertex array under "Usage Examples".
The following extensions are helpful for solving this problem:
GL_texture_float - floating point internal formats to use for the color buffer attachment
GL_color_buffer_float - disable automatic clamping for fragment colors and glReadPixels
GL_pixel_buffer_object - operations for transferring pixel data to buffer objects

If you can do your work in a vertex/geometry shader, you can use transform feedback to write directly into a buffer object. This also has the option of skip the rasterizer and fragment shading.
Transform feedback is available as EXT_transform_feedback or core version since GL 3.0 (and the ARB equivalent).

Related

Is it necessary to bind all VBOs (and textures) each frame?

I'm following basic tutorial on OpenGL 3.0. What is not clear to me why/if I have to bind, enable and unbind/disable all vertex buffers and textures each frame.
To me it seems too much gl**** calls which I guess have some overhead. For example here you see each frame several blocks like:
// do this for each mesh in scene
// vertexes
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER, vertex_buffer);
glVertexAttribPointer( 0, 3, GL_FLOAT,GL_FALSE,0,(void*)0);
// normals
glEnableVertexAttribArray(1);
glBindBuffer(GL_ARRAY_BUFFER, normal_buffer );
glVertexAttribPointer( 1, 3, GL_FLOAT,GL_FALSE,0,(void*)0);
// UVs
glEnableVertexAttribArray(2);
glBindBuffer(GL_ARRAY_BUFFER, uv_buffer );
glVertexAttribPointer( 2, 2, GL_FLOAT,GL_FALSE,0,(void*)0);
// ...
glDrawArrays(GL_TRIANGLES, 0, nVerts );
// ...
glDisableVertexAttribArray(0);
glDisableVertexAttribArray(1);
glDisableVertexAttribArray(2);
imagine you have not just one but 100 different meshes each with it's own VBOs for vertexes,normas,UVs. Should I really do this procedure each frame for each of them? Sure I can encapsulate that complexity into some function/objects, but I worry about overheads of this gl**** function calls.
Is it not possible some part of this machinery to move from per frame loop into scene setup ?
Also I read that VAO is a way how to pack corresponding VBOs for one object together. And that binding VAO automatically binds corresponding VBOs. So I was thinking that maybe one VAO for each mesh (not instance) is how it should be done - but according to this answer it does not seems so?
First things first: Your concerns about GL call overhead have been addressed with the introduction of Vertex Array Objects (see #Criss answer). However the real problem with your train of thought is, that you equate VBOs with geometry meshes, i.e. give each geometry its own VBO.
That's not how you should see and use VBOs. VBOs are chunks of memory and you can put the data of several objects into a single VBO; you don't have to draw the whole thing, you can limit draw calls to subsets of a VBO. And you can coalesce geometries with similar or even identical drawing setup and draw them all at once with a single draw call. Either by having the right vertex index list, or by use of instancing.
When it comes to the binding state of textures… well, yeah, that's a bit more annoying. You really have to do the whole binding dance when switching textures. That's why in general you sort geometry by texture/shader before drawing, so that the amount of texture switches is minimized.
The last 3 or 4 generations of GPUs (as of late 2016) do support bindless textures though, where you can access textures through a 64 bit handle (effectively the address of the relevant data structure in some address space) in the shader. However bindless textures did not yet make it into the core OpenGL standard and you have to use vendor extensions to make use of it.
Another interesting approach (popularized by Id Tech 4) is virtual textures. You can allocate sparsely populated texture objects that are huge in their addressable size, but only part of them actually populated with data. During program execution you determine which areas of the texture are required and swap in the required data on demand.
You should use vertex array object (generated by glGenVertexArrays). Thanks to it you don't have to perform those calls everytime. Vertex buffer object stores:
Calls to glEnableVertexAttribArray or glDisableVertexAttribArray.
Vertex attribute configurations via glVertexAttribPointer.
Vertex buffer objects associated with vertex attributes by calls to
glVertexAttribPointer.
Maybe this will be better tutorial.
So that you can generate vao object, then bind it, perform the calls and unbind. Now in drawing loop you just have to bind vao.
Example:
glUseProgram(shaderId);
glBindVertexArray(vaoId);
glDrawArrays(GL_TRIANGLES, 0, 3);
glBindVertexArray(0);
glUseProgram(0);

render to color attachment with DSA

I want to render into a framebuffer with DSA. However I only got it working when manually binding the framebuffer. Is there a way without bind it?
This is how I thought it would work:
glNamedFramebufferDrawBuffer(m_fbo, GL_COLOR_ATTACHMENT0);
drawCall();
This only works if I use glBindFramebuffer(GL_FRAMEBUFFER, m_fbo); before. How do I do it correctly?
Also, what is the equivalent to glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); for framebuffers with DSA? Again, I can only clear if I previously bound the framebuffer.
The array of draw buffers is not a global state, but rather it is stored per-framebuffer. You are probably familiar with the mechanics of Vertex Array Objects, which maintain separate sets of vertex attribute pointers; draw buffers are analogous to attribute pointers in this situation.
When you make a call to glNamedFramebufferDrawBuffer (m_fbo, ...), you are modifying the state of m_fbo's array without first having to bind m_fbo. You are not actually telling OpenGL to source its color buffer from m_fbo's GL_COLOR_ATTACHMENT0 - that only happens when you bind m_fbo.
In fact, if you think about this critically, this is the only logical way it can work. If you could arbitrarily source buffers from different framebuffer objects, then that would violate validation (completeness). For instance, FBO0 might have a multi-sampled color attachment with 4 samples and FBO1 might have a single-sampled depth attachment. Those are two incompatible buffers, but the only time that is validated is when you try to attach those two images to the same FBO.

How to draw many textured quads faster, and retain glScissor (or something like it)?

I'm using OpenGL 4 and C++11.
Currently I make a whole bunch of individual calls to glDrawElements using separate VAOs with a separate VBO and an IBO.
I do this because the texture coords change for each, and my Vertex data features the texture coords. I understand that there's some redundent position information in this vertex data; however, it's always -1,-1,1,1 because I use a translation and a scale matrix in my vertex shader to then position and scale the vertex data.
The VAO, VBO, IBO, position and scale matrix and texture ID are stored in an object. It's one object per quad.
Currently, some of the drawing would occur like this:
Draw a quad object via (glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT,0)). The bound VBO is just -1,-1,1,1 and the IBO draws me a quad. The bound VBO contains the texture coords of a common texture (same texture used to texture all drawn quads). Matrix transformations on shader position it.
Repeat with another quad object
glEnable(GL_SCISSOR_TEST) is called and the position information of the preview quad is used in a call to glScissor
Next quad object is drawn; only the parts of it visible from the previous quad are actually shown.
Draw another quad object
The performance I'm getting now is acceptable but I want it faster because I've only scratched the surface of what I have in mind. So I'm looking at optimizing. So far I've read that I should:
Remove the position information from my vertex data and just keep texture coords. Instead bind a single position VBO at the start of drawing quads so it's used by all of them.
But I'm unsure how this would work? Because I can only have one VBO active at any one time.
Would I then have to call glBufferSubData and update the texture coordinates prior to drawing each quad? Would this be better performance or worse (a call to glBindVertexArray for every object or a call to glBufferSubData?)
Would I still pass the position and scale as matrices to the shader, I would I take that opportunity to also update the position info of the vertices as well as the texture coords? Which would be faster?
Create one big VBO with or without an IBO and update the vertex data for the position (rather than use a transformation and scale matrix) of each quad within this. It seems like this would be difficult to manage.
Even if I did manage to do this; I would only have a single glDraw call; which sounds fast. Is this true? What sort of performance impact does a single glBindVertexArray call have over multiple?
I don't think there's any way to use this method to implement something like the glScissor call that I'm making now?
Another option I've read is instancing. So I draw the quad however many times I need it; which means I would pass the shader an array of translation matrices and an array of texture coords?
Would this be a lot faster?
I think I could do something like the glScissor test by passing an additional array of booleans which defines whether the current quad should be only drawn within the bounds of the previous one. However, I think this means that for each gl_InstanceID I would have to traverse all previous instances looking for true and false values, and it seems like it would be slow.
I'm trying to save time by not implementing all of these individually. Hopefully an expert can point me towards which is probably better. If anyone has an even better idea, please let me know.
You can have multiple VBO attached to different attributes!
following seqence binds 2 vbos to attribs 0 & 1, note that glBindBuffer() binds buffer temporarily and actual VBO assignment to attrib is made during glVertexAttribPointer().
glBindBuffer(GL_ARRAY_BUFFER,buf1);
glVertexAttribPointer(0, ...);
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER,buf2);
glVertexAttribPointer(1, ...);
glEnableVertexAttribArray(1);
The fastest way to provide quad positions & sizes is to use texture and sample it inside vertex shader. Of course you'd need at least RGBA (x,y,width,height) 16bits / channel texture. But then you can update quad positions using glTexSubImage2D() or you could even render them via FBO.
Everything other than that will perform slower, of course if you want we can elaborate about using uniforms, attribs in vbos or using attribs without enabled arrays for them.
Putting all together:
use single vbo, store quad id in it (int) + your texturing data
prepare x,y,w,h texture, define mapping from quad id to this texture texcoord ie: u=quad_id&0xFF , v=(quad_id>>8) (for texture 256x256 max 65536 quads)
use vertex shader to sample displacement and size from that texture (for given quad_id stored in attribute (or use vertex_ID/4 or vertex_ID/6)
fill vbo and texture
draw everything with single drawarrays of draw elements

Pass stream hint to existing texture?

I have a texture that was created by another part of my code (with QT5's bindTexture, but this isn't relevant).
How can I set an OpenGL hint that this texture will be frequently updated?
glBindTexture(GL_TEXTURE_2D, textures[0]);
//Tell opengl that I plan on streaming this texture
glBindTexture(GL_TEXTURE_2D, 0);
There is no mechanism to indicating that a texture will be updated repeatedly; that is only related to buffers (e.g., VBOs, etc.) through the usage parameter. However, there are two possibilities:
Attache your texture as a framebuffer object and update it that way. That's probably the most efficient method to do what you're asking. The memory associated with the texture remains resident on the GPU, and you can update it at rendering speeds.
Try using a pixel buffer object (commonly called a PBO, and has an OpenGL buffer type of GL_PIXEL_UNPACK_BUFFER) as the buffer that Qt writes its generated texture into, and mark that buffer as GL_DYNAMIC_DRAW. You'll still need to call glTexImage*D() with the buffer offset of the PBO (i.e., probably zero) for each update, but that approach may afford some efficiency over just blasting texels to the pipe directly through glTexImage*D().
There is no such hint. OpenGL defines functionality, not performance. Just upload to it whenever you need to.

Equiv of glDrawpixels that operates on GPU memory?

glDrawPixels(GLsizei width, GLsizei height, GLenum format, GLenum type, const ovid *pixels);
Is there a function like this, except instead of accessing CPU memory, it accesses GPU memory? [Either a texture of a frame buffer object]
Let's cover all the bases here.
First, a direct answer: yes, there is such a function. It's called glDrawPixels. I'm not kidding.
glDrawPixels can certainly read from GPU memory, provided that you are using buffer objects as their source data (commonly called "pixel buffer objects"). glDrawPixels can use pixel buffer objects as the source for pixel data. Buffer objects are (theoretically, at least) in GPU memory, so they qualify.
However, you add onto this "Either a texture of a frame buffer object". Under this qualification, you're asking, "is there a way to copy pixel data from one texture/framebuffer to the current framebuffer?"
Yes. glBlitFramebuffer can do that. It blits from the GL_READ_FRAMEBUFFER to the GL_DRAW_FRAMEBUFFER. And since you can add images from textures to FBOs, you can copy from images just fine. You can even copy from the default framebuffer to some renderbuffer or texture.
You can also employ glCopyImageSubData, which copies pixel rectangles from one image to another. It's a lot more convenient than glBlitFramebuffer if all you're doing is copying pixel data. This is quite new at present (GL 4.3, or ARB_copy_image). It cannot be used to copy data to the default framebuffer.
If it is in a texture:
set up orthographic frustum
disable blending, depth test, etc.
bind texture
draw screen-aligned textured quad with correct texture coordinates
I use this in for example in Compositor::_drawPixels
glDrawPixels can read from a Buffer Object. Just do a
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, XXX)
before calling glDrawPixels.
Caveat: glDrawPixels is deprecated...
Use glBlitFramebuffer, which operates on frambuffer objects (Link). Ans this is not deprecated.
You can take advantage of format conversion, scaling and multisampling.