accesing to RGB and depth buffer of fragment shader : GLSL - opengl

continue to this question :
GLSL : accessing framebuffer to get RGB and change it
Is it possible to develop program based on GLSL as follows?
draw object1
get depth buffer using shader (also save rgb)
draw object 1, object 2 simultaneously
get detph
check if depths are different (depth 2 vs depth 4)
draw object 1
: for the range that depth isn't changed -> draw as original RGB
: for the range that depth is changed -> draw with different RGB
I confirmed this algorithm which distinguishes object 1 is hidden by other object using glut functions.
I used glReadbuffer, glDrawbuffer functions. However those are too slow, I want to use GLSL.

If the only goal is to render object1 with a different color set when it is hidden behind object2 (which are the pixels of object1 where depth has changed) I would go for a completely different approach.
Draw object2 with depth-write only (glDrawBuffer(GL_NONE) or glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE))
Draw object1 with depth test set to glDepthFunc(GL_GREATER) and the coloring you want when object1 is behind object2
Draw object1 with depth test set to glDepthFunc(GL_LESS) and the coloring you want when object1 is in front of object2.
In contrast to the algorithm you described in the question, there is no need for any read-back operations or additional framebuffers

Related

OpenGL How to use a invisible mask to hide objects behind it

I have an OpenGL problem to solve. I have an object/mesh A, an object/mesh B and a background texture C.
Initially the framebuffer is filled with background texture C. We draw both A & B in the framebuffer. We want to keep object A visible, and object B always invisible.
In the beginning, A is in front of B. During rotation, at a certain angle, B is in front of A based on the depth test result, but since B is always invisible, B's part should be filled with background C.
Does anyone know a simple approach to solve this issue?
Is stencil test a good approach? Basically set object B with a color, compare the color of B with background C, and show the background C when the test fail.
Does anyone have any sample code I can read?
The easiest solution is to:
draw C;
draw B with the colour mask preventing writes to the frame buffer (but don't touch the depth mask, so that writes are still made to the depth buffer);
draw A, subject to the depth test.
The specific thing to use is glColorMask — if you supply GL_FALSE for each channel via that then subsequent geometry won't write any colour output. But assuming you haven't touched glDepthMask it'll still write depth output.
So, you've probably currently got the code:
drawBackground(C);
render(A);
render(B);
You'd just adapt that to:
drawBackground(C);
glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);
render(B);
glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
render(A);

How to draw many textured quads faster, and retain glScissor (or something like it)?

I'm using OpenGL 4 and C++11.
Currently I make a whole bunch of individual calls to glDrawElements using separate VAOs with a separate VBO and an IBO.
I do this because the texture coords change for each, and my Vertex data features the texture coords. I understand that there's some redundent position information in this vertex data; however, it's always -1,-1,1,1 because I use a translation and a scale matrix in my vertex shader to then position and scale the vertex data.
The VAO, VBO, IBO, position and scale matrix and texture ID are stored in an object. It's one object per quad.
Currently, some of the drawing would occur like this:
Draw a quad object via (glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT,0)). The bound VBO is just -1,-1,1,1 and the IBO draws me a quad. The bound VBO contains the texture coords of a common texture (same texture used to texture all drawn quads). Matrix transformations on shader position it.
Repeat with another quad object
glEnable(GL_SCISSOR_TEST) is called and the position information of the preview quad is used in a call to glScissor
Next quad object is drawn; only the parts of it visible from the previous quad are actually shown.
Draw another quad object
The performance I'm getting now is acceptable but I want it faster because I've only scratched the surface of what I have in mind. So I'm looking at optimizing. So far I've read that I should:
Remove the position information from my vertex data and just keep texture coords. Instead bind a single position VBO at the start of drawing quads so it's used by all of them.
But I'm unsure how this would work? Because I can only have one VBO active at any one time.
Would I then have to call glBufferSubData and update the texture coordinates prior to drawing each quad? Would this be better performance or worse (a call to glBindVertexArray for every object or a call to glBufferSubData?)
Would I still pass the position and scale as matrices to the shader, I would I take that opportunity to also update the position info of the vertices as well as the texture coords? Which would be faster?
Create one big VBO with or without an IBO and update the vertex data for the position (rather than use a transformation and scale matrix) of each quad within this. It seems like this would be difficult to manage.
Even if I did manage to do this; I would only have a single glDraw call; which sounds fast. Is this true? What sort of performance impact does a single glBindVertexArray call have over multiple?
I don't think there's any way to use this method to implement something like the glScissor call that I'm making now?
Another option I've read is instancing. So I draw the quad however many times I need it; which means I would pass the shader an array of translation matrices and an array of texture coords?
Would this be a lot faster?
I think I could do something like the glScissor test by passing an additional array of booleans which defines whether the current quad should be only drawn within the bounds of the previous one. However, I think this means that for each gl_InstanceID I would have to traverse all previous instances looking for true and false values, and it seems like it would be slow.
I'm trying to save time by not implementing all of these individually. Hopefully an expert can point me towards which is probably better. If anyone has an even better idea, please let me know.
You can have multiple VBO attached to different attributes!
following seqence binds 2 vbos to attribs 0 & 1, note that glBindBuffer() binds buffer temporarily and actual VBO assignment to attrib is made during glVertexAttribPointer().
glBindBuffer(GL_ARRAY_BUFFER,buf1);
glVertexAttribPointer(0, ...);
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER,buf2);
glVertexAttribPointer(1, ...);
glEnableVertexAttribArray(1);
The fastest way to provide quad positions & sizes is to use texture and sample it inside vertex shader. Of course you'd need at least RGBA (x,y,width,height) 16bits / channel texture. But then you can update quad positions using glTexSubImage2D() or you could even render them via FBO.
Everything other than that will perform slower, of course if you want we can elaborate about using uniforms, attribs in vbos or using attribs without enabled arrays for them.
Putting all together:
use single vbo, store quad id in it (int) + your texturing data
prepare x,y,w,h texture, define mapping from quad id to this texture texcoord ie: u=quad_id&0xFF , v=(quad_id>>8) (for texture 256x256 max 65536 quads)
use vertex shader to sample displacement and size from that texture (for given quad_id stored in attribute (or use vertex_ID/4 or vertex_ID/6)
fill vbo and texture
draw everything with single drawarrays of draw elements

OpenGL - Occlusion query depth buffer?

I've just started getting into the topic of occlusion queries in OpenGL, but I'm a bit confused about how they actually work.
In most examples I've found, the depth and color masks are deactivated before drawing with the occlusion query (Because we don't need to actually 'draw' anything), in essence somewhat like this:
glDepthMask(GL_FALSE);
glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);
glBeginQuery(GL_ANY_SAMPLES_PASSED,query1);
// Draw Object 1
glEndQuery(GL_ANY_SAMPLES_PASSED);
glBeginQuery(GL_ANY_SAMPLES_PASSED,query2);
// Draw Object 2
glEndQuery(GL_ANY_SAMPLES_PASSED);
// etc
glDepthMask(GL_TRUE);
glColorMask(GL_TRUE,GL_TRUE,GL_TRUE,GL_TRUE);
(It's assumed that objects are drawn front to back, so object 1 is in front of object 2. The above code is just pseudo code for the sake of this question. The result of the queries would be retrieved at a later time.)
Now, to know if object 2 is actually occluded by object 1, it would need to keep the fragment information from query 1 somehow (I'm assuming in some sort of depth buffer). But we've disabled drawing to the depth and color buffers, which means nothing is drawn, which means it shouldn't store anything anywhere?
Is there a special 'query' buffer? If so, is there a way to access it? Is it in any way connected to the currently bound texture or frame buffer? Do I need to clear it? Am I misunderstanding how occlusion queries actually work?
Now, to know if object 2 is actually occluded by object 1, it would need to keep the fragment information from query 1 somehow
Why would it? Occlusion queries store a counter of the number of samples that pass the depth test, which is a single integer.
Since you've disabled writing to the color and depth buffer, the only thing drawing the objects will do is increment the occlusion query counter*. Object 2 can't possibly occlude object 1 because drawing object 1 doesn't change the depth buffer.
* Unless you have a stencil buffer or are doing something like image load/store in shaders

What is the most efficient method of rendering sprites in DirectX 10?

I am currently experimenting with various ways of displaying 2D sprites in DirectX 10. I began by using the ID3DX10Sprite interface to batch draw my sprites in a single call. Eventually, however, I wanted a little more control over how my sprites were rendered, so I decided to look into quad-based sprite rendering (ie each sprite being represented by a quad with a texture applied).
I started out simple: I created a single vertex buffer consisting of 4 vertices that was applied once before the sprites were drawn. I then looped through my sprites, setting the appropriate properties to be passed into the shader, and making a draw call for each sprite, like so: d3dDevice->Draw( 4, 0);. Though it worked, the draw call for every sprite bugged me, so I looked for a more efficient method.
After searching about, I learned about object instancing, and decided to try it out. Everything went well until I tried implementing the most important part of sprites--textures. In short, though I had a texture array (declared at the top of my shader like so Texture2D textures[10];) that could be successfully sampled within my pixel shader using literals/constants as indexes, I could not figure out how to control which textures were applied to which instances via a texture index.
The idea would be for me to pass in a texture index per instance, that could then be used to sample the appropriate texture in the array within the pixel shader. However, after searching around more, I could not find an example of how it could be done (and found many things suggesting that it could not be done without moving to DirectX 11).
Is that to say that the only way to successfully render sprites via object instancing in DirectX 10 is to render them in batches based on texture? So, for example, if my scene consists of 100 sprites with 20 different textures (each texture referenced by 5 sprites), then it would take 20 separate draw calls to display the scene, and I would only be sending 5 sprites at a time.
In the end, I am rather at a loss. I have done a lot of searching, and seem to be coming up with conflicting information. For example, in this article in paragraph 6 it states:
Using DirectX 10, it is possible to apply different textures in the array to different instances of the same object, thus making them look different
In addition, on page 3 of this whitepaper, it mentions the option to:
Read a custom texture per instance from a texture array
However, I cannot seem to find a concrete example of how the shader can be setup to access a texture array using a per instance texture index.
In the end, the central question is: What is the most efficient method of rendering sprites using DirectX 10?
If the answer is instancing, then is it possible to control which texture is applied to each specific instance within the shader--thereby making it possible to send in much larger batches of sprites along with their appropriate texture index with only a single draw call? Or must I be content with only instancing sprites with the same texture at a time?
If the answer is returning to the use of the provided DX10 Sprite interface, then is there a way for me to have more control over how it is rendered?
As a side note, I have also looked into using a Geometry Shader to create the actual quad, so I would only have to pass in a series of points instead of managing a vertex and instance buffer. Again, though, unless there is a way to control which textures are applied to the generated quads, then I'm back to only batching sprites by textures.
There's a few ways (as usual) to do what you describe.
Please note that using
Texture2D textures[10];
will not allow you to use a variable index for lookup in Pixel Shader (since technically this declaration will allocate a slot per texture).
So what you need is to create a Texture2DArray instead. This is a bit like a volume texture, but the z component is a full number and there's no sampling on it.
You will need to generate this texture array though. Easy way is on startup you do one full screen quad draw call to draw each texture into a slice of the array (you can create a RenderTargetView for a specific slice). Shader will be a simple passtrough here.
To create a Texture Array (code is in SlimDX but, options are similar):
var texBufferDesc = new Texture2DDescription
{
ArraySize = TextureCount,
BindFlags = BindFlags.RenderTarget | BindFlags.ShaderResource,
CpuAccessFlags = CpuAccessFlags.None,
Format = format,
Height = h,
Width = w,
OptionFlags = ResourceOptionFlags.None,
SampleDescription = new SampleDescription(1,0),
Usage = ResourceUsage.Default,
};
Then shader resource view is like this:
ShaderResourceViewDescription srvd = new ShaderResourceViewDescription()
{
ArraySize = TextureCount,
FirstArraySlice = 0,
Dimension = ShaderResourceViewDimension.Texture2DArray,
Format = format,
MipLevels = 1,
MostDetailedMip = 0
};
Finally, to get a render target for a specific slice:
RenderTargetViewDescription rtd = new RenderTargetViewDescription()
{
ArraySize = 1,
FirstArraySlice = SliceIndex,
Dimension = RenderTargetViewDimension.Texture2DArray,
Format = this.Format
};
Bind that to your passtrough shader, set desired texture as input and slice as output and draw a full screen quad (or full screen triangle).
Please note that this texture can also be saved in dds format (so it saves you to regenerate every time you start your program).
Looking up your Texture is like:
Texture2DArray myarray;
In Pixel Shader:
myarray.Sample(mySampler, float2(uv,SliceIndex);
Now about rendering sprites, you also have the option of GS expansion.
So you create a vertex buffer containing only the position/size/textureindex/whatever else you need one one vertex per sprite.
Send a draw call with n sprites (Topology needs to be set to point list).
Passtrough the data from vertex shader to geometry shader.
Expand your point into quad in geometry shader, you can find an example which is ParticlesGS in Microsoft SDK doing that, it's a bit overkill for your case since you only need the rendering part for it, not the animation. If you need some cleaned code let me know I'll quickly make a dx10 compatible sample (In my case I use StructuredBuffers instead of VertexBuffer)
Doing a pre-made Quad and passing the above data in Per Instance VertexBuffer is also possible, but if you have a high number of sprites it will easily blow up your graphics card (by high I mean something like over 3 million particles, which is not much by nowadays standards, but if you're under half a million sprites you'll be totally fine ;)
Include the texture index within the instance buffer and use this to select the correct texture from the texture array per instance:
struct VS
{
float3 Position: POSITION;
float2 TexCoord: TEXCOORD0;
float TexIndex: TexIndex; // From the instance buffer not the vertex buffer
}
Then pass this value on through to the pixel shader
struct PS
{
float4 Positon: SV_POSITION;
float3 TexCoord: TEXCOORD0;
}
..
vout.TexCoord = float3(vin.TexCoord, vin.TexIndex);

Off-screen multiple render targets using Frame Buffer Object (FBO) or?

Situation: Generating N samples of a shape and corresponding edges (using Sobel filter or my own) with different transformations and rotations, while viewport (size=600*600) and camera remain constants. i.e. there will be N samples + N corresponding edges.
I am thinking to do like this,
Use One FBO with 2 renderbuffers [i.e. size of each buffer will be= (N *600) * 600]- 1st for N shapes and 2nd for edges of the corresponding shapes
Questions:
Which is the best way to achieve above things?
Though viewport size is 600*600pixels but shape will only occupy around 50*50pixels. So is there any efficient way to apply edge detection on bounding box/AABB region only on 2nd buffer? Also only reading 2N bounding box (N sample + N corresponding edges) in efficient way?
1 : I'm not sure what you call "best way". Use Multiple Render Targets : you create two 600*N textures, bind them both to the FBO with glDrawArrays, and in your fragment shader, so something like that :
layout(location = 0) out vec3 color;
layout(location = 1) out vec3 edges;
When writing to "color" and "edges", you'll effectively write in your textures.
2 : You shouldn't do this. Compute your bounding boxes on the CPU, and project them (i.e. multiply each corner by your ModelViewProjection matrix) to get the bounding boxes in 2D
By the way : Compute your bounding boxes first, so that you won't need 600*600 textures but 50*50...
EDIT : You usually restrict the drawn zone with glViewPort. But ther is only one viewport, and you need several. You can try the Viewport array extension and live on the bleeding edge, or pass the AABB in a texture, or don't worry about that until performance matters...
Oh, and you can't use Sobel just like that... Sobel requires that you can read all texels around, which is not the case since you're currently rendering said texels. Either make a two-pass algorithm without MRTs (first color, then edges) or don't use Sobel and guess you edges in the shader ( I don't really see how )
Like Calvin said, you have to first render your object into the the first framebuffer and then bind this as texture (use texture attachment rather than a renderbuffer) for the second pass to find the edges, as the edge detection usually needs access to a pixel's surrounding pixels.
Regarding your second question, you could probably use the stencil buffer. Just draw your shapes in the first pass and let them write a reference value into the stencil buffer. Then do the edge detection (usually by rendering a screen sized quad with the corrseponding fragment shader) and configure the stencil test to only pass where the stencil buffer contains the reference value. This way (assuming early-z hardware, which is quite common now) the fragment shader will only be executed on the pixels the shape has actually been drawn onto.