Shader framebuffer readback - glsl

I was wondering if there is support in the newer shader models to read-back a pixel value from the target framebuffer. I assume that this is alrdy done in later (non-programmable) stages in the drawing pipeline which made me hope that this feature might have been added into the programmable pipeline.
I am aware that it is possible to draw to a texture bound framebuffer and then send this texture to the shader, I was just hoping for a more elegant way to achieve the same functionality.

As Andrew notes, the framebuffer access is logically a separate stage from the fragment shader, so reading the framebuffer in the fragment shader is impossible. The reason for this (to answer Andrew's question) is a combination of performance and the ordering requirements of the graphics pipeline. The way the rendering pipeline is defined, framebuffer blending operations MUST occur in the same order as the triangles/primitives that went into the beginning of the pipeline. The fragment shaders, on the other hand, can happen in any order. So by having them be separate stages, the GPU is free to run fragment shaders as fast as it can, as their inputs become available, without having to synchronize between them. As long as it maintains enough bufffer space to hold on to the outputs of the fragment shaders, so that they can be accumulated and allow the framebuffer blends and writes to occur in order, all is well, as the results of any given fragment shader are not visible until after the blending stage.
If there was a way for the fragment shader to read the framebuffer, it would require some sort of synchronization to ensure that those reads happen in order, thus greatly slowing things down.

No. As you mention, rendering to a texture is the way to achieve that functionality.
If you take a look at a block diagram of a GPU pipeline, you'll see that the blending stage - which is what combines fragment shader output with the framebuffer - is separate from the fragment shader and is fixed-function.
I'm not a GPU designer - so I can only speculate the reason for this. Presumably it is to keep framebuffer access fast and insulate the fragment shader stage from the frame buffer so that it can be better parallelised. There are probably also issues regarding multi-sampling, and so on.
(Not to mention that fixed-function blending is "good enough" in most cases.)

Actually I think this is now doable with Direct3D 11 SM 5.0 (I didn't test it though).
You can bind an UAV to a PS 5.0, for allowing read and write operations on it using method OMSetRenderTargetsAndUnorderedAccessViews.
In that case the backbuffer of the swap chain in which you render has to be created with flag DXGI_USAGE_UNORDERED_ACCESS (I guess).
This is used in DXSDK OIT11 sample.

It is possible to read back the contents of the frame buffer in the fragment shader with Shader_framebuffer_fetch extension. The support can be added to the GPU with some performance loss. In fact, these days I'm working on to add the support of this extension in the OpenGL ES2.0 driver of a well known GPU brand in the consumer electronics market.

You can draw to a texture TEX (using a render target view) and then bind that as an input to another shader (using a shader resource view). TEX is then a pseduo-framebuffer.

Related

Writing to depth buffer from opengl compute shader

Generally on modern desktop OpenGL hardware what is the best way to fill a depth buffer from a compute shader and then use that depth buffer for graphics pipeline rendering with triangles etc?
Specifically I am wondering about concerns regards HiZ. Also I wonder if it's better to do compute shader modifications to the depth buffer before or after the graphics rendering?
If the compute shader is run after the graphics rendering I assume the depth buffer will typically be decompressed behind the scenes. But I worry done the other way around the depth buffer may be in a decompressed/non-optimal state for the graphics pipeline?
As far as i know, you cannot bind textures with any of the depth formats as images, and thus cannot write to depth format textures in compute shaders. See glBindImageTexture documentation, it lists the formats that your texture format must be compatible to. Depth formats are not among them and the specification says the depth formats are not compatible to the normal formats.
Texture copying functions have the same compatibility restrictions, so you can't even e.g. write to a normal texture in the compute shader and then copy to a depth texture. glCopyImageSubData does not explicitly have that restriction but i haven't tried it and it's not part of the core profile anymore.
What might work is writing to a normal texture, then rendering a fullscreen triangle and setting gl_FragDepth to values read from the texture, but that's an additional fullscreen pass.
I don't quite understand your second question - if your compute shader stuff modifies the depth buffer, the result will most likely be different depending on whether you do it before or after regular rendering because different parts will be visible or occluded.
But maybe that question is moot since it seems you cannot manually write into depth buffers at all - which might also answer your third question - by not writing into depth buffers you cannot mess with the compression of it :)
Please note that i'm no expert in this, i had a similar problem and looked at the docs/spec myself, so this all might be wrong :) Please let me know if you manage to write to depth buffers with compute shaders!

unity3d, multiple render targets - different behavior in Direct3D/OpenGl

I'm writing shader for unity3d. The shader uses multiple render targets to render post processing effect.
However, I've run into interesting issue.
When Unity3d runs in direct3d mode, by default all standard shaders write data only into first color buffer (i.e. with index 0). I.e. if I attach 3 color buffers to camera, call Camera.Render color buffer with index 0 will contain rendered scene, and all the other buffers will remain untouched unless some shader specifically write in them. My shader utilizes that behavior (I use buffers with indexes 1 and 2 to accumulate data needed for post process effect).
However, in OpenGL mode standard unity3d shaders write in ALL color buffers at once. I.e. if I attach multiple render buffers to a camera, call Camera.Render all 3 buffers will contain copy of rendered scene.
That breaks my shader in OpenGL mode.
How can I fix that? I need to render the whole scene in one go, and only objects that have specific shader should modify additional color buffers.
I need to render scene in one go because using layer masks causes unity to recalculate projector shadows for ALL lights and I need shadows to be correct.
Advice?
Sadly, it turned out that "not writing into one of the render targets" is undocumented behavior in opengl. Standard unity shader when compiled for forward rendering path produces gl_FragData[0] = ...; assignment and writes into only one buffer, which triggers undocumented behavior and causes the mess.
In order to fix that problem, I would need to make unity write data explicitly into additional render targets in standard shaders. Unfortunately, this cannot be done, because there is no "entry point" to "hook" standard shader and write additional data into other color buffers. The closest thing to that is "finalcolor" modifier, but it does not actually allow to write into additional buffers via CG shader (that requires additional data to be from fragment shader, which is inaccessible from surface shader), it is only possible to modify one color.
I decided to rewrite portion of the shader (so it won't trigger undocumented behavior in OpenGL) and gave up on having unity shadowmap support in the effect. As far as I know, there is no other options short of modifying unity engine (requires "special arrangements" and source code access) or replacing entire lighting system with my own.

Relation between depth-only FBO and fragment shader

I’ve been wondering what happens when binding a depth-only FBO (only the GL_DEPTH_ATTACHMENT gets attached and glDrawBuffer(GL_NONE) is called) for the fragment shader part. Because any color is discarded:
does OpenGL simply process vertices the regular way, call the rasterizer, apply the fragment shader for rasterized fragments, but discard any result
or does it do smarter things, like process vertices until the optional geometry shader, then cut the fragment shader part and use a dummy fragment shader in order to discard useless color computations?
Because of vendor-implementation details, I guess it might vary, but I’d like to have a better idea about that topic.
In my experience, the fragment shader will still run even if it has no outputs. This can be used for example to draw shadow maps with punch-through alpha textures, using discard.
If it does have outputs (or more outputs then are bound), then they should just be ignored. I'd imagine that a smart driver could easily skip the fragment shader entirely if it doesn't contain any discard statements.
Also perhaps look into Separate Shader Objects (https://www.opengl.org/registry/specs/ARB/separate_shader_objects.txt). It allows you to disable the stages manually.
I've read (Though never personally tested) that a complete lack of a color buffer causes strange undefined behavior, as OpenGL implementations each had to ask this question in reverse: "What /should/ we make it do when there's no color buffer?" and have no official, commonly-used-across-all-implementations answer.
The official documentation carefully avoids mentioning this situation generally.
As such, it is just recommended that you simply... not do that, and instead always have a color buffer, even if you don't use it.

D3D11 Writing to buffer in geometry shader

I have some working OpenGL code that I was asked to port to Direct3D 11.
In my code i am using Shader Storage Buffer Objects (SSBOs) to read and write data in a geometry shader.
I am pretty new of Direct3D programming. Thanks to google I've been able to identify the D3D equivalent of SSBOs, RWStructuredBuffer (I think).
The problem is that I am not sure at all I can use them in a geometry shader in D3D11, which, from what i understand, can generally only use up to 4 "stream out"s (are these some sort of transform feedback buffer?).
The question is: is there any way with D3D11/11.1 to do what I'm doing in OpenGL (that is writing to SSBOs from the geometry shader)?
UPDATE:
Just found this page: http://msdn.microsoft.com/en-us/library/windows/desktop/hh404562%28v=vs.85%29.aspx
If i understand correctly the section "Use UAVs at every pipeline stage", it seems that accessing such buffers is allowed in all shader stages.
Then i discovered that DX11.1 are available only on Windows 8, but some features are also ported to Windows 7.
Is this part of Direct3D included in those features available on Windows 7?
RWBuffers are not related to the geometry shader outputting geometry, they are found in compute shader mostly and in a less percentage in pixel shader, and as you spot, other stages needs D3D 11.1 and Windows 8.
What you are looking for is stream output. The API to bind buffers to the output of the geometry shader stage is ID3D11DeviceContext::SOSetTargets and buffers need to be created with the flag D3D11_BIND_STREAM_OUTPUT
Also, outputting geometry with a geometry shader was an addition from D3D10, in D3D11, it is often possible to have something at least as efficient and simpler with compute shaders. That's not an absolute advice of course.
The geometry shader is processed once per assembled primitive and can generate one or more primitives as a result.
The output of the geometry shader can be redirected towards an output buffer instead of passed on further for rasterization.
See this overview diagram of the pipeline and this description of the pipeline stages.
A geometry shader has access to other resources, bound via the GSSetShaderResources method on the device context. However, these are generally resources that are "fixed" at shader execution time such as constants and textures. The data that varies for each execution of the geometry shader is the input primitive to the shader.
just been pointed to this page:
http://nvidia.custhelp.com/app/answers/detail/a_id/3196/~/fermi-and-kepler-directx-api-support .
In short, nvidia does not support the feature on cards < Maxell.
This pretty much answers my question. :/

Quick question about glColorMask and its work

I want to render depth buffer to do some nice shadow mapping. My drawing code though, consists of many shader switches. If I set glColorMask(0,0,0,0) and leave all shader programs, textures and others as they are, and just render the depth buffer, will it be 'OK' ? I mean, if glColorMask disables the "write of color components", does it mean that per-fragment shading IS NOT going to be performed?
For rendering a shadow map, you will normally want to bind a depth texture (preferrably square and power of two, because stereo drivers take this as hint!) to a FBO and use exactly one shader (as simple as possible) for everything. You do not want to attach a color buffer, because you are not interested in color at all, and it puts more unnecessary pressure on ROP (plus, some hardware can render double speed or more with depth-only). You do not want to switch between many shaders.
Depending on whether you do "classic" shadow mapping, or something more sophisticated such as exponential shadow maps, the shader that you will use is either as simple as it can be (constant color, and no depth write), or performs some (moderately complex) calculations on depth, but you normally do not want to perform any colour calculations, since that will mean needless calculations which will not be visible in any way.
No, the fragment operations will be performed anyway, but their result will be squashed by your zero color mask.
If you don't want some fragment operations to be performed - use the proper shader program which has an empty fragment shader attached and set the draw buffer to GL_NONE.
There is another way to disable fragment processing - to enable GL_RASTERIZER_DISCARD, but you won't get even the depth values in this case :)
No, the shader programs execute independent of the fixed function pipeline. Setting the glColorMask will have no effect on the shader programs.