D3D11 Writing to buffer in geometry shader - opengl

I have some working OpenGL code that I was asked to port to Direct3D 11.
In my code i am using Shader Storage Buffer Objects (SSBOs) to read and write data in a geometry shader.
I am pretty new of Direct3D programming. Thanks to google I've been able to identify the D3D equivalent of SSBOs, RWStructuredBuffer (I think).
The problem is that I am not sure at all I can use them in a geometry shader in D3D11, which, from what i understand, can generally only use up to 4 "stream out"s (are these some sort of transform feedback buffer?).
The question is: is there any way with D3D11/11.1 to do what I'm doing in OpenGL (that is writing to SSBOs from the geometry shader)?
UPDATE:
Just found this page: http://msdn.microsoft.com/en-us/library/windows/desktop/hh404562%28v=vs.85%29.aspx
If i understand correctly the section "Use UAVs at every pipeline stage", it seems that accessing such buffers is allowed in all shader stages.
Then i discovered that DX11.1 are available only on Windows 8, but some features are also ported to Windows 7.
Is this part of Direct3D included in those features available on Windows 7?

RWBuffers are not related to the geometry shader outputting geometry, they are found in compute shader mostly and in a less percentage in pixel shader, and as you spot, other stages needs D3D 11.1 and Windows 8.
What you are looking for is stream output. The API to bind buffers to the output of the geometry shader stage is ID3D11DeviceContext::SOSetTargets and buffers need to be created with the flag D3D11_BIND_STREAM_OUTPUT
Also, outputting geometry with a geometry shader was an addition from D3D10, in D3D11, it is often possible to have something at least as efficient and simpler with compute shaders. That's not an absolute advice of course.

The geometry shader is processed once per assembled primitive and can generate one or more primitives as a result.
The output of the geometry shader can be redirected towards an output buffer instead of passed on further for rasterization.
See this overview diagram of the pipeline and this description of the pipeline stages.
A geometry shader has access to other resources, bound via the GSSetShaderResources method on the device context. However, these are generally resources that are "fixed" at shader execution time such as constants and textures. The data that varies for each execution of the geometry shader is the input primitive to the shader.

just been pointed to this page:
http://nvidia.custhelp.com/app/answers/detail/a_id/3196/~/fermi-and-kepler-directx-api-support .
In short, nvidia does not support the feature on cards < Maxell.
This pretty much answers my question. :/

Related

Writing to depth buffer from opengl compute shader

Generally on modern desktop OpenGL hardware what is the best way to fill a depth buffer from a compute shader and then use that depth buffer for graphics pipeline rendering with triangles etc?
Specifically I am wondering about concerns regards HiZ. Also I wonder if it's better to do compute shader modifications to the depth buffer before or after the graphics rendering?
If the compute shader is run after the graphics rendering I assume the depth buffer will typically be decompressed behind the scenes. But I worry done the other way around the depth buffer may be in a decompressed/non-optimal state for the graphics pipeline?
As far as i know, you cannot bind textures with any of the depth formats as images, and thus cannot write to depth format textures in compute shaders. See glBindImageTexture documentation, it lists the formats that your texture format must be compatible to. Depth formats are not among them and the specification says the depth formats are not compatible to the normal formats.
Texture copying functions have the same compatibility restrictions, so you can't even e.g. write to a normal texture in the compute shader and then copy to a depth texture. glCopyImageSubData does not explicitly have that restriction but i haven't tried it and it's not part of the core profile anymore.
What might work is writing to a normal texture, then rendering a fullscreen triangle and setting gl_FragDepth to values read from the texture, but that's an additional fullscreen pass.
I don't quite understand your second question - if your compute shader stuff modifies the depth buffer, the result will most likely be different depending on whether you do it before or after regular rendering because different parts will be visible or occluded.
But maybe that question is moot since it seems you cannot manually write into depth buffers at all - which might also answer your third question - by not writing into depth buffers you cannot mess with the compression of it :)
Please note that i'm no expert in this, i had a similar problem and looked at the docs/spec myself, so this all might be wrong :) Please let me know if you manage to write to depth buffers with compute shaders!

Reading current framebuffer

Is there a way to read fragment from the framebuffer currently rendered?
So, I'm looking for a way to read color information from the fragment that's on the place that current fragment will probably overwrite. So, exact position of the fragment that previously rendered.
I found gl_FragData and gl_LastFragData to be added with certain EXT_ extensions to shaders, but if they are what I need, could somebody explain how to use those?
I am looking either for a OpenGL or OpenGL ES 2.0 solution.
EDIT:
All the time I was searching for the solution that would allow me to have some kind of read&write "uniform" accessible from shaders. For anyone out there searching for similar thing, OpenGL version 4.3+ support image and buffer storage types. They do allow both reading and writing to them simultaneously, and in combination with compute shaders they proved to be very powerful tool.
Your question seems rather confused.
Part of your question (the first sentence) asks if you can read from the framebuffer in the fragment shader. The answer is, generally no. There is an OpenGL ES 2.0 extension that lets you do so, but it's only supported on some hardware. In desktop GL 4.2+, you can use arbitrary image load/store to get the same effect. But you can't render to that image anymore; you have to write your data using image storing functions.
gl_LastFragData is pretty simple: it's the color from the sample in the framebuffer that will be overwritten by this fragment shader. You can do with it what you wish, if it is available.
The second part of your question (the second paragraph) is a completely different question. There, you're asking about fragments that were potentially never written to the framebuffer. You can't read from a fragment shader; you can only read images. And if a fragment fails the depth test, then it's data was never rendered to an image. So you can't read it.
With most nVidia hardware you can use the GL_NV_texture_barrier extension to read from a texture that's currently bound to a framebuffer. But bear in mind that you won't be able to read data any more recent than produced in the previous draw call

What is the difference between opengl and GLSL?

I recently started programming with openGL. I've done code creating basic primitives and have used shaders in webGL. I've googled the subject extensively but it's still not that clear to me. Basically, here's what I want to know. Is there anything that can be done in GLSL that can't be done in plain openGL, or does GLSL just do things more efficiently?
The short version is: OpenGL is an API for rendering graphics, while GLSL (which stands for GL shading language) is a language that gives programmers the ability to modify pipeline shaders. To put it another way, GLSL is a (small) part of the overall OpenGL framework.
To understand where GLSL fits into the big picture, consider a very simplified graphics pipeline.
Vertexes specified ---(vertex shader)---> transformed vertexes ---(primitive assembly)---> primitives ---(rasterization)---> fragments ---(fragment shader)---> output pixels
The shaders (here, just the vertex and fragment shaders) are programmable. You can do all sorts of things with them. You could just swap the red and green channels, or you could implement a bump mapping to make your surfaces appear much more detailed. Writing these shaders is an important part of graphics programming. Here's a link with some nice examples that should help you see what you can accomplish with custom shaders: http://docs.unity3d.com/Documentation/Components/SL-SurfaceShaderExamples.html.
In the not-too-distant past, the only way to program them was to use GPU assembler. In OpenGL's case, the language is known as ARB assembler. Because of the difficulty of this, the OpenGL folks gave us GLSL. GLSL is a higher-level language that can be compiled and run on graphics hardware. So to sum it all up, programmable shaders are an integral part of the OpenGL framework (or any modern graphics API), and GLSL makes it vastly easier to program them.
As also covered by Mattsills answer GL Shader Language or GLSL is a part of OpenGL that enables the creation of algorithms called shaders in/for OpenGL. Shaders run on the GPU.
Shaders make decisions about factors such as the color of parts of surfaces, and the way surfaces share information such as reflected light. Vertex Shaders, Geometry Shaders, Tesselation Shaders and Pixel Shaders are types of shader that can be written in GLSL.
Q1:
Is there anything that can be done in GLSL that can't be done in plain OpenGL?
A:
You may be able to use just OpenGL without the GLSL parts, but if you want your own surface properties you'll probably want a shader make this reasonably simple and performant, created in something like GLSL. Here are some examples:
Q2:
Or does GLSL just do things more efficiently?
A:
Pixel shaders specifically are very parallel, calculating values independently for every cell of a 2D grid, while also containing significant caveats, like not being unable to handle "if" statement like conditions very performantly, so it's a case of using different kinds of shaders to there strengths, on surfaces described and dealt with in the rest of OpenGL.
Q3:
I suspect you want to know if just using GLSL is an option, and I can only answer this with my knowledge of one kind of shader, Pixel Shaders. The rest of this answer covers "just" using GLSL as a possible option:
A:
While GLSL is a part of OpenGL, you can use the rest of OpenGL to set up the enviroment and write your program almost entirly as a pixel shader, where each element of the pixel shader colours a pixel of the whole screen.
For example:
(Note that WebGL has a tendency to hog CPU to the point of stalling the whole system, and Windows 8.1 lets it do so, Chrome seems better at viewing these links than Firefox.)
No, this is not a video clip of real water:
https://www.shadertoy.com/view/Ms2SD1
The only external resources fed to this snail some easily generatable textures:
https://www.shadertoy.com/view/ld3Gz2
Rendering using a noisy fractal clouds of points:
https://www.shadertoy.com/view/Xtc3RS
https://www.shadertoy.com/view/MsdGzl
A perfect sphere: 1 polygon, 1 surface, no edges or vertices:
https://www.shadertoy.com/view/ldS3DW
A particle system like simulation with cars on a racetrack, using a 2nd narrow but long pixel shader as table of data about car positions:
https://www.shadertoy.com/view/Md3Szj
Random values are fairly straightforward:
fract(sin(p)*10000.)
I've found the language in some respects to be hard to work with and it may or may not be particularly practical to use GLSL in this way for a large project such as a game or simulation, however as these demos show, a computer game does not have to look like a computer game and this sort of approach should be an option, perhaps used with generated content and/or external data.
As I understand it to perform reasonably Pixel Shaders in OpenGL:
Have to be loaded into a small peice of memory.
Do not support:
"if" statement like conditions.
recursion or while loop like flow control.
Are restricted to a small pool of valid instructions and data types.
Including "sin", mod, vector multiplication, floats and half precision floats.
Lack high level features like objects or lambdas.
And effectively calculate values all at once in parallel.
A consequence of all this is that code looks more like lines of closed form equations and lacks algorythms or higher level structures, using modular arithmetic for something akin to conditions.

How big can a WebGL fragment shader be?

I'm raytracing in the WebGL fragment shader, planning on using a dynamically generated fragment shader that contains the objects in my scene. As I add an object to my scene, I will be adding some lines to the fragment shader, so it could get pretty large. How big can it get and still work? Is it dependent on the graphics card?
The "safe" answer is that it depends on your hardware and drivers, but in practical terms they can be pretty crazy big. What JustSid said about performance does apply (bigger shader == slower shader) but it sounds like you're not exactly aiming for 60FPS here.
For a more in-depth breakdown of shader limits, check out http://en.wikipedia.org/wiki/High_Level_Shader_Language. The page is about Direct X shaders, but all the shader constraints apply to GLSL as well.
There are all kinds of limits on how big shaders can be. It's up the driver/gpu but it's also up to the machine and WebGL. Some WebGL implementations (chrome) run an internal timer. If a single GL command takes too long they'll kill WebGL ("Rats, WebGL hit a snag"). This can happen when a shader takes too long to compile.
Here's an example of a silly 100k shader where I ran a model through a script to generate a mesh in the shader itself. It runs for me on macOS on my Macbook Pro. It also runs on my iPhone6+. But when I try it in Windows the DirectX driver takes too long trying to optimize it and Chrome kills it.
It's fun to goof off and put geometry in your shader and it's fun to goof off with signed distance fields and ray marching type stuff but those techniques are more of a puzzle / plaything. They are not in any way performant. As a typical example this SDF based shader runs at about 1 frame per second fullscreen on my laptop, and yet my laptop is capable of displaying all of Los Santos from GTV5 at a reasonable framerate.
If you want performance you should really be using more standard techniques, putting your geometry in buffers and doing forward or deferred rendering.
You don't want to create massive fragment shaders since they tend to drain the performance very fast. You should, if possible, do any calculation either already on the CPU or in the vertex shader, but not for every pixel.

Shader framebuffer readback

I was wondering if there is support in the newer shader models to read-back a pixel value from the target framebuffer. I assume that this is alrdy done in later (non-programmable) stages in the drawing pipeline which made me hope that this feature might have been added into the programmable pipeline.
I am aware that it is possible to draw to a texture bound framebuffer and then send this texture to the shader, I was just hoping for a more elegant way to achieve the same functionality.
As Andrew notes, the framebuffer access is logically a separate stage from the fragment shader, so reading the framebuffer in the fragment shader is impossible. The reason for this (to answer Andrew's question) is a combination of performance and the ordering requirements of the graphics pipeline. The way the rendering pipeline is defined, framebuffer blending operations MUST occur in the same order as the triangles/primitives that went into the beginning of the pipeline. The fragment shaders, on the other hand, can happen in any order. So by having them be separate stages, the GPU is free to run fragment shaders as fast as it can, as their inputs become available, without having to synchronize between them. As long as it maintains enough bufffer space to hold on to the outputs of the fragment shaders, so that they can be accumulated and allow the framebuffer blends and writes to occur in order, all is well, as the results of any given fragment shader are not visible until after the blending stage.
If there was a way for the fragment shader to read the framebuffer, it would require some sort of synchronization to ensure that those reads happen in order, thus greatly slowing things down.
No. As you mention, rendering to a texture is the way to achieve that functionality.
If you take a look at a block diagram of a GPU pipeline, you'll see that the blending stage - which is what combines fragment shader output with the framebuffer - is separate from the fragment shader and is fixed-function.
I'm not a GPU designer - so I can only speculate the reason for this. Presumably it is to keep framebuffer access fast and insulate the fragment shader stage from the frame buffer so that it can be better parallelised. There are probably also issues regarding multi-sampling, and so on.
(Not to mention that fixed-function blending is "good enough" in most cases.)
Actually I think this is now doable with Direct3D 11 SM 5.0 (I didn't test it though).
You can bind an UAV to a PS 5.0, for allowing read and write operations on it using method OMSetRenderTargetsAndUnorderedAccessViews.
In that case the backbuffer of the swap chain in which you render has to be created with flag DXGI_USAGE_UNORDERED_ACCESS (I guess).
This is used in DXSDK OIT11 sample.
It is possible to read back the contents of the frame buffer in the fragment shader with Shader_framebuffer_fetch extension. The support can be added to the GPU with some performance loss. In fact, these days I'm working on to add the support of this extension in the OpenGL ES2.0 driver of a well known GPU brand in the consumer electronics market.
You can draw to a texture TEX (using a render target view) and then bind that as an input to another shader (using a shader resource view). TEX is then a pseduo-framebuffer.