Deferred Rendering: Performance issues

Deferred Rendering: Performance issues - c++

I've just implemented deferred rendering/shading for the first time and I was surprised to see the big perfomance gap between forward and deferred rendering.
When I run my application with forward rendering I get a pretty decent frame rate while running in Release mode
FORWARD RENDERING
However when I ran it with deferred rendering it gave me a rather surprising output
DEFERRED RENDERING
I'm well aware of that deferred rendering is NOT something you coat an application with to make it go "faster". I consider it to be a performance optimization technique that can be optimized in a numerous ways and I understand that the technique has a larger memory footprint than forward rendering.
However...
I've currently got ONE point light in the scene and one hundred cubes created with hardware instancing. The light is moving back and forth on the Z-axis casting light on the cubes.
The problem is that the light is very laggy when moving. It's so laggy that the application doesn't register the keyboard input. Honestly, I was not prepared for this and I assume that I'm doing something terrible bad in my implementation.
So far I've changed the texture format on the gbuffers from DXGI_FORMAT_R32G32B32A32_FLOAT to DXGI_FORMAT_R16G16B16A16_FLOAT jsut to see if it had any visual impact but it did not.
Any suggestions? Thank you!
SIDE NOTE
I'm using the Visual Studio Graphics Diagnostics for debugging my DirectX applications

Solved it!
I enabled the DirectX debug layer and I got a sea of D3D11 warnings
D3D11 WARNING: ID3D11DeviceContext::OMSetRenderTargets: Resource being set to OM RenderTarget slot 1 is still bound on input! [ STATE_SETTING WARNING #9: DEVICE_OMSETRENDERTARGETS_HAZARD]
D3D11 WARNING: ID3D11DeviceContext::OMSetRenderTargets[AndUnorderedAccessViews]: Forcing PS shader resource slot 1 to NULL. [ STATE_SETTING WARNING #7: DEVICE_PSSETSHADERRESOURCES_HAZARD]
D3D11 WARNING: ID3D11DeviceContext::OMSetRenderTargets: Resource being set to OM RenderTarget slot 2 is still bound on input! [ STATE_SETTING WARNING #9: DEVICE_OMSETRENDERTARGETS_HAZARD]
D3D11 WARNING: ID3D11DeviceContext::OMSetRenderTargets[AndUnorderedAccessViews]: Forcing PS shader resource slot 2 to NULL. [ STATE_SETTING WARNING #7: DEVICE_PSSETSHADERRESOURCES_HAZARD]
D3D11 WARNING: ID3D11DeviceContext::OMSetRenderTargets: Resource being set to OM RenderTarget slot 3 is still bound on input! [ STATE_SETTING WARNING #9: DEVICE_OMSETRENDERTARGETS_HAZARD]
D3D11 WARNING: ID3D11DeviceContext::OMSetRenderTargets[AndUnorderedAccessViews]: Forcing PS shader resource slot 3 to NULL. [ STATE_SETTING WARNING #7: DEVICE_PSSETSHADERRESOURCES_HAZARD]
So at the end of the render function i.e. after
mSwapChain->Present( 0, 0 );
I added
mDeviceContext->ClearState();
Just to reset the device context which sets all input/output resource slots to NULL as explained on MSDN

Related

D3D11 swapchain present slows down over time

Render behaves very strangely, over time fps starts to fall very hard, up to 70%. I've tried reducing the number of objects to render, simplifying shaders (1-3 operations., but it didn't solve the problem. When debugging CPU in visual studio I see how IDXGISwapChain::Present takes more time, although the scene is static and nothing in it changes.
To give you an example, it goes like this:
Running the application at 60fps
Waiting 30 seconds 50fps
Another 30 seconds 30fps
Minimize the application and wait 30 seconds (I use the delay for an inactive application) like slee(100)
Return to the application 60fps
Wait 30 seconds 50fps
30 more seconds 30fps
I have simplified the shaders to just a couple of operations in them(1-3).
This is happening on different PCs. It also doesn't depend on the complexity of the scene, just the initial fps will be higher. I tried everything I found stackoverflow, but nothing solved the problem.
I only get one error in debug mode, but I don't think it's the cause of it all:
D3D11 WARNING: ID3D11DeviceContext::DrawIndexed: The Pixel Shader expects a Render Target View bound to slot 0, but none is bound. This is OK, as writes of an unbound Render Target View are discarded. It is also possible the developer knows the data will not be used anyway. This is only a problem if the developer actually intended to bind a Render Target View here. [ EXECUTION WARNING #3146081: DEVICE_DRAW_RENDERTARGETVIEW_NOT_SET]

dx12 open 4x msaa failed

I am just learning "introduction to 3D game programming with DirectX 12".Running the example code in initialize d3d(chapter 4),when I wanna use 4xmsaa,something wrong was happened,like the follow figure,please help me.
wrong figure

In keeping with the DirectX 12 design philosophy of "no magic runtime behavior", you aren't allowed to create a backbuffer as MSAA. This is because the video output hardware can't actually present MSAA backbuffers, so they have to be resolved to a single pixel each at some point in the pipeline. In DirectX 11, this was done 'behind the scenes' when you created an MSAA backbuffer. In DirectX 12, you are responsible for creating the MSAA render target texture yourself, and then perform the resolve to the backbuffer -or- run some other postprocess that does the resolve. The set up is exactly the same, just more verbose and explicit and under application control instead of being 'magic'.
See the SimpleMSAA12 sample.
With DirectX 12 you also aren't allowed to create sRGB format backbuffers. You can create sRGB render target views that will perform the gamma while writing to the backbuffer. There were some bugs in the older debug layers and Windows 10 runtime when doing direct resolves of sRGB MSAA render targets to non-sRGB backbuffers. These are also noted in the sample above.
Note that UWP apps have exactly the same behavior as DirectX 12 for Win32 desktop apps. This is because both UWP apps and DirectX 12 are required to use the DXGI_SWAP_EFFECT_FLIP_* swap effects instead of the legacy DXGI_SWAP_EFFECT_* swap modes.
BTW, if you had enabled DXGI debugging, you'd have gotten some specific debug diagnostic output when you tried to create the 4x MSAA backbuffer:
DXGI ERROR: IDXGIFactory::CreateSwapChain: Flip model swapchains (
DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL
and DXGI_SWAP_EFFECT_FLIP_DISCARD) do not support multisampling.
[ MISCELLANEOUS ERROR #102: ]
Take a look at this blog post for details on enabling DXGI debugging in your project, or take a look at the implementation of DeviceResources.

Vertex Shader not bound when running an application in DirectX 11 in VS2012

For my university course I have been given a number of example base applications that show different features and things that can be achieved using DirectX 11. On the University Computers and some others these all run fine. However on my laptop, not the most powerful of machines, some applications work fine while others run but only show a blank window with just the screen background colour. Looking in the output window there are a large number of D3D11 errors stating that a vertex shader is not bound hence nothing is visible on screen.
Two of those errors that appear in the debug window (they dont appear in the error list) are:
D3D11 ERROR: ID3D11DeviceContext::DrawAuto: A Vertex Shader is always required when drawing, but none is currently bound. [ EXECUTION ERROR #341: DEVICE_DRAW_VERTEX_SHADER_NOT_SET]
D3D11 ERROR: ID3D11DeviceContext::DrawAuto: Rasterization Unit is enabled (PixelShader is not NULL or Depth/Stencil test is enabled and RasterizedStream is not D3D11_SO_NO_RASTERIZED_STREAM) but position is not provided by the last shader before the Rasterization Unit. [ EXECUTION ERROR #362: DEVICE_DRAW_POSITION_NOT_PRESENT]
These programs do however run correctly on other machines. Is there some kind of configuration that I need to do on my laptop or is it likely due to the lower technical capabilities even though some applications do work?

Horizontal Tearing DirectX9

I've been trying to develop a video capture display application with DirectX9 under Win7 using a vertex shader and a pixel shader (very basic ones). However, the image being displayed is showing some tearing, always at the same location on the screen. The specs are the following
Video is being captured via a webcam
Display is not in fullscreen mode
Refresh rate of screen is 60Hz
D3DPRESENT_INTERVAL_ONE is being used to force to a good refresh rate (found on some forum, doesn't work though)
I tried modifying this last parameter with all that are available only to realize that D3DPRESENT_INTERVAL_ONE gives me a consistent (always at the same position on screen) tearing.
I know that "enabling" V-Sync could maybe solve my problem, but I can't seem to find any info about this on the web (Yes I know, DirectX9 is getting outdated), so any help would be very appreciated!

Use D3DPRESENT_INTERVAL_DEFAULT if it doesn't give you tearing.
This flag also enables V-sync. From documentation:
D3DPRESENT_INTERVAL_DEFAULT uses the default system timer resolution
whereas the D3DPRESENT_INTERVAL_ONE calls timeBeginPeriod to enhance
system timer resolution. This improves the quality of vertical sync,
but consumes slightly more processing time. Both parameters attempt to
synchronize vertically.

Opengl Hardware Accelerator Sleeping After Period of Inactivity

I am working on a Opengl based 2D CAD software which requires heavy use of hardware OpenGL acceleator (pushing 250 million vertex per second at times). Here is my problem.... whenever the viewport is stagnant for more than 10 seconds, the Opengl accelerator (Geforce 9800 GT in this case) goes to a inactive mode. When the viewport is being rendered again after the inactive period, I am getting 1/4th the normal framerate and this will last for 3-4 seconds before the 3D accelerator wakes up and kicks into full speed.
Question :
How do I prevent this from happening ?
Is there an Opengl way to prevent GPus from going into inactive mode?
Thank you for your replies.
Gary

There are several ways you can keep a GPU busy but the most sure fire way to guarantee it is doing something and not just deferring your commands is to actually draw something. glClear() and every glDraw* command constitute actual drawing commands. Throw in a glFinish() at the end of the draw to guarantee execution of the gl command stream.
Presumably you don't want to see this drawing so create a new framebuffer object, create a small RGBA texture (say 256 on a side), then attach the texture to color attachment point 0.
When you want to keep the GPU busy draw to this offscreen buffer.
This is all with the assumption that you can't, for instance, just change your boot-args or control panel settings to modulate power management behavior on the card. Every OS has different semantics here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js