Whole screen capture and render in DirectX [PERFORMANCE]

Whole screen capture and render in DirectX [PERFORMANCE] - c++

I need some way to get screen data and pass them to DX9 surface/texture in my aplication and render it at at least 25fps at 1600*900 resolution, 30 would be better.
I tried BitBliting but even after that I am at 20fps and after loading data into texture and rendering it I am at 11fps which is far behind what I need.
GetFrontBufferData is out of question.
Here is something about using Windows Media API, but I am not familiar with it. Sample is saving data right into file, maybe it can be set up to give you individual frames, but I haven't found good enough documentation to try it on my own.
My code:
m_memDC.BitBlt(0, 0, m_Rect.Width(),m_Rect.Height(), //m_Rect is area to be captured
&m_dc, m_Rect.left, m_Rect.top, SRCCOPY);
//at 20-25fps after this if I comment out the rest
//DC,HBITMAP setup and memory alloc is done once at the begining
GetDIBits( m_hDc, (HBITMAP)m_hBmp.GetSafeHandle(),
0L, // Start scan line
(DWORD)m_Rect.Height(), // # of scan lines
m_lpData, // LPBYTE
(LPBITMAPINFO)m_bi, // address of bitmapinfo
(DWORD)DIB_RGB_COLORS); // Use RGB for color table
//at 17-20fps
IDirect3DSurface9 *tmp;
m_pImageBuffer[0]->GetSurfaceLevel(0,&tmp); //m_pImageBuffer is Texture of same
//size as bitmap to prevent stretching
hr= D3DXLoadSurfaceFromMemory(tmp,NULL,NULL,
(LPVOID)m_lpData,
D3DFMT_X8R8G8B8,
m_Rect.Width()*4,
NULL,
&r, //SetRect(&r,0,0,m_Rect.Width(),m_Rect.Height();
D3DX_DEFAULT,0);
//12-14fps
IDirect3DSurface9 *frameS;
hr=m_pFrameTexture->GetSurfaceLevel(0,&frameS); // Texture of that is rendered
pd3dDevice->StretchRect(tmp,NULL,frameS,NULL,D3DTEXF_NONE);
//11fps
I found out that for 512*512 square its running on 30fps (for i.e. 490*450 at 20-25) so I tried dividing screen, but it didn't seem to work well.
If there is something missing in code please write, don't vote down. Thanks

Starting with Windows 8, there is a new desktop duplication API that can be used to capture the screen in video memory, including mouse cursor changes and which parts of the screen actually changed or moved. This is far more performant than any of the GDI or D3D9 approaches out there and is really well-suited to doing things like encoding the desktop to a video stream, since you never have to pull the texture out of GPU memory. The new API is available by enumerating DXGI outputs and calling DuplicateOutput on the screen you want to capture. Then you can enter a loop that waits for the screen to update and acquires each frame in turn.
To encode the frames to a video, I'd recommend taking a look at Media Foundation. Take a look specifically at the Sink Writer for the simplest method of encoding the video frames. Basically, you just have to wrap the D3D textures you get for each video frame into IMFSample objects. These can be passed directly into the sink writer. See the MFCreateDXGISurfaceBuffer and MFCreateVideoSampleFromSurface functions for more information. For the best performance, typically you'll want to use a codec like H.264 that has good hardware encoding support (on most machines).
For full disclosure, I work on the team that owns the desktop duplication API at Microsoft, and I've personally written apps that capture the desktop (and video, games, etc.) to a video file at 60fps using this technique, as well as a lot of other scenarios. This is also used to do screen streaming, remote assistance, and lots more within Microsoft.

If you don't like the FrontBuffer, try the BackBuffer:
LPDIRECT3DSURFACE9 surface;
surface = GetBackBufferImageSurface(&fmt);
to save it to a file use
D3DXSaveSurfaceToFile(filename, D3DXIFF_JPG, surface, NULL, NULL);

Related

How to capture DirectX animations as videos

I am trying to record an animation that I created using DirectX 11 as a video, which I can present whenever needed (without re-rendering). I am still learning about DirectX and windows API.
This's what I've done so far, I was able to capture animation frames using DirectXTk and following this post. After that I'm using OpenCV to collect frames from disk, and create a video. Is there a way to merge this process? That way I'd be able to append frames into a video file right after img capture.
Code for animation capture:
static int Frame_Number;
void D3D::screenCapture() {
//For each Call to Present() do the following:
//Get Device
//ID3D11Device* baseDevice;
HRESULT gd = m_swapChain->GetDevice(__uuidof(ID3D11Device), (void**)&m_device);
assert(gd == S_OK);
//Get context
//ID3D11DeviceContext* context;
m_device->GetImmediateContext(&m_deviceContext);
//get pointer to back buffer
ID3D11Texture2D* backbufferTex;
HRESULT gb = m_swapChain->GetBuffer(0, __uuidof(ID3D11Texture2D), (LPVOID*)&backbufferTex);
assert(gb == S_OK);
//Set-up Directory
std::wstringstream Image_Directory;
Image_Directory << L"path to directory/screenShots" << Frame_Number << L".JPG";
//Capture Frame
REFGUID GUID_ContainerFormatJpeg{ 0x19e4a5aa, 0x5662, 0x4fc5, 0xa0, 0xc0, 0x17, 0x58, 0x2, 0x8e, 0x10, 0x57 };
HRESULT hr = DirectX::SaveWICTextureToFile(m_deviceContext, backbufferTex, GUID_ContainerFormatJpeg, Image_Directory.str().c_str());
assert(hr == S_OK);
Frame_Number = Frame_Number + 1;
}
I call this function after I present the rendered scene to the screen. After that I use a python script to create a video from the captured frames.
This's not optimal, especially in the case of rendering many animations. It will take forever, I would like to eliminate the reading and writing to disk. Is there a way to get frames from
SaveWICTextureToFile
that I can push in a video in a sequential manner.
How could one accomplish this?
I would really appreciate any help or pointers.

Possible, but relatively hard, many pages of code.
Here’s a tutorial written by Microsoft. You gonna need to change following there.
Integrate with Direct3D. To do that, call MFCreateDXGIDeviceManager, then IMFDXGIDeviceManager.ResetDevice, then pass that IMFDXGIDeviceManager interface in MF_SINK_WRITER_D3D_MANAGER attribute when creating the sink writer. Also, don’t forget to set MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS to TRUE, you don’t want software encoders, they are way too slow.
Replace the video codec, use MFVideoFormat_H264 instead of MFVideoFormat_WMV3, and *.mp4 extension for the output file.
The sample code encodes video frames provided in system memory. Instead, you should supply your video frames in VRAM. Every time your D3D app renders a frame, create a new D3D texture, copy your render target into that new texture with CopyResource, then call MFCreateDXGISurfaceBuffer. This will create IMFMediaBuffer object referencing a video frame in VRAM. You can then submit that video frame to the sink writer, and it should do the rest of the things automagically.
If you’ll manage to implement that correctly, Media Foundation framework gonna use a proprietary hardware transform to convert RGBA textures into NV12 textures on GPU, then use another proprietary hardware transform to encode NV12 into h.264, download encoded video samples from VRAM to system RAM as soon as ready, and append these encoded samples into mpeg-4 container file on disk.
Both of the above transforms are implemented by GPU, in hardware. All 3 GPU vendors have hardware for that, and they ship media foundation transform DLLs to use their hardware as a part of their GPU drivers.

How to decode a video file straight to a Direct3D11 texture using Windows Media Foundation?

I'd like to decode the contents of a video file to a Direct3D11 texture and avoid the copies back and forth to CPU memory. Ideally, the library will play the audio itself and call back into my code whenever a video frame has been decoded.
On the surface, the Windows Media Foundation's IMFPMediaPlayer (ie MFPCreateMediaPlayer() and IMFPMediaPlayer::CreateMediaItemFromURL()) seem like a good match, except that the player decodes straight to the app's HWND. The documentation implies that I can add a custom video sink, but I have not been able to find documentation nor sample code on how to do that. Please point me in the right direction.
Currently, I am using libVLC to accomplish the above, but it only provides the video surface in CPU memory, which can become a bottleneck for my use-case.
Thanks.

Take a look at this source code from my project 'Stackoverflow' : MFVideoEVR
This program shows how to setup EVR (enhanced video renderer), and how to provide video samples to it, using a Source Reader.
The key is to provide video samples, so you can use them for your purpose.
This program provides samples through IMFVideoSampleAllocator. It is for DirectX9 texture. You need to change source code, and to use IMFVideoSampleAllocatorEx, instead : IMFVideoSampleAllocatorEx
About MFCreateVideoSampleAllocatorEx :
This function creates an allocator for DXGI video surfaces. The buffers created by this allocator expose the IMFDXGIBuffer interface.
So to retreive texture : IMFDXGIBuffer::GetResource
You can use this method to get a pointer to the ID3D11Texture2D interface of the surface. If the buffer is locked, the method returns MF_E_INVALIDREQUEST.
You will also have to manage sound through IMFSourceReader.
With this approach, there is no copy back to system memory.
PS : You don't talk about video format (h265, h264, mpeg2, others ??). MediaFoundation doesn't handle all video format, natively.

DXGI Desktop Duplication: encoding frames to send them over the network

I'm trying to write an app which will capture a video stream of the screen and send it to a remote client. I've found out that the best way to capture a screen on Windows is to use DXGI Desktop Duplication API (available since Windows 8). Microsoft provides a neat sample which streams duplicated frames to screen. Now, I've been wondering what is the easiest, but still relatively fast way to encode those frames and send them over the network.
The frames come from AcquireNextFrame with a surface that contains the desktop bitmap and metadata which contains dirty and move regions that were updated. From here, I have a couple of options:
Extract a bitmap from a DirectX surface and then use an external library like ffmpeg to encode series of bitmaps to H.264 and send it over RTSP. While straightforward, I fear that this method will be too slow as it isn't taking advantage of any native Windows methods. Converting D3D texture to a ffmpeg-compatible bitmap seems like unnecessary work.
From this answer: convert D3D texture to IMFSample and use MediaFoundation's SinkWriter to encode the frame. I found this tutorial of video encoding, but I haven't yet found a way to immediately get the encoded frame and send it instead of dumping all of them to a video file.
Since I haven't done anything like this before, I'm asking if I'm moving in the right direction. In the end, I want to have a simple, preferably low latency desktop capture video stream, which I can view from a remote device.
Also, I'm wondering if I can make use of dirty and move regions provided by Desktop Duplication. Instead of encoding the frame, I can send them over the network and do the processing on the client side, but this means that my client has to have DirectX 11.1 or higher available, which is impossible if I would want to stream to a mobile platform.

You can use IMFTransform interface for H264 encoding. Once you get IMFSample from ID3D11Texture2D just pass it to IMFTransform::ProcessInput and get the encoded IMFSample from IMFTransform::ProcessOutput.
Refer this example for encoding details.
Once you get the encoded IMFSamples you can send them one by one over the network.

Why do DirectX fullscreen applications give black screenshots?

You may know that trying to capture DirectX fullscreen applications the GDI way (using BitBlt()) gives a black screenshot.
My question is rather simple but I couldn't find any answer: why? I mean technically, why does it give a black screenshot?
I'm reading a DirectX tutorial here: http://www.directxtutorial.com/Lesson.aspx?lessonid=9-4-1. It's written:
[...] the function BeginScene() [...] does something called locking, where the buffer in the video RAM is 'locked', granting you exclusive access to this memory.
Is this the reason? VRAM is locked so GDI can't access it and it gives a black screenshot?
Or is there another reason? Like DirectX directly "talks" to the graphic card and GDI doesn't get it?
Thank you.

The reason is simple: performance.
The idea is to render a scene as much as possible on the GPU out of lock-step with the CPU. You use the CPU to send the rendering buffers to the GPU (vertex, indices, shaders etc), which is overall really cheap because they're small, then you do whatever you want, physics, multiplayer sync etc. The GPU can just crunch the data and render it on its own.
If you require the scene to be drawn on the window, you have to interrupt the GPU, ask for the rendering buffer bytes (LockRect), ask for the graphics object for the window (more interference with the GPU), render it and free every lock. You just lost any sort of gain you had by rendering on the GPU out of sync with the CPU. Even worse when you think of all the different CPU cores just sitting idle because you're busy "rendering" (more like waiting on buffer transfers).
So what graphics drivers do is they paint the rendering area with a magic color and tell the GPU the position of the scene, and the GPU takes care of overlaying the scene over the displayed screen based on the magic color pixels (sort of a multi-pass pixel shader that takes from the 2nd texture when the 1st texture has a certain color for x,y, but not that slow). You get completely out of sync rendering, but when you ask the OS for its video memory, you get the magic color where the scene is because that's what it actually uses.
Reference: http://en.wikipedia.org/wiki/Hardware_overlay

I believe it is actually due to double buffering. I'm not 100% sure but that was actually the case when I tested screenshots in OpenGL. I would notice that the DC on my window was not the same. It was using two different DC's for this one game.. For other games I wasn't sure what it was doing. The DC was the same but swapbuffers was called so many times that I don't think GDI was even fast enough to screenshot it.. Sometimes I would get half a screenshot and half black..
However, when I hooked into the client, I was able to just ask for the pixels like normal. No GDI or anything. I think there is a reason why we don't use GDI when drawing in games that use DirectX or OpenGL..
You can always look at ways to capture the screen here:http://www.codeproject.com/Articles/5051/Various-methods-for-capturing-the-screen
Anyway, I use the following for grabbing data from DirectX:
HRESULT DXI_Capture(IDirect3DDevice9* Device, const char* FilePath)
{
IDirect3DSurface9* RenderTarget = nullptr;
HRESULT result = Device->GetBackBuffer(0, 0, D3DBACKBUFFER_TYPE_MONO, &RenderTarget);
result = D3DXSaveSurfaceToFile(FilePath, D3DXIFF_PNG, RenderTarget, nullptr, nullptr);
SafeRelease(RenderTarget);
return result;
}
Then in my hooked Endscene I call it like so:
HRESULT Direct3DDevice9Proxy::EndScene()
{
DXI_Capture(ptr_Direct3DDevice9, "C:/Ssers/School/Desktop/Screenshot.png");
return ptr_Direct3DDevice9->EndScene();
}
You can either use microsoft detours for hooking EndScene of some external application or you can use a wrapper .dll.

Combining Direct3D, Axis to make multiple IP camera GUI

Right now, what I'm trying to do is to make a new GUI, essentially a software using directX (more exact, direct3D), that display streaming images from Axis IP cameras.
For the time being I figured that the flow for the entire program would be like this:
1. Get the Axis program to get streaming images
2. Pass the images to the Direct3D program.
3. Display the program, on the screen.
Currently I have made a somewhat basic Direct3D app that loads and display video frames from avi videos(for testing). I dunno how to load images directly from videos using DirectX, so I used OpenCV to save frames from the video and have DX upload them up. Very slow.
Right now I have some unclear things:
1. How to Get an Axis program that works in C++ (gonna look up examples later, prolly no big deal)
2. How to upload images directly from the Axis IP camera program.
So guys, do you have any recommendations or suggestions on how to make my program work more efficiently? Anything just let me know.

Well you may find it faster to use directshow and add a custom renderer at the far end that, directly, copies the decompressed video data directly to a Direct3D texture.
Its well worth double buffering that texture. ie have texture 0 displaying and texture 1 being uploaded too and then swap the 2 over when a new frame is available (ie display texture 1 while uploading to texture 0).
This way you can de-couple the video frame rate from the rendering frame rate which makes dropped frames a little easier to handle.

I use in-place update of Direct3D textures (using IDirect3DTexture9::LockRect) and it works very fast. What part of your program works slow?

For capture images from Axis cams you may use iPSi c++ library: http://sourceforge.net/projects/ipsi/
It can be used for capturing images and control camera zoom and rotation (if available).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js