hardware accelerated scaling MFT in windows7 - c++

I am searching hardware accelerated(GPU) based video scaling.I found a extensive discussion in following threads
How to use hardware video scalers? and
Hardware Accelerated Image Scaling in windows using C++
I try to stick with MFT based scaling because i also using H.264 Encoder MFT in my application.
We have two option for MFT based solution-
1. Video Resizer DSP
2. Video Processor MFT
But both these methods used MF_SA_D3D_AWARE. As mentioned below:
A video MFT has the attribute MF_SA_D3D_AWARE.aspx which can be used to query whether it supports DirectX 3D hardware acceleration, and this can be enabled by sending it the MFT_MESSAGE_SET_D3D_MANAGER message.
and MF_SA_D3D_AWARE is supported in Windows 8 onwards.
Is their any MFT for scaling which uses Hardware acceleration in windows 7 ?
I haven't investigated other two options(MFCreateVideoRenderer and IDirectXVideoProcessor::VideoProcessBlt) mentioned in How to use hardware video scalers? that it will support in windows 7 or not. But I am actively looking for MFT option on priority.

Under Windows 7, i will recommand you to use IDXVAHD_VideoProcessor
You have a sample here : DXVA-HD Sample
But i think that if you use a simple DirectXDevice9 with a DirectXTexture9, the scaling result will be the same. There is no reason that dedicated scaling process only apply to video file process. I think they are the same for both (game and video file).
The lonely thing i saw, is that you can setup the constriction mode : DXVAHD_BLT_STATE_CONSTRICTION_DATA , wich apply to downscaling, not really to upscaling.

Related

Media Foundation: Custom Topology with Direct3D 11

I am having to build a video topology manually, which includes using loading and configuring the mpeg2videoextension (decoder). Otherwise the default topoloader fails to resolve the video stream automatically. I am using the default topology loader to resolve the rest of the topology.
Since I am loading the decoder manually, the docs say that I am responsible to get the decoder the hardware acceleration manager. (This decoder is D3D11 Aware). If I create a DXGI device, then create manager in code, I can pass the manager to the decoder, and it seems to work.
The docs also say, however that "In a Media Session scenario, the video renderer creates the Direct3D 11 device."
If this is the case, how do I get a handle to that device? I assume I should be using that device in the device manager to pass into the decoder.
I'm going around in circles. All of the sample code uses IDirect3DDeviceManager9. I am unable to get those samples to work. So I decided to use 11. But I can't find any sample code that uses 11.
Can someone point me in the right direction?
Microsoft does not give a good solution for this challenge. Indeed, standard video renderer for Media Foundation is EVR and it is "aware" of Direct3D 9 only. So you cannot combine it with the decoder using common DXGI device manager. Newer Microsoft applications use a different Direct3D 11 aware renderer, which is not published as an API: you can take advantage of these rendering services as a part of wrapping APIs such as UWP or HTML5 media element playing video. MPEG-2 decoder extension targets primarily these scanarios leaving you with a problem if you are plugging this into older Media Foundation topologies.
I can think of a few solutions to this problems, none of which sound exactly perfect:
Stop using EVR and use DX11VideoRenderer instead: Microsoft gives a starting point with this sample and you are own your own to establish required wiring to share DXGI device manager.
Use multiple Direct3D devices and transfer video frames between the two; there should be graphics API interop to help transfer in efficient way, but overall this looks a sort of stupid work as of 2020 even though doable. This path looks more or less acceptable if you can accept performance hit from transfer through system memory, which makes things a tad easier to implement.
Stop using MPEG-2 decoder extension and implement your own decoder on top of lower level DXVA2 API and implement hardware assisted decoder without fallback to software, in which case you have more control over using GPU services and fitting to renderer's device.

UWP Hardware Video Decoding - DirectX12 vs Media Foundation

I would like to use DirectX 12 to load each frame of an H264 file into a texture and render it. There is however little to no information on doing this, and the Microsoft website has limited superficial documentation.
Media Foundation has plenty of examples and offers Hardware Enabled decoding. Is the Media Foundation a wrapper around DirectX or is it doing something else?
If not, how much less optimised would the Media Foundation equivalent be in comparison to a DX 12 approach?
Essentially, what are the big differences between Media Foundation and DirectX12 Video Decoding?
I am already using DirectX 12 in my engine so this is specifically regarding DX12.
Thanks in advance.
Hardware video decoding comes from DXVA (DXVA2) API. It's DirectX 11 evolution is D3D11 Video Device part of D3D11 API. Microsoft provides wrappers over hardware accelerated decoders in the format of Media Foundation API primitives, such as H.264 Video Decoder. This decoder is offering use of hardware decoding capabilities as well as fallback to software decoding scenario.
Note that even though Media Foundation is available for UWP development, your options are limited and you are not offered primitives like mentioned transform directly. However if you use higher level APIs (Media Foundation Source Reader API in particular) you can leverage hardware accelerated video decoding in your UWP application.
Media Foundation implementation offers interoperability with Direct3D 11, in the part of video encoding/decoding in particular, but not Direct3D 12. You will not be able to use Media Foundation and DirectX 12 together out of the box. You will either have to implement Direct3D 11/12 interop to transfer the data between the APIs (or, where applicable, use shared access to the same GPU data).
Or alternatively you will have to step down to underlying ID3D12VideoDevice::CreateVideoDecoder which is further evolution of mentioned DXVA2 and Direct3D 11 video decoding APIs with similar usage.
Unfortunately if Media Foundation is notoriously known for poor documentation and hard-to-start development, Direct3D 12 video decoding has zero information and you will have to enjoy a feeling of a pioneer.
Either way all the mentioned are relatively thin wrappers over hardware assisted video decoding implementation with the same great performance. I would recommend taking Media Foundation path and implement 11/12 interop if/when it becomes necessary.
You will get a lot of D3D12 errors caused by Media Foundation if you pass a D3D12 device to IMFDXGIDeviceManager::ResetDevice.
The errors could be avoided if you call IMFSourceReader::ReadSample slowly. It doesn't matter that you adopt sync or async mode to use this method. And, how slowly it should be depends on the machine that runs the program. I use ::Sleep(1) between ReadSample calls for sync mode playing a stream from network, and ::Sleep(3) for sync mode playing a local mp4 file on my machine.
Don't ask who I am. My name is 'the pioneer'.

Encoding RGB to H.264

What I am doing is trying to record the screen in windows XP and Win7. I got the bitmap by using DirectX's interface CreateOffscreenPlainSurface and GetFrontBufferData. I need to encode the bitmap into a H.264 format video. The problem is the bitmap captured is in format D3DFMT_A8R8G8B8, but the H.264 Video Encoder can only support MFVideoFormat_I420, MFVideoFormat_IYUV, MFVideoFormat_NV12, MFVideoFormat_YUY2 and MFVideoFormat_YV12 as input. My question is do I need to transfer the format myself(I do not want to)? Are there any other better solutions for this?
The input format corresponds to MFVideoFormat_ARGB32.
Stock OS component that handles the conversion is Video Processor MFT. I don't see availability information in the footer of MSDN article, however I am under impression that this MFT comes with Windows Vista, just like the whole Media Foundation API.
In Windows XP there has been a similar Color Converter DSP which offers really close services, and exposes a really close interface of DirectX Media Object (DMO). It is available in all more recent operating systems, however it is software only and never leverages GPU capability for the conversion.
These both can handle the requested format conversion for you.
Also for the reference, H.264 Video Encoder was introduced with Windows 7 only.

How to use Intel Hardware MJPEG Decoder MFT in MediaFoundation SourceReader for Window Desktop application?

I'm developing USB camera streaming Desktop application using MediaFoundation SourceReader technique. The camera is having USB3.0 support and gives 60fps for 1080p MJPG video format resolution.
I used Software MJPEG Decoder MFT to convert MJPG to YUY2 frames and then converted into the RGB32 frame to draw on the window. Instead of 60fps, I'm able to render only 30fps on the window when using this software decoder. I have posted a question on this site and got some suggestion to use Intel Hardware MJPEG Decoder MFT to solve frame drop issue.
I have faced an error 0xC00D36B5 - MF_E_NOTACCEPTING when calling IMFTransform::ProcessInput() method. To solve this error, MSDN suggested using IMFTranform interface asynchronously. So, I used IMFMediaEventGenerator interface to GetEvent for every In/Out sample. Successfully, I can process only one input sample and then continuously IMFMediaEventGenerator:: GetEvent() methods returns MF_E_NO_EVENTS_AVAILABLE error(GetEvent() is synchronous).
I have tried to configure an asynchronous callback for SourceReader as well as IMFTransform but MFAsyncCallback:: Invoke method is not invoking, hence I planned to use GetEvent method.
Am I missing anything?If Yes, Someone guides me to use Intel Hardware Decoder into my project?
Intel Hardware MJPEG Decoder MFT is an asynchronous MFT and if you are managing it directly, you are responsible to apply asynchronous model. You seem to be doing this but you don't provide information that allows nailing the problem down. Yes, you are supposed to use event model described in ProcessInput, ProcessOutput sections of the article linked above. As you get the first frame, you should debug further to make it work with smooth continuous processing.
When you use APIs like media session our source reader, you have Media Foundation itself dealing with the MFTs. It is capable of doing synchronous and asynchronous consumption when appropriate. In this case, however, you don't do IMFTransform calls and even from your vague description it comes you are doing it wrong way.

How to connect two kinect v.2 sensor to one computer

I'm updating an application which use 3 kinect v1 with sdk 1.8 connected to the same computer.
Actually i am updating my application with kinect v2, to improve the performance of my system. The last version of microsoft sdk 2.0 does not support multi sensor connection.
The only solution that i tried which works is to use three different pc,
each for kinect v.2, and exchange data through Ethernet connection.
The problem of this solution is that is too expensive.
The minimum specs of kinect 2 require expensive pc, while i was considering to use this solution just with smart small computer like raspberry 2.
My questions are:
Do you know any hack solution to provide mulitple kinect v2 sensor connection to the same computer?
Do you know any low cost, raspberry likes, solution, which respect the minimum kinect v2 requirements? (http://www.microsoft.com/en-us/kinectforwindows/purchase/sensor_setup.aspx)
When you only need the video and depth data, perhaps you could investigate to use https://github.com/OpenKinect/libfreenect2
Here I can understand if the maximum framerate could be a bit lower than what you get on an intel i5 system with USB 3.0.
The rest of the high requirements is also necessary for skeleton tracking. So this won't be available then, also as this is not present in the libfreenect2.