UWP, Media Foundation, choosing specific encoder - c++

I would like to choose a specific encoder in Media Foundation under UWP using c++/cx. Currently I use a SinkWriter and let the system choose a default encoder.
This code returns "class not registered" error under UWP, but it works in a win32 console app:
CoInitializeEx(NULL, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE);
MFStartup(MF_VERSION);
IMFTransform* mtf;
CLSID id;
CLSIDFromString(L"{966F107C-8EA2-425D-B822-E4A71BEF01D7}", &id); // "NVIDIA HEVC Encoder MFT"
//CLSIDFromString(L"{F2F84074-8BCA-40BD-9159-E880F673DD3B}", &id); // "H265 Encoder MFT"
//CLSIDFromString(L"{BC10864D-2B34-408F-912A-102B1B867B6C}", &id); // "Intel« Hardware H265 Encoder MFT"
//HRESULT hr = CoCreateInstance(id, nullptr, CLSCTX_INPROC_SERVER, IID_IMFTransform, (void **)&mtf);
HRESULT hr = CoCreateInstance(id, nullptr, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&mtf));
I also noticed that MFTEnumEx() is not definded in the header files under UWP, so I can't enumerate the encoders.
I noticed there is C# documentation allowing something like this:
auto codecQuery = ref new Windows::Media::Core::CodecQuery();
But it seems it not available when using c++/cx.
I would also like to ask the SinkWriter what encoder it actually chose, but this code does not work because ICodecAPI is undefined:
IMFTransform* pEncoder = NULL;
mWriter->GetServiceForStream(MF_SOURCE_READER_FIRST_VIDEO_STREAM, GUID_NULL, IID_IMFTransform, (void**)&pEncoder);
if (pEncoder)
{
ICodecAPI* pCodecApi = NULL;
hr = pEncoder->QueryInterface<ICodecAPI>(&pCodecApi);
}
Please help me choose encoder or find out which encoder was chosen?

Media Foundation does not offer flexibility to specify encoder using Sink Writer API. You can only instruct to use or not use hardware encoder, using MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS attribute:
Enables the source reader or sink writer to use hardware-based Media Foundation transforms (MFTs).
Once Sink Writer is set up, you can use IMFSinkWriterEx::GetTransformForStream to enumerate the transforms the API prepared for the processing and pick the encoder from the enumeration. This is going to give you an idea what encoder is actually used.
Media Foundation Sink Writer API reserves the right to decode which encoder to use. Typically if would prefer certified compatible encoder, especially if you enable Direct3D scenario.
Finally, I am not sure which of these is available for C++/CX, but your code snippets suggest that the mentioned API is available.
To use encoder of your choice, you are supposed to use Media Foundation Media Session API, as opposed to Sink Writer.

Thank you Roman. I tried GetTranformForStream. With nvidia driver I get the attributes for the IMFTransform:
{206B4FC8-FCF9-4C51-AFE3-9764369E33A0}=1,
{2FB866AC-B078-4942-AB6C-003D05CDA674}=NVIDIA HEVC Encoder MFT,
FRIENDLY_NAME_Attribute=NVIDIA HEVC Encoder MFT,
{3AECB0CC-035B-4BCC-8185-2B8D551EF3AF}=VEN_10DE,
MAJOR_TYPE=Video,
{53476A11-3F13-49FB-AC42-EE2733C96741}=1,
{86A355AE-3A77-4EC4-9F31-01149A4E92DE}=1,
{88A7CB15-7B07-4A34-9128-E64C6703C4D3}=8,
{E3F2E203-D445-4B8C-9211-AE390D3BA017}=2303214,
{E5666D6B-3422-4EB6-A421-DA7DB1F8E207}=1,
{F34B9093-05E0-4B16-993D-3E2A2CDE6AD3}=860522,
SUBTYPE=Base,
{F81A699A-649A-497D-8C73-29F8FED6AD7A}=1,
When disabling nvidia driver I only get:
{86A355AE-3A77-4EC4-9F31-01149A4E92DE}=1
I wonder if the last transform is a list of several transforms? How to get them? Can I traverse the topology from sinkwriter?
My pc has the following codecs I could use:
{966F107C-8EA2-425D-B822-E4A71BEF01D7} // "NVIDIA HEVC Encoder MFT"
{F2F84074-8BCA-40BD-9159-E880F673DD3B} // "H265 Encoder MFT"
{BC10864D-2B34-408F-912A-102B1B867B6C} // "Intel« Hardware H265 Encoder MFT"
In the nvidia case, I get a meaningful string, but not when it is not nvidia apparently (Intel or software).
Now I will also try look into Media Session API as you suggested.

Related

MFTEnumEx cannot find MFAudioFormat_MP3 decoder on Windows 7?

As an offshoot question from: IMFTransform SetInputType()/SetOutputType() fails
When I try to enumerate MP3 decoders on Windows 7 it fails to find any MP3 decoders? However it appears to find one when I set a partial media type for a IMFSourceReader for an MP3 file created by MFCreateSourceReaderFromURL.
I have tried:
MFT_REGISTER_TYPE_INFO outType{ MFMediaType_Audio, MFAudioFormat_Float }; // And MFAudioFormat_PCM, MFAudioFormat_Float
MFT_REGISTER_TYPE_INFO inType{ MFMediaType_Audio, MFAudioFormat_MP3 };
IMFActivate** decoders;
UINT32 decoderCount;
HRESULT hr;
hr = MFTEnumEx(MFT_CATEGORY_AUDIO_DECODER, MFT_ENUM_FLAG_SYNCMFT, &inType, &outType, &decoders, &decoderCount);
SUCCEEDED(hr);
I believe I have tried all the different flags to MFTEnumEx but decoderCount still gives zero?
Windows 7 SP1 decoder:
MP3 Decoder MFT
MFT_TRANSFORM_CLSID_Attribute: {BBEEA841-0A63-4F52-A7AB-A9B3A84ED38A} (Type VT_CLSID)
MF_TRANSFORM_FLAGS_Attribute: MFT_ENUM_FLAG_SYNCMFT
MFT_INPUT_TYPES_Attributes: MFAudioFormat_MP3
MFT_OUTPUT_TYPES_Attributes: MFAudioFormat_PCM
The decoder does not advertise support for MFAudioFormat_Float for decoding (even tough it can support it too once instantiated). When you enumerate decoders limiting output to MFAudioFormat_Float the decoder is excluded. Newer versions of OS might either have updated decoder with more output format options.
If you did it this way:
MFT_REGISTER_TYPE_INFO outType { MFMediaType_Audio, MFAudioFormat_PCM };
or nullptr output media type, the decoder would be enumerated.
Also Source Reader API, generally speaking, uses the same MFTEnum logic in order to fit actual source media type to requested media type.
When enumerating also pay attention to flags: it might be not so important to you to pick exactly synchronous MFT, but your API call suggests you are requesting to skip asynchronous.

FFMPEG with C++ accessing a webcam

I have searched all around and can not find any examples or tutorials on how to access a webcam using ffmpeg in C++. Any sample code or any help pointing me to some documentation, would greatly be appreciated.
Thanks in advance.
I have been working on this for months now. Your first "issue" is that ffmpeg (libavcodec and other ffmpeg libs) does NOT access web cams, or any other device.
For a basic USB webcam, or audio/video capture card, you first need driver software to access that device. For linux, these drivers fall under the Video4Linux (V4L2 as it is known) category, which are modules that are part of most distros. If you are working with MS Windows, then you need to get an SDK that allows you to access the device. MS may have something for accessing generic devices, (but from my experience, they are not very capable, if they work at all) If you've made it this far, then you now have raw frames (video and/or audio).
THEN you get to the ffmpeg part - libavcodec - which takes the raw frames (audio and/or video) and encodes them into a streams, which ffmpeg can then mux into your final container.
I have searched, but have found very few examples of all of these, and most are piece-meal.
If you don't need to actually code of this yourself, the command line ffmpeg, as well as vlc, can access these devices, capture and save to files, and even stream.
That's the best I can do for now.
ken
For windows use dshow
For Linux (like ubuntu) use Video4Linux (V4L2).
FFmpeg can take input from V4l2 and can do the process.
To find the USB video path type : ls /dev/video*
E.g : /dev/video(n) where n = 0 / 1 / 2 ….
AVInputFormat – Struct which holds the information about input device format / media device format.
av_find_input_format ( “v4l2”) [linux]
av_format_open_input(AVFormatContext , “/dev/video(n)” , AVInputFormat , NULL)
if return value is != 0 then error.
Now you have accessed the camera using FFmpeg and can continue the operation.
sample code is below.
int CaptureCam()
{
avdevice_register_all(); // for device
avcodec_register_all();
av_register_all();
char *dev_name = "/dev/video0"; // here mine is video0 , it may vary.
AVInputFormat *inputFormat =av_find_input_format("v4l2");
AVDictionary *options = NULL;
av_dict_set(&options, "framerate", "20", 0);
AVFormatContext *pAVFormatContext = NULL;
// check video source
if(avformat_open_input(&pAVFormatContext, dev_name, inputFormat, NULL) != 0)
{
cout<<"\nOops, could'nt open video source\n\n";
return -1;
}
else
{
cout<<"\n Success !";
}
} // end function
Note : Header file < libavdevice/avdevice.h > must be included
This really doesn't answer the question as I don't have a pure ffmpeg solution for you, However, I personally use Qt for webcam access. It is C++ and will have a much better API for accomplishing this. It does add a very large dependency on your code however.
It definitely depends on the webcam - for example, at work we use IP cameras that deliver a stream of jpeg data over the network. USB will be different.
You can look at the DirectShow samples, eg PlayCap (but they show AmCap and DVCap samples too). Once you have a directshow input device (chances are whatever device you have will be providing this natively) you can hook it up to ffmpeg via the dshow input device.
And having spent 5 minutes browsing the ffmpeg site to get those links, I see this...

Get encoder name from SinkWriter or ICodecAPI or IMFTransform

I'm using the SinkWriter in order to encode video using media foundation.
After I initialize the SinkWriter, I would like to get the underlying encoder it uses, and print out its name, so I can see what encoder it uses. (In my case, the encoder is most probably the H.264 Video Encoder included in MF).
I can get references to the encoder's ICodecAPI and IMFTransform interface (using pSinkWriter->GetServiceForStream), but I don't know how to get the encoder's friendly name using those interfaces.
Does anyone know how to get the encoder's friendly name from the sinkwriter? Or from its ICodecAPI or IMFTransform interface?
This is by far an effective solution and i am not 100% sure it works, but what could be done is:
1) At start-up enumerate all the codecs that could be used (as i understand in this case H264 encoders) and subscribe to setting change event
MFT_REGISTER_TYPE_INFO TransformationOutput = { MFMediaType_Video, MFVideoFormat_H264 };
DWORD nFlags = MFT_ENUM_FLAG_ALL;
UINT32 nCount = 0;
CLSID* pClsids;
MFTEnum( MFT_CATEGORY_VIDEO_ENCODER, nFlags, NULL, &TransformationOutput, NULL, &pClsids, &nCount);
// Ok here we assume nCount is 1 and we got the MS encoder
ICodecAPI *pMsEncoder;
hr = CoCreateInstance(pClsids[0], NULL, CLSCTX_INPROC_SERVER, __uuidof(ICodecAPI), (void**)&pMsEncoder);
// nCodecIds is supposed to be an array of identifiers to distinguish the sender
hr = pMsEncoder->RegisterForEvent(CODECAPI_AVEncVideoOutputFrameRate, (LONG_PTR)&nCodecIds[0]);
2) Not 100% sure if the frame rate setting is also set when the input media type for the stream is set, but anyhow you can try to set the same property on the ICodecAPI you retrieved from the SinkWriter. Then after getting the event you should be able to identify the codec by comparing lParam1 to the value passed. But still this is very poor since it relies on the fact that all the encoders support the event notification and requires unneeded parameter changing if my hypothesis about the event being generated on stream construction is wrong.
Having IMFTransform you don't have a friendly name of the encoder.
One of the options you have is to check transform output type and compare to well known GUIDs to identify the encoder, in particular you are going to have a subtype of MFVideoFormat_H264 with H264 Encoder MFT.
Another option is to reach CLSID of the encoder (IMFTransform does not get you it, but you might have it otherwise such as via IMFActivate or querying MFT_TRANSFORM_CLSID_Attribute attribute, or via IPersist* interfaces). Then you could look registry up for a friendly name or enumerate transforms and look your one in that list by comparing CLSID.

How to set bitrate of IVP8Encoder filter in a DirectShow application

How to set bitrate of vp8encoder filter in directshow application (c++ code). my graph looks like this.
Webcam --->Webm VP8 encoder -->AVI mux --->file writer(.avi)
I'm able to set bitrate in graphedit by right clicking vp8encoder->properties. But i want to set bitrate using c++ code in directshow application. I'm new to directshow please provide sample code . Thanks in advance
The subject suggests that you already have IVP8Encoder interface on hands (which also goes in line with the fact that you do have IDL files and their derivatives).
IVP8Encoder::SetTargetBitrate is the method that does the thing.
//Target data rate
//
//Target bandwidth to use for this stream, in kilobits per second.
//The value 0 means "use the codec default".
HRESULT SetTargetBitrate([in] int Bitrate);
HRESULT GetTargetBitrate([out] int* pBitrate);

Sound from mic vs sound from speaker

I want to capture audio from both the mic and the speaker - separately. How can I distinguish between them? I can capture one or the other using the Wave API, e.g., WaveInOpen().
When I enumerate the devices using waveInGetNumDevs() and waveInGetDevCaps()/waveoutGetDevCaps(), there seems to be no information related to a particular end-point device (e.g., mic or speaker). I only see the following, which are adapter devices:
HD Read Audio Input
HD Read Audio Output
Webcam ...
I've actually no knowledge of the windows API so my answer isn't probably the best and there maybe even better ways.
HRESULT hr = CoInitialize(NULL);
IMMDeviceEnumerator *pEnum = NULL;
hr = CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_ALL, __uuidof(IMMDeviceEnumerator), (void**)&pEnum);
if(SUCCEEDED(hr))
{
IMMDeviceCollection *pDevices;
// Enumerate the output devices.
hr = pEnum->EnumAudioEndpoints(eAll, DEVICE_STATE_ACTIVE, &pDevices);
// You can choose between eAll, eCapture or eRender
}
With that you'd be able to distinguish between input (capture) and output (render).
(That's what you wanted right?)
The code is taken from this article. You may look at it for the correct API calls and libraries, it even might give you some more information.
Hope that's helpfull.