Playing audio through WASAPI - IAudioRenderClient and getting static - c++

I'm trying to learn how to play audio through my default playback device using WASAPI. My test is capturing audio from my default device and sending it to the playback device to render so that I can hear myself talk through my speakers.
It appears that the data I'm getting from my capture device is working fine, but when I send it to the IAudioRenderClient buffer, sometimes I get just pure static and sometimes I just get silence.
If you think you need to see my capture device logic, just ask. I haven't posted it since it appears to me to be a playback device issue.
My devices are working properly in other apps. So it's not a device problem.
Can anyone tell me what I'm doing wrong?
Playback Device Init
#define REFTIMES_PER_SEC 48000
bool AudioManager::InitPlaybackDevice()
{
HRESULT hr;
WAVEFORMATEX* pWfx = nullptr;
BYTE* pData = nullptr;
UINT32 nNumFramesPadding = 0;
// Query the Enumerator for the default playback device
hr = m_pEnumerator->GetDefaultAudioEndpoint(eRender, eCommunications, &m_pPlaybackDevice);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pPlaybackDevice->Activate(IID_IAudioClient, CLSCTX_ALL, NULL, (void**)&m_pAudioRenderClient);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pAudioRenderClient->GetMixFormat(&pWfx);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pAudioRenderClient->Initialize(AUDCLNT_SHAREMODE_SHARED, 0, REFTIMES_PER_SEC, 0, pWfx, NULL);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pAudioRenderClient->GetBufferSize(&m_nPlaybackBufferFrameCount);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pAudioRenderClient->GetService(IID_IAudioRenderClient, (void**)&m_pAudioRenderClientObj);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pAudioRenderClient->GetCurrentPadding(&nNumFramesPadding);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pAudioRenderClientObj->GetBuffer(m_nPlaybackBufferFrameCount - nNumFramesPadding, &pData);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pAudioRenderClientObj->ReleaseBuffer(m_nPlaybackBufferFrameCount - nNumFramesPadding, AUDCLNT_BUFFERFLAGS_SILENT);
EXIT_ON_ERROR_SHUTDOWN(hr);
hr = m_pAudioRenderClient->Start();
EXIT_ON_ERROR(hr);
return true;
}
Sending audio data to the playback device
bool AudioManager::SendSampleToPlaybackDevice(BYTE * i_pData, UINT32 i_nSize)
{
if (m_pAudioRenderClientObj != nullptr)
{
HRESULT hr;
BYTE* pData = nullptr;
UINT32 nNumFramesPadding = 0;
int nNumFrames = 0;
hr = m_pAudioRenderClient->GetCurrentPadding(&nNumFramesPadding);
EXIT_ON_ERROR(hr);
nNumFrames = min(i_nSize, m_nPlaybackBufferFrameCount - nNumFramesPadding);
hr = m_pAudioRenderClientObj->GetBuffer(nNumFrames, &pData);
EXIT_ON_ERROR(hr);
if(pData != nullptr)
memcpy(pData, i_pData, nNumFrames);
hr = m_pAudioRenderClientObj->ReleaseBuffer(nNumFrames, 0);
EXIT_ON_ERROR(hr);
return true;
}
return false;
}

Related

How to use sink write to write raw pcm data to an aac file,ultimate goal is use sink write to accept rgb data and pcm data into an overall video file

I have been able to use sink write to read rgb32 data in a file to generate an mp4 video, please refer to the example on the official website https://learn.microsoft.com/zh-cn/windows/win32/medfound/tutorial--using-the-sink-writer-to-encode-video
Now,I wanted a similar idea to write a pcm generated data to an aac or mp3 file, but there were some problems with the code:
HRESULT InitializeSinkWriter(IMFSinkWriter** ppWriter, DWORD* pStreamIndex)
{
HRESULT hr = S_OK;
*ppWriter = NULL;
*pStreamIndex = NULL;
DWORD streamIndex;
IMFSinkWriter* pSinkWriter = NULL;
IMFMediaType* pMediaTypeOut = NULL;
IMFMediaType* pMediaTypeIn = NULL;
// Set the input media type.
if (SUCCEEDED(hr))
{
hr = MFCreateMediaType(&pMediaTypeIn);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetGUID(MF_MT_SUBTYPE, _recorderConf.audioInPutFormat);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetUINT32(MF_MT_AUDIO_NUM_CHANNELS, _recorderConf.audioChannelNum);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, _recorderConf.audioSamplePerSec);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetUINT32(MF_MT_AUDIO_BITS_PER_SAMPLE, _recorderConf.audioBitPerSample);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetUINT32(MF_MT_AUDIO_AVG_BYTES_PER_SECOND, _recorderConf.audioBytesPerSecond);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, _recorderConf.audioSampleIndependent);
}
if (SUCCEEDED(hr))
{
hr = pSinkWriter->SetInputMediaType(streamIndex, pMediaTypeIn, NULL);
}
}
hr = pSinkWriter->SetInputMediaType(streamIndex, pMediaTypeIn, NULL);
The hr return value for this line is:
0xc00d36b4 : The data specified for the media type is invalid, inconsistent, or not supported by this object。
I want to know what went wrong and what is the right way to do it
I want to know what went wrong and what is the right way to do it

capturing bluetooth audio data with WASAPI

I'm working on a small project where I need to mix incoming data from my internal microphone and a bluetooth headset (Bose). I thought I'd use WASAPI for it. However, whilst I can easily read out my internal microphone, this is not the case for my Bluetooth headphones.
I followed the example given by the Docs almost to the letter, however I made a small change to be able to choose my own "input device" (internal microphone or headset) using th ID they have been given.
Tho method is the following:
HRESULT RecordAudioStreamBLE(MyAudioSink *pMySink, LPWSTR pwszID)
{
HRESULT hr;
REFERENCE_TIME hnsRequestedDuration = REFTIMES_PER_SEC;
REFERENCE_TIME hnsActualDuration;
UINT32 bufferFrameCount;
UINT32 numFramesAvailable;
IMMDeviceEnumerator *pEnumerator = NULL;
IMMDevice *pDevice = NULL;
IAudioClient *pAudioClient = NULL;
IAudioCaptureClient *pCaptureClient = NULL;
WAVEFORMATEX *pwfx = NULL;
UINT32 packetLength = 0;
BOOL bDone = FALSE;
BYTE *pData;
DWORD flags;
hr = CoInitialize(0);
hr = CoCreateInstance(
CLSID_MMDeviceEnumerator, NULL,
CLSCTX_ALL, IID_IMMDeviceEnumerator,
(void**)&pEnumerator);
EXIT_ON_ERROR(hr)
hr = pEnumerator->GetDevice(pwszID, &pDevice);
EXIT_ON_ERROR(hr)
hr = pDevice->Activate(IID_IAudioClient, CLSCTX_ALL,
NULL, (void**)&pAudioClient);
EXIT_ON_ERROR(hr)
hr = pAudioClient->GetMixFormat(&pwfx);
EXIT_ON_ERROR(hr)
hr = pAudioClient->Initialize(AUDCLNT_SHAREMODE_SHARED,
0, hnsRequestedDuration,
0, pwfx, NULL);
EXIT_ON_ERROR(hr)
// Get the size of the allocated buffer.
hr = pAudioClient->GetBufferSize(&bufferFrameCount);
EXIT_ON_ERROR(hr)
hr = pAudioClient->GetService(IID_IAudioCaptureClient,
(void**)&pCaptureClient);
EXIT_ON_ERROR(hr)
// Notify the audio sink which format to use.
hr = pMySink->SetFormat(pwfx);
EXIT_ON_ERROR(hr)
// Calculate the actual duration of the allocated buffer.
hnsActualDuration = (double)REFTIMES_PER_SEC * bufferFrameCount / pwfx->nSamplesPerSec;
hr = pAudioClient->Start(); // Start recording.
EXIT_ON_ERROR(hr)
// Each loop fills about half of the shared buffer.
while (bDone == FALSE)
{
// Sleep for half the buffer duration.
Sleep(hnsActualDuration/REFTIMES_PER_MILLISEC/2);
hr = pCaptureClient->GetNextPacketSize(&packetLength);
EXIT_ON_ERROR(hr)
printf("packet size = %d\n", packetLength);
while (packetLength != 0)
{
// Get the available data in the shared buffer.
hr = pCaptureClient->GetBuffer( &pData,
&numFramesAvailable,
&flags, NULL, NULL);
EXIT_ON_ERROR(hr)
if (flags & AUDCLNT_BUFFERFLAGS_SILENT)
{
pData = NULL; // Tell CopyData to write silence.
}
// Copy the available capture data to the audio sink.
hr = pMySink->CopyData(pData, numFramesAvailable, &bDone);
EXIT_ON_ERROR(hr)
hr = pCaptureClient->ReleaseBuffer(numFramesAvailable);
EXIT_ON_ERROR(hr)
hr = pCaptureClient->GetNextPacketSize(&packetLength);
EXIT_ON_ERROR(hr)
}
}
hr = pAudioClient->Stop(); // Stop recording.
EXIT_ON_ERROR(hr)
Exit:
printf("%s\n", hr);
CoTaskMemFree(pwfx);
SAFE_RELEASE(pEnumerator)
SAFE_RELEASE(pDevice)
SAFE_RELEASE(pAudioClient)
SAFE_RELEASE(pCaptureClient)
return hr;
}
When doing this with the ID for the internal microphone, everything works normally. However, if the ID of the bluetooth device is used I get the following output: (I first list all the active devices and choose the ID in the terminal:
Endpoint 0: "S24D330 (Intel(R) Display Audio)" ({0.0.0.00000000}.{0f483f83-6e29-482e-94b5-fb9cc257a03d})
Endpoint 1: "Hoofdtelefoon (LE-My headphone Hands-Free AG Audio)" ({0.0.0.00000000}.{75c8e0c3-2e44-4538-940d-f8c2ae6424ca})
Endpoint 2: "Hoofdtelefoon (My headphone Stereo)" ({0.0.0.00000000}.{849b4cc2-ed72-4d40-9725-d1ce6b4abfa0})
Endpoint 3: "Luidsprekers (2- High Definition Audio Device)" ({0.0.0.00000000}.{cb8d7625-257e-4fd0-84b8-26de6aeb1e1b})
Endpoint 4: "Microfoon (2- High Definition Audio Device)" ({0.0.1.00000000}.{5edec961-7a46-4554-bdcd-43fb7d9a9d9a})
Endpoint 5: "Hoofdtelefoon (LE-My headphone Hands-Free AG Audio)" ({0.0.1.00000000}.{f937a0fa-1475-495b-81be-7aec0c1c7ea5})
Which input would you like?5
samples per second 16000
packet size = 0
packet size = 0
packet size = 0
packet size = 0
packet size = 0
packet size = 0
packet size = 0
packet size = 0
packet size = 0
^C
The packet size is printed in the loop in the method shown above. Shows that the packet size each time is just 0.
Does anybody know how to fix this, and just get "regular" data out of it?
Do I maybe need to use a different API? Speed is key however.
With kind regards
Many Bluetooth headsets support both the A2DP profile for stereo playback and the Hands-Free protocol for bidirectional mono communication, but not at the same time.
If one of these protocols is active, the other will stall.
I suspect the problem in your case is that something is playing to the A2DP endpoint "Hoofdtelefoon (My headphone Stereo)" and that is causing all activity on the two HF endpoints to stall.

How to use IAudioClient3 (WASAPI) with Real-Time Work Queue API

I'm working on a lowest-possible latency MIDI synthetizer software. I'm aware of ASIO and other alternatives, but as they have apparently made significant improvements to the WASAPI stack (in shared mode, at least), I'm curious to try it out. I first wrote a simple event-driven version of program, but as that's not the recommended way to do low-latency audio on Windows 10 (according to the docs), I'm trying to migrate to the Real-Time Work Queue API.
The documentation on Low Latency Audio states that it is recommended to use the Real-Time Work Queue API or MFCreateMFByteStreamOnStreamEx with WASAPI in order for the OS to manage work items in a way that will avoid interference from non-audio subsystems. This seems like a good idea, but the latter option seems to require some managed code (demonstrated in this WindowsAudioSession example), which I know nothing about and would preferably avoid (also the header Robytestream.h which has defs for the IRandomAccessStream isn't found on my system either).
The RTWQ example included in the docs is incomplete (doesn't compile as such), and I have made the necessary additions to make it compilable:
class my_rtqueue : IRtwqAsyncCallback {
public:
IRtwqAsyncResult* pAsyncResult;
RTWQWORKITEM_KEY workItemKey;
DWORD WorkQueueId;
STDMETHODIMP GetParameters(DWORD* pdwFlags, DWORD* pdwQueue)
{
HRESULT hr = S_OK;
*pdwFlags = 0;
*pdwQueue = WorkQueueId;
return hr;
}
//-------------------------------------------------------
STDMETHODIMP Invoke(IRtwqAsyncResult* pAsyncResult)
{
HRESULT hr = S_OK;
IUnknown* pState = NULL;
WCHAR className[20];
DWORD bufferLength = 20;
DWORD taskID = 0;
LONG priority = 0;
BYTE* pData;
hr = render_info.renderclient->GetBuffer(render_info.buffer_framecount, &pData);
ERROR_EXIT(hr);
update_buffer((unsigned short*)pData, render_info.framesize_bytes / (2*sizeof(unsigned short))); // 2 channels, sizeof(unsigned short) == 2
hr = render_info.renderclient->ReleaseBuffer(render_info.buffer_framecount, 0);
ERROR_EXIT(hr);
return S_OK;
}
STDMETHODIMP QueryInterface(const IID &riid, void **ppvObject) {
return 0;
}
ULONG AddRef() {
return 0;
}
ULONG Release() {
return 0;
}
HRESULT queue(HANDLE event) {
HRESULT hr;
hr = RtwqPutWaitingWorkItem(event, 1, this->pAsyncResult, &this->workItemKey);
return hr;
}
my_rtqueue() : workItemKey(0) {
HRESULT hr = S_OK;
IRtwqAsyncCallback* callback = NULL;
DWORD taskId = 0;
WorkQueueId = RTWQ_MULTITHREADED_WORKQUEUE;
//WorkQueueId = RTWQ_STANDARD_WORKQUEUE;
hr = RtwqLockSharedWorkQueue(L"Pro Audio", 0, &taskId, &WorkQueueId);
ERROR_THROW(hr);
hr = RtwqCreateAsyncResult(NULL, reinterpret_cast<IRtwqAsyncCallback*>(this), NULL, &pAsyncResult);
ERROR_THROW(hr);
}
int stop() {
HRESULT hr;
if (pAsyncResult)
pAsyncResult->Release();
if (0xFFFFFFFF != this->WorkQueueId) {
hr = RtwqUnlockWorkQueue(this->WorkQueueId);
if (FAILED(hr)) {
printf("Failed with RtwqUnlockWorkQueue 0x%x\n", hr);
return 0;
}
}
return 1;
}
};
And so, the actual WASAPI code (HRESULT error checking is omitted for clarity):
void thread_main(LPVOID param) {
HRESULT hr;
REFERENCE_TIME hnsRequestedDuration = 0;
IMMDeviceEnumerator* pEnumerator = NULL;
IMMDevice* pDevice = NULL;
IAudioClient3* pAudioClient = NULL;
IAudioRenderClient* pRenderClient = NULL;
WAVEFORMATEX* pwfx = NULL;
HANDLE hEvent = NULL;
HANDLE hTask = NULL;
UINT32 bufferFrameCount;
BYTE* pData;
DWORD flags = 0;
hr = RtwqStartup();
// also, hr is checked for errors every step of the way
hr = CoInitialize(NULL);
hr = CoCreateInstance(
CLSID_MMDeviceEnumerator, NULL,
CLSCTX_ALL, IID_IMMDeviceEnumerator,
(void**)&pEnumerator);
hr = pEnumerator->GetDefaultAudioEndpoint(
eRender, eConsole, &pDevice);
hr = pDevice->Activate(
IID_IAudioClient, CLSCTX_ALL,
NULL, (void**)&pAudioClient);
WAVEFORMATEX wave_format = {};
wave_format.wFormatTag = WAVE_FORMAT_PCM;
wave_format.nChannels = 2;
wave_format.nSamplesPerSec = 48000;
wave_format.nAvgBytesPerSec = 48000 * 2 * 16 / 8;
wave_format.nBlockAlign = 2 * 16 / 8;
wave_format.wBitsPerSample = 16;
UINT32 DP, FP, MINP, MAXP;
hr = pAudioClient->GetSharedModeEnginePeriod(&wave_format, &DP, &FP, &MINP, &MAXP);
printf("DefaultPeriod: %u, Fundamental period: %u, min_period: %u, max_period: %u\n", DP, FP, MINP, MAXP);
hr = pAudioClient->InitializeSharedAudioStream(AUDCLNT_STREAMFLAGS_EVENTCALLBACK, MINP, &wave_format, 0);
my_rtqueue* workqueue = NULL;
try {
workqueue = new my_rtqueue();
}
catch (...) {
hr = E_ABORT;
ERROR_EXIT(hr);
}
hr = pAudioClient->GetBufferSize(&bufferFrameCount);
PWAVEFORMATEX wf = &wave_format;
UINT32 current_period;
pAudioClient->GetCurrentSharedModeEnginePeriod(&wf, &current_period);
INT32 FrameSize_bytes = bufferFrameCount * wave_format.nChannels * wave_format.wBitsPerSample / 8;
printf("bufferFrameCount: %u, FrameSize_bytes: %d, current_period: %u\n", bufferFrameCount, FrameSize_bytes, current_period);
hr = pAudioClient->GetService(
IID_IAudioRenderClient,
(void**)&pRenderClient);
render_info.framesize_bytes = FrameSize_bytes;
render_info.buffer_framecount = bufferFrameCount;
render_info.renderclient = pRenderClient;
hEvent = CreateEvent(nullptr, false, false, nullptr);
if (hEvent == INVALID_HANDLE_VALUE) { ERROR_EXIT(0); }
hr = pAudioClient->SetEventHandle(hEvent);
const size_t num_samples = FrameSize_bytes / sizeof(unsigned short);
DWORD taskIndex = 0;
hTask = AvSetMmThreadCharacteristics(TEXT("Pro Audio"), &taskIndex);
if (hTask == NULL) {
hr = E_FAIL;
}
hr = pAudioClient->Start(); // Start playing.
running = 1;
while (running) {
workqueue->queue(hEvent);
}
workqueue->stop();
hr = RtwqShutdown();
delete workqueue;
running = 0;
return 1;
}
This seems to kind of work (ie. audio is being output), but on every other invocation of my_rtqueue::Invoke(), IAudioRenderClient::GetBuffer() returns a HRESULT of 0x88890006 (-> AUDCLNT_E_BUFFER_TOO_LARGE), and the actual audio output is certainly not what I intend it to be.
What issues are there with my code? Is this the right way to use RTWQ with WASAPI?
Turns out there were a number of issues with my code, none of which had really anything to do with Rtwq. The biggest issue was me assuming that the shared mode audio stream was using 16-bit integer samples, when in reality my audio was setup for 32-bit float format (WAVE_FORMAT_IEEE_FLOAT). The currently active shared mode format, period etc. should be fetched like this:
WAVEFORMATEX *wavefmt = NULL;
UINT32 current_period = 0;
hr = pAudioClient->GetCurrentSharedModeEnginePeriod((WAVEFORMATEX**)&wavefmt, &current_period);
wavefmt now contains the output format info of the current shared mode. If the wFormatTag field is equal to WAVE_FORMAT_EXTENSIBLE, one needs to cast WAVEFORMATEX to WAVEFORMATEXTENSIBLE to see what the actual format is. After this, one needs to fetch the minimum period supported by the current setup, like so:
UINT32 DP, FP, MINP, MAXP;
hr = pAudioClient->GetSharedModeEnginePeriod(wavefmt, &DP, &FP, &MINP, &MAXP);
and then initialize the audio stream with the new InitializeSharedAudioStream function:
hr = pAudioClient->InitializeSharedAudioStream(AUDCLNT_STREAMFLAGS_EVENTCALLBACK, MINP, wavefmt, NULL);
... get the buffer's actual size:
hr = pAudioClient->GetBufferSize(&render_info.buffer_framecount);
and use GetCurrentPadding in the Get/ReleaseBuffer logic:
UINT32 pad = 0;
hr = render_info.audioclient->GetCurrentPadding(&pad);
int actual_size = (render_info.buffer_framecount - pad);
hr = render_info.renderclient->GetBuffer(actual_size, &pData);
if (SUCCEEDED(hr)) {
update_buffer((float*)pData, actual_size);
hr = render_info.renderclient->ReleaseBuffer(actual_size, 0);
ERROR_EXIT(hr);
}
The documentation for IAudioClient::Initialize states the following about shared mode streams (I assume it also applies to the new IAudioClient3):
Each time the thread awakens, it should call IAudioClient::GetCurrentPadding to determine how much data to write to a rendering buffer or read from a capture buffer. In contrast to the two buffers that the Initialize method allocates for an exclusive-mode stream that uses event-driven buffering, a shared-mode stream requires a single buffer.
Using GetCurrentPadding solves the problem with AUDCLNT_E_BUFFER_TOO_LARGE, and feeding the buffer with 32-bit float samples instead of 16-bit integers makes the output sound fine on my system (although the effect was quite funky!).
If someone comes up with better/more correct ways to use the Rtwq API, I would love to hear them.

Why MediaFoundation InitializeSinkWriter (SetInputMediaType) only accepts WMV3 format?

Taken from the MSDN help pages, InitializeSinkWriter works fine so long as the video encoding and video input format is WMV3/RGB32, however if I change it to WMV1, MPEG2, etc. then SetInputMediaType fails.
AFAIK I have WMV1 installed as a codec according to Sherlock the Codec Detective program.
Here is the code that causes the issue:
(to find the problem code, search for "problem" in source comments, there is a lot of boiler plate code that is irrelevant)
// Format constants
const UINT32 VIDEO_WIDTH = 640;
const UINT32 VIDEO_HEIGHT = 480;
const UINT32 VIDEO_FPS = 30;
const UINT64 VIDEO_FRAME_DURATION = 10 * 1000 * 1000 / VIDEO_FPS;
const UINT32 VIDEO_BIT_RATE = 800000;
const GUID VIDEO_ENCODING_FORMAT = MFVideoFormat_WMV1 ; // problem here, must be WMV3
const GUID VIDEO_INPUT_FORMAT = MFVideoFormat_WMV3 ; // problem here if not wmv3 too
const UINT32 VIDEO_PELS = VIDEO_WIDTH * VIDEO_HEIGHT;
const UINT32 VIDEO_FRAME_COUNT = 20 * VIDEO_FPS;
HRESULT InitializeSinkWriter(IMFSinkWriter **ppWriter, DWORD *pStreamIndex)
{
*ppWriter = NULL;
*pStreamIndex = NULL;
IMFSinkWriter *pSinkWriter = NULL;
IMFMediaType *pMediaTypeOut = NULL;
IMFMediaType *pMediaTypeIn = NULL;
DWORD streamIndex;
HRESULT hr = MFCreateSinkWriterFromURL(L"output.wmv", NULL, NULL, &pSinkWriter);
// Set the output media type.
if (SUCCEEDED(hr))
{
hr = MFCreateMediaType(&pMediaTypeOut);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeOut->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeOut->SetGUID(MF_MT_SUBTYPE, VIDEO_ENCODING_FORMAT);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeOut->SetUINT32(MF_MT_AVG_BITRATE, VIDEO_BIT_RATE);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeOut->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive);
}
if (SUCCEEDED(hr))
{
hr = MFSetAttributeSize(pMediaTypeOut, MF_MT_FRAME_SIZE, VIDEO_WIDTH, VIDEO_HEIGHT);
}
if (SUCCEEDED(hr))
{
hr = MFSetAttributeRatio(pMediaTypeOut, MF_MT_FRAME_RATE, VIDEO_FPS, 1);
}
if (SUCCEEDED(hr))
{
hr = MFSetAttributeRatio(pMediaTypeOut, MF_MT_PIXEL_ASPECT_RATIO, 1, 1);
}
if (SUCCEEDED(hr))
{
hr = pSinkWriter->AddStream(pMediaTypeOut, &streamIndex);
}
// Set the input media type.
if (SUCCEEDED(hr))
{
hr = MFCreateMediaType(&pMediaTypeIn);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetGUID(MF_MT_SUBTYPE, VIDEO_INPUT_FORMAT);
}
if (SUCCEEDED(hr))
{
hr = pMediaTypeIn->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive);
}
if (SUCCEEDED(hr))
{
hr = MFSetAttributeSize(pMediaTypeIn, MF_MT_FRAME_SIZE, VIDEO_WIDTH, VIDEO_HEIGHT);
}
if (SUCCEEDED(hr))
{
hr = MFSetAttributeRatio(pMediaTypeIn, MF_MT_FRAME_RATE, VIDEO_FPS, 1);
}
if (SUCCEEDED(hr))
{
hr = MFSetAttributeRatio(pMediaTypeIn, MF_MT_PIXEL_ASPECT_RATIO, 1, 1);
}
if (SUCCEEDED(hr))
{
// Problem here! Codec issue with wmv1, mpeg, etc.
hr = pSinkWriter->SetInputMediaType(streamIndex, pMediaTypeIn, NULL);
}
else {
puts("setattributeratio failed");
}
// Tell the sink writer to start accepting data.
if (SUCCEEDED(hr))
{
hr = pSinkWriter->BeginWriting();
}
else {
puts("setinputmediatype failed"); // <-- HR result problem here
}
// Return the pointer to the caller.
if (SUCCEEDED(hr))
{
*ppWriter = pSinkWriter;
(*ppWriter)->AddRef();
*pStreamIndex = streamIndex;
}
else {
puts("beginwriting failed");
}
SafeRelease(&pSinkWriter);
SafeRelease(&pMediaTypeOut);
SafeRelease(&pMediaTypeIn);
return hr;
}
Initialize sink writer is called with this code:
void main()
{
DWORD streamidx = 0;
const WCHAR *SAMPLE_FILE = L"sample.wmv";
IMFSourceReader *pReader = NULL;
IMFSinkWriter *pWriter = NULL;
puts("Initializing...");
HRESULT hr = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
if (SUCCEEDED(hr))
{
hr = MFStartup(MF_VERSION);
if (SUCCEEDED(hr))
{
// problem here !
hr = InitializeSinkWriter(&pWriter, &streamidx);
if (SUCCEEDED(hr))
{
// more code would go here...
}
else {
puts("InitializeSinkWriter failed"); // this is called
}
SafeRelease(&pWriter);
MFShutdown();
}
CoUninitialize();
}
puts("Finished...");
}
This is a standard Windows 7 computer I am using, so, if it only accepts WMV3 as the encoder or input type, does it mean I have to install codecs? This seems absurd since popular formats like WMV1 and MPEG should already be installed, and Sherlock codec detective says they are
There is no support for codecs you are trying in Windows Media Foundation (even though some third party software could report availability of other codecs for other APIs).
See:
Supported Media Formats in Media Foundation - Video Codecs - Encoder column in the table under Video Codecs
Windows Media Video 9 Encoder - Output Fomats - there is no WMV1 there

Capture a frame from video using directshow filters in C++

I have taken a code from the net to capture a frame from a video file and modified to capture all frames and store it as bmp images.
HRESULT GrabVideoBitmap(PCWSTR pszVideoFile)
{
IGraphBuilder *pGraph = NULL;
IMediaControl *pControl = NULL;
IMediaEventEx *pEvent = NULL;
IBaseFilter *pGrabberF = NULL;
ISampleGrabber *pGrabber = NULL;
IBaseFilter *pSourceF = NULL;
IEnumPins *pEnum = NULL;
IPin *pPin = NULL;
IBaseFilter *pNullF = NULL;
long evCode;
wchar_t temp[10];
wchar_t framename[50] = IMAGE_FILE_PATH; // L"D:\\sampleframe";
BYTE *pBuffer = NULL;
HRESULT hr = CoInitialize(NULL);
if (FAILED(hr))
return 0;
hr = CoCreateInstance(CLSID_FilterGraph, NULL, CLSCTX_INPROC_SERVER,
IID_PPV_ARGS(&pGraph));
hr = pGraph->QueryInterface(IID_PPV_ARGS(&pControl));
hr = pGraph->QueryInterface(IID_PPV_ARGS(&pEvent));
// Create the Sample Grabber filter.
hr = CoCreateInstance(CLSID_SampleGrabber, NULL, CLSCTX_INPROC_SERVER,
IID_PPV_ARGS(&pGrabberF));
hr = pGraph->AddFilter(pGrabberF, L"Sample Grabber");
hr = pGrabberF->QueryInterface(IID_PPV_ARGS(&pGrabber));
// Displays the metadata of the file
DisplayFileInfo((wchar_t*)pszVideoFile); // to display video information
AM_MEDIA_TYPE mt;
ZeroMemory(&mt, sizeof(mt));
mt.majortype = MEDIATYPE_Video;
mt.subtype = MEDIASUBTYPE_RGB24;
hr = pGrabber->SetMediaType(&mt);
hr = pGraph->AddSourceFilter(pszVideoFile, L"Source", &pSourceF);
hr = pSourceF->EnumPins(&pEnum);
while (S_OK == pEnum->Next(1, &pPin, NULL))
{
hr = ConnectFilters(pGraph, pPin, pGrabberF);
SafeRelease(&pPin);
if (SUCCEEDED(hr))
{
break;
}
}
hr = CoCreateInstance(CLSID_NullRenderer, NULL, CLSCTX_INPROC_SERVER,
IID_PPV_ARGS(&pNullF));
hr = pGraph->AddFilter(pNullF, L"Null Filter");
hr = ConnectFilters(pGraph, pGrabberF, pNullF);
hr = pGrabber->SetOneShot(TRUE);
hr = pGrabber->SetBufferSamples(TRUE);
hr = pControl->Run();
hr = pEvent->WaitForCompletion(INFINITE, &evCode);
for (int i = 0; i < 10; i++)
{
// Find the required buffer size.
long cbBuffer;
hr = pGrabber->GetCurrentBuffer(&cbBuffer, NULL);
pBuffer = (BYTE*)CoTaskMemAlloc(cbBuffer);
hr = pGrabber->GetCurrentBuffer(&cbBuffer, (long*)pBuffer);
hr = pGrabber->GetConnectedMediaType(&mt);
// Examine the format block.
if ((mt.formattype == FORMAT_VideoInfo) &&
(mt.cbFormat >= sizeof(VIDEOINFOHEADER)) &&
(mt.pbFormat != NULL))
{
swprintf(temp, 5, L"%d", i);
wcscat_s(framename, temp);
wcscat_s(framename, L".bmp");
VIDEOINFOHEADER *pVih = (VIDEOINFOHEADER*)mt.pbFormat;
hr = WriteBitmap((PCWSTR)framename, &pVih->bmiHeader,
mt.cbFormat - SIZE_PREHEADER, pBuffer, cbBuffer);
wcscpy_s(framename, IMAGE_FILE_PATH);
}
else
{
// Invalid format.
hr = VFW_E_INVALIDMEDIATYPE;
}
FreeMediaType(mt);
}
done:
CoTaskMemFree(pBuffer);
SafeRelease(&pPin);
SafeRelease(&pEnum);
SafeRelease(&pNullF);
SafeRelease(&pSourceF);
SafeRelease(&pGrabber);
SafeRelease(&pGrabberF);
SafeRelease(&pControl);
SafeRelease(&pEvent);
SafeRelease(&pGraph);
return hr;
}
The input video file has 132 frames.
But only 68 images are generated.
Also last frame of the video is captured for the last 38 images.
I think the directshow graph is running continuously and WriteBitmap() is missing frames.
How to get the control in directX to capture one frame and write it to bmp file and capture the next frame and thus capture all the frames as bmp images.
Thanks
Arun
Your approach is wrong. Currently, you set the sample grabber to one shot and after that you wait for the graph completion. This way it only works for capturing a single frame. You need to capture the frames inside the ISampleGrabberCB callback of your pGrabber. You need to implement ISampleGrabberCB interface and use ISampleGrabber::SetCallback on your pGrabber filter to point it to your implementation. After that you can capture the frames inside either SampleCB or BufferCB methods. http://www.infognition.com/blog/2013/accessing_raw_video_in_directshow.html