Windows Media Player DSP Plugin Format Negotiation - c++

I am writing an audio DSP plugin for Windows Media Player with the plugin acting as a DMO. I am trying to get WMP to send me the audio data in mono 22.050 khz audio. However, no matter what I do the player re-samples all audio to stereo 44.1k data. Even if the file I'm playing is a 22.050khz wave file I still get 44.1 audio in my plugin.
I specify the data my plugin can handle via the GetInputType/GetOutputType functions, but no matter what happens by the time SetInputType/SetOutputType is called the format is back to 44.1k. Does anyone have an idea of what is happening? I tried writing ValidateMediaType to only accept the sample rate I want, but then I just get no data at all. My GetInputType function is below
STDMETHODIMP CWMPIPSpeaker::GetInputType (
DWORD dwInputStreamIndex,
DWORD dwTypeIndex,
DMO_MEDIA_TYPE *pmt)
{
HRESULT hr = S_OK;
if ( 0 != dwInputStreamIndex )
{
return DMO_E_INVALIDSTREAMINDEX ;
}
// only support one preferred type
if ( 0 != dwTypeIndex )
{
return DMO_E_NO_MORE_ITEMS;
}
if ( NULL == pmt )
{
return E_POINTER;
}
hr = MoInitMediaType(pmt, sizeof( WAVEFORMATEX ) );
WAVEFORMATEX* format = ((WAVEFORMATEX*)pmt->pbFormat);
format->nChannels = 1;
format->nSamplesPerSec = 22050;
format->wFormatTag = WAVE_FORMAT_PCM;
format->wBitsPerSample = 16;
format->cbSize = 0;
format->nBlockAlign = (format->nChannels * format->wBitsPerSample) / 8;
format->nAvgBytesPerSec = format->nBlockAlign * format->nSamplesPerSec;
pmt->formattype = FORMAT_WaveFormatEx;
pmt->lSampleSize = format->nBlockAlign;
pmt->bFixedSizeSamples = true;
pmt->majortype = MEDIATYPE_Audio;
pmt->subtype = MEDIASUBTYPE_PCM;
return hr;
}

Well unfortunately it appears the problem isn't me. I'm archiving this here for future reference because of all the trouble this issue caused me. I found a detailed report on the problem on an msdn blog and it appears that in Vista and later you cannot negotiate media types for DMO plugins by design. I can't say I agree with this decision, but I means that I must do the conversion myself if I want to have down-sampled data.
Hopefully this helps anyone else who runs into this "feature".

Related

DirectShow CSourceStream::FillBuffer unpredictable number of calls after Pause and Seek to the first frame

I have a Directshow File Source Filter which has audio and frame output pins. It is written in C++ based on this tutorial on MSDN. My filter opens the video by using Medialooks MFormats SDK and provides raw data to output pins. Two pins are directly connecting to renderer filters when they are rendered.
The problem occurs when I run the graph, pause the video and seek to the frame number 0. After a call to ChangeStart method in output frame pin, sometimes FillBuffer is called three times and frame 1 is shown on the screen instead of 0. When it is called two times, it shows the correct frame which is the frame 0.
Output pins are inherited from CSourceStream and CSourceSeeking classes. Here is my FillBuffer and ChangeStart methods of the output frame pin;
FillBuffer Method
HRESULT frame_pin::FillBuffer(IMediaSample *sample)
{
CheckPointer(sample, E_POINTER);
BYTE *frame_buffer;
sample->GetPointer(&frame_buffer);
// Check if the downstream filter is changing the format.
CMediaType *mt;
HRESULT hr = sample->GetMediaType(reinterpret_cast<AM_MEDIA_TYPE**>(&mt));
if (hr == S_OK)
{
auto new_width = reinterpret_cast<VIDEOINFOHEADER2*>(mt->pbFormat)->bmiHeader.biWidth;
auto old_witdh = reinterpret_cast<VIDEOINFOHEADER2*>(m_mt.pbFormat)->bmiHeader.biWidth;
if(new_width != old_witdh)
format_changed_ = true;
SetMediaType(mt);
DeleteMediaType(mt);
}
ASSERT(m_mt.formattype == FORMAT_VideoInfo2);
VIDEOINFOHEADER2 *vih = reinterpret_cast<VIDEOINFOHEADER2*>(m_mt.pbFormat);
CComPtr<IMFFrame> mf_frame;
{
CAutoLock lock(&shared_state_);
if (source_time_ >= m_rtStop)
return S_FALSE;
// mf_reader_ is a member external SDK instance which gets the frame data with this function call
hr = mf_reader_->SourceFrameConvertedGetByNumber(&av_props_, frame_number_, -1, &mf_frame, CComBSTR(L""));
if (FAILED(hr))
return hr;
REFERENCE_TIME start, stop = 0;
start = stream_time_;
stop = static_cast<REFERENCE_TIME>(tc_.get_stop_time() / m_dRateSeeking);
sample->SetTime(&start, &stop);
stream_time_ = stop;
source_time_ += (stop - start);
frame_number_++;
}
if (format_changed_)
{
CComPtr<IMFFrame> mf_frame_resized;
mf_frame->MFResize(eMFCC_YUY2, std::abs(vih->bmiHeader.biWidth), std::abs(vih->bmiHeader.biHeight), 0, &mf_frame_resized, CComBSTR(L""), CComBSTR(L""));
mf_frame = mf_frame_resized;
}
MF_FRAME_INFO mf_frame_info;
mf_frame->MFAllGet(&mf_frame_info);
memcpy(frame_buffer, reinterpret_cast<BYTE*>(mf_frame_info.lpVideo), mf_frame_info.cbVideo);
sample->SetActualDataLength(static_cast<long>(mf_frame_info.cbVideo));
sample->SetSyncPoint(TRUE);
sample->SetPreroll(FALSE);
if (discontinuity_)
{
sample->SetDiscontinuity(TRUE);
discontinuity_ = FALSE;
}
return S_OK;
}
ChangeStart Method
HRESULT frame_pin::ChangeStart()
{
{
CAutoLock lock(CSourceSeeking::m_pLock);
tc_.reset();
stream_time_ = 0;
source_time_ = m_rtStart;
frame_number_ = static_cast<int>(m_rtStart / frame_lenght_);
}
update_from_seek();
return S_OK;
}
From the Microsoft DirectShow documentation:
The CSourceSeeking class is an abstract class for implementing
seeking in source filters with one output pin.
CSourceSeeking is not recommended for a filter with more than one
output pin. The main issue is that only one pin should respond to
seeking requests. Typically this requires communication among the pins
and the filter.
And you have two output pins in your source filter.
The CSourceSeeking class can be extended to manage more than one output pin with custom coding. When seek commands come in they'll come through both input pins so you'll need to decide which pin is controlling seeking and ignore seek commands arriving at the other input pin.

Media Foundation Webcam live capture freezes in low light condition

We are building a video communication software. We are using Media Foundation to obtain the live Stream. We use the IMFSourceReadder to perform the capture.
The sequence of call looks like:
hr = pAttributes->SetString(MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_SYMBOLIC_LINK, m_pwszSymbolicLink);
hr = MFCreateDeviceSourceActivate(pAttributes, &avdevice);
hr = avdevice->ActivateObject(__uuidof(IMFMediaSource), (void**) &m_mediaSource);
hr = m_mediaSource->CreatePresentationDescriptor(&pPD);
hr = pPD->GetStreamDescriptorByIndex(m_streamIdx, &fSelected, &pSD);
hr =
// we select the best native MediaType enumerating the source reader
pHandler->SetCurrentMediaType(m_bestNativeType);
hr = pAttributes->SetUINT32(MF_READWRITE_DISABLE_CONVERTERS, FALSE);
hr = pAttributes->SetUINT32(MF_SOURCE_READER_ENABLE_ADVANCED_VIDEO_PROCESSING, TRUE);
hr = MFCreateSourceReaderFromMediaSource(m_mediaSource, pAttributes, &m_reader);
Then we start to read the frame SYNCHRONOUSLY in a separate thread using
m_reader->ReadSample()
When we need to stop the device or reconfigure it, we stop the thread (by setting an flag and exiting the thread). We call the following
hr = m_mediaSource->Stop();
m_mediaSource->Shutdown();
SafeRelease(&m_mediaSource);
SafeRelease(&m_reader);
The software can be out ouf call. There, it captures the webcam video in VGA format and display it on screen. In call, it selects the best capture format depending on the negociated call quality and restarts the capture.
The issues that we are experiencing are the following: some cameras freeze sometimes in low light conditions (low fps output). It can happen right away at the beginning of the call or during the call.
When it freezes, one of the two things can happen (not sure which one)
m_reader->ReadSample() fails repetitively with MF_E_OPERATION_CANCELLED error code
m_reader->ReadSample() returns often producing more than 80 frames per seconds producing same frozen image.
When we hang up the device is reconfigured back to VGA capture and works fine.
Does someone struggled with Media Foundation on the same issue?
You wrote that web camera "freez" - produce low frame rate while capture image with low light condition. The result of it that controller of web camera take more time on exposition of photo matrix in automatic mode. It allows improve quality of image by increasing frame duration. So, it is special feature of hardware part. it is possible to switch such behavior of camera from auto mode on manual mode of parameter
Code::Result VideoCaptureDevice::setParametrs(CamParametrs parametrs){
ResultCode::Result result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_ERROR;
if(pLocalSource)
{
unsigned int shift = sizeof(Parametr);
Parametr *pParametr = (Parametr *)(&settings);
Parametr *pPrevParametr = (Parametr *)(&prevParametrs);
CComPtrCustom<IAMVideoProcAmp> pProcAmp;
HRESULT hr = pLocalSource->QueryInterface(IID_PPV_ARGS(&pProcAmp));
if (SUCCEEDED(hr))
{
for(unsigned int i = 0; i < 10; i++)
{
if(pPrevParametr[i].CurrentValue != pParametr[i].CurrentValue || pPrevParametr[i].Flag != pParametr[i].Flag)
hr = pProcAmp->Set(VideoProcAmp_Brightness + i, pParametr[i].CurrentValue, pParametr[i].Flag);
}
}
else
{
result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_SETVIDEOPROCESSOR_ERROR;
goto finish;
}
CComPtrCustom<IAMCameraControl> pProcControl;
hr = pLocalSource->QueryInterface(IID_PPV_ARGS(&pProcControl));
if (SUCCEEDED(hr))
{
for(unsigned int i = 0; i < 7; i++)
{
if(pPrevParametr[10 + i].CurrentValue != pParametr[10 + i].CurrentValue || pPrevParametr[10 + i].Flag != pParametr[10 + i].Flag)
hr = pProcControl->Set(CameraControl_Pan+i, pParametr[10 + i].CurrentValue, pParametr[10 + i].Flag);
}
}
else
{
result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_SETVIDEOCONTROL_ERROR;
goto finish;
}
result = ResultCode::OK;
prevParametrs = parametrs.settings;
}finish:
if(result != ResultCode::OK)
DebugPrintOut::getInstance().printOut(L"VIDEO CAPTURE DEVICE: Parametrs of video device cannot be set!!!\n");
return result;
}
where:
struct Parametr
{
long CurrentValue;
long Min;
long Max;
long Step;
long Default;
long Flag;
Parametr();
};
struct CamParametrs
{
Parametr Brightness;
Parametr Contrast;
Parametr Hue;
Parametr Saturation;
Parametr Sharpness;
Parametr Gamma;
Parametr ColorEnable;
Parametr WhiteBalance;
Parametr BacklightCompensation;
Parametr Gain;
Parametr Pan;
Parametr Tilt;
Parametr Roll;
Parametr Zoom;
Parametr Exposure;
Parametr Iris;
Parametr Focus;
};
More code you can find on site:
Capturing Live-video from Web-camera on Windows 7 and Windows 8
However, using of IMFSourceReader can be not effective. Media Foundation model uses async interaction - after sending the request into the media source code must listen responding from media source with new frame or some other info. Method with direct calling m_reader->ReadSample() cannot be effective - you faced with it. Method m_reader->ReadSample() can be effective with reading frames from video file while delay can be very low, but for web camera I can advice use topology - session binding, like in my code Capturing Live-video from Web-camera on Windows 7 and Windows 8
Regards,
Evgeny Pereguda
The question description leaves an impression that you do things in a somewhat chaotic way and the resulting freeze is not necessarily caused by Media Foundation or camera.
Use of media source and source reader are certainly the right way to access a camera and it provides efficient way to capture video, both synchronously and asynchronously.
However, your incomplete code snippets show that you create a media source, then source reader, and then you keep dealing with media source directly. Well, you are not supposed to do this. Once you created a source reader, it will manage media source for you: you don't need Stop, Shutdown calls. Your calling that and other methods might bring confusion that results in incorrect source reader behavior.
That is, either you deal with a media source, or you plug it into Media Session or Source Reader and use this higher level API.
Also note that if/when you experience a freeze, you are interested to break in with debugged and locate threads that indicate freeze position.

How can I select an audio input device and capture audio in directshow

I am using DirectShow to develop a program in windows embedded ce 6.0.
I write the program in C/C++.
the program needs to deal with multiple audio input devices.
I am able to get available audio input devices in directshow,
but don't know how to specify an input device and capture audio from it.
is there any way to do it?
Thanks!
// firstly, using following code to create audio filter
IBaseFilter * pDevice = NULL;
CoCreateInstance(CLSID_AudioCapture, NULL, CLSCTX_INPROC,IID_IBaseFilter, (void**)&pDevice);
// then, enumerate PIN to get input audio name from filter
IEnumPins * pinEnum = NULL;
IPin * pin = NULL;
ULONG fetchCount = 0;
PIN_INFO pinInfo;
pDevice->EnumPins(&pinEnum);
while (SUCCEEDED(pinEnum->Next(1, &pin, &fetchCount)) && fetchCount)
{
pin->QueryPinInfo(&pinInfo);
if (pinInfo.dir == PINDIR_INPUT)
{
// get name from pinInfo.achName
}
}

Printing Raw Data in Terminal Server

Here is the scenario:
I have a Windows Server 2008 with Terminal Server (No Domain Controller, No join to Domain)
I have a client machine with Windows XP SP3 updated (.NET 3.0 SP1 and .NET 4.0)
I'm Using Embarcadero C++Builder (BCB6)
I have a ticket printer (Thermal Printer, POS Printer, Epson, Zebra, etc.)
When I connect to the terminal server, the printer works OK. I tested printing a test page.
When I use my software to send the raw data in the terminal server on the local computer, I get this error:
Windows Presentation Foundation terminal server print W has encountered a
problem and needs to close. We are sorry for the inconvenience.
I followed the advice from this support page with no luck.
I used to print directly to LPT1:, but with Windows Server 2008 it's getting harder to make this work, so we have to change the way we print to this kind of printer.
Here is the code that I'm using. I tested locally and it works fine, but in the terminal server doesn't work:
bool TForm1::RawDataToPrinter(char* szPrinterName, char* lpData, unsigned int dwCount )
{
int BytesWritten;
HANDLE hPrinter;
TDocInfo1 DocInfo;
bool bStatus = false;
int dwJob = 0;
unsigned long dwBytesWritten = 0;
// Open a handle to the printer.
bStatus = OpenPrinter( szPrinterName, &hPrinter, NULL );
if( bStatus )
{
// Fill in the structure with info about this "document."
DocInfo.pDocName = "My Document";
DocInfo.pOutputFile = NULL;
DocInfo.pDatatype = "RAW";
// to indicate that the application will be sending document data to the printer.
dwJob = StartDocPrinter( hPrinter, 1, (LPBYTE)&DocInfo );
if ( dwJob > 0 )
{
// Start a page.
bStatus = StartPagePrinter( hPrinter );
bStatus = true;
if( bStatus )
{
// Send the data to the printer.
bStatus = WritePrinter( hPrinter, lpData, dwCount, &dwBytesWritten );
EndPagePrinter ( hPrinter );
}
// Inform the spooler that the document is ending.
EndDocPrinter( hPrinter );
}
// Close the printer handle.
ClosePrinter( hPrinter );
}
// Check to see if correct number of bytes were written.
if (!bStatus || (dwBytesWritten != dwCount))
bStatus = false;
else
bStatus = true;
return bStatus;
}
I copied this code from a example in Microsoft's Support. I also tried changing the "RAW" to "TEXT" but I get the same error.
I tried this code, because it uses the GDI to print:
long pageline;
char prueba[255];
Printer()->SetPrinter(ListBox1->Items->Strings[ListBox1->ItemIndex].c_str(), "WINSPOOL", "", NULL);
Printer()->BeginDoc();
pageline = 0;
while(pageline < Memo1->Lines->Count)
{
Printer()->Canvas->TextOut(10, (10 + Printer()->Canvas->TextHeight("Hi! There")) * pageline, Memo1->Lines->Strings[pageline]);
pageline++;
}
Printer()->EndDoc();
This is a example that I found in the Embarcadero Forum.
I also verified TsWpfWrp.exe. I tried replacing it by the one in the server, but it does nothing, doesn't send the error, but also won't send any data.
There is another way to do this? Do I have something wrong in the code?
I appreciated any help or insight.
I found the problem, is the Easy Print driver, it expect in RAW Mode the XPS specification, but I was sending only text.
I disabled the Easy Print to put the printer in Fallback mode( something like that), this is where the Terminal Server, first it look for the installed driver then for the Easy Print (this can be verified in the properties of the printer in advanced options).
Now it works, thanks.

Filling CMediaType and IMediaSample from AVPacket for h264 video

I have searched and have found almost nothing, so I would really appreciate some help with my question.
I am writting a DirectShow source filter which uses libav to read and send downstream h264 packets from youtube's FLV file. But I can't find appropriate libav structure's fields to implement correctly filter's GetMediType() and FillBuffer(). Some libav fields is null. In consequence h264 decoder crashes in attempt to process received data.
Where am I wrong? In working with libav or with DirectShow interfaces? Maybe h264 requires additional processing when working with libav or I fill reference time incorrectly? Does someone have any links useful for writing DirectShow h264 source filter with libav?
Part of GetMediaType():
VIDEOINFOHEADER *pvi = (VIDEOINFOHEADER*) toMediaType->AllocFormatBuffer(sizeof(VIDEOINFOHEADER));
pvi->AvgTimePerFrame = UNITS_PER_SECOND / m_pFormatContext->streams[m_streamNo]->codec->sample_rate; //sample_rate is 0
pvi->dwBitRate = m_pFormatContext->bit_rate;
pvi->rcSource = videoRect;
pvi->rcTarget = videoRect;
//Bitmap
pvi->bmiHeader.biSize = sizeof(BITMAPINFOHEADER);
pvi->bmiHeader.biWidth = videoRect.right;
pvi->bmiHeader.biHeight = videoRect.bottom;
pvi->bmiHeader.biPlanes = 1;
pvi->bmiHeader.biBitCount = m_pFormatContext->streams[m_streamNo]->codec->bits_per_raw_sample;//or should here be bits_per_coded_sample
pvi->bmiHeader.biCompression = FOURCC_H264;
pvi->bmiHeader.biSizeImage = GetBitmapSize(&pvi->bmiHeader);
Part of FillBuffer():
//Get buffer pointer
BYTE* pBuffer = NULL;
if (pSamp->GetPointer(&pBuffer) < 0)
return S_FALSE;
//Get next packet
AVPacket* pPacket = m_mediaFile.getNextPacket();
if (pPacket->data == NULL)
return S_FALSE;
//Check packet and buffer size
if (pSamp->GetSize() < pPacket->size)
return S_FALSE;
//Copy from packet to sample buffer
memcpy(pBuffer, pPacket->data, pPacket->size);
//Set media sample time
REFERENCE_TIME start = m_mediaFile.timeStampToReferenceTime(pPacket->pts);
REFERENCE_TIME duration = m_mediaFile.timeStampToReferenceTime(pPacket->duration);
REFERENCE_TIME end = start + duration;
pSamp->SetTime(&start, &end);
pSamp->SetMediaTime(&start, &end);
P.S. I've debugged my filter with hax264 decoder and it crashes on call to libav deprecated function img_convert().
Here is the MSDN link you need to build a correct H.264 media type: H.264 Video Types
You have to fill the right fields with the right values.
The AM_MEDIA_TYPE should contain the right MEDIASUBTYPE for h264.
And these are plain wrong :
pvi->bmiHeader.biWidth = videoRect.right;
pvi->bmiHeader.biHeight = videoRect.bottom;
You should use a width/height which is independent of the rcSource/rcTarget, due to the them being indicators, and maybe completely zero if you take them from some other filter.
pvi->bmiHeader.biBitCount = m_pFormatContext->streams[m_streamNo]->codec->bits_per_raw_sample;//or should here be bits_per_coded_sample
This only makes sense if biWidth*biHeight*biBitCount/8 are the true size of the sample. I do not think so ...
pvi->bmiHeader.biCompression = FOURCC_H264;
This must also be passed in the AM_MEDIA_TYPE in the subtype parameter.
pvi->bmiHeader.biSizeImage = GetBitmapSize(&pvi->bmiHeader);
This fails, because the fourcc is unknown to the function and the bitcount is plain wrong for this sample, due to not being a full frame.
You have to take a look at how the data stream is handled by the downstream h264 filter. This seems to be flawed.