DirectShow CSourceStream::FillBuffer unpredictable number of calls after Pause and Seek to the first frame - c++

I have a Directshow File Source Filter which has audio and frame output pins. It is written in C++ based on this tutorial on MSDN. My filter opens the video by using Medialooks MFormats SDK and provides raw data to output pins. Two pins are directly connecting to renderer filters when they are rendered.
The problem occurs when I run the graph, pause the video and seek to the frame number 0. After a call to ChangeStart method in output frame pin, sometimes FillBuffer is called three times and frame 1 is shown on the screen instead of 0. When it is called two times, it shows the correct frame which is the frame 0.
Output pins are inherited from CSourceStream and CSourceSeeking classes. Here is my FillBuffer and ChangeStart methods of the output frame pin;
FillBuffer Method
HRESULT frame_pin::FillBuffer(IMediaSample *sample)
{
CheckPointer(sample, E_POINTER);
BYTE *frame_buffer;
sample->GetPointer(&frame_buffer);
// Check if the downstream filter is changing the format.
CMediaType *mt;
HRESULT hr = sample->GetMediaType(reinterpret_cast<AM_MEDIA_TYPE**>(&mt));
if (hr == S_OK)
{
auto new_width = reinterpret_cast<VIDEOINFOHEADER2*>(mt->pbFormat)->bmiHeader.biWidth;
auto old_witdh = reinterpret_cast<VIDEOINFOHEADER2*>(m_mt.pbFormat)->bmiHeader.biWidth;
if(new_width != old_witdh)
format_changed_ = true;
SetMediaType(mt);
DeleteMediaType(mt);
}
ASSERT(m_mt.formattype == FORMAT_VideoInfo2);
VIDEOINFOHEADER2 *vih = reinterpret_cast<VIDEOINFOHEADER2*>(m_mt.pbFormat);
CComPtr<IMFFrame> mf_frame;
{
CAutoLock lock(&shared_state_);
if (source_time_ >= m_rtStop)
return S_FALSE;
// mf_reader_ is a member external SDK instance which gets the frame data with this function call
hr = mf_reader_->SourceFrameConvertedGetByNumber(&av_props_, frame_number_, -1, &mf_frame, CComBSTR(L""));
if (FAILED(hr))
return hr;
REFERENCE_TIME start, stop = 0;
start = stream_time_;
stop = static_cast<REFERENCE_TIME>(tc_.get_stop_time() / m_dRateSeeking);
sample->SetTime(&start, &stop);
stream_time_ = stop;
source_time_ += (stop - start);
frame_number_++;
}
if (format_changed_)
{
CComPtr<IMFFrame> mf_frame_resized;
mf_frame->MFResize(eMFCC_YUY2, std::abs(vih->bmiHeader.biWidth), std::abs(vih->bmiHeader.biHeight), 0, &mf_frame_resized, CComBSTR(L""), CComBSTR(L""));
mf_frame = mf_frame_resized;
}
MF_FRAME_INFO mf_frame_info;
mf_frame->MFAllGet(&mf_frame_info);
memcpy(frame_buffer, reinterpret_cast<BYTE*>(mf_frame_info.lpVideo), mf_frame_info.cbVideo);
sample->SetActualDataLength(static_cast<long>(mf_frame_info.cbVideo));
sample->SetSyncPoint(TRUE);
sample->SetPreroll(FALSE);
if (discontinuity_)
{
sample->SetDiscontinuity(TRUE);
discontinuity_ = FALSE;
}
return S_OK;
}
ChangeStart Method
HRESULT frame_pin::ChangeStart()
{
{
CAutoLock lock(CSourceSeeking::m_pLock);
tc_.reset();
stream_time_ = 0;
source_time_ = m_rtStart;
frame_number_ = static_cast<int>(m_rtStart / frame_lenght_);
}
update_from_seek();
return S_OK;
}

From the Microsoft DirectShow documentation:
The CSourceSeeking class is an abstract class for implementing
seeking in source filters with one output pin.
CSourceSeeking is not recommended for a filter with more than one
output pin. The main issue is that only one pin should respond to
seeking requests. Typically this requires communication among the pins
and the filter.
And you have two output pins in your source filter.
The CSourceSeeking class can be extended to manage more than one output pin with custom coding. When seek commands come in they'll come through both input pins so you'll need to decide which pin is controlling seeking and ignore seek commands arriving at the other input pin.

Related

Windows MFT (Media Foundation Transform) decoder not returning proper sample time or duration

To decode a H264 stream with the Windows Media foundation Transform, the work flow is currently something like this:
IMFSample sample;
sample->SetTime(time_in_ns);
sample->SetDuration(duration_in_ns);
sample->AddBuffer(buffer);
// Feed IMFSample to decoder
mDecoder->ProcessInput(0, sample, 0);
// Get output from decoder.
/* create outputsample that will receive content */ { ... }
MFT_OUTPUT_DATA_BUFFER output = {0};
output.pSample = outputsample;
DWORD status = 0;
HRESULT hr = mDecoder->ProcessOutput(0, 1, &output, &status);
DWORD status = 0;
hr = mDecoder->ProcessOutput(0, 1, &output, &status);
if (output.pEvents) {
// We must release this, as per the IMFTransform::ProcessOutput()
// MSDN documentation.
output.pEvents->Release();
output.pEvents = nullptr;
}
if (hr == MF_E_TRANSFORM_STREAM_CHANGE) {
// Type change, probably geometric aperture change.
// Reconfigure decoder output type, so that GetOutputMediaType()
} else if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT) {
// Not enough input to produce output.
} else if (!output.pSample) {
return S_OK;
} else }
// Process output
}
}
When we have fed all data to the MFT decoder, we must drain it:
mDecoder->ProcessMessage(MFT_MESSAGE_COMMAND_DRAIN, 0);
Now, one thing with the WMF H264 decoder, is that it will typically not output anything before having been called with over 30 compressed h264 frames regardless of the size of the h264 sliding window. Latency is very high...
I'm encountering an issue that is very troublesome.
With a video made only of keyframes, and which has only 15 frames, each being 2s long, the first frame having a presentation time of non-zero (this stream is from live content, so first frame is typically in epos time)
So without draining the decoder, nothing will come out of the decoder as it hasn't received enough frames.
However, once the decoder is drained, the decoded frame will come out. HOWEVER, the MFT decoder has set all durations to 33.6ms only and the presentation time of the first sample coming out is always 0.
The original duration and presentation time have been lost.
If you provide over 30 frames to the h264 decoder, then both duration and pts are valid...
I haven't yet found a way to get the WMF decoder to output samples with the proper value.
It appears that if you have to drain the decoder before it has output any samples by itself, then it's totally broken...
Has anyone experienced such problems? How did you get around it?
Thank you in advance
Edit: a sample of the video is available on http://people.mozilla.org/~jyavenard/mediatest/fragmented/1301869.mp4
Playing this video with Firefox will causes it to play extremely quickly due to the problems described above.
I'm not sure that your work flow is correct. I think you should do something like this:
do
{
...
hr = mDecoder->ProcessInput(0, sample, 0);
if(FAILED(hr))
break;
...
hr = mDecoder->ProcessOutput(0, 1, &output, &status);
if(FAILED(hr) && hr != MF_E_TRANSFORM_NEED_MORE_INPUT)
break;
}
while(hr == MF_E_TRANSFORM_NEED_MORE_INPUT);
if(SUCCEEDED(hr))
{
// You have a valid decoded frame here
}
The idea is to keep calling ProcessInput/ProcessOuptut while ProcessOutput returns MF_E_TRANSFORM_NEED_MORE_INPUT. MF_E_TRANSFORM_NEED_MORE_INPUT means that decoder needs more input. I think that with this loop you won't need to drain the decoder.

Media Foundation Webcam live capture freezes in low light condition

We are building a video communication software. We are using Media Foundation to obtain the live Stream. We use the IMFSourceReadder to perform the capture.
The sequence of call looks like:
hr = pAttributes->SetString(MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_SYMBOLIC_LINK, m_pwszSymbolicLink);
hr = MFCreateDeviceSourceActivate(pAttributes, &avdevice);
hr = avdevice->ActivateObject(__uuidof(IMFMediaSource), (void**) &m_mediaSource);
hr = m_mediaSource->CreatePresentationDescriptor(&pPD);
hr = pPD->GetStreamDescriptorByIndex(m_streamIdx, &fSelected, &pSD);
hr =
// we select the best native MediaType enumerating the source reader
pHandler->SetCurrentMediaType(m_bestNativeType);
hr = pAttributes->SetUINT32(MF_READWRITE_DISABLE_CONVERTERS, FALSE);
hr = pAttributes->SetUINT32(MF_SOURCE_READER_ENABLE_ADVANCED_VIDEO_PROCESSING, TRUE);
hr = MFCreateSourceReaderFromMediaSource(m_mediaSource, pAttributes, &m_reader);
Then we start to read the frame SYNCHRONOUSLY in a separate thread using
m_reader->ReadSample()
When we need to stop the device or reconfigure it, we stop the thread (by setting an flag and exiting the thread). We call the following
hr = m_mediaSource->Stop();
m_mediaSource->Shutdown();
SafeRelease(&m_mediaSource);
SafeRelease(&m_reader);
The software can be out ouf call. There, it captures the webcam video in VGA format and display it on screen. In call, it selects the best capture format depending on the negociated call quality and restarts the capture.
The issues that we are experiencing are the following: some cameras freeze sometimes in low light conditions (low fps output). It can happen right away at the beginning of the call or during the call.
When it freezes, one of the two things can happen (not sure which one)
m_reader->ReadSample() fails repetitively with MF_E_OPERATION_CANCELLED error code
m_reader->ReadSample() returns often producing more than 80 frames per seconds producing same frozen image.
When we hang up the device is reconfigured back to VGA capture and works fine.
Does someone struggled with Media Foundation on the same issue?
You wrote that web camera "freez" - produce low frame rate while capture image with low light condition. The result of it that controller of web camera take more time on exposition of photo matrix in automatic mode. It allows improve quality of image by increasing frame duration. So, it is special feature of hardware part. it is possible to switch such behavior of camera from auto mode on manual mode of parameter
Code::Result VideoCaptureDevice::setParametrs(CamParametrs parametrs){
ResultCode::Result result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_ERROR;
if(pLocalSource)
{
unsigned int shift = sizeof(Parametr);
Parametr *pParametr = (Parametr *)(&settings);
Parametr *pPrevParametr = (Parametr *)(&prevParametrs);
CComPtrCustom<IAMVideoProcAmp> pProcAmp;
HRESULT hr = pLocalSource->QueryInterface(IID_PPV_ARGS(&pProcAmp));
if (SUCCEEDED(hr))
{
for(unsigned int i = 0; i < 10; i++)
{
if(pPrevParametr[i].CurrentValue != pParametr[i].CurrentValue || pPrevParametr[i].Flag != pParametr[i].Flag)
hr = pProcAmp->Set(VideoProcAmp_Brightness + i, pParametr[i].CurrentValue, pParametr[i].Flag);
}
}
else
{
result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_SETVIDEOPROCESSOR_ERROR;
goto finish;
}
CComPtrCustom<IAMCameraControl> pProcControl;
hr = pLocalSource->QueryInterface(IID_PPV_ARGS(&pProcControl));
if (SUCCEEDED(hr))
{
for(unsigned int i = 0; i < 7; i++)
{
if(pPrevParametr[10 + i].CurrentValue != pParametr[10 + i].CurrentValue || pPrevParametr[10 + i].Flag != pParametr[10 + i].Flag)
hr = pProcControl->Set(CameraControl_Pan+i, pParametr[10 + i].CurrentValue, pParametr[10 + i].Flag);
}
}
else
{
result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_SETVIDEOCONTROL_ERROR;
goto finish;
}
result = ResultCode::OK;
prevParametrs = parametrs.settings;
}finish:
if(result != ResultCode::OK)
DebugPrintOut::getInstance().printOut(L"VIDEO CAPTURE DEVICE: Parametrs of video device cannot be set!!!\n");
return result;
}
where:
struct Parametr
{
long CurrentValue;
long Min;
long Max;
long Step;
long Default;
long Flag;
Parametr();
};
struct CamParametrs
{
Parametr Brightness;
Parametr Contrast;
Parametr Hue;
Parametr Saturation;
Parametr Sharpness;
Parametr Gamma;
Parametr ColorEnable;
Parametr WhiteBalance;
Parametr BacklightCompensation;
Parametr Gain;
Parametr Pan;
Parametr Tilt;
Parametr Roll;
Parametr Zoom;
Parametr Exposure;
Parametr Iris;
Parametr Focus;
};
More code you can find on site:
Capturing Live-video from Web-camera on Windows 7 and Windows 8
However, using of IMFSourceReader can be not effective. Media Foundation model uses async interaction - after sending the request into the media source code must listen responding from media source with new frame or some other info. Method with direct calling m_reader->ReadSample() cannot be effective - you faced with it. Method m_reader->ReadSample() can be effective with reading frames from video file while delay can be very low, but for web camera I can advice use topology - session binding, like in my code Capturing Live-video from Web-camera on Windows 7 and Windows 8
Regards,
Evgeny Pereguda
The question description leaves an impression that you do things in a somewhat chaotic way and the resulting freeze is not necessarily caused by Media Foundation or camera.
Use of media source and source reader are certainly the right way to access a camera and it provides efficient way to capture video, both synchronously and asynchronously.
However, your incomplete code snippets show that you create a media source, then source reader, and then you keep dealing with media source directly. Well, you are not supposed to do this. Once you created a source reader, it will manage media source for you: you don't need Stop, Shutdown calls. Your calling that and other methods might bring confusion that results in incorrect source reader behavior.
That is, either you deal with a media source, or you plug it into Media Session or Source Reader and use this higher level API.
Also note that if/when you experience a freeze, you are interested to break in with debugged and locate threads that indicate freeze position.

Add Enhanced Video Renderer Stream

I have a Direct Show program that utilizes the EVR. I would like to add another video stream that basically inserts a picture-in-picture box over the main video stream but can't quite figure out how to do it:
// When this is called, the graph is already running with the EVR
// displaying a web cam in stream 0
HRESULT CVideoControl::AddVideoStream(wchar_t* file)
{
HRESULT hr;
CComPtr<IMFMediaSink> sink;
CComPtr<IMFStreamSink> stream;
//hr = pEVR->QueryInterface(__uuidof(IMFMediaSink), (void **) &sink); <- FAILS
hr = MFCreateVideoRenderer(__uuidof(IMFMediaSink), (void **) &sink);
hr = sink->AddStreamSink(1234, NULL, &stream);
CComPtr<IMFGetService> service;
hr = pEVR->QueryInterface(&service);
CComPtr<IMFVideoMixerControl> mixer;
hr = service->GetService(MR_VIDEO_MIXER_SERVICE, IID_PPV_ARGS(&mixer));
MFVideoNormalizedRect rect = { .25, .25, .5, .5 };
hr = mixer->SetStreamOutputRect(1234, &rect);
hr = m_pGraph->RenderFile(file, NULL);
return hr;
}
Everything returns S_OK except the SetStreamOutputRect, which returns "The stream number provided was invalid."
I'm also dubious about the MFCreateVideoRenderer call, as this is a direct show program, not media foundation.
I'm pretty sure I am way oversimplifying this, but can't find much documentation on this. Any suggestions?
https://msdn.microsoft.com/en-us/library/windows/desktop/aa965247(v=vs.85).aspx
In a directshow program you need to create EVR with CoCreateInstance and then use it's IEVRFilterConfig interface as explained in the link above:
The EVR filter starts with one input pin, which corresponds to the reference stream. To add pins for substreams, query the filter for the IEVRFilterConfig interface and call IEVRFilterConfig::SetNumberOfStreams. Call this method before connecting any input pins. Pin 0 is always the reference stream. Connect this pin before any other pins, because the format of the reference stream might limit which substream formats are available.

Video Renderer Filter rejects sample

Currently my filter just forwards data from one input pin to a renderer-filter. I am testing it in graphstudio.
Now, everything seems to work just fine except that in the Deliver method of my output pin the call to
the connected input pin returns a sample-rejected error code. (
VFW_E_SAMPLE_REJECTED
0x8004022B )
According to MSDN this can happen if one the following is true:
The pin is flushing (see Flushing).
The pin is not connected.
The filter is stopped.
Some other error occurred
I don't think the first one is true. It can't be flushing for all input samples
The second one cannot be true because the filters have been conncected.
Third one is unlikely. Why should the filter be stopped.
So I think it must be some other error, but I couldn't find much helpful information.
HRESULT MCMyOutputPin::Deliver(IMediaSample* sample)
{
HRESULT hr = NO_ERROR;
myLogger->LogDebug("In Outputpin Deliver", L"D:\\TEMP\\yc.log");
if (sample->GetActualDataLength() > 0)
{
hr = m_pInputPin->Receive(sample);
sample->AddRef();
}
return hr;
//Forward to filter
}
As you can see i made sure to use the IMemAllocator provided by the input pin
HRESULT MCMyOutputPin::DecideAllocator(IMemInputPin *pPin, IMemAllocator **ppAlloc)
{
ALLOCATOR_PROPERTIES *pprops = new ALLOCATOR_PROPERTIES;
/*HRESULT hr = pPin->GetAllocatorRequirements(pprops);
if (FAILED(hr))*/
//return hr;
HRESULT hr = pPin->GetAllocator(ppAlloc);
if (hr == VFW_E_NO_ALLOCATOR)
{
hr = InitAllocator(ppAlloc);
if (FAILED(hr))
return hr;
}
hr = DecideBufferSize(*ppAlloc, pprops);
if (FAILED(hr))
return hr;
hr = pPin->NotifyAllocator(*ppAlloc, TRUE);
if (FAILED(hr))
{
return hr;
}
*ppAlloc = m_pAllocator;
m_pAllocator->AddRef();
return hr;
}
Here is where i get sample in my inputpin from the precdeing filter:
HRESULT CMyInputPin::Receive(IMediaSample *pSample)
{
mylogger->LogDebug("In Inputpin Receive", L"D:\\TEMP\\yc.log");
//Forward to filter
filter->acceptFilterInput(pinname, pSample);
return S_OK;
}
This calls acceptFilterInput in my filter:
void MyFilter::acceptFilterInput(LPCWSTR pinname, IMediaSample* sample)
{
//samplesPin0.insert(samplesPin0.end(), sample);
mylogger->LogDebug("In acceptFIlterInput", L"D:\\TEMP\\yc.log");
outpin->Deliver(sample);
}
the deliver method is already posted above
So many question asked recently, and you still don't ask them the right way. Here is the checklist to check your questions against before posting.
You have a rejection? What is the error code then.
Video renders are picky for input, for performance reasons. So if you are connecting to video renderer, you have to do everything correctly. Even if you can cut corners with other filters, it does not work out with video renderers.
My guess is that you ignore the rule that media samples on pin connection have to belong to the agreed allocator. VMR will only accept samples from its own allocator (effectively backed by video surfaces). One does not simply "forward" a media sample from input pin, which belongs to another allocator, to VMR's input. My best best it is the problem you are having. You have to copy data instead of passing media sample pointer between pins (or you have to propagate VMR's allocator, which is a pretty advanced task).
Additionally, VMR/EVR have specific requirements for video stride. As long as I see direct connection between VMR and your filter, I suspect you might be ignoring it, in which case you will face this problem later, but you can start reading MSDN right away: Handling Format Changes from the Video Renderer.

AVI created with AVIStreamWrite has incorrect length and playback speed

I'm trying to write to an AVI file using AVIStreamWrite but the resulting avi file is a bit messed up. The images in the avi contain the proper image and colors but the duration and speed of the video is off. I recorded a video that should have been around 7 seconds and looking at the file properties in Windows explorer it showed it had a duration of about 2 seconds. When I played it in Media Player it was too short and seemed to be playing very rapidly (motion in the video was like fast forward). I also can't seem to seek within the video using Media Player.
Here is what I'm doing...
//initialization
HRESULT AVIWriter::Init()
{
HRESULT hr = S_OK;
_hAVIFile = NULL;
_videoStream = NULL;
_frameCount = 0;
AVIFileInit();
::DeleteFileW(_filename);
hr = AVIFileOpen(&_hAVIFile,_filename,OF_WRITE|OF_CREATE, NULL);
if (hr != AVIERR_OK)
{
::cout << "AVI ERROR";
return 0;
}
/**************************************/
// Create a raw video stream in the file
::ZeroMemory(&_streamInfo, sizeof(_streamInfo));
_streamInfo.fccType = streamtypeVIDEO; // stream type
_streamInfo.fccHandler = 0; // No compressor
_streamInfo.dwScale = 1;
_streamInfo.dwRate = _aviFps; //this is 30
_streamInfo.dwSuggestedBufferSize = 0;
_streamInfo.dwSampleSize = 0;
SetRect( &_streamInfo.rcFrame, 0, 0,_bmi.biWidth , _bmi.biHeight );
hr = AVIFileCreateStream( _hAVIFile, // file pointer
&_videoStream,// returned stream pointer
&_streamInfo); // stream header
hr = AVIStreamSetFormat(_videoStream, 0,
&_bmi,
sizeof(_bmi));
return hr;
}
//call this when I receive a frame from my camera
HRESULT AVIWriter::AddFrameToAVI(BYTE* buffer)
{
HRESULT hr;
long size = _bmi.biHeight * _bmi.biWidth * 3;
hr = AVIStreamWrite(_videoStream, // stream pointer
_frameCount++, // time of this frame
1, // number to write
buffer, // pointer to data
size,// size of this frame
AVIIF_KEYFRAME, // flags....
NULL,
NULL);
return hr;
}
//call this when I am done
void AVIWriter::CloseAVI()
{
AVIStreamClose(_videoStream);
AVIFileClose(_hAVIFile);
AVIFileExit();
}
Now as a test I used DirectShow's GraphEdit to create a graph consisting of a VideoCapture Filter for this same camera and an AVI mux and created an avi file. The resulting AVI file was fine. The frame rate was 30 fps, the same that I am using. I queried both avi files (my 'bad' one and the 'good' one created with GraphEdit) using a call to AVIStreamInfo and the stream info was pretty much the same for both files. I would have expected either the samples per second or number of frames to be way off for my 'bad' avi but it wasn't. Am I doing something wrong that would cause my AVI to have the incorrect length and seem to play back at an increased speed?? I'm new to using VFW so any help is appreciated. Thanks
Frame time in the will will eventually be _frameCount / _aviFps, so it is either you are dropping your frames and they don't reach AVIStreamWrite, or alternatively if you prefer to skip a few frames in the file, you need to increment _frameCount respectively, to jump over frames to skip.