Detect end of Video with IMediaSeeking - c++

I am playing a video to get some screens using DirectShow.
I am doing this in a loop by calling IMediaControl->Run, IVMRWindowlessControl->GetCurrentImage and then a IMediaSeeking->SetPositions.
The problem is that I cannot detect when the video is over. IMediaSeeking->SetPositions returns always same value (S_FALSE). IMediaControl->Runalso returns always S_FALSE. I have also tried IMediaEvent->GetEvent after the call to IMediaControl->Run to check for EC_COMPLETE but instead returns (always) EC_CLOCK_CHANGED.
How can I detect the end of video ? Thanks
UPDATE: Doing something like
long eventCode = 0;
LONG_PTR ptrParam1 = 0;
LONG_PTR ptrParam2 = 0;
long timeoutMs = INFINITE;
while (SUCCEEDED(pEvent->GetEvent(&eventCode, &ptrParam1, &ptrParam1, timeoutMs)))
{
if (eventCode == EC_COMPLETE)
{
break;
}
// Free the event data.
hr = pEvent->FreeEventParams(eventCode, ptrParam1, ptrParam1);
if (FAILED(hr))
{
break;
}
}
blocks after few events: 0x53 (EC_VMR_RENDERDEVICE_SET), 0x0D (EC_CLOCK_CHANGED), 0x0E (EC_PAUSED), next call to GetEvent is blocking and the video is rendered (played frame by frame) in my IVideoWindow

You should be doing IMediaEvent->GetEvent, however note you will be receiving various events, not only EC_CLOCK_CHANGED. Keep receiving and you are to get EC_COMPLETE. Step 6: Handle Graph Events on MSDN explains this in detail.

Check the state of the filter graph with IMediaControl::GetState and see if it is stopped. You can also get the duration of the video from IMediaSeeking::GetDuration that you may find helpful.
Another option is to use event signaling. This event processing can be off-threaded.

Related

IMFTransfomer::ProcessInput() and MF_E_TRANSFORM_NEED_MORE_INPUT

I have code that decodes AAC-encoded audio using IMFTransform. It works well for various test inputs. But I observed that in some cases IMFTransform::ProcessOutput() returns MF_E_TRANSFORM_NEED_MORE_INPUT when according to my reading of MS documentation it should return a valid data sample.
Basically the code has the following structure:
IMFTransform* transformer;
MFT_OUTPUT_DATA_BUFFER output_data_buffer;
...
bool try_to_get_output = false;
for (;;) {
if (try_to_get_output) {
// Try to get the outpu sample.
try_to_get_output = false;
output_data_buffer.dwStatus = 0;
...
hr = transformer->ProcessOutput(...&output_data_buffer);
if (success) {
// process sample
if (output_data_buffer.dwStatus & MFT_OUTPUT_DATA_BUFFER_INCOMPLETE) {
// We have more data
try_to_get_output = true;
}
} else if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT) {
Log("Unnecessary ProcessOutput()");
} else {
// Process other errors
}
continue;
}
// Send more encoded AAC data to MFT.
hr->ProcessInput();
}
What happens is that ProcessOutput() sets MFT_OUTPUT_DATA_BUFFER_INCOMPLETE in MFT_OUTPUT_DATA_BUFFER.dwStatus but then the following ProcessOutput() always returns MF_E_TRANSFORM_NEED_MORE_INPUT contradicting the documentation.
Again, so far it seems harmless and things works. But then what exactly does AAC decoder want to tell the caller via setting MFT_OUTPUT_DATA_BUFFER_INCOMPLETE?
This might be a small glitch in the decoder implementation. Quite possible that if you happen to drain the MFT it would spit out some data, so the incompletion flag migth indicate, a bit confusingly, some data even though not immediately accessible.
However overall the idea is to do ProcessOutput sucking the output data for as long as possible until yщu get MF_E_TRANSFORM_NEED_MORE_INPUT, and then proceed with feeding new input (or draining). That is, I would say MF_E_TRANSFORM_NEED_MORE_INPUT is much more important compared to MFT_OUTPUT_DATA_BUFFER_INCOMPLETE. After all this is what Microsoft's own code over MFTs does.
Also keep in mind that AAC decoder is an "old", "first generation" MFT and so over years its update could be such that it diverted a bit from the current docs.

WIN API - Programm getting stuck while a button is selected

I am creating a simple software using WINAPI that reads the data from a sensor connected to a computer via USB.
In this software, I am implementing some functions like read mode, test mode, etc.
The problem that I am facing is that I am getting stuck while I select the button for continuous reading, the code follows below:
case WM_COMMAND:
switch (wp)
{
case START_BUTTON:
printf("START_BUTTON");
while(SendDlgItemMessage(hWnd,START_BUTTON,BM_GETCHECK ,TRUE,0)== BST_CHECKED)
{
char* var = USB_Read(); //Get data from the sensor
SetWindowText(hLux, var); //Display the data
if (SendDlgItemMessage(hWnd,START_BUTTON,BM_GETCHECK ,TRUE,0)!= BST_CHECKED) //Check if the botton is no longer selected
break;
}
break;
}
break;
I know that the problem is in the while-loop, when I press it all the program gets stuck, only the data is being displayed correctly, the other controls get like frozen.
The question is: How could I display the data continuously and have access to the other controls at the same time?
You have to create a thread of execution that reads the usb while the start is checked.
So we create a thread that is started in the program initialization, which run continuosly and reads usb each time it founds the button checked.
Now in the message loop you simply check or uncheck the button.
DWORD WINAPI ThreadFunction( LPVOID lpParam )
{
(void)lpParam; //make happy compiler for unused variable
while (TRUE) //Once created the thread runs always
{
//If checked reads usb for each iteration
if(SendDlgItemMessage(hWnd,START_BUTTON,BM_GETCHECK ,0,0)== BST_CHECKED)
{
char* var = USB_Read(); //Get data from the sensor
SetWindowText(hLux, var); //Display the data
Sleep(1); //Why this? to don't have a furious CPU usage
}
}
}
.....
//Winmain
DWORD dwThreadId; //thread ID in case you'll need it
//Create and start the thread
CreateThread(
NULL, // default security attributes
0, // use default stack size
ThreadFunction, // thread function name
NULL, // argument to thread function
0, // use default creation flags
&dwThreadId); // returns the thread identifier
......
case WM_COMMAND:
switch (wp)
{
case START_BUTTON:
printf("START_BUTTON");
if(SendDlgItemMessage(hWnd,START_BUTTON,BM_GETCHECK ,0,0)== BST_CHECKED)
SendDlgItemMessage(hWnd,START_BUTTON,BM_SETCHECK ,BST_UNCHECKED, 0);
else
SendDlgItemMessage(hWnd,START_BUTTON,BM_SETCHECK ,BST_CHECKED, 0);
break;
}
break;
EDIT: I modified the program to check/uncheck the radio button.
Please note the usage of the Sleep function with a minimal value of 1ms. It is used to give back control to the OS to smooth the CPU usage. If in the function that reads the usb there are enough OS synch primitives it can be omitted (check cpu usage).

DirectShow CSourceStream::FillBuffer unpredictable number of calls after Pause and Seek to the first frame

I have a Directshow File Source Filter which has audio and frame output pins. It is written in C++ based on this tutorial on MSDN. My filter opens the video by using Medialooks MFormats SDK and provides raw data to output pins. Two pins are directly connecting to renderer filters when they are rendered.
The problem occurs when I run the graph, pause the video and seek to the frame number 0. After a call to ChangeStart method in output frame pin, sometimes FillBuffer is called three times and frame 1 is shown on the screen instead of 0. When it is called two times, it shows the correct frame which is the frame 0.
Output pins are inherited from CSourceStream and CSourceSeeking classes. Here is my FillBuffer and ChangeStart methods of the output frame pin;
FillBuffer Method
HRESULT frame_pin::FillBuffer(IMediaSample *sample)
{
CheckPointer(sample, E_POINTER);
BYTE *frame_buffer;
sample->GetPointer(&frame_buffer);
// Check if the downstream filter is changing the format.
CMediaType *mt;
HRESULT hr = sample->GetMediaType(reinterpret_cast<AM_MEDIA_TYPE**>(&mt));
if (hr == S_OK)
{
auto new_width = reinterpret_cast<VIDEOINFOHEADER2*>(mt->pbFormat)->bmiHeader.biWidth;
auto old_witdh = reinterpret_cast<VIDEOINFOHEADER2*>(m_mt.pbFormat)->bmiHeader.biWidth;
if(new_width != old_witdh)
format_changed_ = true;
SetMediaType(mt);
DeleteMediaType(mt);
}
ASSERT(m_mt.formattype == FORMAT_VideoInfo2);
VIDEOINFOHEADER2 *vih = reinterpret_cast<VIDEOINFOHEADER2*>(m_mt.pbFormat);
CComPtr<IMFFrame> mf_frame;
{
CAutoLock lock(&shared_state_);
if (source_time_ >= m_rtStop)
return S_FALSE;
// mf_reader_ is a member external SDK instance which gets the frame data with this function call
hr = mf_reader_->SourceFrameConvertedGetByNumber(&av_props_, frame_number_, -1, &mf_frame, CComBSTR(L""));
if (FAILED(hr))
return hr;
REFERENCE_TIME start, stop = 0;
start = stream_time_;
stop = static_cast<REFERENCE_TIME>(tc_.get_stop_time() / m_dRateSeeking);
sample->SetTime(&start, &stop);
stream_time_ = stop;
source_time_ += (stop - start);
frame_number_++;
}
if (format_changed_)
{
CComPtr<IMFFrame> mf_frame_resized;
mf_frame->MFResize(eMFCC_YUY2, std::abs(vih->bmiHeader.biWidth), std::abs(vih->bmiHeader.biHeight), 0, &mf_frame_resized, CComBSTR(L""), CComBSTR(L""));
mf_frame = mf_frame_resized;
}
MF_FRAME_INFO mf_frame_info;
mf_frame->MFAllGet(&mf_frame_info);
memcpy(frame_buffer, reinterpret_cast<BYTE*>(mf_frame_info.lpVideo), mf_frame_info.cbVideo);
sample->SetActualDataLength(static_cast<long>(mf_frame_info.cbVideo));
sample->SetSyncPoint(TRUE);
sample->SetPreroll(FALSE);
if (discontinuity_)
{
sample->SetDiscontinuity(TRUE);
discontinuity_ = FALSE;
}
return S_OK;
}
ChangeStart Method
HRESULT frame_pin::ChangeStart()
{
{
CAutoLock lock(CSourceSeeking::m_pLock);
tc_.reset();
stream_time_ = 0;
source_time_ = m_rtStart;
frame_number_ = static_cast<int>(m_rtStart / frame_lenght_);
}
update_from_seek();
return S_OK;
}
From the Microsoft DirectShow documentation:
The CSourceSeeking class is an abstract class for implementing
seeking in source filters with one output pin.
CSourceSeeking is not recommended for a filter with more than one
output pin. The main issue is that only one pin should respond to
seeking requests. Typically this requires communication among the pins
and the filter.
And you have two output pins in your source filter.
The CSourceSeeking class can be extended to manage more than one output pin with custom coding. When seek commands come in they'll come through both input pins so you'll need to decide which pin is controlling seeking and ignore seek commands arriving at the other input pin.

Windows MFT (Media Foundation Transform) decoder not returning proper sample time or duration

To decode a H264 stream with the Windows Media foundation Transform, the work flow is currently something like this:
IMFSample sample;
sample->SetTime(time_in_ns);
sample->SetDuration(duration_in_ns);
sample->AddBuffer(buffer);
// Feed IMFSample to decoder
mDecoder->ProcessInput(0, sample, 0);
// Get output from decoder.
/* create outputsample that will receive content */ { ... }
MFT_OUTPUT_DATA_BUFFER output = {0};
output.pSample = outputsample;
DWORD status = 0;
HRESULT hr = mDecoder->ProcessOutput(0, 1, &output, &status);
DWORD status = 0;
hr = mDecoder->ProcessOutput(0, 1, &output, &status);
if (output.pEvents) {
// We must release this, as per the IMFTransform::ProcessOutput()
// MSDN documentation.
output.pEvents->Release();
output.pEvents = nullptr;
}
if (hr == MF_E_TRANSFORM_STREAM_CHANGE) {
// Type change, probably geometric aperture change.
// Reconfigure decoder output type, so that GetOutputMediaType()
} else if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT) {
// Not enough input to produce output.
} else if (!output.pSample) {
return S_OK;
} else }
// Process output
}
}
When we have fed all data to the MFT decoder, we must drain it:
mDecoder->ProcessMessage(MFT_MESSAGE_COMMAND_DRAIN, 0);
Now, one thing with the WMF H264 decoder, is that it will typically not output anything before having been called with over 30 compressed h264 frames regardless of the size of the h264 sliding window. Latency is very high...
I'm encountering an issue that is very troublesome.
With a video made only of keyframes, and which has only 15 frames, each being 2s long, the first frame having a presentation time of non-zero (this stream is from live content, so first frame is typically in epos time)
So without draining the decoder, nothing will come out of the decoder as it hasn't received enough frames.
However, once the decoder is drained, the decoded frame will come out. HOWEVER, the MFT decoder has set all durations to 33.6ms only and the presentation time of the first sample coming out is always 0.
The original duration and presentation time have been lost.
If you provide over 30 frames to the h264 decoder, then both duration and pts are valid...
I haven't yet found a way to get the WMF decoder to output samples with the proper value.
It appears that if you have to drain the decoder before it has output any samples by itself, then it's totally broken...
Has anyone experienced such problems? How did you get around it?
Thank you in advance
Edit: a sample of the video is available on http://people.mozilla.org/~jyavenard/mediatest/fragmented/1301869.mp4
Playing this video with Firefox will causes it to play extremely quickly due to the problems described above.
I'm not sure that your work flow is correct. I think you should do something like this:
do
{
...
hr = mDecoder->ProcessInput(0, sample, 0);
if(FAILED(hr))
break;
...
hr = mDecoder->ProcessOutput(0, 1, &output, &status);
if(FAILED(hr) && hr != MF_E_TRANSFORM_NEED_MORE_INPUT)
break;
}
while(hr == MF_E_TRANSFORM_NEED_MORE_INPUT);
if(SUCCEEDED(hr))
{
// You have a valid decoded frame here
}
The idea is to keep calling ProcessInput/ProcessOuptut while ProcessOutput returns MF_E_TRANSFORM_NEED_MORE_INPUT. MF_E_TRANSFORM_NEED_MORE_INPUT means that decoder needs more input. I think that with this loop you won't need to drain the decoder.

Media Foundation Webcam live capture freezes in low light condition

We are building a video communication software. We are using Media Foundation to obtain the live Stream. We use the IMFSourceReadder to perform the capture.
The sequence of call looks like:
hr = pAttributes->SetString(MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_SYMBOLIC_LINK, m_pwszSymbolicLink);
hr = MFCreateDeviceSourceActivate(pAttributes, &avdevice);
hr = avdevice->ActivateObject(__uuidof(IMFMediaSource), (void**) &m_mediaSource);
hr = m_mediaSource->CreatePresentationDescriptor(&pPD);
hr = pPD->GetStreamDescriptorByIndex(m_streamIdx, &fSelected, &pSD);
hr =
// we select the best native MediaType enumerating the source reader
pHandler->SetCurrentMediaType(m_bestNativeType);
hr = pAttributes->SetUINT32(MF_READWRITE_DISABLE_CONVERTERS, FALSE);
hr = pAttributes->SetUINT32(MF_SOURCE_READER_ENABLE_ADVANCED_VIDEO_PROCESSING, TRUE);
hr = MFCreateSourceReaderFromMediaSource(m_mediaSource, pAttributes, &m_reader);
Then we start to read the frame SYNCHRONOUSLY in a separate thread using
m_reader->ReadSample()
When we need to stop the device or reconfigure it, we stop the thread (by setting an flag and exiting the thread). We call the following
hr = m_mediaSource->Stop();
m_mediaSource->Shutdown();
SafeRelease(&m_mediaSource);
SafeRelease(&m_reader);
The software can be out ouf call. There, it captures the webcam video in VGA format and display it on screen. In call, it selects the best capture format depending on the negociated call quality and restarts the capture.
The issues that we are experiencing are the following: some cameras freeze sometimes in low light conditions (low fps output). It can happen right away at the beginning of the call or during the call.
When it freezes, one of the two things can happen (not sure which one)
m_reader->ReadSample() fails repetitively with MF_E_OPERATION_CANCELLED error code
m_reader->ReadSample() returns often producing more than 80 frames per seconds producing same frozen image.
When we hang up the device is reconfigured back to VGA capture and works fine.
Does someone struggled with Media Foundation on the same issue?
You wrote that web camera "freez" - produce low frame rate while capture image with low light condition. The result of it that controller of web camera take more time on exposition of photo matrix in automatic mode. It allows improve quality of image by increasing frame duration. So, it is special feature of hardware part. it is possible to switch such behavior of camera from auto mode on manual mode of parameter
Code::Result VideoCaptureDevice::setParametrs(CamParametrs parametrs){
ResultCode::Result result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_ERROR;
if(pLocalSource)
{
unsigned int shift = sizeof(Parametr);
Parametr *pParametr = (Parametr *)(&settings);
Parametr *pPrevParametr = (Parametr *)(&prevParametrs);
CComPtrCustom<IAMVideoProcAmp> pProcAmp;
HRESULT hr = pLocalSource->QueryInterface(IID_PPV_ARGS(&pProcAmp));
if (SUCCEEDED(hr))
{
for(unsigned int i = 0; i < 10; i++)
{
if(pPrevParametr[i].CurrentValue != pParametr[i].CurrentValue || pPrevParametr[i].Flag != pParametr[i].Flag)
hr = pProcAmp->Set(VideoProcAmp_Brightness + i, pParametr[i].CurrentValue, pParametr[i].Flag);
}
}
else
{
result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_SETVIDEOPROCESSOR_ERROR;
goto finish;
}
CComPtrCustom<IAMCameraControl> pProcControl;
hr = pLocalSource->QueryInterface(IID_PPV_ARGS(&pProcControl));
if (SUCCEEDED(hr))
{
for(unsigned int i = 0; i < 7; i++)
{
if(pPrevParametr[10 + i].CurrentValue != pParametr[10 + i].CurrentValue || pPrevParametr[10 + i].Flag != pParametr[10 + i].Flag)
hr = pProcControl->Set(CameraControl_Pan+i, pParametr[10 + i].CurrentValue, pParametr[10 + i].Flag);
}
}
else
{
result = ResultCode::VIDEOCAPTUREDEVICE_SETPARAMETRS_SETVIDEOCONTROL_ERROR;
goto finish;
}
result = ResultCode::OK;
prevParametrs = parametrs.settings;
}finish:
if(result != ResultCode::OK)
DebugPrintOut::getInstance().printOut(L"VIDEO CAPTURE DEVICE: Parametrs of video device cannot be set!!!\n");
return result;
}
where:
struct Parametr
{
long CurrentValue;
long Min;
long Max;
long Step;
long Default;
long Flag;
Parametr();
};
struct CamParametrs
{
Parametr Brightness;
Parametr Contrast;
Parametr Hue;
Parametr Saturation;
Parametr Sharpness;
Parametr Gamma;
Parametr ColorEnable;
Parametr WhiteBalance;
Parametr BacklightCompensation;
Parametr Gain;
Parametr Pan;
Parametr Tilt;
Parametr Roll;
Parametr Zoom;
Parametr Exposure;
Parametr Iris;
Parametr Focus;
};
More code you can find on site:
Capturing Live-video from Web-camera on Windows 7 and Windows 8
However, using of IMFSourceReader can be not effective. Media Foundation model uses async interaction - after sending the request into the media source code must listen responding from media source with new frame or some other info. Method with direct calling m_reader->ReadSample() cannot be effective - you faced with it. Method m_reader->ReadSample() can be effective with reading frames from video file while delay can be very low, but for web camera I can advice use topology - session binding, like in my code Capturing Live-video from Web-camera on Windows 7 and Windows 8
Regards,
Evgeny Pereguda
The question description leaves an impression that you do things in a somewhat chaotic way and the resulting freeze is not necessarily caused by Media Foundation or camera.
Use of media source and source reader are certainly the right way to access a camera and it provides efficient way to capture video, both synchronously and asynchronously.
However, your incomplete code snippets show that you create a media source, then source reader, and then you keep dealing with media source directly. Well, you are not supposed to do this. Once you created a source reader, it will manage media source for you: you don't need Stop, Shutdown calls. Your calling that and other methods might bring confusion that results in incorrect source reader behavior.
That is, either you deal with a media source, or you plug it into Media Session or Source Reader and use this higher level API.
Also note that if/when you experience a freeze, you are interested to break in with debugged and locate threads that indicate freeze position.