Media Foundation set video interlacing and decode - c++

I have an MOV file and I want to decode it and have all frames as separate images.
So I try to configure an uncompressed media type in the following way:
// configure the source reader
IMFSourceReader* m_pReader;
MFCreateSourceReaderFromURL(filePath, NULL, &m_pReader);
// get the compressed media type
IMFMediaType* pFileVideoMediaType;
m_pReader->GetCurrentMediaType(MF_SOURCE_READER_FIRST_VIDEO_STREAM, &pFileVideoMediaType);
// create new media type for uncompressed type
IMFMediaType* pTypeUncomp;
MFCreateMediaType(&pTypeUncomp);
// copy all settings from compressed to uncompressed type
pFileVideoMediaType->CopyAllItems(pTypeUncomp);
// set the uncompressed video attributes
pTypeUncomp->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB8);
pTypeUncomp->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE);
pTypeUncomp->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive);
// set the new uncompressed type to source reader
m_pReader->SetCurrentMediaType(MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, pTypeUncomp);
// get the full uncompressed media type
m_pReader->GetCurrentMediaType(MF_SOURCE_READER_FIRST_VIDEO_STREAM, &pTypeUncomp);
I noticed that even I explicitly set the MF_MT_INTERLACE_MODE to MFVideoInterlace_Progressive the final configuration is still configured with the old mode MFVideoInterlace_MixedInterlaceOrProgressive.
Afterwards, I loop through all samples and look at their size:
IMFSample* videoSample = nullptr;
IMFMediaBuffer* mbuffer = nullptr;
LONGLONG llTimeStamp;
DWORD streamIndex, flags;
m_pReader->ReadSample(
MF_SOURCE_READER_FIRST_VIDEO_STREAM,
0, // Flags.
&streamIndex, // Receives the actual stream index.
&flags, // Receives status flags.
&llTimeStamp, // Receives the time stamp.
&videoSample) // Receives the sample or NULL.
videoSample->ConvertToContiguousBuffer(&mbuffer);
BYTE* videoData = nullptr;
DWORD sampleBufferLength = 0;
mbuffer->Lock(&videoData, nullptr, &sampleBufferLength);
cout << sampleBufferLength << endl;
And I get quite different sizes for the samples: from 31bytes to 18000bytes.
Even changing the format to MFVideoFormat_RGB32 does not change affect the sample sizes.
This question seems to have the same issue but the solution is not fixing it.
Any help on why I can't change the interlacing and how to properly decode video frames and get image data out of samples?
Many thanks in advance.

In order to make SourceReader convert the samples to RGB you need to create it like this:
IMFAttributes* pAttr = NULL;
MFCreateAttributes(&pAttr, 1);
pAttr->SetUINT32(MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS, TRUE);
pAttr->SetUINT32(MF_SOURCE_READER_ENABLE_VIDEO_PROCESSING, TRUE);
IMFSourceReader* m_pReader;
throwIfFailed(MFCreateSourceReaderFromURL(filePath, pAttr, &m_pReader), Can't create source reader from url");
pAttr->Release();
Later, you shouldn't break from the cycle when MF_SOURCE_READERF_CURRENTMEDIATYPECHANGED occurs. Now you'll have all samples with the same size.
Otherwise you can use MFVideoFormat_NV12 subtype and then you won't need to specify MF_SOURCE_READER_ENABLE_VIDEO_PROCESSING attribute when creating the reader.

Related

Media Foundation: Getting a MediaSink from a SinkWriter

I'm trying to add an MP4 file sink to a Topology. When my MediaSource is already MP4, I use MFCreateMPEG4MediaSink and MF_MPEG4SINK_SPSPPS_PASSTHROUGH. When my MediaSource isn't MP4 (so raw YUV from a webcam), I want to use MFCreateSinkWriterFromURL so that I don't have to figure out MP4 headers and other complex stuff.
According to the MSDN Docs I should be able to use GetServiceForStream to get at the MediaSink, since the input type is different from the output type. However it always returns MF_E_UNSUPPORTED_SERVICE.
How can I get the underlying MediaSink out of a MediaSinkWriter?
Alternatively, how can I easily create a MP4 media sink for an arbitrary topology?
HRESULT CreateVideoFileSink(
IMFStreamDescriptor *pSourceSD, // Pointer to the stream descriptor.
LPCWSTR pFilename, // Name of file to save to.
IMFStreamSink **ppStream) // Receives a pointer to the stream sink.
HRESULT hr = S_OK;
CComPtr<IMFAttributes> pAttr;
CComPtr<IMFMediaTypeHandler> pHandler;
CComPtr<IMFMediaType> pType;
CComPtr<IMFMediaSink> pSink;
CComPtr<IMFStreamSink> pStream;
CComPtr<IMFSinkWriter> pSinkWriter;
CComPtr<IMFByteStream> pByteStream;
*ppStream = nullptr;
// Get the media type handler for the stream.
IFR(pSourceSD->GetMediaTypeHandler(&pHandler));
// Get the major media type.
GUID guidMajorType;
IFR(pHandler->GetMajorType(&guidMajorType));
IFR(MFCreateAttributes(&pAttr, 1));
IFR(pAttr->SetUINT32(MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS, TRUE));
// Create an output file
if (MFMediaType_Video == guidMajorType)
{
GUID guidSubType;
IFR(pHandler->GetCurrentMediaType(&pType));
IFR(pType->GetGUID(MF_MT_SUBTYPE, &guidSubType));
if (MFVideoFormat_H264 == guidSubType)
{
// ... use MFCreateMPEG4MediaSink
}
else
{
IFR(MFCreateSinkWriterFromURL(pFilename, nullptr, pAttr, &pSinkWriter));
DWORD streamIdx;
IFR(pSinkWriter->AddStream(pType, &streamIdx));
IFR(pSinkWriter->GetServiceForStream(MF_SINK_WRITER_MEDIASINK, GUID_NULL, IID_PPV_ARGS(&pSink)));
IFR(pSink->GetStreamSinkByIndex(streamIdx, &pStream));
}
}
else
{
// Don't use this stream
IFR(E_FAIL)
}
// Return IMFStreamSink pointer to caller.
*ppStream = pStream.Detach();
return S_OK;
}
Figured it out right after writing the question - of course. The SinkWriter doesn't have a MediaSink until you call BeginWriting.
IFR(MFCreateSinkWriterFromURL(pFilename, nullptr, pAttr, &pSinkWriter));
DWORD streamIdx;
IFR(pSinkWriter->AddStream(pType, &streamIdx));
IFR(pSinkWriter->BeginWriting()); // <<----
IFR(pSinkWriter->GetServiceForStream(MF_SINK_WRITER_MEDIASINK, GUID_NULL, IID_PPV_ARGS(&pSink)));
IFR(pSink->GetStreamSinkByIndex(streamIdx, &pStream));
(Make sure you don't let the SinkWriter get Released while you're using the StreamSink)

Media Foundation Audio/Video capturing to MPEG4FileSink produces incorrect duration

I am working on media streaming application using Media Foundation framework. I've used some samples from internet and from Anton Polinger book. Unfortunately after saving streams into mp4 file metadata of file is corrupted. It has incorrect duration (according to time of work of my PC, 30 hours for instance), wrong bitrate. After long struggling I've fixed it for single stream (video or audio) but when i try to record both audio and video this problem returns again. Something is wrong with my topology but i can't understand what and probably there are some experts here?
I get audio and video source, wrap it into IMFCollection, create aggregate source by MFCreateAggregateSource.
I create source nodes for each source in aggregate source:
Com::IMFTopologyNodePtr
TopologyBuilder::CreateSourceNode(Com::IMFStreamDescriptorPtr
streamDescriptor)
{
HRESULT hr = S_OK;
Com::IMFTopologyNodePtr pNode;
// Create the topology node, indicating that it must be a source node.
hr = MFCreateTopologyNode(MF_TOPOLOGY_SOURCESTREAM_NODE, &pNode);
THROW_ON_FAIL(hr, "Unable to create topology node for source");
// Associate the node with the source by passing in a pointer to the media source,
// and indicating that it is the source
hr = pNode->SetUnknown(MF_TOPONODE_SOURCE, _sourceDefinition->GetMediaSource());
THROW_ON_FAIL(hr, "Unable to set source as object for topology node");
// Set the node presentation descriptor attribute of the node by passing
// in a pointer to the presentation descriptor
hr = pNode->SetUnknown(MF_TOPONODE_PRESENTATION_DESCRIPTOR, _sourceDefinition->GetPresentationDescriptor());
THROW_ON_FAIL(hr, "Unable to set MF_TOPONODE_PRESENTATION_DESCRIPTOR to node");
// Set the node stream descriptor attribute by passing in a pointer to the stream
// descriptor
hr = pNode->SetUnknown(MF_TOPONODE_STREAM_DESCRIPTOR, streamDescriptor);
THROW_ON_FAIL(hr, "Unable to set MF_TOPONODE_STREAM_DESCRIPTOR to node");
return pNode;
}
After that i connect each source to transform(H264 encoder and AAC encoder) and to MPEG4FileSink:
void TopologyBuilder::CreateFileSinkOutputNode(PCWSTR filePath)
{
HRESULT hr = S_OK;
DWORD sink_count;
Com::IMFByteStreamPtr byte_stream;
Com::IMFTransformPtr transform;
LPCWSTR lpcwstrFilePath = filePath;
hr = MFCreateFile(
MF_ACCESSMODE_WRITE, MF_OPENMODE_FAIL_IF_NOT_EXIST, MF_FILEFLAGS_NONE,
lpcwstrFilePath, &byte_stream);
THROW_ON_FAIL(hr, L"Unable to create and open file");
// Video stream
Com::IMFMediaTypePtr in_mf_video_media_type = _sourceDefinition->GetCurrentVideoMediaType();
Com::IMFMediaTypePtr out_mf_media_type = CreateMediaType(MFMediaType_Video, MFVideoFormat_H264);
hr = CopyType(in_mf_video_media_type, out_mf_media_type);
THROW_ON_FAIL(hr, L"Unable to copy type parameters");
if (GetSubtype(in_mf_video_media_type) != MEDIASUBTYPE_H264)
{
transform.Attach(CreateAndInitCoderMft(MFT_CATEGORY_VIDEO_ENCODER, out_mf_media_type));
THROW_ON_NULL(transform);
}
if (transform)
{
Com::IMFMediaTypePtr transformMediaType;
hr = transform->GetOutputCurrentType(0, &transformMediaType);
THROW_ON_FAIL(hr, L"Unable to get current output type");
UINT32 pcbBlobSize = 0;
hr = transformMediaType->GetBlobSize(MF_MT_MPEG_SEQUENCE_HEADER, &pcbBlobSize);
THROW_ON_FAIL(hr, L"Unable to get blob size of MF_MT_MPEG_SEQUENCE_HEADER");
std::vector<UINT8> blob(pcbBlobSize);
hr = transformMediaType->GetBlob(MF_MT_MPEG_SEQUENCE_HEADER, &blob.front(), blob.size(), NULL);
THROW_ON_FAIL(hr, L"Unable to get blob MF_MT_MPEG_SEQUENCE_HEADER");
hr = out_mf_media_type->SetBlob(MF_MT_MPEG_SEQUENCE_HEADER, &blob.front(), blob.size());
THROW_ON_FAIL(hr, L"Unable to set blob of MF_MT_MPEG_SEQUENCE_HEADER");
}
// Audio stream
Com::IMFMediaTypePtr out_mf_audio_media_type;
Com::IMFTransformPtr transformAudio;
Com::IMFMediaTypePtr mediaTypeTmp = _sourceDefinition->GetCurrentAudioMediaType();
Com::IMFMediaTypePtr in_mf_audio_media_type;
if (mediaTypeTmp != NULL)
{
std::unique_ptr<MediaTypesFactory> factory(new MediaTypesFactory());
if (!IsMediaTypeSupportedByAacEncoder(mediaTypeTmp))
{
UINT32 channels;
hr = mediaTypeTmp->GetUINT32(MF_MT_AUDIO_NUM_CHANNELS, &channels);
THROW_ON_FAIL(hr, L"Unable to get MF_MT_AUDIO_NUM_CHANNELS fron source media type");
in_mf_audio_media_type = factory->CreatePCM(factory->DEFAULT_SAMPLE_RATE, channels);
}
else
{
in_mf_audio_media_type.Attach(mediaTypeTmp.Detach());
}
out_mf_audio_media_type = factory->CreateAAC(in_mf_audio_media_type, factory->HIGH_ENCODED_BITRATE);
GUID subType = GetSubtype(in_mf_audio_media_type);
if (GetSubtype(in_mf_audio_media_type) != MFAudioFormat_AAC)
{
// add encoder to Aac
transformAudio.Attach(CreateAndInitCoderMft(MFT_CATEGORY_AUDIO_ENCODER, out_mf_audio_media_type));
}
}
Com::IMFMediaSinkPtr pFileSink;
hr = MFCreateMPEG4MediaSink(byte_stream, out_mf_media_type, out_mf_audio_media_type, &pFileSink);
THROW_ON_FAIL(hr, L"Unable to create mpeg4 media sink");
Com::IMFTopologyNodePtr pOutputNodeVideo;
hr = MFCreateTopologyNode(MF_TOPOLOGY_OUTPUT_NODE, &pOutputNodeVideo);
THROW_ON_FAIL(hr, L"Unable to create output node");
hr = pFileSink->GetStreamSinkCount(&sink_count);
THROW_ON_FAIL(hr, L"Unable to get stream sink count from mediasink");
if (sink_count == 0)
{
THROW_ON_FAIL(E_UNEXPECTED, L"Sink count should be greater than 0");
}
Com::IMFStreamSinkPtr stream_sink_video;
hr = pFileSink->GetStreamSinkByIndex(0, &stream_sink_video);
THROW_ON_FAIL(hr, L"Unable to get stream sink by index");
hr = pOutputNodeVideo->SetObject(stream_sink_video);
THROW_ON_FAIL(hr, L"Unable to set stream sink as output node object");
hr = _pTopology->AddNode(pOutputNodeVideo);
THROW_ON_FAIL(hr, L"Unable to add file sink output node");
pOutputNodeVideo = AddEncoderIfNeed(_pTopology, transform, in_mf_video_media_type, pOutputNodeVideo);
_outVideoNodes.push_back(pOutputNodeVideo);
Com::IMFTopologyNodePtr pOutputNodeAudio;
if (in_mf_audio_media_type != NULL)
{
hr = MFCreateTopologyNode(MF_TOPOLOGY_OUTPUT_NODE, &pOutputNodeAudio);
THROW_ON_FAIL(hr, L"Unable to create output node");
Com::IMFStreamSinkPtr stream_sink_audio;
hr = pFileSink->GetStreamSinkByIndex(1, &stream_sink_audio);
THROW_ON_FAIL(hr, L"Unable to get stream sink by index");
hr = pOutputNodeAudio->SetObject(stream_sink_audio);
THROW_ON_FAIL(hr, L"Unable to set stream sink as output node object");
hr = _pTopology->AddNode(pOutputNodeAudio);
THROW_ON_FAIL(hr, L"Unable to add file sink output node");
if (transformAudio)
{
Com::IMFTopologyNodePtr outputTransformNodeAudio;
AddTransformNode(_pTopology, transformAudio, pOutputNodeAudio, &outputTransformNodeAudio);
_outAudioNode = outputTransformNodeAudio;
}
else
{
_outAudioNode = pOutputNodeAudio;
}
}
}
When output type is applied on to audio transform, it has 15 attributes instead of 8, including MF_MT_AVG_BITRATE which should be applied to video as i understand. In my case it is 192000 and it is different of MF_MT_AVG_BITRATE on video stream.
My AAC media type is creating by this method:
HRESULT MediaTypesFactory::CopyAudioTypeBasicAttributes(IMFMediaType * in_media_type, IMFMediaType * out_mf_media_type) {
HRESULT hr = S_OK;
static const GUID AUDIO_MAJORTYPE = MFMediaType_Audio;
static const GUID AUDIO_SUBTYPE = MFAudioFormat_PCM;
out_mf_media_type->SetUINT32(MF_MT_AUDIO_BITS_PER_SAMPLE, AUDIO_BITS_PER_SAMPLE);
WAVEFORMATEX *in_wfx;
UINT32 wfx_size;
MFCreateWaveFormatExFromMFMediaType(in_media_type, &in_wfx, &wfx_size);
hr = out_mf_media_type->SetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, in_wfx->nSamplesPerSec);
DEBUG_ON_FAIL(hr);
hr = out_mf_media_type->SetUINT32(MF_MT_AUDIO_NUM_CHANNELS, in_wfx->nChannels);
DEBUG_ON_FAIL(hr);
hr = out_mf_media_type->SetUINT32(MF_MT_AUDIO_AVG_BYTES_PER_SECOND, in_wfx->nAvgBytesPerSec);
DEBUG_ON_FAIL(hr);
hr = out_mf_media_type->SetUINT32(MF_MT_AUDIO_BLOCK_ALIGNMENT, in_wfx->nBlockAlign);
DEBUG_ON_FAIL(hr);
return hr;
}
It would be awesome if somebody can help me or explain where i am wrong.
Thanks.
In my project CaptureManager I faced with similar problem - while I have wrote code for recording live video from many web cams into the one file. After long time research of Media Foundation I found two important facts:
1. live sources - web cams and microphones do not start from 0 - according of specification samples from them should start from 0 time stamp - Live Sources - "The first sample should have a time stamp of zero." - but live sources set current system time.
2. I see from you code that you use Media Session - it is an object with IMFMediaSession interface. I think you create it from MFCreateMediaSession function. This function creates default version of session which is optimized for playing of media from file, which samples starts from 0 by default.
In my view,the main problem is that default Media Session does not check time stamp of media samples from source, because from media file they start from zero or from StartPosition. However, live sources do not start from 0 - they should, or must, but do not.
So, my advise - write class with IMFTransform which will be "Proxy" transform between source and encoder - this "Proxy" transform must fix time stamp of media samples from live source: 1. while it receive first media sample from live source, it save actual time stamp of the first media sample like reference time, and set time stamp of the first media sample to zero, all time stamps the next media samples from this live source must be subtracted by this reference time and set to time stamps of media samples.
Also, check code for calling of IMFFinalizableMediaSink.
Regards.
MP4 metadata might under some conditions be initialized incorrectly (e.g. like this), however in the scenario you described the problem is like to be the payload data and not the way you set up the pipeline in first place.
The decoders and converters are typically passing time stamps of samples through copying them from input to output, so they are not indicating a failure if something is wrong - you still have output that makes sense written into file. The sink might be having issues processing your data if you have sample time issues, very long recordings, overflow bug esp. in case of rates expressed with large numerators/denominators. Important is what sample times the sources produce.
You might want to try to record shorter recordings, also video only and audio only recording that might possibly help in identification of the stream which supplies the data leading to the problem.
Additionally, you might want to inspect the resulting MP4 file atoms/boxes to identify whether the header boxes have incorrect data or data itself is stamped incorrectly, on which track and how exactly (esp. starts okay and then does a weird gaps in the middle).

Media Foundation onReadSample wrong size of returned sample

I am working on translating a capture library from DirectShow to MediaFoundation. The capture library seemed to work quite well but I face a problem with an integrated webcam on a tablet running Windows 8 32 bit.
When enumerating the capture format (as explained in Media Foundation documentation), I got the following supported format for the camera:
0 : MFVideoFormat_NV12, resolution : 448x252, framerate : 30000x1001
1 : MFVideoFormat_YUY2, resolution : 448x252, framerate : 30000x1001
2 : MFVideoFormat_NV12, resolution : 640x360, framerate : 30000x1001
3 : MFVideoFormat_YUY2, resolution : 640x360, framerate : 30000x1001
4 : MFVideoFormat_NV12, resolution : 640x480, framerate : 30000x1001
5 : MFVideoFormat_YUY2, resolution : 640x480, framerate : 30000x1001
I then set the capture format, in this case the one at index 5, using the following function, as described in the example:
hr = pHandler->SetCurrentMediaType(pType);
This function executed without error. The camera should thus be configured to capture in YUY2 with a resolution of 640*480.
In the onReadSample callback, I should receive a sample with a buffer of size :
640 * 480 * sizeof(unsigned char) * 2 = 614400 //YUY2 is encoded on 2 bytes
However, I got a sample with a buffer of size 169344. Here below is a part of the callback function.
HRESULT SourceReader::OnReadSample(
HRESULT hrStatus,
DWORD dwStreamIndex,
DWORD dwStreamFlags,
LONGLONG llTimeStamp,
IMFSample *pSample // Can be NULL
)
{
EnterCriticalSection(&m_critsec);
if (pSample)
{
DWORD expectedBufferSize = 640*480*1*2; // = 614400 (hard code for the example)
IMFMediaBuffer* buffer = NULL;
hr = pSample->ConvertToContiguousBuffer(&buffer);
if (FAILED(hr))
{
//...
goto done;
}
DWORD byteLength = 0;
BYTE* pixels = NULL;
hr = buffer->Lock(&pixels, NULL, &byteLength);
//byteLength is 169344 instead of 614400
if (byteLength > 0 && byteLength == expectedBufferSize)
{
//do someting with the image, but never comes here because byteLength is wrong
}
//...
Any advice why I get a sample of size 169344 ?
Thanks in advance
Thanks Mgetz for your answer.
I checked the value of MF_MT_INTERLACE_MODE of the media type and it appears that the video stream contains progressive frames. The value of MF_MT_INTERLACE_MODE returns MFVideoInterlace_Progressive.
hr = pHandler->SetCurrentMediaType(m_pType);
if(FAILED(hr)){
//
}
else
{
//get info about interlacing
UINT32 interlaceFormat = MFVideoInterlace_Unknown;
m_pType->GetUINT32(MF_MT_INTERLACE_MODE, &interlaceFormat);
//...
So the video stream is not interlaced. I checked again in the onReadSample the value of MFSampleExtension_Interlaced to see if the sample is interlaced or not and it appears that the sample is interlaced.
if (pSample && m_bCapture)
{
//check if interlaced
UINT32 isSampleInterlaced = 0;
pSample->GetUINT32(MFSampleExtension_Interlaced, &isSampleInterlaced);
if(isSampleInterlaced)
{
//enters here
}
How it is possible that the stream is progressive and that the sample is interlaced? I double checked the value of MF_MT_INTERLACE_MODE in the onReadSample callback as well and it still gives me the value MFT_INPUT_STREAM_WHOLE_SAMPLES.
Concerning your first suggestion, I didn't way to force the flag MFT_INPUT_STREAM_WHOLE_SAMPLES on the input stream.
Thanks in advance
I still face the issue and I am now investigating on the different streams available.
According to the documentation, each media source provides a presentation descriptor from which we can get the streams available. To get the presentation descriptor, we have to call:
HRESULT hr = pSource->CreatePresentationDescriptor(&pPD);
I then request the streams available using the IMFPresentationDescriptor::GetStreamDescriptorCount function:
DWORD nbrStream;
pPD->GetStreamDescriptorCount(&nbrStream);
When requesting this information on the frontal webcam on an ACER tablet running windows 8, I got that three streams are available. I looped over these streams, requested their MediaTypeHandler and checked the MajorType. The three streams have for major type : MFMediaType_Video, so all the streams are video streams. When listing the media type available on the different streams, I got that all the streams support capture at 640x480. (some of the streams have more available media types).
I tested to select each of the different streams and the appropriate format type (the framework did not return any error), but I still do not receive the correct sample in the callback function...
Any advice to progress on the issue?
Finally found the issue: I had to set the media type on the source reader directly, using SourceReader->SetCurrentMediaType(..). That did the trick!
Thanks for your help!
Without knowing what the input media type descriptor is we can largely only speculate, but the most likely answer is you are saying you can handle the stream even though MFT_INPUT_STREAM_WHOLE_SAMPLES is not set on the input stream.
The next most likely cause is interlacing in which case each frame would be complete but not full resolution which you are assuming. Regardless you should verify the ENTIRE media type descriptor before accepting it.
Finally found the issue: I had to set the media type on the source reader directly, using SourceReader->SetCurrentMediaType(..). That did the trick!
Thanks for your help!

AVI created with AVIStreamWrite has incorrect length and playback speed

I'm trying to write to an AVI file using AVIStreamWrite but the resulting avi file is a bit messed up. The images in the avi contain the proper image and colors but the duration and speed of the video is off. I recorded a video that should have been around 7 seconds and looking at the file properties in Windows explorer it showed it had a duration of about 2 seconds. When I played it in Media Player it was too short and seemed to be playing very rapidly (motion in the video was like fast forward). I also can't seem to seek within the video using Media Player.
Here is what I'm doing...
//initialization
HRESULT AVIWriter::Init()
{
HRESULT hr = S_OK;
_hAVIFile = NULL;
_videoStream = NULL;
_frameCount = 0;
AVIFileInit();
::DeleteFileW(_filename);
hr = AVIFileOpen(&_hAVIFile,_filename,OF_WRITE|OF_CREATE, NULL);
if (hr != AVIERR_OK)
{
::cout << "AVI ERROR";
return 0;
}
/**************************************/
// Create a raw video stream in the file
::ZeroMemory(&_streamInfo, sizeof(_streamInfo));
_streamInfo.fccType = streamtypeVIDEO; // stream type
_streamInfo.fccHandler = 0; // No compressor
_streamInfo.dwScale = 1;
_streamInfo.dwRate = _aviFps; //this is 30
_streamInfo.dwSuggestedBufferSize = 0;
_streamInfo.dwSampleSize = 0;
SetRect( &_streamInfo.rcFrame, 0, 0,_bmi.biWidth , _bmi.biHeight );
hr = AVIFileCreateStream( _hAVIFile, // file pointer
&_videoStream,// returned stream pointer
&_streamInfo); // stream header
hr = AVIStreamSetFormat(_videoStream, 0,
&_bmi,
sizeof(_bmi));
return hr;
}
//call this when I receive a frame from my camera
HRESULT AVIWriter::AddFrameToAVI(BYTE* buffer)
{
HRESULT hr;
long size = _bmi.biHeight * _bmi.biWidth * 3;
hr = AVIStreamWrite(_videoStream, // stream pointer
_frameCount++, // time of this frame
1, // number to write
buffer, // pointer to data
size,// size of this frame
AVIIF_KEYFRAME, // flags....
NULL,
NULL);
return hr;
}
//call this when I am done
void AVIWriter::CloseAVI()
{
AVIStreamClose(_videoStream);
AVIFileClose(_hAVIFile);
AVIFileExit();
}
Now as a test I used DirectShow's GraphEdit to create a graph consisting of a VideoCapture Filter for this same camera and an AVI mux and created an avi file. The resulting AVI file was fine. The frame rate was 30 fps, the same that I am using. I queried both avi files (my 'bad' one and the 'good' one created with GraphEdit) using a call to AVIStreamInfo and the stream info was pretty much the same for both files. I would have expected either the samples per second or number of frames to be way off for my 'bad' avi but it wasn't. Am I doing something wrong that would cause my AVI to have the incorrect length and seem to play back at an increased speed?? I'm new to using VFW so any help is appreciated. Thanks
Frame time in the will will eventually be _frameCount / _aviFps, so it is either you are dropping your frames and they don't reach AVIStreamWrite, or alternatively if you prefer to skip a few frames in the file, you need to increment _frameCount respectively, to jump over frames to skip.

Filling CMediaType and IMediaSample from AVPacket for h264 video

I have searched and have found almost nothing, so I would really appreciate some help with my question.
I am writting a DirectShow source filter which uses libav to read and send downstream h264 packets from youtube's FLV file. But I can't find appropriate libav structure's fields to implement correctly filter's GetMediType() and FillBuffer(). Some libav fields is null. In consequence h264 decoder crashes in attempt to process received data.
Where am I wrong? In working with libav or with DirectShow interfaces? Maybe h264 requires additional processing when working with libav or I fill reference time incorrectly? Does someone have any links useful for writing DirectShow h264 source filter with libav?
Part of GetMediaType():
VIDEOINFOHEADER *pvi = (VIDEOINFOHEADER*) toMediaType->AllocFormatBuffer(sizeof(VIDEOINFOHEADER));
pvi->AvgTimePerFrame = UNITS_PER_SECOND / m_pFormatContext->streams[m_streamNo]->codec->sample_rate; //sample_rate is 0
pvi->dwBitRate = m_pFormatContext->bit_rate;
pvi->rcSource = videoRect;
pvi->rcTarget = videoRect;
//Bitmap
pvi->bmiHeader.biSize = sizeof(BITMAPINFOHEADER);
pvi->bmiHeader.biWidth = videoRect.right;
pvi->bmiHeader.biHeight = videoRect.bottom;
pvi->bmiHeader.biPlanes = 1;
pvi->bmiHeader.biBitCount = m_pFormatContext->streams[m_streamNo]->codec->bits_per_raw_sample;//or should here be bits_per_coded_sample
pvi->bmiHeader.biCompression = FOURCC_H264;
pvi->bmiHeader.biSizeImage = GetBitmapSize(&pvi->bmiHeader);
Part of FillBuffer():
//Get buffer pointer
BYTE* pBuffer = NULL;
if (pSamp->GetPointer(&pBuffer) < 0)
return S_FALSE;
//Get next packet
AVPacket* pPacket = m_mediaFile.getNextPacket();
if (pPacket->data == NULL)
return S_FALSE;
//Check packet and buffer size
if (pSamp->GetSize() < pPacket->size)
return S_FALSE;
//Copy from packet to sample buffer
memcpy(pBuffer, pPacket->data, pPacket->size);
//Set media sample time
REFERENCE_TIME start = m_mediaFile.timeStampToReferenceTime(pPacket->pts);
REFERENCE_TIME duration = m_mediaFile.timeStampToReferenceTime(pPacket->duration);
REFERENCE_TIME end = start + duration;
pSamp->SetTime(&start, &end);
pSamp->SetMediaTime(&start, &end);
P.S. I've debugged my filter with hax264 decoder and it crashes on call to libav deprecated function img_convert().
Here is the MSDN link you need to build a correct H.264 media type: H.264 Video Types
You have to fill the right fields with the right values.
The AM_MEDIA_TYPE should contain the right MEDIASUBTYPE for h264.
And these are plain wrong :
pvi->bmiHeader.biWidth = videoRect.right;
pvi->bmiHeader.biHeight = videoRect.bottom;
You should use a width/height which is independent of the rcSource/rcTarget, due to the them being indicators, and maybe completely zero if you take them from some other filter.
pvi->bmiHeader.biBitCount = m_pFormatContext->streams[m_streamNo]->codec->bits_per_raw_sample;//or should here be bits_per_coded_sample
This only makes sense if biWidth*biHeight*biBitCount/8 are the true size of the sample. I do not think so ...
pvi->bmiHeader.biCompression = FOURCC_H264;
This must also be passed in the AM_MEDIA_TYPE in the subtype parameter.
pvi->bmiHeader.biSizeImage = GetBitmapSize(&pvi->bmiHeader);
This fails, because the fourcc is unknown to the function and the bitcount is plain wrong for this sample, due to not being a full frame.
You have to take a look at how the data stream is handled by the downstream h264 filter. This seems to be flawed.