Playback speed of video encoded with IMFSinkWriter changes based on width - c++

I'm making a screen recorder (without audio) using Win32s Sink Writer to encode a series of bitmaps into an MP4 file.
For some reason, the video playback speed increases (seemingly) proportionally with the video width.
From this post, I've gathered that it's most likely because I'm calculating the buffer size incorrectly. The difference here is that their video playback issue was fixed once the calculation for the audio buffer size was correct, but since I don't encode any audio at all, I'm not sure what to take from it.
I've also tried to read about how the buffer works, but I'm really at a loss as to exactly how the buffer size is causing different playback speeds.
Here is a pastebin for the entirity of the code, I really can't track down the problem any more than the buffer size and/or the frame index/duration.
i.e.:
Depending on the width of the member variable m_width (measured in pixels), the playback speed changes. That is; the higher the width, the faster the video plays, and vice versa.
Here are two video examples:
3840x1080 and 640x1080, notice the system clock.
Imugr does not retain the original resolution of the files, but I double checked before uploading, and the program does indeed create files of the claimed resolutions.
rtStart and rtDuration are defined as such, and are both private members of the MP4File class.
LONGLONG rtStart = 0;
UINT64 rtDuration;
MFFrameRateToAverageTimePerFrame(m_FPS, 1, &rtDuration);
This is where rtStart is updated, and the individual bits of the bitmap is passed to the frame writer.
Moved the LPVOID object to private members to hopefully increase performance. Now there's no need for heap allocation every time a frame is appended.
HRESULT MP4File::AppendFrame(HBITMAP frame)
{
HRESULT hr = NULL;
if (m_isInitialFrame)
{
hr = InitializeMovieCreation();
if (FAILED(hr))
return hr;
m_isInitialFrame = false;
}
if (m_hHeap && m_lpBitsBuffer) // Makes sure buffer is initialized
{
BITMAPINFO bmpInfo;
bmpInfo.bmiHeader.biBitCount = 0;
bmpInfo.bmiHeader.biSize = sizeof(BITMAPINFOHEADER);
// Get individual bits from bitmap and loads it into the buffer used by `WriteFrame`
GetDIBits(m_hDC, frame, 0, 0, NULL, &bmpInfo, DIB_RGB_COLORS);
bmpInfo.bmiHeader.biCompression = BI_RGB;
GetDIBits(m_hDC, frame, 0, bmpInfo.bmiHeader.biHeight, m_lpBitsBuffer, &bmpInfo, DIB_RGB_COLORS);
hr = WriteFrame();
if (SUCCEEDED(hr))
{
rtStart += rtDuration;
}
}
return m_writeFrameResult = hr;
}
And lastly, the frame writer which actually loads the bits into the buffer, and then writes to the Sink Writer.
HRESULT MP4File::WriteFrame()
{
IMFSample *pSample = NULL;
IMFMediaBuffer *pBuffer = NULL;
const LONG cbWidth = 4 * m_width;
const DWORD cbBufferSize = cbWidth * m_height;
BYTE *pData = NULL;
// Create a new memory buffer.
HRESULT hr = MFCreateMemoryBuffer(cbBufferSize, &pBuffer);
// Lock the buffer and copy the video frame to the buffer.
if (SUCCEEDED(hr))
{
hr = pBuffer->Lock(&pData, NULL, NULL);
}
if (SUCCEEDED(hr))
{
hr = MFCopyImage(
pData, // Destination buffer.
cbWidth, // Destination stride.
(BYTE*)m_lpBitsBuffer, // First row in source image.
cbWidth, // Source stride.
cbWidth, // Image width in bytes.
m_height // Image height in pixels.
);
}
if (pBuffer)
{
pBuffer->Unlock();
}
// Set the data length of the buffer.
if (SUCCEEDED(hr))
{
hr = pBuffer->SetCurrentLength(cbBufferSize);
}
// Create a media sample and add the buffer to the sample.
if (SUCCEEDED(hr))
{
hr = MFCreateSample(&pSample);
}
if (SUCCEEDED(hr))
{
hr = pSample->AddBuffer(pBuffer);
}
// Set the time stamp and the duration.
if (SUCCEEDED(hr))
{
hr = pSample->SetSampleTime(rtStart);
}
if (SUCCEEDED(hr))
{
hr = pSample->SetSampleDuration(rtDuration);
}
// Send the sample to the Sink Writer and update the timestamp
if (SUCCEEDED(hr))
{
hr = m_pSinkWriter->WriteSample(m_streamIndex, pSample);
}
SafeRelease(&pSample);
SafeRelease(&pBuffer);
return hr;
}
A couple details about the encoding:
Framerate: 30FPS
Bitrate: 15 000 000
Output encoding format: H264 (MP4)

To me, this behavior makes sense.
See https://github.com/mofo7777/Stackoverflow/tree/master/ScreenCaptureEncode
My program uses DirectX9 instead of GetDIBits, but the behaviour is the same. Try this program with different screen resolutions, to confirm this behaviour.
And I confirm, with my program, the video playback speed increases proportionally with the video width (and also with the video height).
Why ?
More data to copy, more time to pass. And wrong sample time/sample duration.
Using 30 FPS, means one frame each 33.3333333 ms :
Do GetDIBits, MFCopyImage, WriteSample end exactly at 33.3333333 ms... no.
Do you write each frame exactly at 33.3333333 ms... no.
So just doing rtStart += rtDuration is wrong, because you don't capture and write screen exactly at this time. And GetDIBits/DirectX9 are not able to process at 30 FPS, trust me. And why Microsoft provided Windows Desktop Duplication (for windows 8/10 only) ?
The key is latency.
Do you know how long GetDIBits, MFCopyImage and WriteSample take ? You should know, to understand the problem. Usually, it takes more than 33.3333333 ms. But it is variable.
You must know it to adjust the correct FPS to the encoder. But you also will need to WriteSample at the right time.
If you use MF_MT_FRAME_RATE with 5-10 FPS instead of 30 FPS, You will see it is more realistic, but not optimal.
For example, use an IMFPresentationClock to handle the correct WriteSample time.

Related

WICConvertBitmapSource + CopyPixels results in blue image

I'm trying to use WIC to load an image into an in-memory buffer for further processing then write it back to a file when done. Specifically:
Load the image into an IWICBitmapFrameDecode.
The loaded IWICBitmapFrameDecode reports that its pixel format is GUID_WICPixelFormat24bppBGR. I want to work in 32bpp RGBA, so I call WICConvertBitmapSource.
Call CopyPixels on the converted frame to get a memory buffer.
Write the memory buffer back into an IWICBitmapFrameEncode using WritePixels.
This results in a recognizable image, but the resulting image is mostly blueish, as if the red channel is being interpreted as blue.
If I call WriteSource to write the converted frame directly, instead of writing the memory buffer, it works. If I call CopyPixels from the original unconverted frame (and update my stride and pixel formats accordingly), it works. It's only the combination of WICConvertBitmapSource plus the use of a memory buffer (CopyPixels + WritePixels) that causes the problem, but I can't figure out what I'm doing wrong.
Here's my code.
int main() {
IWICImagingFactory *pFactory;
IWICBitmapDecoder *pDecoder = NULL;
CoInitializeEx(NULL, COINIT_MULTITHREADED);
CoCreateInstance(
CLSID_WICImagingFactory,
NULL,
CLSCTX_INPROC_SERVER,
IID_IWICImagingFactory,
(LPVOID*)&pFactory
);
// Load the image.
pFactory->CreateDecoderFromFilename(L"input.png", NULL, GENERIC_READ, WICDecodeMetadataCacheOnDemand, &pDecoder);
IWICBitmapFrameDecode *pFrame = NULL;
pDecoder->GetFrame(0, &pFrame);
// pFrame->GetPixelFormat shows that the image is 24bpp BGR.
// Convert to 32bpp RGBA for easier processing.
IWICBitmapSource *pConvertedFrame = NULL;
WICConvertBitmapSource(GUID_WICPixelFormat32bppRGBA, pFrame, &pConvertedFrame);
// Copy the 32bpp RGBA image to a buffer for further processing.
UINT width, height;
pConvertedFrame->GetSize(&width, &height);
const unsigned bytesPerPixel = 4;
const unsigned stride = width * bytesPerPixel;
const unsigned bitmapSize = width * height * bytesPerPixel;
BYTE *buffer = new BYTE[bitmapSize];
pConvertedFrame->CopyPixels(nullptr, stride, bitmapSize, buffer);
// Insert image buffer processing here. (Not currently implemented.)
// Create an encoder to turn the buffer back into an image file.
IWICBitmapEncoder *pEncoder = NULL;
pFactory->CreateEncoder(GUID_ContainerFormatPng, nullptr, &pEncoder);
IStream *pStream = NULL;
SHCreateStreamOnFileEx(L"output.png", STGM_WRITE | STGM_CREATE, FILE_ATTRIBUTE_NORMAL, true, NULL, &pStream);
pEncoder->Initialize(pStream, WICBitmapEncoderNoCache);
IWICBitmapFrameEncode *pFrameEncode = NULL;
pEncoder->CreateNewFrame(&pFrameEncode, NULL);
pFrameEncode->Initialize(NULL);
WICPixelFormatGUID pixelFormat = GUID_WICPixelFormat32bppRGBA;
pFrameEncode->SetPixelFormat(&pixelFormat);
pFrameEncode->SetSize(width, height);
pFrameEncode->WritePixels(height, stride, bitmapSize, buffer);
pFrameEncode->Commit();
pEncoder->Commit();
pStream->Commit(STGC_DEFAULT);
return 0;
}
The PNG encoder only supports GUID_WICPixelFormat32bppBGRA (BGR) for 32bpp as specified in PNG Native Codec official documentation. When you call it with GUID_WICPixelFormat32bppRGBA, it will not do channel switching. The pervert will just use your pixels as they were BGR, not RGB, and will not tell you there's a problem.
I don't know what you're trying to do, but in your example, you could just replace GUID_WICPixelFormat32bppRGBA by GUID_WICPixelFormat32bppBGRA in the call to WICConvertBitmapSource (and also replace the definition of the last pixelFormat variable to make sure your source code is correct, but it doesn't change anything).
PS: you can use Wic to save files, not need to create stream using another API, see my answer here: Capture screen using DirectX

What are pixel size limits of GDI bitmaps/DCs in 32-bit and 64-bit processes?

I'm coding a Win32 application that performs low-level printing, for which I'm dealing with GDI bitmaps and device contexts.
The code basically does this (pseudo-code):
HDC hCompDc = ::CreateCompatibleDC(hDC);
BITMAPINFOHEADER infoHeader = {0};
infoHeader.biSize = sizeof(infoHeader);
infoHeader.biWidth = nWidth;
infoHeader.biHeight = -nHeight; //The document must be right side up
infoHeader.biPlanes = 1;
infoHeader.biBitCount = 24;
infoHeader.biCompression = BI_RGB;
BITMAPINFO info;
info.bmiHeader = infoHeader;
//Use file on disk to store large bitmap data
HANDLE hFileScatch = ::CreateFile(strTempPath,
GENERIC_READ | GENERIC_WRITE, 0, NULL,
CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
UINT64 ncbLineSz = nWidth * 3;
if(ncbLineSz & 0x3)
ncbLineSz = (ncbLineSz & ~0x3) + 0x4; //Must be DWORD aligned
UINT64 uiSzBmp = ncbLineSz * nHeight;
HANDLE hFileMapObj = ::CreateFileMapping(hFileScatch,
NULL, PAGE_READWRITE, (DWORD)(uiSzBmp >> 32), (DWORD)(uiSzBmp), NULL);
BYTE* pMemory = 0;
HBITMAP hBitmap = ::CreateDIBSection(hDC, &info, DIB_RGB_COLORS, (void**)&pMemory, hFileMapObj, 0);
::SelectObject(hCompDc, hBitmap);
//Do drawing on `hCompDc` using other GDI APIs
//...
//And print
::SetAbortProc(hPrintDC, _printerAbortProc);
::StartDoc(hPrintDC, &docInfo);
::StartPage(hPrintDC);
::BitBlt(hPrintDC,
rcPrintArea.left,
rcPrintArea.top,
rcRenderArea.Width(),
rcRenderArea.Height(),
hCompDc, //DC from
rcRenderArea.left,
rcRenderArea.top,
SRCCOPY);
::EndPage(hPrintDC);
//Clean up, etc.
//...
This works but to a certain degree. Let me explain:
Since I'm dealing with printer resolutions running at 300 dpi (vs. 96 dpi for a usual screen) the sizes of bitmaps turn out pretty large.
For instance, if I have a bitmap that is 3900x86625 pixels, it requires a DIB section of around 966 MB. (I can learn that if I substitute default memory allocation in CreateDIBSection in my code above with a file mapping object backed by a file on disk.) Otherwise, I understand that I can hit a limit of contiguous RAM space in a 32-bit process with that chunk of memory. But that is not what happens.
When I run my code with that size of a bitmap in a 32-bit process, the bitmap is created OK using file mapping object backed by a file on disk, but the resulting bitmap/image comes out completely blank, while the exact same code compiled in a 64-bit process, produces a normal looking bitmap.
So in light of that I was curious if there are specific image size restrictions that those GDI objects have? For 32-bit and 64-bit processes.
And also a side question -- is there any way to know that such size limit was reached while drawing?
PS. My actual code has all error checks in place for all Win32 APIs, and while the 32-bit process that I described above produced a blank bitmap, it did not fail in a single GDI API.

Media Foundation: Cannot change a FPS on webcam

I try to replace codes with Directshow ("DS") on Media Foundation ("MF") in my app and met one problem - cannot set a needed fps using MF on a webcam. MF allowed me to set only 30 fps. If I try to set 25 fps, I always get the error 0xc00d5212 on SetCurrentMediaType(). In DS I could change that parameter.
My codes:
ASSERT(m_pReader); //IMFSourceReader *m_pReader;
IMFMediaType *pNativeType = NULL;
IMFMediaType *pType = NULL;
UINT32 w = 1280;
UINT32 h = 720;
UINT32 fps = 25; // or 30
DWORD dwStreamIndex = MF_SOURCE_READER_FIRST_VIDEO_STREAM;
// Find the native format of the stream.
HRESULT hr = m_pReader->GetNativeMediaType(dwStreamIndex, 0, &pNativeType);
if (FAILED(hr))
{
//error
}
GUID majorType, subtype;
// Find the major type.
hr = pNativeType->GetGUID(MF_MT_MAJOR_TYPE, &majorType);
if (FAILED(hr))
{
//error
}
// Define the output type.
hr = MFCreateMediaType(&pType);
if (FAILED(hr))
{
//error
}
hr = pType->SetGUID(MF_MT_MAJOR_TYPE, majorType);
if (FAILED(hr))
{
//error
}
// Select a subtype.
if (majorType == MFMediaType_Video)
{
subtype= MFVideoFormat_RGB24;
}
else
{
//error
}
hr = pType->SetGUID(MF_MT_SUBTYPE, subtype);
if (FAILED(hr))
{
//error
}
hr = MFSetAttributeSize(pType, MF_MT_FRAME_SIZE, w, h);
if (FAILED(hr))
{
//error
}
hr = MFSetAttributeSize(pType, MF_MT_FRAME_RATE, fps, 1);
if (FAILED(hr))
{
//error
}
hr = m_pReader->SetCurrentMediaType(dwStreamIndex, NULL, pType);
if (FAILED(hr))
{// hr = 0xc00d5212
//!!!!!error - if fps == 25
}
return hr;
Thanks for any help.
It might so happen that the camera does not support flexible frame rate values, and can work with only among the supported set, for example: 10, 15, 20, 24, 30 fps. You should be able to enumerate supported media types and choose the one that works for you - those media types typically include frame rate options.
Even though Media Foundation and DirectShow video capture eventually ends up in the same backend, there might be discrepancies in behavior. Specifically, you are working with Media Foundation higher level API that internally interfaces to a media source, and it might so happens that frame rate leads to 0xC00D5212 MF_E_TOPO_CODEC_NOT_FOUND "No suitable transform was found to encode or decode the content" confusion even though technically the driver can capture in respective mode.
See also:
Get all supported FPS values of a camera in Microsoft Media Foundation
Media Foundation Video/Audio Capture Capabilities
I've added the timer for fps control imitation into the codes. So at the start I set 30 fps , then by fps scale I skip some frames for my app.
Thank you for help.

WIC Direct2D CreateBitmapFromMemory: limitations on width and height?

CreateBitmapFromMemory executes successfully when _nWidth is equal to or less than 644.
If the value exceeds this value, the HRESULT value is -2003292276
Do limits exist on the width and height?
#include <d2d1.h>
#include <d2d1helper.h>
#include <wincodecsdk.h> // Use this for WIC Direct2D functions
void test()
{
IWICImagingFactory *m_pIWICFactory;
ID2D1Factory *m_pD2DFactory;
IWICBitmap *m_pEmbeddedBitmap;
ID2D1Bitmap *m_pD2DBitmap;
unsigned char *pImageBuffer = new unsigned char[1024*1024];
HRESULT hr = S_OK;
int _nHeight = 300;
int _nWidth = 644;
If nWidth exceeds 644, CreateBitmapFromMemory returns an Error.
//_nWidth = 648;
if (m_pIWICFactory == 0 )
{
hr = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE);
// Create WIC factory
hr = CoCreateInstance(
CLSID_WICImagingFactory,
NULL,
CLSCTX_INPROC_SERVER,
IID_PPV_ARGS(&m_pIWICFactory)
);
if (SUCCEEDED(hr))
{
// Create D2D factory
hr = D2D1CreateFactory( D2D1_FACTORY_TYPE_SINGLE_THREADED, &m_pD2DFactory );
}
}
hr = m_pIWICFactory->CreateBitmapFromMemory(
_nHeight, // height
_nWidth, // width
GUID_WICPixelFormat24bppRGB, // pixel format of the NEW bitmap
_nWidth*3, // calculated from width and bpp information
1024*1024, // height x width
pImageBuffer, // name of the .c array
&m_pEmbeddedBitmap // pointer to pointer to whatever an IWICBitmap is.
);
if (!SUCCEEDED(hr)) {
char *buffer = "Error in CreateBitmapFromMemory\n";
}
}
Error code is 0x88982F8C WINCODEC_ERR_INSUFFICIENTBUFFER and the reason is now obvious?
The first parameter is width, and the second is height. You have them in wrong order. All in all you provide incorrect arguments resulting in bad buffer.
Are you sure you passed in the correct pixelFormat for function CreateBitmapFromMemory? you hard code it to GUID_WICPixelFormat24bppRGB, I think this is the root cause, you should make sure this format same as the format with the source bitmap which you are copy the data from. try use the GetPixelFormat function to get the correct format instead of hard code.
There is an upper limit on the dimensions of images on the GPU.
Call GetMaximumBitmapSize on the render target.
http://msdn.microsoft.com/query/dev11.query?appId=Dev11IDEF1&l=EN-US&k=k(GetMaximumBitmapSize);k(DevLang-C%2B%2B);k(TargetOS-Windows)&rd=true
What you get back is the max pixels of either vertical or horiz.
For larger images you'd have to load them into a software render target such as a bitmap render target and then render what you want from that.

IMovieControl::Run fails on Windows XP?

Actually, it only fails the second time it's called. I'm using a windowless control to play video content, where the video being played could change while the control is still on screen. Once the graph is built the first time, we switch media by stopping playback, replacing the SOURCE filter, and running the graph again. This works fine under Vista, but when running on XP, the second call to Run() returns E_UNEXPECTED.
The initialization goes something like this:
// Get the interface for DirectShow's GraphBuilder
mGB.CoCreateInstance(CLSID_FilterGraph, NULL, CLSCTX_INPROC_SERVER);
// Create the Video Mixing Renderer and add it to the graph
ATL::CComPtr<IBaseFilter> pVmr;
pVmr.CoCreateInstance(CLSID_VideoMixingRenderer9, NULL, CLSCTX_INPROC);
mGB->AddFilter(pVmr, L"Video Mixing Renderer 9");
// Set the rendering mode and number of streams
ATL::CComPtr<IVMRFilterConfig9> pConfig;
pVmr->QueryInterface(IID_IVMRFilterConfig9, (void**)&pConfig);
pConfig->SetRenderingMode(VMR9Mode_Windowless);
pVmr->QueryInterface(IID_IVMRWindowlessControl9, (void**)&mWC);
And here's what we do when we decide to play a movie. RenderFileToVideoRenderer is borrowed from dshowutil.h in the DirectShow samples area.
// Release the source filter, if it exists, so we can replace it.
IBaseFilter *pSource = NULL;
if (SUCCEEDED(mpGB->FindFilterByName(L"SOURCE", &pSource)) && pSource)
{
mpGB->RemoveFilter(pSource);
pSource->Release();
pSource = NULL;
}
// Render the file.
hr = RenderFileToVideoRenderer(mpGB, mPlayPath.c_str(), FALSE);
// QueryInterface for DirectShow interfaces
hr = mpGB->QueryInterface(&mMC);
hr = mpGB->QueryInterface(&mME);
hr = mpGB->QueryInterface(&mMS);
// Read the default video size
hr = mpWC->GetNativeVideoSize(&lWidth, &lHeight, NULL, NULL);
if (hr != E_NOINTERFACE)
{
if (FAILED(hr))
{
return hr;
}
// Play video at native resolution, anchored at top-left corner.
RECT r;
r.left = 0;
r.top = 0;
r.right = lWidth;
r.bottom = lHeight;
hr = mpWC->SetVideoPosition(NULL, &r);
}
// Run the graph to play the media file
if (mMC)
{
hr = mMC->Run();
if (FAILED(hr))
{
// We get here the second time this code is executed.
return hr;
}
mState = Running;
}
if (mME)
{
mME->SetNotifyWindow((OAHWND)m_hWnd, WM_GRAPHNOTIFY, 0);
}
Anybody know what's going on here?
Try calling IMediaControl::StopWhenReady before removing the source filter.
When are you calling QueryInterface directly? you can use CComQIPtr<> to warp the QI for you. This way you won't have to call Release as it will be called automatically.
The syntax look like this: CComPtr<IMediaControl> mediaControl = pGraph;
In FindFilterByName() instead of passing live pointer pass a CComPtr, again so you won't have to call release explicitly.
Never got a resolution on this. The production solution was to just call IGraphBuilder::Release and rebuild the entire graph from scratch. There's a CPU spike and a slight redraw delay when switching videos, but it's less pronounced than we'd feared.