Resolution Issue when create .mp4 video file from camera streaming using SourceReader - c++

I'm developing the Window Desktop Camera application using SourceReader technique.I have completed Video streaming and still capture.
Now, I'm working on to capturing an .mp4 video file from USB camera. I'm able to capture video file for the following resolutions: 640 x 480,1280 x 720 and 1920 X 1080.
I have encountered an issue when changing the video resolution higher than 1920 x 1080 and call SetInputMediaType for the IMFSinkWriter object returns an HRESULT error code 0xc00d36b.
I used video subtype for encoding : MFVideoFormat_H264
Is there any other subtype available for encoding .mp4 file other than MFVideoFormat_H264?
Why cant i capturing an .mp4 file higher than FULL HD resolution? Am i missing anything to encode the video file? If yes, please provide me some guidelines to solve this issue.
Thanks in advance.

The likely bottleneck is the maximal resolution supported by video encoder. You are presumably using the encoder implicitly as a part of Sink Writer. Sink Writer on its own does not limit the resolution, but if encoder is unable to handle specific media type then encoding is impossible. Specifically, in Windows 7 resolution is (or might be) limited to 1920x1088.
Also, you lost one digit from the error code.
See also:
IMFSinkWriter can't export a large size video of mp4
The max resolution for mp4(h264) encoder

Roman gave a great answer, but just to add on the topic of other codecs - if you check MPEG-4 File Sink on MSDN you see that it mentions MJPG support as well (even though not clear if it's only available from Windows 8 or it simply has been improved in Windows 8), so you should be able to use MFVideoFormat_MJPG as well. I assume that should not have any size limitations, but of course the size of the resulting .mp4 file will grow drasticly.

Related

Handling Image data from IMFSourceReader and IMFSample

I am attempting to use the IMFSourceReader to read and decode a .mp4 file. I have configured the source reader to decode to MFVideoFormat_NV12 by setting a partial media type and calling IMFSourceReader::SetCurrentMediaType and loaded a video with dimensions of 1266x544.
While processing I receive the MF_SOURCE_READERF_CURRENTMEDIATYPECHANGED flag with a new dimension of 1280x544 and a MF_MT_MINIMUM_DISPLAY_APERTURE of 1266x544.
I believe the expectation is to then use either the video resizer dsp or video processor mft. However it is my understanding that the video processor mft requires windows 8.1 while I am on windows 7, and the video resizer dsp does not support MFVideoFormat_NV12.
What is the correct way to crop out the extra data added by the source reader to display only the data within the minimum display aperture for MFVideoFormat_NV12?
New media type says this: "video is 1266x544 and you expected/requested, but I have to carry it in 1280x544 textures because this is how GPU wanted it to work".
Generally speaking this does not require further scaling or cropping you already have the frames you need. If you are reading them out of sample objects - which is what I believe you are trying to do - just use increased stride (1280 bytes between consecutive rows).
If you are using this as a texture, presenting it somewhere or using it as a part of rendering, you would just use adjusted coordinates (0, 0) - (1266, 544) ignoring the remainder, as opposed to using full texture.

Encoding RGB to H.264

What I am doing is trying to record the screen in windows XP and Win7. I got the bitmap by using DirectX's interface CreateOffscreenPlainSurface and GetFrontBufferData. I need to encode the bitmap into a H.264 format video. The problem is the bitmap captured is in format D3DFMT_A8R8G8B8, but the H.264 Video Encoder can only support MFVideoFormat_I420, MFVideoFormat_IYUV, MFVideoFormat_NV12, MFVideoFormat_YUY2 and MFVideoFormat_YV12 as input. My question is do I need to transfer the format myself(I do not want to)? Are there any other better solutions for this?
The input format corresponds to MFVideoFormat_ARGB32.
Stock OS component that handles the conversion is Video Processor MFT. I don't see availability information in the footer of MSDN article, however I am under impression that this MFT comes with Windows Vista, just like the whole Media Foundation API.
In Windows XP there has been a similar Color Converter DSP which offers really close services, and exposes a really close interface of DirectX Media Object (DMO). It is available in all more recent operating systems, however it is software only and never leverages GPU capability for the conversion.
These both can handle the requested format conversion for you.
Also for the reference, H.264 Video Encoder was introduced with Windows 7 only.

Best way to load in a video and to grab images using c++

I am looking for a fast way to load in a video file and to create images from them at certain intervals ( every second, every minute, every hour, etc.).
I tried using DirectShow, but it just ran too slow for me to start the video file and move to a certain location to get data and to save it out to an image. Even if I disabled the reference clock. Tried OpenCV, but it has trouble opening the AVI file unless I know the exact codec information. So if I know a way to get the codec information out from OpenCV I may give it another shot. I tried to use FFMPEG, but I don't have as much control over it as well as I would wish.
Any advice would be greatly appreciated. This is being developed on a Windows box since it has to be hosted on a Windows box.
MPEG-4 format is not an intra-coded format, so you can't just jump to a random frame and decode it on its own, as most frames only encode the differences from one or more other frames. I suspect your decoding is slow because when you land on a frame for which several other dependent frames to be decoded first.
One way to improve performance would be to determine which frames are keyframes (or sometimes also called 'sync' points) and limit your decoding to those frames, since these can be decoded on their own.
I'm not very familiar with DirectShow capabilities, but I would expect it has some API to expose sync points.
Also, I should mention that the QuickTime SDK on Windows is possibly another good option that you have for decoding frames from movies. You should first test that your AVI movies are played correctly in the QuickTime Player. And the QT SDK does expose sync points, see the section Finding Interesting Times in the QT SDK documentation.
ffmpeg's libavformat might work for ya...

How to read .avi files C++

I want to read in an .avi video file for a program that I am making. I have the file location saved as a string. Is there any good tutorials on using .avi files in c++ or does anyone know who to read one in? Is it the same as normal files?
I have a previously asked SO question that goes into better detail but here is what I want to do:
I am making a program that will detect faces (though OpenCV) As of now I have been given a video processor program that will detect each face on a frame, and return the frame as a image and the CvRec of the faces. I want to take these faces and test them to validate that they are all actually faces.
After I have all the faces (tested) I want to then take the images and test them together. I test the faces on each frame for size and distance changes. If the faces pass this for a frame length of two seconds, then I want to crop the face and make it the subject of each frame.
After each frame is cropped I then want to save the new video file for the user.
Hopefully that helps. If anyone needs a better explanation please let me know.
First of all, a little background.
What is AVI?
AVI stands for Audio Video Interleave. It is a special case of the RIFF (Resource Interchange File Format). AVI is defined by Microsoft and it is the most common format for audio/video data.
I assume you would want to read a avi file and decode the compressed video frames. AVI file is just like any other normal file and you can use fread()(in C) or iostream(in C++) to open an avi file and read it contents. But the contents of an avi file are video frames in a compressed format. The compression allows video content of bigger sizes to be efficiently packed in less memory space.To make any sense of this compressed data you would have to decode the encoded data format.You will have to study the standard which describes how AVI encoding is done and then extract and decode the frames. this raw video data now when fed to a video device will be displayed in video format.
It seems you are staying within OpenCV so things are easy. If OpenCV is compiled properly it is capable of delegating io/coding/decoding to other libraries. Quicktime and others for example, but best is to use ffmpeg. You open, read and decode everything using the OpenCV API which gives you the video frame by frame.
Make sure your OpenCV is compiled with ffmpeg support and then read the OpenCV tutorial on how to read/write AVI files. It's really easy.
Getting OpenCV to be built with ffmpeg support might be hard though. You might want to switch to an older version of OpenCV if you can't get ffmpeg running with the current one.
Personally i would not spent time trying to read the video by yourself and delegate the task to OpenCV. That's how it is supposed to be used.

encoding camera with audio source in realtime with WMAsfWriter - jitter problem

I build a DirectShow graph consisting of my video capture filter
(grabbing the screen), default audio input filter both connected
through spliiter to WM Asf Writter output filter and to VMR9 renderer.
This means I want to have realtime audio/video encoding to disk
together with preview. The problem is that no matter what WM profile I
choose (even very low resolution profile) the output video file is
always "jitter" - every few frames there is a delay. The audio is ok -
there is no jitter in audio. The CPU usage is low < 10% so I believe
this is not a problem of lack of CPU resources. I think I'm time-
stamping my frames correctly.
What could be the reason?
Below is a link to recorder video explaining the problem:
http://www.youtube.com/watch?v=b71iK-wG0zU
Thanks
Dominik Tomczak
I have had this problem in the past. Your problem is the volume of data being written to disk. Writing to a faster drive is a great and simple solution to this problem. The other thing I've done is placing a video compressor into the graph. You need to make sure both input streams are using the same reference clock. I have had a lot of problems using this compressor scheme and keeping a good preview. My preview's frame rate dies even if i use an infinite Tee rather than a Smart Tee, the result written to disk was fine though. Its also worth noting that the more of a beast the machine i was running it on was the less of an issue so it may not actually provide much of a win if you need both over sticking a new faster hard disk in the machine.
I don't think this is an issue. The volume of data written is less than 1MB/s (average compression ratio during encoding). I found the reason - when I build the graph without audio input (WM ASF writer has only video input pint) and my video capture pin is connected through Smart Tree to preview pin and to WM ASF writer input video pin then there is no glitch in the output movie. I reckon this is the problem with audio to video synchronization in my graph. The same happens when I build the graph in GraphEdit. Without audio, no glitch. With audio, there is a constant glitch every 1s. I wonder whether I time stamp my frames wrongly bu I think I'm doing it correctly. How is the general solution for audio to video synchronization in DirectShow graphs?