OpenCV + FFmpeg: Error when recording lossless video - c++

I'm trying to record a lossless video from a webcam using opencv.
I would like to use the FFV1 codec for this.
I open my video writter like this:
theVideoWriter.open(filename,CV_FOURCC('F','F','V','1'), 30, cv::Size(1280,720), true);
I can successfully open the video writter but while recording I get following error message:
[ffv1 # 26d78020] Provided packet is too small, needs to be 8310784
The resulting video is not playable. Other FFmpeg codecs like FMP4 work fine.
What does that error mean and how can I fix it?

Related

Ffmpeg video output is 0 seconds with correct filesize when uploading to google cloud bucket

I've made a C++ program that lives in gke and takes some videos as input using ffmpeg, then does something with that input using opengl(not relevant), then finally encodes those edited videos as a single output. Normally the program works perfectly fine on my local machine, it encodes just as I want it to with no warnings or valgrind errors whatsoever. Then, after encoding the said video, I want my program to upload that video to the google cloud storage. This is where the problem comes, I have tried 2 methods for this: First, I tried using curl to upload to the cloud using a signed url. Second, I tried mounting the google storage using gcsfuse(I was already mounting the bucket to access the inputs in question). Both of those methods yielded undefined, weird behaviour's ranging from: Outputing a 0byte or 44byte file, (This is the most common one:) encoding in the correct file size ~500mb but the video is 0 seconds long, outputing a 0.4 second video or just encoding the desired output normally (really rare).
From the logs I can't see anything unusual, everything seems to work fine and ffmpeg does not give any errors or warnings, so does valgrind. Everything seems to work normally, even when I use curl to upload the video to the cloud the output is perfectly fine when it first encodes it (before sending it with curl) but the video gets messed up when curl uploads it to the cloud.
I'm using the muxing.c example of ffmpeg to encode my video with the only difference being:
void video_encoder::fill_yuv_image(AVFrame *frame, struct SwsContext *sws_context) {
const int in_linesize[1] = { 4 * width };
//uint8_t* dest[4] = { rgb_data, NULL, NULL, NULL };
sws_context = sws_getContext(
width, height, AV_PIX_FMT_RGBA,
width, height, AV_PIX_FMT_YUV420P,
SWS_BICUBIC, 0, 0, 0);
sws_scale(sws_context, (const uint8_t * const *)&rgb_data, in_linesize, 0,
height, frame->data, frame->linesize);
}
rgb_data is the data I got after editing the inputs. Again, this works fine and I don't think there are any errors here.
I'm not sure where the error is and since the code is huge I can't provide a replicable example. I'm just looking for someone to point me to the right direction.
Running the cloud's output in mplayer wields this result (This is when the video is the right size but is 0 seconds long, the most common one.):
MPlayer 1.4 (Debian), built with gcc-11 (C) 2000-2019 MPlayer Team
do_connect: could not connect to socket
connect: No such file or directory
Failed to open LIRC support. You will not be able to use your remote control.
Playing /media/c36c2633-d4ee-4d37-825f-88ae54b86100.
libavformat version 58.76.100 (external)
libavformat file format detected.
[mov,mp4,m4a,3gp,3g2,mj2 # 0x7f2cba1168e0]moov atom not found
LAVF_header: av_open_input_stream() failed
libavformat file format detected.
[mov,mp4,m4a,3gp,3g2,mj2 # 0x7f2cba1168e0]moov atom not found
LAVF_header: av_open_input_stream() failed
RAWDV file format detected.
VIDEO: [DVSD] 720x480 24bpp 29.970 fps 0.0 kbps ( 0.0 kbyte/s)
X11 error: BadMatch (invalid parameter attributes)
Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory
[vdpau] Error when calling vdp_device_create_x11: 1
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
libavcodec version 58.134.100 (external)
[dvvideo # 0x7f2cb987a380]Requested frame threading with a custom get_buffer2() implementation which is not marked as thread safe. This is not supported anymore, make your callback thread-safe.
Selected video codec: [ffdv] vfm: ffmpeg (FFmpeg DV)
==========================================================================
Load subtitles in /media/
==========================================================================
Opening audio decoder: [libdv] Raw DV Audio Decoder
Unknown/missing audio format -> no sound
ADecoder init failed :(
Opening audio decoder: [ffmpeg] FFmpeg/libavcodec audio decoders
[dvaudio # 0x7f2cb987a380]Decoder requires channel count but channels not set
Could not open codec.
ADecoder init failed :(
ADecoder init failed :(
Cannot find codec for audio format 0x56444152.
Audio: no sound
Starting playback...
[dvvideo # 0x7f2cb987a380]could not find dv frame profile
Error while decoding frame!
[dvvideo # 0x7f2cb987a380]could not find dv frame profile
Error while decoding frame!
V: 0.0 2/ 2 ??% ??% ??,?% 0 0
Exiting... (End of file)
Edit: Since the code runs on a VM, I'm using xvfb-run ro start my application, but again even when using xvfb-run it works completely fine on when not encoding to the cloud.
Apparently, I'm assuming for security reasons, the google cloud storage does not allow us to do multiple continuous operations on a file, just a singular read/write operation. So I found a workaround by encoding my video to a local file inside the pod and then doing a copy operation to the cloud.

Use Source Reader to get H264 samples from webcam source

When using the Source Reader I can use it to get decoded YUV samples using an mp4 file source (example code).
How can I do the opposite with a webcam source? Use the Source Reader to provide encoded H264 samples? My webcam supports RGB24 and I420 pixel formats and I can get H264 samples if I manually wire up the H264 MFT transform. But it seems as is the Source Reader should be able to take care of the transform for me. I get an error whenever I attempt to set MF_MT_SUBTYPE of MFVideoFormat_H264 on the Source Reader.
Sample snippet is shown below and the full example is here.
// Get the first available webcam.
CHECK_HR(MFCreateAttributes(&videoConfig, 1), "Error creating video configuration.");
// Request video capture devices.
CHECK_HR(videoConfig->SetGUID(
MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE,
MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID), "Error initialising video configuration object.");
CHECK_HR(videoConfig->SetGUID(MF_MT_SUBTYPE, WMMEDIASUBTYPE_I420),
"Failed to set video sub type to I420.");
CHECK_HR(MFEnumDeviceSources(videoConfig, &videoDevices, &videoDeviceCount), "Error enumerating video devices.");
CHECK_HR(videoDevices[WEBCAM_DEVICE_INDEX]->GetAllocatedString(MF_DEVSOURCE_ATTRIBUTE_FRIENDLY_NAME, &webcamFriendlyName, &nameLength),
"Error retrieving video device friendly name.\n");
wprintf(L"First available webcam: %s\n", webcamFriendlyName);
CHECK_HR(videoDevices[WEBCAM_DEVICE_INDEX]->ActivateObject(IID_PPV_ARGS(&pVideoSource)),
"Error activating video device.");
CHECK_HR(MFCreateAttributes(&pAttributes, 1),
"Failed to create attributes.");
// Adding this attribute creates a video source reader that will handle
// colour conversion and avoid the need to manually convert between RGB24 and RGB32 etc.
CHECK_HR(pAttributes->SetUINT32(MF_SOURCE_READER_ENABLE_VIDEO_PROCESSING, 1),
"Failed to set enable video processing attribute.");
CHECK_HR(pAttributes->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video), "Failed to set major video type.");
// Create a source reader.
CHECK_HR(MFCreateSourceReaderFromMediaSource(
pVideoSource,
pAttributes,
&pVideoReader), "Error creating video source reader.");
MFCreateMediaType(&pSrcOutMediaType);
CHECK_HR(pSrcOutMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video), "Failed to set major video type.");
CHECK_HR(pSrcOutMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264), "Error setting video sub type.");
CHECK_HR(pSrcOutMediaType->SetUINT32(MF_MT_AVG_BITRATE, 240000), "Error setting average bit rate.");
CHECK_HR(pSrcOutMediaType->SetUINT32(MF_MT_INTERLACE_MODE, 2), "Error setting interlace mode.");
CHECK_HR(pVideoReader->SetCurrentMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, NULL, pSrcOutMediaType),
"Failed to set media type on source reader.");
CHECK_HR(pVideoReader->GetCurrentMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, &pFirstOutputType),
"Error retrieving current media type from first video stream.");
std::cout << "Source reader output media type: " << GetMediaTypeDescription(pFirstOutputType) << std::endl << std::endl;
Output:
bind returned success
First available webcam: Logitech QuickCam Pro 9000
Failed to set media type on source reader. Error: C00D5212.
finished.
Source Reader does not look like suitable API here. It is API to implement "half of pipeline" which includes necessary decoding but not encoding. The other half is Sink Writer API which is capable to handle encoding, and which can encode H.264.
Or your another option, unless you are developing a UWP project, is Media Session API which implements a pipeline end to end.
Even though technically (in theory) you could have an encoding MFT as a part of Source Reader pipeline, Source Reader API itself is insufficiently flexible to add encoding style tansforms based on requested media types.
So, one solution could be to have Source Reader to read with necessary decoding (such as up to having RGB32 or NV12 video frames), then Sink Writer to manage encoding with respectively appropriate media sink on its end (or Sample Grabber as media sink). Another solution is to put Media Foundation primitives into Media Session pipeline which can manage both decoding and encoding parts, connected together.
Now, your use case is clearer.
For me, your MFWebCamRtp is the best optimized way of doing : WebCam Source Reader -> Encoding -> RTP Streaming.
But you are experiencing presentation clock issues, synchronization issues, or unsynchronized audio video issues. Am I right ?
So you tried Sample Grabber Sink, and now Source Reader, like I suggested to you. Of course, you can think that a Media Session will be able to do it better.
I think so, but extra work will be needed.
Here is what I would do in your case :
Code a custom RTP Sink
Create a topology with webcam source, h264 encoder, your custom RTP Sink
Add your topology to a MediaSession
Use the MediaSession to play the process
If you want a networkstream sink sample, see this : MFSkJpegHttpStreamer
This is old, but it's a good start. This program also uses winsock, like your.
You should be aware that RTP protocol uses UDP. A very good way to have synchronization issues... Definitely your main problem, as I guess.
What I think. You are trying to compensate for the weaknesses of the RTP protocol (UDP), with a management of the audio / video synchronization of MediaFoundation. I think you will just fail with this approach.
I think your main problem is RTP protocol.
EDIT
No I'm not having synchronisation issues. The Source Reader and Sample Grabber both provide correct timestamps which I can use in the RTP header. Likewise no problems with RTP/UDP etc. that's the bit I do know about. My questions are originating from a desire to understand the most efficient (least amount of plumbing code) and flexible solution. And yes it does look like a custom sink writer is the optimal solution.
Again things are clearer. If you need help with a custom RTP sink, I'll be there.

record video & audio to mp4 file on windows pc

I have received video packets(h264) and audio packets(aac) from network(somewhere) separately. And i can get the ntptime of every packet(a or v).
now I want write them to a mp4 file in 25 fps, how can l do?
I have tried to do it using ffmpeg, but failed. please help me, thanks.

FFMPEG with C++ accessing a webcam

I have searched all around and can not find any examples or tutorials on how to access a webcam using ffmpeg in C++. Any sample code or any help pointing me to some documentation, would greatly be appreciated.
Thanks in advance.
I have been working on this for months now. Your first "issue" is that ffmpeg (libavcodec and other ffmpeg libs) does NOT access web cams, or any other device.
For a basic USB webcam, or audio/video capture card, you first need driver software to access that device. For linux, these drivers fall under the Video4Linux (V4L2 as it is known) category, which are modules that are part of most distros. If you are working with MS Windows, then you need to get an SDK that allows you to access the device. MS may have something for accessing generic devices, (but from my experience, they are not very capable, if they work at all) If you've made it this far, then you now have raw frames (video and/or audio).
THEN you get to the ffmpeg part - libavcodec - which takes the raw frames (audio and/or video) and encodes them into a streams, which ffmpeg can then mux into your final container.
I have searched, but have found very few examples of all of these, and most are piece-meal.
If you don't need to actually code of this yourself, the command line ffmpeg, as well as vlc, can access these devices, capture and save to files, and even stream.
That's the best I can do for now.
ken
For windows use dshow
For Linux (like ubuntu) use Video4Linux (V4L2).
FFmpeg can take input from V4l2 and can do the process.
To find the USB video path type : ls /dev/video*
E.g : /dev/video(n) where n = 0 / 1 / 2 ….
AVInputFormat – Struct which holds the information about input device format / media device format.
av_find_input_format ( “v4l2”) [linux]
av_format_open_input(AVFormatContext , “/dev/video(n)” , AVInputFormat , NULL)
if return value is != 0 then error.
Now you have accessed the camera using FFmpeg and can continue the operation.
sample code is below.
int CaptureCam()
{
avdevice_register_all(); // for device
avcodec_register_all();
av_register_all();
char *dev_name = "/dev/video0"; // here mine is video0 , it may vary.
AVInputFormat *inputFormat =av_find_input_format("v4l2");
AVDictionary *options = NULL;
av_dict_set(&options, "framerate", "20", 0);
AVFormatContext *pAVFormatContext = NULL;
// check video source
if(avformat_open_input(&pAVFormatContext, dev_name, inputFormat, NULL) != 0)
{
cout<<"\nOops, could'nt open video source\n\n";
return -1;
}
else
{
cout<<"\n Success !";
}
} // end function
Note : Header file < libavdevice/avdevice.h > must be included
This really doesn't answer the question as I don't have a pure ffmpeg solution for you, However, I personally use Qt for webcam access. It is C++ and will have a much better API for accomplishing this. It does add a very large dependency on your code however.
It definitely depends on the webcam - for example, at work we use IP cameras that deliver a stream of jpeg data over the network. USB will be different.
You can look at the DirectShow samples, eg PlayCap (but they show AmCap and DVCap samples too). Once you have a directshow input device (chances are whatever device you have will be providing this natively) you can hook it up to ffmpeg via the dshow input device.
And having spent 5 minutes browsing the ffmpeg site to get those links, I see this...

Encoding video on H.263 to send over RTP

I'm developing an application to send video over RTP to a client that can play only H.263 (1996) and H263+ (1998).
To do this i've encoded the video using libav following these steps: (this is only part of the code)
av_register_all();
avformat_network_init();
Fmt = av_guess_format("rtp", NULL, NULL);
...
st = add_video_stream(FmtCtx, CODEC_ID_H263);
...
avio_open(&FmtCtx->pb, rtp_url, URL_WRONLY)
To finally enter a loop where i encode the video, the problem is that the stream generated by this program is encoded in H.263-2000 (or H.263++) which the other side cannot undertand, even though i use CODEC_ID_H263 or CODEC_ID_H263P in the initialization the same thing happens.
Is it possible to encode in those old H.263 versions using libav? i havent managed to do it not even using ffmpeg commands. The stream is always h.263-2000 (PT=96)