Saving H.264 RTP stream without re-encoding? - c++

My C++ application receives a H.264 RTP video stream.
Right now it decodes the stream, saves it into a YUV file and later I use ffmpeg to re-ecode the file into something suitable to watch on a Windows PC (eg. Mpeg4 AVI).
Shouldn't it be possible to save the H.264 stream into a AVI (or similar) container without having to decode and re-encode it ? That would require some H.264 decoder on the PC to watch, but it should be much more efficient.
How could that be done ? Are there any libraries supporting that ?

using ffmpeg is correct but the answers posted so far dont look right to me.
the correct switch should be:
-vcodec copy

Your program could pipe the rtp itself through ffmpeg - even invoking it using popen3().
It seems that you need to use an intermediate SDP file - I speculate that you can specify a file you created as a named pipe or with tmpfile() which your application writes to - using the file as an intermediary.
The command-line would be something like:
int p[3];
const char* const out_fmt = "avi";
const char* cmd[] = {"ffmpeg","-f",,"-i",temp_sdp_filename,"-vcodec","copy","-f",out_fmt,"-",NULL};
if(-1 == popen3(p,cmd)) ...
// write the rtp that you receive to p[STDIN_FILENO]
// read the avi from p[STDOUT_FILENO]
// read any messages and error text from p[STDERR_FILENO]
I believe that in this circumstance ffmpeg is clever enough to repackage the container (rtp stream vs AVI) without transcoding the video and audio (this is the -vcodec copy switch); therefore, you'd have no loss of quality and it'd be blazingly fast.

Related

Use Source Reader to get H264 samples from webcam source

When using the Source Reader I can use it to get decoded YUV samples using an mp4 file source (example code).
How can I do the opposite with a webcam source? Use the Source Reader to provide encoded H264 samples? My webcam supports RGB24 and I420 pixel formats and I can get H264 samples if I manually wire up the H264 MFT transform. But it seems as is the Source Reader should be able to take care of the transform for me. I get an error whenever I attempt to set MF_MT_SUBTYPE of MFVideoFormat_H264 on the Source Reader.
Sample snippet is shown below and the full example is here.
// Get the first available webcam.
CHECK_HR(MFCreateAttributes(&videoConfig, 1), "Error creating video configuration.");
// Request video capture devices.
CHECK_HR(videoConfig->SetGUID(
MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE,
MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID), "Error initialising video configuration object.");
CHECK_HR(videoConfig->SetGUID(MF_MT_SUBTYPE, WMMEDIASUBTYPE_I420),
"Failed to set video sub type to I420.");
CHECK_HR(MFEnumDeviceSources(videoConfig, &videoDevices, &videoDeviceCount), "Error enumerating video devices.");
CHECK_HR(videoDevices[WEBCAM_DEVICE_INDEX]->GetAllocatedString(MF_DEVSOURCE_ATTRIBUTE_FRIENDLY_NAME, &webcamFriendlyName, &nameLength),
"Error retrieving video device friendly name.\n");
wprintf(L"First available webcam: %s\n", webcamFriendlyName);
CHECK_HR(videoDevices[WEBCAM_DEVICE_INDEX]->ActivateObject(IID_PPV_ARGS(&pVideoSource)),
"Error activating video device.");
CHECK_HR(MFCreateAttributes(&pAttributes, 1),
"Failed to create attributes.");
// Adding this attribute creates a video source reader that will handle
// colour conversion and avoid the need to manually convert between RGB24 and RGB32 etc.
CHECK_HR(pAttributes->SetUINT32(MF_SOURCE_READER_ENABLE_VIDEO_PROCESSING, 1),
"Failed to set enable video processing attribute.");
CHECK_HR(pAttributes->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video), "Failed to set major video type.");
// Create a source reader.
CHECK_HR(MFCreateSourceReaderFromMediaSource(
pVideoSource,
pAttributes,
&pVideoReader), "Error creating video source reader.");
MFCreateMediaType(&pSrcOutMediaType);
CHECK_HR(pSrcOutMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video), "Failed to set major video type.");
CHECK_HR(pSrcOutMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264), "Error setting video sub type.");
CHECK_HR(pSrcOutMediaType->SetUINT32(MF_MT_AVG_BITRATE, 240000), "Error setting average bit rate.");
CHECK_HR(pSrcOutMediaType->SetUINT32(MF_MT_INTERLACE_MODE, 2), "Error setting interlace mode.");
CHECK_HR(pVideoReader->SetCurrentMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, NULL, pSrcOutMediaType),
"Failed to set media type on source reader.");
CHECK_HR(pVideoReader->GetCurrentMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, &pFirstOutputType),
"Error retrieving current media type from first video stream.");
std::cout << "Source reader output media type: " << GetMediaTypeDescription(pFirstOutputType) << std::endl << std::endl;
Output:
bind returned success
First available webcam: Logitech QuickCam Pro 9000
Failed to set media type on source reader. Error: C00D5212.
finished.
Source Reader does not look like suitable API here. It is API to implement "half of pipeline" which includes necessary decoding but not encoding. The other half is Sink Writer API which is capable to handle encoding, and which can encode H.264.
Or your another option, unless you are developing a UWP project, is Media Session API which implements a pipeline end to end.
Even though technically (in theory) you could have an encoding MFT as a part of Source Reader pipeline, Source Reader API itself is insufficiently flexible to add encoding style tansforms based on requested media types.
So, one solution could be to have Source Reader to read with necessary decoding (such as up to having RGB32 or NV12 video frames), then Sink Writer to manage encoding with respectively appropriate media sink on its end (or Sample Grabber as media sink). Another solution is to put Media Foundation primitives into Media Session pipeline which can manage both decoding and encoding parts, connected together.
Now, your use case is clearer.
For me, your MFWebCamRtp is the best optimized way of doing : WebCam Source Reader -> Encoding -> RTP Streaming.
But you are experiencing presentation clock issues, synchronization issues, or unsynchronized audio video issues. Am I right ?
So you tried Sample Grabber Sink, and now Source Reader, like I suggested to you. Of course, you can think that a Media Session will be able to do it better.
I think so, but extra work will be needed.
Here is what I would do in your case :
Code a custom RTP Sink
Create a topology with webcam source, h264 encoder, your custom RTP Sink
Add your topology to a MediaSession
Use the MediaSession to play the process
If you want a networkstream sink sample, see this : MFSkJpegHttpStreamer
This is old, but it's a good start. This program also uses winsock, like your.
You should be aware that RTP protocol uses UDP. A very good way to have synchronization issues... Definitely your main problem, as I guess.
What I think. You are trying to compensate for the weaknesses of the RTP protocol (UDP), with a management of the audio / video synchronization of MediaFoundation. I think you will just fail with this approach.
I think your main problem is RTP protocol.
EDIT
No I'm not having synchronisation issues. The Source Reader and Sample Grabber both provide correct timestamps which I can use in the RTP header. Likewise no problems with RTP/UDP etc. that's the bit I do know about. My questions are originating from a desire to understand the most efficient (least amount of plumbing code) and flexible solution. And yes it does look like a custom sink writer is the optimal solution.
Again things are clearer. If you need help with a custom RTP sink, I'll be there.

Write H.264 stream in buffer to a streamable mp4 using ffmpeg

I wrote code to create H.264 stream, which has a loop to generate H.264 encoded frame.
while(true) {
...
x264_encoder_encode(encoder, &buffer, &i_buffer, &pic_in, &pic_out);
...
/*TODO: Write one frame in the buffer to a streamable mp4 file*/
}
Every single time, an H.264 encoded frame is generated and stored in the buffer. How can I write it into a streamable mp4 file directly through the buffer?
I spent lots of time searching for the solution. All I can find is to read stream from a file using
avformat_open_input(&fmtCtx, in_filename, 0, 0)
Is there any way to read directly from buffer without a file?
MP4 is actually not streamable. So in other words, you can't do it at all. I ran in that very problem.
The reason why it won't work is because when you open an mp4 file, you have to have all sorts of parameters, which by default get saved at the end of the file. When you create an MP4, you can always forcibly save that info at the start. However, to know what those parameters are, you need all the data. And without those parameters, the software trying to load the mp4 fails very early on. This is true for some other formats such as webm videos and .m4a or .wav for audio.
What you have to do is stream the actual H.264, possibly using RTSP or a format of your own if you're in control of both sides.

ffmpeg c++ API encode mpegts with KLV data stream

I need to encode an mpegts video using the ffmpeg C++ API. The output video shall have two streams: the first one shall be of type AVMEDIA_TYPE_VIDEO; the second one shall be of type AVMEDIA_TYPE_DATA and shall contain a set of KLV data.
I have written my own KLV library to manage the KLV format.
However I'm not able to create "from scratch" a new video by combining the two streams. Following the implementation as in FFMPEG C api h.264 encoding / MPEG2 ts streaming problems I can successfully encode a mpegts video with a single video stream.
However I'm not able to add a new AVMEDIA_TYPE_DATA stream to the output video since, as soon as I add a new data stream using methods like avformat_new_stream(...) the output video is empty: neither the data stream nor the video one are produced and the output file is empty.
Can anyone suggest me a tutorial page or a sample on how to properly add a data stream to my output video in mpegts format?
Thanks a lot!
I was able to get a KLV stream added to a muxed output by starting with the "muxing.c" example that comes with the FFmpeg source, and modifying it as follows.
First, I created the AVStream as follows, where "oc" is the AVFormatContext (muxer) variable:
AVStream *klv_stream = klv_stream = avformat_new_stream(oc, NULL);
klv_stream->codec->codec_type = AVMEDIA_TYPE_DATA;
klv_stream->codec->codec_id = AV_CODEC_ID_TIMED_ID3;
klv_stream->time_base = AVRational{ 1, 30 };
klv_stream->id = oc->nb_streams - 1;
Then, during the encoding/muxing loop:
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = (uint8_t*)GetKlv(pkt.size);
auto res = write_frame(oc, &video_st.st->time_base, klv_stream, &pkt);
free(pkt.data);
(The GetKlv() function returns a malloc()'ed array of binary data that would be replaced by whatever you're using to get your encoded KLV. It sets pkt.size to the length of the data.)
With this modification, and specifying a ".ts" target file, I get a three-stream file that plays just fine in VLC. The KLV stream has a stream_type of 0x15, indicating synchronous KLV.
Note the codec_id value of AV_CODEC_ID_TIMED_ID3. According to the libavformat source file "mpegtsenc.c", a value of AV_CODEC_ID_OPUS should result in stream_type 6, for asynchronous KLV (no accompanying PTS or DTS). This is actually important for my application, but I'm unable to get it to work -- the call to avformat_write_header() throws a division by zero error. If I get that figured out, I'll add an update here.

ffmpeg: how to save h264 raw data as mp4 file

I encode h264 data by libavcodec.
ex.
while (1) {
...
avcodec_encode_video(pEnc->pCtx, OutBuf, ENC_OUTSIZE, pEnc->pYUVFrame);
...
}
If I directly save OutBuf data as a .264 file, it can`t be play by player. Now I want to save OutBuf
as a mp4 file. Anyone know how to do this by ffmpeg lib? thanks.
You use avformat_write_header, av_interleaved_write_frame, avformat_write_trailer and friends.
Their usage is shown in the muxing example of FFmpeg.
See a similar topic: Raw H264 frames in mpegts container using libavcodec with also writing to a file (different container, same API)
See also links from answer here: FFMpeg encoding RGB images to H264

Encoding video on H.263 to send over RTP

I'm developing an application to send video over RTP to a client that can play only H.263 (1996) and H263+ (1998).
To do this i've encoded the video using libav following these steps: (this is only part of the code)
av_register_all();
avformat_network_init();
Fmt = av_guess_format("rtp", NULL, NULL);
...
st = add_video_stream(FmtCtx, CODEC_ID_H263);
...
avio_open(&FmtCtx->pb, rtp_url, URL_WRONLY)
To finally enter a loop where i encode the video, the problem is that the stream generated by this program is encoded in H.263-2000 (or H.263++) which the other side cannot undertand, even though i use CODEC_ID_H263 or CODEC_ID_H263P in the initialization the same thing happens.
Is it possible to encode in those old H.263 versions using libav? i havent managed to do it not even using ffmpeg commands. The stream is always h.263-2000 (PT=96)