How do I properly unwrap FLV video into raw and valid h264 segments for gstreamer buffers? - gstreamer

I have written an RTMP server in rust that successfully allows RTMP publishers to connect, push a video stream, and RTMP clients can connect and watch those video streams successfully.
When a video RTMP packet comes in, I attempt to unwrap the video from the FLV container via:
// TODO: The FLV spec has the AVCPacketType and composition time as the first parts of the
// AVCPACKETTYPE. It's unclear if these two fields are part of h264 or FLV specific.
let flv_tag = data.split_to(1);
let is_sequence_header;
let codec = if flv_tag[0] & 0x07 == 0x07 {
is_sequence_header = data[0] == 0x00;
VideoCodec::H264
} else {
is_sequence_header = false;
VideoCodec::Unknown
};
let is_keyframe = flv_tag[0] & 0x10 == 0x10;
After this runs data contains the AVCVIDEOPACKET with the flv tag removed. When I send this video to other RTMP clients i just prepend the correct flv tag to it and send it off.
Now I am trying to pass the video packets to gstreamer in order to do in process transcoding. To do this I set up an appsrc | avdec_264 pipeline, and gave the appsrc component the following caps:
video_source.set_caps(Some(
&Caps::builder("video/x-h264")
.field("alignment", "nal")
.field("stream-format", "byte-stream")
.build()
));
Now when an RTMP publisher sends a video packet, I take the (attempted) unwrapped video packet and pass it to my appsrc via
pub fn push_video(&self, data: Bytes, timestamp: RtmpTimestamp) {
let mut buffer = Buffer::with_size(data.len()).unwrap();
{
let buffer = buffer.get_mut().unwrap();
buffer.set_pts(ClockTime::MSECOND * timestamp.value as u64);
let mut samples = buffer.map_writable().unwrap();
{
let samples = samples.as_mut_slice();
for index in 0..data.len() {
samples[index] = data[index];
}
}
}
self.video_source.push_buffer(buffer).unwrap();
}
When this occurs the following gstreamer debug output appears
2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #0 (is_sequence_header:true, is_keyframe=true)
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Connection 63397d56-16fb-4b54-a622-d991b5ad2d8e sent audio data
0:00:05.531722000 7516 000001C0C04011C0 INFO GST_EVENT gstevent.c:973:gst_event_new_segment: creating segment event bytes segment start=0, offset=0, stop=-1, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0, base=0, position 0, duration -1
0:00:05.533525000 7516 000001C0C04011C0 INFO basesrc gstbasesrc.c:3018:gst_base_src_loop:<video_source> marking pending DISCONT
0:00:05.535385000 7516 000001C0C04011C0 WARN videodecoder gstvideodecoder.c:2818:gst_video_decoder_chain:<video_decode> Received buffer without a new-segment. Assuming timestamps start from 0.
0:00:05.537381000 7516 000001C0C04011C0 INFO GST_EVENT gstevent.c:973:gst_event_new_segment: creating segment event time segment start=0:00:00.000000000, offset=0:00:00.000000000, stop=99:99:99.999999999, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0:00:00.000000000, base=0:00:00.000000000, position 0:00:00.000000000, duration 99:99:99.999999999
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #1 (is_sequence_header:false, is_keyframe=true)
0:00:05.563445000 7516 000001C0C04011C0 INFO libav :0:: Invalid NAL unit 0, skipping.
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #2 (is_sequence_header:false, is_keyframe=false)
0:00:05.579274000 7516 000001C0C04011C0 ERROR libav :0:: No start code is found.
0:00:05.581338000 7516 000001C0C04011C0 ERROR libav :0:: Error splitting the input into NAL units.
0:00:05.583337000 7516 000001C0C04011C0 WARN libav gstavviddec.c:2068:gst_ffmpegviddec_handle_frame:<video_decode> Failed to send data for decoding
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #3 (is_sequence_header:false, is_keyframe=false)
0:00:05.595253000 7516 000001C0C04011C0 ERROR libav :0:: No start code is found.
0:00:05.597204000 7516 000001C0C04011C0 ERROR libav :0:: Error splitting the input into NAL units.
0:00:05.599262000 7516 000001C0C04011C0 WARN libav gstavviddec.c:2068:gst_ffmpegviddec_handle_frame:<video_decode> Failed to send data for decoding
Based on this I figured this might be caused by the non-data portions of the AVCVIDEOPACKET not being part of the h264 flow, but an FLV specific flow. So I tried ignoring the first 4 bytes (AVCPacketType and CompositionTime fields) of each packet I wrote to the buffer:
pub fn push_video(&self, data: Bytes, timestamp: RtmpTimestamp) {
let mut buffer = Buffer::with_size(data.len() - 4).unwrap();
{
let buffer = buffer.get_mut().unwrap();
buffer.set_pts(ClockTime::MSECOND * timestamp.value as u64);
let mut samples = buffer.map_writable().unwrap();
{
let samples = samples.as_mut_slice();
for index in 4..data.len() {
samples[index - 4] = data[index];
}
}
}
self.video_source.push_buffer(buffer).unwrap();
}
This essentially gave me the same logging output and errors. This is reproducible with the h264parse plugin as well.
What am I missing in the unwrapping process to pass raw h264 video to gstreamer?
Edit:
Realizing I misread the pad template I tried the following caps instead
video_source.set_caps(Some(
&Caps::builder("video/x-h264")
.field("alignment", "au")
.field("stream-format", "avc")
.build()
));
This also failed with pretty simmilar output.

I think I finally figured this out.
The first thing is that I need to include removing the AVCVIDEOPACKET headers (packet type and composition time fields). These are not part of the h264 format and thus cause parsing errors.
The second thing I needed to do was to not pass the sequence header as a buffer to the source. Instead the sequence header bytes need to be set as the codec_data field for the appsrc's caps. This now allows for no parsing errors when passing the video data to h264parse, and even gives me a correctly sized window.
The third thing I was missing is the correct dts and pts values. It turns out the RTMP timestamp I'm given is the dts, and pts = AVCVIDEOPACKET.CompositionTime + dts.

Related

Missing element: MPEG4-GENERIC audio RTP depayloader Gstreamer

When I try to record an RTSP stream with audio and video using gstreamer I get the above error. When only video is recorded it works but when audio pipeline is added the file size becomes zero and the above error is displayed. Further following is also displayed
Missing element: MPEG4-GENERIC audio RTP depayloader
WARNING: from element /GstPlayBin:playbin0/GstURIDecodeBin:uridecodebin0: No decoder available for type 'application/x-rtp, media=(string)audio, payload=(int)96, clock-rate=(int)48000, encoding-name=(string)MPEG4-GENERIC, streamtype=(string)5, profile-level-id=(string)1, mode=(string)aac-hbr, sizelength=(string)13, indexlength=(string)3, indexdeltalength=(string)3, config=(string)1188, a-tool=(string)"LIVE555\ Streaming\ Media\ v2016.01.29", a-type=(string)broadcast, x-qt-text-nam=(string)"KMStreaming\ Server", x-qt-text-inf=(string)ch01, clock-base=(uint)3130203504, seqnum-base=(uint)34845, npt-start=(guint64)0, play-speed=(double)1, play-scale=(double)1, ssrc=(uint)3216157947'.
Additional debug info:
gsturidecodebin.c(921): unknown_type_cb (): /GstPlayBin:playbin0/GstURIDecodeBin:uridecodebin0
There are two different MPEG4 audio RTP formats in the wild. MP4A-LATM and MPEG4-GENERIC. See RFC 3016 and RFC 3640 respectively.
Looks like GStreamer only supports MP4A-LATM. So basically, yes, the format you are trying to receive is not supported.

Video Recording Hangs on IMFSinkWriter->Finalize

I’ve implemented custom IMFMediaSink for use with sink writer. Works OK, receives h264 video samples. I don’t have any container, I’m consuming raw h264 video samples. I have not implemented custom writer, I'm using MFCreateSinkWriterFromMediaSink API to wrap my custom media sink into a framework-provided writer.
I’m unable to implement graceful shutdown, IMFSinkWriter::Finalize() never returns. When I implemented IMFSinkWriterCallback, IMFSinkWriter::Finalize() returns immediately but my IMFSinkWriterCallback::OnFinalize was never called.
The problem reproduces in 100% tests with both nvenc and MS software encoder.
Writer attributes:
MF_LOW_LATENCY = TRUE
MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS = TRUE (1)
MF_READWRITE_DISABLE_CONVERTERS = FALSE (2)
MF_SINK_WRITER_DISABLE_THROTTLING = TRUE
MF_SINK_WRITER_D3D_MANAGER
MF_SINK_WRITER_ASYNC_CALLBACK
(1) Tried both, same result
(2) Need the converters because nvenc only supports YUV and I have RGB textures on input.
Output media type (it’s fixed, I’m using the built-in handler created by MFCreateSimpleTypeHandler API).
MF_MT_MAJOR_TYPE = MFMediaType_Video
MF_MT_SUBTYPE = MFVideoFormat_H264
MF_MT_INTERLACE_MODE = MFVideoInterlace_Progressive
MF_MT_AVG_BITRATE = 40*1000*1000
MF_MT_FRAME_SIZE = { 3840, 2160 }
MF_MT_FRAME_RATE = { 60, 1 }
MF_MT_PIXEL_ASPECT_RATIO = { 1, 1 }
Input media type:
MF_MT_MAJOR_TYPE = MFMediaType_Video
MF_MT_SUBTYPE = MFVideoFormat_RGB32
MF_MT_INTERLACE_MODE = MFVideoInterlace_Progressive
MF_MT_FRAME_SIZE = { 3840, 2160 }
MF_MT_FRAME_RATE = { 60, 1 }
MF_MT_PIXEL_ASPECT_RATIO = { 1, 1 }
When not using IMFSinkWriterCallback, here’s a call stack at the time of hang:
ntdll.dll!_NtWaitForSingleObject#12 ()
KernelBase.dll!WaitForSingleObjectEx()
mfreadwrite.dll!CMFSinkWriter::InternalFinalize(void)
mfreadwrite.dll!CMFSinkWriter::Finalize(void)
MFTrace doesn’t have anything related to finalize even with -k All:
13700,3C60 19:01:25.79566 CMFTransformDetours::ProcessOutput #02EA6E3C failed hr=0xC00D6D72 MF_E_TRANSFORM_NEED_MORE_INPUT
13700,2A98 19:01:25.80250 CMFTransformDetours::ProcessOutput #1A6CEF38 Stream ID 0, Sample #1C244F30, Time 1216ms, Duration 16ms, Buffers 1, Size 12441600B, MFSampleExtension_CleanPoint=1;MFSampleExtension_Interlaced=0
13700,2098 19:01:25.80254 CMFTransformDetours::ProcessInput #02EA6E3C Stream ID 0, Sample #1C244F30, Time 1216ms, Duration 16ms, Buffers 1, Size 12441600B, MFSampleExtension_CleanPoint=1;MFSampleExtension_Interlaced=0
13700,2A98 19:01:25.80256 CMFTransformDetours::ProcessOutput #1A6CEF38 failed hr=0xC00D6D72 MF_E_TRANSFORM_NEED_MORE_INPUT
13700,2A98 19:01:25.80266 CMFTransformDetours::ProcessMessage #1A6CEF38 Message type=0x00000001 MFT_MESSAGE_COMMAND_DRAIN, param=00000000
13700,2A98 19:01:25.80267 CMFTransformDetours::ProcessOutput #1A6CEF38 failed hr=0xC00D6D72 MF_E_TRANSFORM_NEED_MORE_INPUT
13700,2098 19:01:25.81669 CMFTransformDetours::ProcessOutput #02EA6E3C Stream ID 0, Sample #1FB68CF8, Time 1216ms, Duration 16ms, Buffers 1, Size 680B, {2B5D5457-5547-4F07-B8C8-B4A3A9A1DAAC}=1;{73A954D4-09E2-4861-BEFC-94BD97C08E6E}=12166667 (0,12166667);{9154733F-E1BD-41BF-81D3-FCD918F71332}=65535;{973704E6-CD14-483C-8F20-C9FC0928BAD5}=1;MFSampleExtension_CleanPoint=0;{B2EFE478-F979-4C66-B95E-EE2B82C82F36}=16 (0,16)
13700,82C 19:01:25.81674 CMFTransformDetours::ProcessOutput #02EA6E3C failed hr=0xC00D6D72 MF_E_TRANSFORM_NEED_MORE_INPUT
13700,82C 19:01:25.81674 CMFTransformDetours::ProcessMessage #02EA6E3C Message type=0x00000001 MFT_MESSAGE_COMMAND_DRAIN, param=00000000
13700,82C 19:01:25.81674 CMFTransformDetours::ProcessOutput #02EA6E3C failed hr=0xC00D6D72 MF_E_TRANSFORM_NEED_MORE_INPUT
13700,1F54 19:01:27.24237 CKernel32ExportDetours::OutputDebugStringA # D3D11 WARNING: Process is terminating. Using simple reporting. Please call ReportLiveObjects() at runtime for standard reporting. [ STATE_CREATION WARNING #0: UNKNOWN]
13700,1F54 19:01:27.24255 CKernel32ExportDetours::OutputDebugStringA # D3D11 WARNING: Live Producer at 0x0311D91C, Refcount: 13. [ STATE_CREATION WARNING #0: UNKNOWN]
Warnings about live D3D resources are expected as I terminated the process after the hang.
Any ideas what’s going on? I think the writer probably waits for these SPS/PPS magic blobs to arrive but it never happens. Is there a way to instruct the h264 encoder to output SPS/PPS somewhere?
You’ve implemented custom IMFMediaSink, so i suppose you 've also implemented IMFStreamSink.
Doing this in the usual way with Mediafoundation, you have circular COM reference between IMFMediaSink and IMFStreamSink. That's why the Shutdown method from IMFMediaSink interface exists.
If a program that uses your custom MediaSink does not call Shutdown at the right place, there will be memory leaks.
From your IMFSinkWriterCallback problem, we don't have enough information to find where the problem is.
Also it is not clear about "custom IMFMediaSink" and "IMFSinkWriter". Are you also implementing a IMFSinkWriter...
EDIT1
Just two things :
MFCreateSinkWriterFromMediaSink
Call CoInitialize(Ex) and MFStartup before calling this function.
When you are done using the media sink, call the media sink's IMFMediaSink::Shutdown method. (The sink writer does not shut down the media sink.) Release the sink writer before calling Shutdown on the media sink.
do you release interfaces correctly ?
IMFSinkWriter::Finalize
Internally, this method calls IMFStreamSink::PlaceMarker to place end-of-segment markers for each stream on the media sink.
Do you handle this message (MFSTREAMSINK_MARKER_ENDOFSEGMENT) ?
We don't know how you handle CriticalSection/Event/CircularReference, so it's hard to found the problem.
EDIT2
Is there a way to instruct the h264 encoder to output SPS/PPS somewhere?
Normally, for h264 video format, you need to get the attributes MF_MT_MPEG_SEQUENCE_HEADER (BLOB type) when SetCurrentMediaType is called on your IMFStreamSink (assuming you implements IMFMediaTypeHandler).
EDIT3
Could you provide the real one (this is how i think the app should be) :
I don't remember if your custom sink creates a mp4 file. If it is, in the IMFSinkWriter::Finalize, you have to generate ftpy/moov atom.
EDIT4
Also you can read this : Video Recording Hangs on IMFSinkWriter->Finalize();
With no source code, this is the only answer i can give.

ffmpeg c++ API encode mpegts with KLV data stream

I need to encode an mpegts video using the ffmpeg C++ API. The output video shall have two streams: the first one shall be of type AVMEDIA_TYPE_VIDEO; the second one shall be of type AVMEDIA_TYPE_DATA and shall contain a set of KLV data.
I have written my own KLV library to manage the KLV format.
However I'm not able to create "from scratch" a new video by combining the two streams. Following the implementation as in FFMPEG C api h.264 encoding / MPEG2 ts streaming problems I can successfully encode a mpegts video with a single video stream.
However I'm not able to add a new AVMEDIA_TYPE_DATA stream to the output video since, as soon as I add a new data stream using methods like avformat_new_stream(...) the output video is empty: neither the data stream nor the video one are produced and the output file is empty.
Can anyone suggest me a tutorial page or a sample on how to properly add a data stream to my output video in mpegts format?
Thanks a lot!
I was able to get a KLV stream added to a muxed output by starting with the "muxing.c" example that comes with the FFmpeg source, and modifying it as follows.
First, I created the AVStream as follows, where "oc" is the AVFormatContext (muxer) variable:
AVStream *klv_stream = klv_stream = avformat_new_stream(oc, NULL);
klv_stream->codec->codec_type = AVMEDIA_TYPE_DATA;
klv_stream->codec->codec_id = AV_CODEC_ID_TIMED_ID3;
klv_stream->time_base = AVRational{ 1, 30 };
klv_stream->id = oc->nb_streams - 1;
Then, during the encoding/muxing loop:
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = (uint8_t*)GetKlv(pkt.size);
auto res = write_frame(oc, &video_st.st->time_base, klv_stream, &pkt);
free(pkt.data);
(The GetKlv() function returns a malloc()'ed array of binary data that would be replaced by whatever you're using to get your encoded KLV. It sets pkt.size to the length of the data.)
With this modification, and specifying a ".ts" target file, I get a three-stream file that plays just fine in VLC. The KLV stream has a stream_type of 0x15, indicating synchronous KLV.
Note the codec_id value of AV_CODEC_ID_TIMED_ID3. According to the libavformat source file "mpegtsenc.c", a value of AV_CODEC_ID_OPUS should result in stream_type 6, for asynchronous KLV (no accompanying PTS or DTS). This is actually important for my application, but I'm unable to get it to work -- the call to avformat_write_header() throws a division by zero error. If I get that figured out, I'll add an update here.

How to use FFMPEG to play H.264 stream from NAL units that are stored as video AVPackets

I am writing client-server system that uses FFMPEG library to parse H.264 stream into NAL units on the server side, then uses channel coding to send them over network to client side, where my application must be able to play video.
The question is how to play received AVPackets (NAL units) in my application as video stream.
I have found this tutorial helpful and used it as base for both server and client side.
Some sample code or resource related to playing video not from file, but from data inside program using FFMPEG library would be very helpful.
I am sure that received information will be sufficient to play video, because I tried to save received data as .h264 or .mp4 file and it can be played by VLC player.
Of what I understand from your question, you have the AVPackets and want to play a video. In reality this is two problems; 1. decoding your packets, and 2. playing the video.
For decoding your packets, with FFmpeg, you should take a look at the documentation for AVPacket, AVCodecContext and avcodec_decode_video2 to get some ideas; the general idea is that you want to do something (just wrote this in the browser, take with a grain of salt) along the lines of:
//the context, set this appropriately based on your video. See the above links for the documentation
AVCodecContext *decoder_context;
std::vector<AVPacket> packets; //assume this has your packets
...
AVFrame *decoded_frame = av_frame_alloc();
int ret = -1;
int got_frame = 0;
for(AVPacket packet : packets)
{
avcodec_get_frame_defaults(frame);
ret = avcodec_decode_video2(decoder_context, decoded_frame, &got_frame, &packet);
if (ret <= 0) {
//had an error decoding the current packet or couldn't decode the packet
break;
}
if(got_frame)
{
//send to whatever video player queue you're using/do whatever with the frame
...
}
got_frame = 0;
av_free_packet(&packet);
}
It's a pretty rough sketch, but that's the general idea for your problem of decoding the AVPackets. As for your problem of playing the video, you have many options, which will likely depend more on your clients. What you're asking is a pretty large problem, I'd advise familiarizing yourself with the FFmpeg documentation and the provided examples at the FFmpeg site. Hope that makes sense

How to find frame end when MPEG2 stream coming in MPEG-TS Container over RTP?

I am receiving MPEG2-TS stream over RTP. But i am unable to find the end of a particular frame.
When only MPEG2 stream came over RTP then marker bit in RTP header is set to 1 when there is end of any frame , but in this case marker bit is always 0.
Can anyone help me , how can i find the frame end in case of MPEG2-TS?
According to RFC 2250 M bit should indicate the end of frame in case of mpeg-ts. (3.3 RTP Fixed Header for MPEG ES encapsulation) but many decoder may not be putting it in header.
only other way to find the start of frame is to decode the header of 188 byte mpeg-ts packet.mpeg-ts contains "Payload Unit Start Indicator".
so your algo will be like
RTP data contain integer number of mpeg-ts packets.
each packet starts with 0x47
check the "payload unit start indicator" fiels for each packet
if "payload unit start indicator == 1" check the if PES or PSI
ignore packet if PSI and continue with step-1, else go to next step
for PES packet check "Stream id" if its video you hit a new frame.