add rtp header before aac adts header - rtp

I was writing a program that could pack RTP header before h264 & aac. but I was confused with RTP timestamp field.
If the video codec is h264, the timestamp could be added by 90000/fps with each frame.
I have no idea about the aac.
My aac sample rate is 8000HZ,config=1588 and each frame is 250 ~ 520. I found some solution :
(1) 1024
(2) 8000/1024 = 7 => 8000/7 = 1142
Video and audio could not sync, the video will faster than audio.
Anyone could help me?

Sending/Transmitting of Packing RTP packets
H.264 # 90000 - rtp timestamp = frame timestamp * 90000 (timestamp of the frame when read from any source)
AAC # 8000 - rtp timestamp = buffer timestamp * 8000 (timestamp of the audio buffer when read from any source)
Receiving of unPacking RTP packets
H.264 - actual timestamp = rtp timestamp / 90000
AAC - actual timestamp = rtp timestamp / 8000
Based of the actual timestamp you do the Audio Video synchronization.
Note: convert the time to seconds from milliseconds

Related

How do I properly unwrap FLV video into raw and valid h264 segments for gstreamer buffers?

I have written an RTMP server in rust that successfully allows RTMP publishers to connect, push a video stream, and RTMP clients can connect and watch those video streams successfully.
When a video RTMP packet comes in, I attempt to unwrap the video from the FLV container via:
// TODO: The FLV spec has the AVCPacketType and composition time as the first parts of the
// AVCPACKETTYPE. It's unclear if these two fields are part of h264 or FLV specific.
let flv_tag = data.split_to(1);
let is_sequence_header;
let codec = if flv_tag[0] & 0x07 == 0x07 {
is_sequence_header = data[0] == 0x00;
VideoCodec::H264
} else {
is_sequence_header = false;
VideoCodec::Unknown
};
let is_keyframe = flv_tag[0] & 0x10 == 0x10;
After this runs data contains the AVCVIDEOPACKET with the flv tag removed. When I send this video to other RTMP clients i just prepend the correct flv tag to it and send it off.
Now I am trying to pass the video packets to gstreamer in order to do in process transcoding. To do this I set up an appsrc | avdec_264 pipeline, and gave the appsrc component the following caps:
video_source.set_caps(Some(
&Caps::builder("video/x-h264")
.field("alignment", "nal")
.field("stream-format", "byte-stream")
.build()
));
Now when an RTMP publisher sends a video packet, I take the (attempted) unwrapped video packet and pass it to my appsrc via
pub fn push_video(&self, data: Bytes, timestamp: RtmpTimestamp) {
let mut buffer = Buffer::with_size(data.len()).unwrap();
{
let buffer = buffer.get_mut().unwrap();
buffer.set_pts(ClockTime::MSECOND * timestamp.value as u64);
let mut samples = buffer.map_writable().unwrap();
{
let samples = samples.as_mut_slice();
for index in 0..data.len() {
samples[index] = data[index];
}
}
}
self.video_source.push_buffer(buffer).unwrap();
}
When this occurs the following gstreamer debug output appears
2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #0 (is_sequence_header:true, is_keyframe=true)
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Connection 63397d56-16fb-4b54-a622-d991b5ad2d8e sent audio data
0:00:05.531722000 7516 000001C0C04011C0 INFO GST_EVENT gstevent.c:973:gst_event_new_segment: creating segment event bytes segment start=0, offset=0, stop=-1, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0, base=0, position 0, duration -1
0:00:05.533525000 7516 000001C0C04011C0 INFO basesrc gstbasesrc.c:3018:gst_base_src_loop:<video_source> marking pending DISCONT
0:00:05.535385000 7516 000001C0C04011C0 WARN videodecoder gstvideodecoder.c:2818:gst_video_decoder_chain:<video_decode> Received buffer without a new-segment. Assuming timestamps start from 0.
0:00:05.537381000 7516 000001C0C04011C0 INFO GST_EVENT gstevent.c:973:gst_event_new_segment: creating segment event time segment start=0:00:00.000000000, offset=0:00:00.000000000, stop=99:99:99.999999999, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0:00:00.000000000, base=0:00:00.000000000, position 0:00:00.000000000, duration 99:99:99.999999999
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #1 (is_sequence_header:false, is_keyframe=true)
0:00:05.563445000 7516 000001C0C04011C0 INFO libav :0:: Invalid NAL unit 0, skipping.
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #2 (is_sequence_header:false, is_keyframe=false)
0:00:05.579274000 7516 000001C0C04011C0 ERROR libav :0:: No start code is found.
0:00:05.581338000 7516 000001C0C04011C0 ERROR libav :0:: Error splitting the input into NAL units.
0:00:05.583337000 7516 000001C0C04011C0 WARN libav gstavviddec.c:2068:gst_ffmpegviddec_handle_frame:<video_decode> Failed to send data for decoding
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #3 (is_sequence_header:false, is_keyframe=false)
0:00:05.595253000 7516 000001C0C04011C0 ERROR libav :0:: No start code is found.
0:00:05.597204000 7516 000001C0C04011C0 ERROR libav :0:: Error splitting the input into NAL units.
0:00:05.599262000 7516 000001C0C04011C0 WARN libav gstavviddec.c:2068:gst_ffmpegviddec_handle_frame:<video_decode> Failed to send data for decoding
Based on this I figured this might be caused by the non-data portions of the AVCVIDEOPACKET not being part of the h264 flow, but an FLV specific flow. So I tried ignoring the first 4 bytes (AVCPacketType and CompositionTime fields) of each packet I wrote to the buffer:
pub fn push_video(&self, data: Bytes, timestamp: RtmpTimestamp) {
let mut buffer = Buffer::with_size(data.len() - 4).unwrap();
{
let buffer = buffer.get_mut().unwrap();
buffer.set_pts(ClockTime::MSECOND * timestamp.value as u64);
let mut samples = buffer.map_writable().unwrap();
{
let samples = samples.as_mut_slice();
for index in 4..data.len() {
samples[index - 4] = data[index];
}
}
}
self.video_source.push_buffer(buffer).unwrap();
}
This essentially gave me the same logging output and errors. This is reproducible with the h264parse plugin as well.
What am I missing in the unwrapping process to pass raw h264 video to gstreamer?
Edit:
Realizing I misread the pad template I tried the following caps instead
video_source.set_caps(Some(
&Caps::builder("video/x-h264")
.field("alignment", "au")
.field("stream-format", "avc")
.build()
));
This also failed with pretty simmilar output.
I think I finally figured this out.
The first thing is that I need to include removing the AVCVIDEOPACKET headers (packet type and composition time fields). These are not part of the h264 format and thus cause parsing errors.
The second thing I needed to do was to not pass the sequence header as a buffer to the source. Instead the sequence header bytes need to be set as the codec_data field for the appsrc's caps. This now allows for no parsing errors when passing the video data to h264parse, and even gives me a correctly sized window.
The third thing I was missing is the correct dts and pts values. It turns out the RTMP timestamp I'm given is the dts, and pts = AVCVIDEOPACKET.CompositionTime + dts.

Remux mp4 file containing data stream

I’m developing an app that needs to clone an MP4 video file with all the streams using FFmpeg C++ API and have successfully made it work based on the FFmpeg remuxing example.
This works great for video and audio streams, but when the video includes a data stream (actually a QuickTime Time Code according to MediaInfo) I get this error.
Output #0, mp4, to 'C:\Users\user\Desktop\shortOut.mp4':
Stream #0:0: Video: hevc (Main 10) (hev1 / 0x31766568), yuv420p10le(tv,progressive), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 1208 kb/s
Stream #0:1: Audio: mp3 (mp4a / 0x6134706D), 48000 Hz, stereo, s16p, 32s
Stream #0:2: Data: none (tmcd / 0x64636D74), 0 kb/s
[mp4 # 0000000071edf600] Could not find tag for codec none in stream #2, codec not currently supported in container
I’ve found this happens in the call to avformat_write_header().
It makes sense that if FFmpeg doesn’t know the codec it can’t write to the header about it, but I found out that using the ffmpeg command line I can make it to work perfectly using the copy command for the stream, something like:
ffmpeg -i input.mp4 -c:v copy -c:a copy -c:a copy output.mp4
I have been analyzing ffmpeg.c implementation to try to understand how they do a stream copy, but it’s been very painful following along the huge pipeline.
What would be a proper way to remux a data stream of this type with FFmpeg C++ API? Any tip or pointers?

Improve NAudio Mp3 Audio Quality

I’m using NAudio and Lame Audio to Convert Wav to Mp3, I’m newbie too for this Audio Conversion code. Thanks to Mark I’m using his Audio File Inspector to get the details
Here is the details
Input - Wave Format details
Opening D:\Data\Test\NAudio\Wav\8777828760-e5749e4c563bf5411c954442085d1ce1#10.58.13.40.wav
DviAdpcm 8000Hz 2 channels 4 bits per sample
Extra Size: 2 Block Align: 512 Average Bytes Per Second: 8110
WaveFormat: DviAdpcm
Length: 788808 bytes: 00:01:37.2640000
Chunk: fact, length 420 D9 0B 00
Output Mp3
Opening D:\Data\Test\NAudio\Mp3\8777828760-e5749e4c563bf5411c954442085d1ce1#10.58.13.40.mp3
MP3 File WaveFormat: MpegLayer3 8000Hz 2 channels 0 bits per sample
Extra Size: 12 Block Align: 1 Average Bytes Per Second: 3000
ID: Mpeg Flags: PaddingIso Block Size: 216 Frames per Block: 1
Length: 3119616 bytes: 00:01:37.4880000
ID3v1 Tag: None
ID3v2 Tag: None
Version25,Layer3,8000Hz,JointStereo,24000bps, length 216
Version25,Layer3,8000Hz,JointStereo,24000bps, length 216
….
….
I’m Converting Wav to Mp3 ( voice recording files).
Question : I’m seeing some compromise in Mp3 Quality, My converted Mp3 is lower file size when compared to Wav, but my audio quality is little poor than Wav, Wonder if i can increase the quality of the Mp3 file ?
Something like increasing the Bitrate etc.
Code for Wav to Mp3 conversation using NAudio / Lame Audio
string filePath = #"D:\Data\Test\NAudio\Wav\11mb.wav";
string outputPath = #"D:\Data\Test\NAudio\Mp3\11mb.mp3";
using (WaveFileReader wavReader = new WaveFileReader(filePath))
using (WaveStream pcm = WaveFormatConversionStream.CreatePcmStream(wavReader))
using (LameMP3FileWriter fileWriter = new LameMP3FileWriter(outputPath, pcm.WaveFormat, LAMEPreset.VBR_90))
{
pcm.CopyTo(fileWriter);
}
This link has more details on my above question
http://mark-dot-net.blogspot.com/search/label/NAudio
MP3 is a heavily compressed codec, it will never get close to the original .Wav quality.
However, if you look at the original .Wav quality, you are starting from a very poor recording. When the Hz and bit depth are that low, there is all sorts of artifacts getting created as the wavform is very poorly represented digitally to start with.
anything under CD quality is going to have a LOT of problems being compressed because so much is missing to begin with.
Perfect for making "Boards of Canada" music though. :)

Live555 to stream live video and audio in one RTSP stream

I have been able to stream video using live555 on its own as well as audio to stream using live555 on its own.
But I want to have the video and audio playing on the same VLC. My video is h264 encoded and audio is AAC encoded. What do I need to do to pass these packets into a FramedSource.
What MediaSubsession/DeviceSource do I override, as this is not a fixed file but live video/live audio?
Thanks in advance!
In order to stream video/H264 and audio/MPEG4-GENERIC in the same RTSP unicast session you should do something like :
#include "liveMedia.hh"
#include "BasicUsageEnvironment.hh"
int main()
{
TaskScheduler* scheduler = BasicTaskScheduler::createNew();
BasicUsageEnvironment* env = BasicUsageEnvironment::createNew(*scheduler);
RTSPServer* rtspServer = RTSPServer::createNew(*env);
ServerMediaSession* sms = ServerMediaSession::createNew(*env);
sms->addSubsession(H264VideoFileServerMediaSubsession::createNew(*env, "test.264",false));
sms->addSubsession(ADTSAudioFileServerMediaSubsession::createNew(*env, "test.aac",false));
rtspServer->addServerMediaSession(sms);
}

How to find frame end when MPEG2 stream coming in MPEG-TS Container over RTP?

I am receiving MPEG2-TS stream over RTP. But i am unable to find the end of a particular frame.
When only MPEG2 stream came over RTP then marker bit in RTP header is set to 1 when there is end of any frame , but in this case marker bit is always 0.
Can anyone help me , how can i find the frame end in case of MPEG2-TS?
According to RFC 2250 M bit should indicate the end of frame in case of mpeg-ts. (3.3 RTP Fixed Header for MPEG ES encapsulation) but many decoder may not be putting it in header.
only other way to find the start of frame is to decode the header of 188 byte mpeg-ts packet.mpeg-ts contains "Payload Unit Start Indicator".
so your algo will be like
RTP data contain integer number of mpeg-ts packets.
each packet starts with 0x47
check the "payload unit start indicator" fiels for each packet
if "payload unit start indicator == 1" check the if PES or PSI
ignore packet if PSI and continue with step-1, else go to next step
for PES packet check "Stream id" if its video you hit a new frame.