Question: What does the Libav/FFmpeg decoding pipeline need in order to produce valid presentation timestamps (PTS) in the decoded AVFrames?
I'm decoding an H264 stream received via RTSP. I use Live555 to parse H264 and feed the stream to my LibAV decoder. Decoding and displaying is working fine, except I'm not using timestamp info and get some stuttering.
After getting a frame with avcodec_decode_video2, the presentation timestamp (PTS) is not set.
I need the PTS in order to find out for how long each frame needs to be displayed, and avoid any stuttering.
Notes on my pipeline
I get the SPS/PPS information via Live555, I copy these values to my AVCodecContext->extradata.
I also send the SPS and PPS to my decoder as NAL units, with the appended {0,0,0,1} startcode.
Live555 provides presentation timestamps for each packet, these are in most cases not monotonically increasing. The stream contains B-frames.
My AVCodecContext->time_base is not valid, value is 0/2.
Unclear:
Where exactly should I set the NAL PTS coming from my H264 sink (Live555)? As the AVPacket->dts, pts, none, or both?
Why is my time_basevalue not valid? Where is this information?
According to the RTP payload spec. It seems that
The RTP timestamp is set to the sampling timestamp of the content. A 90 kHz clock rate MUST be used.
Does this mean that I must always asume a 1/90000 timebase for the decoder? What if some other value is specified in the SPS?
Copy the live555 pts into the avpacket pts. Process the packet with avcodec_decode_video2, and then retrieve the pts from avframe->pkt_pts, these will be monotonically increasing.
There is no need to set anything in the codec context, apart from setting the SPS and PPS in the AVCodecContex extradata
You can find a good example in VLC's github:
Setting AVPacket pts: https://github.com/videolan/vlc/blob/master/modules/codec/avcodec/video.c#L983
Decoding AVPacket into AVFrame: https://github.com/videolan/vlc/blob/master/modules/codec/avcodec/video.c#L1014
Retrieving from AVFrame pts:
https://github.com/videolan/vlc/blob/master/modules/codec/avcodec/video.c#L1078
avcodec_decode_video2() reorders the frames so that decode order and presentation order is the same.
Even if you somehow convince ffmpeg to give you PTS on the decoded frame it should be the same as DTS.
//
// decode a video frame
//
avcodec_decode_video2
(
ctxt->video_st->codec,
frame,
&is_finished,
buffer
);
if (buffer->dts != AV_NOPTS_VALUE)
{
//
// you should end up here
//
pts = buffer->dts;
}
else
{
pts = 0;
}
//
// adjust time base
//
pts *= av_q2d(ctxt->video_st->time_base);
Related
Not able to get the frame width and height from the rtsp url from h265 camera. Can I get any guidelines for fetching the resolution.
You can process the sps packets to find the height and width for your bit-stream Follow the Sequence parameter set RBSP semantics in the following link and move the bytes to process the sps data LINK: ITU H265 DOC
I am having a task to build a decoder that generates exactly 1 raw audio frame for 1 raw video frame, from an encoded mpegts network stream, so that users can use the API by calling getFrames() and receive exactly these two frames.
Currently I am reading with av_read_frame in a thread, decode as packets come, audio or video; collect until a video packet is hit. Problem is generally multiple audio packets are received before video is seen.
av_read_frame is blocking, returns when certain amount of audio data is collected (1152 samples for mp2); and decoding that packet gives a raw AVFrame having duration of T (depends on samplerate); whereas the video frame generally has duration bigger than T (depends on fps), so multiple audio frames are received before it.
I was guessing I have to find a way to merge collected audio frames into 1 single frame just when video is hit. Also resampling and setting timestamp to align with video is needed I guess. I don't know if this is even valid though.
What is the smoothest way to sync video and audio in this manner ?
I am currently trying to encode some raw audio data with some video inside an avi container.
The video codec used is mpeg4 and I would like to use the PCM_16LE for the audio codec but I am facing a problem regarding the AVCodec->frame_size parameter for the audio samples.
After doing all the correct allocation, I try allocating the audio frame and for AV_CODEC_ID_PCM_S16LE codec I don't have the codec frame_size needed to get the samples buffer size. Therefore the sample buffer size is huge and I simply can't allocate such quantity of memory.
Does someone know how to bypass this issue and how to manually compute the frame_size ?
frame = av_frame_alloc();
if(!frame)
{
return NULL;
}
//Problem is right here with the frame_size
frame->nb_samples = m_pAudioCodecContext->frame_size;
frame->format = m_pAudioStream->codec->sample_fmt;
frame->channel_layout = m_pAudioStream->codec->channel_layout;
//The codec gives us the frame size, in samples, so we can calculate the size of the samples buffer in bytes
//This returns a huge value due to a null frame_size
m_audioSampleBufferSize = av_samples_get_buffer_size(NULL,
m_pAudioCodecContext->channels,
m_pAudioCodecContext->frame_size,
m_pAudioCodecContext->sample_fmt,
0);
Thank you for your help,
Robert
As you can see in pcm_encode_init function in pcm.c
All pcm encoders have frame_size = 0;. Why?
Because in all PCM formats 'De facto' no such thing like frame, there is no compression by nature of PCM.
So you should decide by your own how many samples you wanna to store in buffer
I'm trying to play vc1 coded video in matroska container.
For that I'm using ffmpeg's av_read_frame function and a sertain video driver, which requires AVPacket's data to be prefixed by PES header.
In AVPacket only dts field is valid, pts is AV_NOPTS_VALUE. I write dts value into PES header instead of pts.
Video driver logs constant framerate change from 23976 to 24000 and vice versa. The video jerks. Although I put framerate into PES header (value 23976 is what ffmpeg's probing gives), but apparently, it's changing according to current packet's pts.
I tried to look at AVCodecParserContext's pts_dts_delta and dts_ref_dts_delta but they are of AV_NOPTS_VALUE, its pts and dts are the same as of AVPacket
Please advise how to get proper pts values, or what to do to solve it.
Thanks.
EDIT:
I saw in ffplay.c they use av_frame_get_best_effort_timestamp but that's after decoding by ffmpeg's means, which I cannot afford.
I use live555 to receive RTP video frame (frame encoded in H264). I use Live555 open my local .sdp file to receive frame data. I just saw DummySink::afterGettingFrame was called ceaselessly。 if fReceiveBuffer in DummySink is correct, Why FFMPEG cannot decode the frame? My code is wrong?
Here is my Code Snippet:
http://paste.ubuntu.com/12529740/
the function avcodec_decode_video2 is always return failed , its value less than zero
fReceiveBuffer is present one video frame?
Oh, Here is my FFMPEG init code need to open related video decoder:
http://paste.ubuntu.com/12529760/
I read the document related H264 again, I found out that I-frame(IDR) need SPS/PPS separated by 0x00000001 insert into the header and decoder have a capacity to decode the frame correctly. Here is a related solution
FFmpeg can't decode H264 stream/frame data
Decoding h264 frames from RTP stream
and now, My App works fine, it can decode the frame and convert it to OSD Image for displaying to screen .