I am trying to write a c++ program that would read key frames from the video file using ffmpeg.
So far I managed to get all the frames using av_read_frame where you sequentially read
frame by frame.
But I having some problems using av_seek_frame which (if I am correct) supposed to do the trick for keyframes.
int av_seek_frame(AVFormatContext *s, int stream_index, int64_t timestamp, int flags);
I have FormatContext but what are other correct arguments to sequentially get only all keyframes ?
Is there other function that I can use instead?
Thanks
EDIT: In av_read_frame i am getting AVPacket, which I can use to get frame data, but how I can get packet by using av_seek_frame ?
SOLUTION: OK there is a simple boolean value in AVFrame->key_frame. True if its a keyframe
av_seek_frame has the ability to seek to a certain timestamp in a video file. It takes 4 parameters: a pointer to the AVFormatContext, a stream index, the timestamp to seek to and flags to select the direction and seeking mode.
The function will then seek to the first key frame before the given timestamp.
Check the documentation of that function for more information.
Related
How should I validate pts/dts after demuxing and then after decoding?
For me it is significant to have valid pts all the time for days and
possibly weeks of continuous streaming.
After demuxing I check:
dts <= pts
prev_packet_dts < next_packet_pts
I also discard packets with AV_NOPTS_VALUE and wait for packets with
proper pts, because I don't know video duration at this case.
pts of packets can be not increasing because of I-P-B frames
Is it all right?
What about decoded AVFrames?
Should 'pts' be increasing all the time?
Why at some point 'pts' could lag behind 'dts'?
Why pict_type is a parameter of AVFrame? Should be at AVPacket, because
AVPacket is a compressed frame, not the opposite?
Ideally, yes. Unless if your format allows discontinuities, or wraps timestamps around due to overflow, like MPEG-TS.
Writing error.
It is an informational field, indicating the provenance of the frame. It can be used by filters or encoders, e.g. keyframe alignment during a re-encode.
At libav support I was advised to not rely on decoder output. It is more solid to produce pts/dts for encoding/muxing manually and I should search for ffmpeg tools sources to proper implementation. I will search for this approach.
For now I discard AVFrames only with AV_NOPTS_VALUE, and the rest of encoding/muxing works fine.
Validation of AVPackets after Demuxing remains the same, as described above.
I'm trying to playback an audio CD by using cd_paranoia (from the cdio package) and to hand over the data read to the ALSA sound output. Buffered, of course. My issue is now the following: As stated in this example program, a call to paranoia_read () returns an int16_t* containing one sector (2,352 bytes) of audio data, which can be then cast into a char*.
The ALSA snd_pcm_writei () method, on the other hand needs a chunk of audio data in a char*, whose length is to be determined by using the snd_pcm_hw_params_get_period_size () method, which basically returns the count of bytes sent to the sound device, until it triggers an interrupt. Sell also this example sourcecode.
The two methods will almost for sure return different values 'cause an ALSA frame has a different size than a CD sector. This would mean I'd have to divide the data cd-paranoia delivers me somehow, so that they will fit into ALSA's frame structure. Or would it be sufficient just to stream the CD audio data into a big byte array (std::queue<char>) and then, step by step, read as many bytes from this array, so that I will get a complete ALSA "frame"?
Any hints? Thank you.
snd_pcm_writei() handles any number of frames.
i use isamplegrabber sampleCB callback to get audio sample, i can get buffer and buffer length from imediasample and i use avcodec_fill_audio_frame(frame,ost->enc->channels,ost->enc->sample_fmt,(uint8_t *)buffer,length,0) to make an avframe , but this frame does not make any audio in my mux file! i think the length is very smaller than frame_size.
can every one help me please? or give me some example if it is possible.
thank you
this is my samplecb code :
HRESULT AudioSampleGrabberCallBack::SampleCB(double Time, IMediaSample*pSample){
BYTE *pBuffer;
pSample->GetPointer(&pBuffer);
long BufferLen = pSample->GetActualDataLength();
muxer->PutAudioFrame(pBuffer,BufferLen);
}
and this is samplegrabber pin media type :
AM_MEDIA_TYPE pmt2;
ZeroMemory(&pmt2, sizeof(AM_MEDIA_TYPE));
pmt2.majortype = MEDIATYPE_Audio;
pmt2.subtype = FOURCCMap(0x1602);
pmt2.formattype = FORMAT_WaveFormatEx;
hr = pSampleGrabber_audio->SetMediaType(&pmt2);
after that i using ffmpeg muxing example to process frames and i think i need only to change the signal generating part of code :
AVFrame *Muxing::get_audio_frame(OutputStream *ost,BYTE* buffer,long length)
{
AVFrame *frame = ost->tmp_frame;
int j, i, v;
uint16_t *q = (uint16_t*)frame->data[0];
int buffer_size = av_samples_get_buffer_size(NULL, ost->enc->channels,
ost->enc->frame_size,
ost->enc->sample_fmt, 0);
// uint8_t *sample = (uint8_t *) av_malloc(buffer_size);
av_samples_alloc(&frame->data[0], frame->linesize, ost->enc->channels, ost->enc->frame_size, ost->enc->sample_fmt, 1);
avcodec_fill_audio_frame(frame, ost->enc->channels, ost->enc->sample_fmt,frame->data[0], buffer_size, 1);
frame->pts = ost->next_pts;
ost->next_pts += frame->nb_samples;
return frame;
}
The code snippets suggest you are getting AAC data using Sample Grabber and you are trying to write that into file using FFmpeg's libavformat. This can work out.
You initialize your sample grabber to get audio data in WAVE_FORMAT_AAC_LATM format. This format is not so wide spread and you are interested in reviewing your filter graph to make sure the upstream connection on the Sample Grabber is such that you expect. There is a chance that somehow there is a weird chain of filter that pretend to produce AAC-LATM and the reality is that the data is invalid (or not even reaching grabber callback). So you need to review the filter graph (see Loading a Graph From an External Process and Understanding Your DirectShow Filter Graph), then step through your callback with debugger to make sure you get the data and it makes sense.
Next thing, you are expected to initialize AVFormatContext, AVStream to indicate that you will be writing data in AAC LATM format. Provided code does not show you are doing it right. The sample you are referring to is using default codecs.
Related reading: Support LATM AAC in MP4 container
Then, you need to make sure that both incoming data and your FFmpeg output setup are in agreement about whether the data has or does not have ADTS headers, the provided code does not shed any light on this.
Furthermore, I am afraid you might be preparing your audio data incorrectly. The sample in question generates raw audio data and applies encoder to produce compressed content using avcodec_encode_audio2. Then a packed with compressed audio is being sent to writing using av_interleaved_write_frame. The way you attached your code snippets to the question makes me thing you are doing it wrong. For starters, you still don't show relevant code which makes me think you have troubles identifying what code is relevant exactly. Then you are dealing with your AAC data as if it was raw PCM audio in get_audio_frame code snippet whereas you are interested in reviewing FFmpeg sample code with the thought in mind that you already have compressed AAC data and sample gets to thins point after return from avcodec_encode_audio2 call. This is where you are supposed to merge your code and the sample.
I'm using libav to read an MPEG stream.
I'm using the function av_read_frame() to read some frames into packets:
av_read_frame(pFormatCtx, &packet)
I then use the function avcodec_decode_video2 to decode the packet into frame.
the documentation of the function avcodec_decode_video2 contains the following warning:
The input buffer must be FF_INPUT_BUFFER_PADDING_SIZE larger than the
actual read bytes because some optimized bitstream readers read 32 or
64 bits at once and could read over the end. The end of the input
buffer buf should be set to 0 to ensure that no overreading happens
for damaged MPEG streams.
I wanted to know if the function av_read_frame doesn't already allocate the additional FF_INPUT_BUFFER_PADDING_SIZE?
Thank you.
Yes, av_read_frame() always adds FF_INPUT_BUFFER_PADDING_SIZE for you. You only need to care about that if you use your own demuxed data as input to avcodec_decode_video2(), e.g. if you write your own demuxers (like what VLC or mplayer do).
Can anyone help decipher the correct implementation of the libspotify get_audio_buffer_stats callback. Specifically, we are supposed to populate a sp_audio_buffer_stats buffer, consisting of samples and stutter?
According to the Docs:
int samples - Samples in buffer.
int stutter - Number of stutters (audio dropouts) since last query.
I'm wondering about "samples." What exactly is this referring to?
The music playback (audio_delivery) callback has a num_frames variable, but then you have the issue of audio format (channels and/or sample_rate).
Is it correct to set "samples" to total amount of "num_frames" currently in my buffer? Or do I need to run some math based on total "num_samples", "channels", and "sample_rate"
It should be the number of frames in your output buffer. I.e. int samples is slightly misnamed and should probably be called int frames instead.