h264 ffmpeg: How to initialize ffmpeg to decode NALs created with x264 - c++

I have encoded some frames using x264, using x264_encoder_encode and after that I have created AVPackets using a function like this:
bool PacketizeNals( uint8_t* a_pNalBuffer, int a_nNalBufferSize, AVPacket* a_pPacket )
{
if ( !a_pPacket )
return false;
a_pPacket->data = a_pNalBuffer;
a_pPacket->size = a_nNalBufferSize;
a_pPacket->stream_index = 0;
a_pPacket->flags = AV_PKT_FLAG_KEY;
a_pPacket->pts = int64_t(0x8000000000000000);
a_pPacket->dts = int64_t(0x8000000000000000);
}
I call this function like this:
x264_nal_t* nals;
int num_nals = encode_frame(pic, &nals);
for (int i = 0; i < num_nals; i++)
{
AVPacket* pPacket = ( AVPacket* )av_malloc( sizeof( AVPacket ) );
av_init_packet( pPacket );
if ( PacketizeNals( nals[i].p_payload, nals[i].i_payload, pPacket ) )
{
packets.push_back( pPacket );
}
}
Now what I want to do is to decode these AVPackets using avcodec_decode_video2. I think the problem is that I haven't initialized properly the decoder because to encode I used "ultrafast" profile and "zerolatency" tune ( x264 ) and to decode I don't know how to specify to ffmpeg these options.
In some examples I have read people initialize the decoder using the file where the video is stored, but in this case I have directly the AVPackets.
What I'm doing to try to decode is:
avcodec_init();
avcodec_register_all();
AVCodec* pCodec;
pCodec=avcodec_find_decoder(CODEC_ID_H264);
AVCodecContext* pCodecContext;
pCodecContext=avcodec_alloc_context();
avcodec_open(pCodecContext,pCodec);
pCodecContext->width = 320;
pCodecContext->height = 200;
pCodecContext->extradata = NULL;
unsigned int nNumPackets = packets.size();
int frameFinished = 0;
for ( auto it = packets.begin(); it != packets.end(); it++ )
{
AVFrame* pFrame;
pFrame = avcodec_alloc_frame();
AVPacket* pPacket = *it;
int iReturn = avcodec_decode_video2( pCodecContext, pFrame, &frameFinished, pPacket );
}
But in iReturn always is -1.
Can anyone help me? Sorry if my knowledge in this area es low, I'm new.
Thanks.

I have written a simple client/server application that streams raw RGB video using lib x264 for encoding and ffmpeg for decoding.
You can find the code here: https://github.com/filippobrizzi/raw_rgb_straming
It shows how to setup x264 and ffmpeg to encode/decode.

Right now you initialize the decoder like
pCodecContext->extradata = NULL;
this is not correct. You need to allocate a memory for this field and copy data from the encoder's AVCodecContext::extradata into the allocated buffer. AVCodecContext::extradata_size specifies size of this extradata buffer in bytes

Make sure that you are building correct packets. See how this is done in the ffmpeg: http://ffmpeg.org/doxygen/trunk/libx264_8c_source.html (static int encode_nals(AVCodecContext *ctx, AVPacket *pkt, x264_nal_t *nals, int nnal) and static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, const AVFrame *frame, int *got_packet))

Related

FFMPEG: Decode video in h264rgb/libx264rgb error

I did a small program to encode raw images in h264rgb codec with ffmpeg.
I use this codec because I needed to encode lossless rgb images (not possible with the classic h264 codec).
But now, I have a problem. I'm not able to decode the video generated with ffmpeg. I did a second small program for that, but I get a segfault when I reach the avcodec_decode_video2() function.
I did all the initialisation correctly. I didn't forget the avcodec_register_all() and av_init_packet() functions.
Here is the code for initialisation:
_c = NULL;
_frame_nb = 0;
// Register all formats and codecs
#pragma omp critical
{
avcodec_register_all();
}
_pkt = new AVPacket;
av_init_packet(_pkt); // a defaut de pouvoir ecrire : pkt = av_packet_alloc();
if(!_pkt)
exit(1);
_codec = avcodec_find_encoder_by_name("libx264rgb");
if (!_codec) {
fprintf(stderr, "codec not found\n");
exit(1);
}
_c = avcodec_alloc_context3(_codec);
if (!_c) {
fprintf(stderr, "Could not allocate video codec context\n");
exit(1);
}
_c->debug = true;
_c->pix_fmt = (AVPixelFormat)AV_PIX_FMT_RGB24;
_c->width = this->_headerCam[this->_currentCam]->iNbCol;
_c->height = this->_headerCam[this->_currentCam]->iNbLine;
_picture = av_frame_alloc();
if (!_picture) {
fprintf(stderr, "Could not allocate video _picture\n");
exit(1);
}
_tmp_picture = av_frame_alloc();
if (!_tmp_picture) {
fprintf(stderr, "Could not allocate video _tmp_picture\n");
exit(1);
}
_tmp_picture->format = (AVPixelFormat)AV_PIX_FMT_RGB24;
_tmp_picture->width = this->_headerCam[this->_currentCam]->iNbCol;
_tmp_picture->height = this->_headerCam[this->_currentCam]->iNbLine;
_tmp_picture->linesize[0] = this->_headerCam[this->_currentCam]->iNbCol;
/* open it */
if (avcodec_open2(_c, _codec, NULL) < 0) {
fprintf(stderr, "could not open codec\n");
exit(1);
}
And the decode function:
_pkt->data = NULL; // packet data will be allocated by the encoder
_pkt->size = 0;
unsigned char * inbuf;
inbuf = (uint8_t*)av_malloc(w*h*3);
//! read img size
int size_img;
fread(&size_img, sizeof(int), 1, this->_pFile);
_pkt->size = fread(inbuf, 1, size_img, this->_pFile);
_pkt->data = (unsigned char*)inbuf;
if(_pkt->size)
{
len = avcodec_decode_video2(_c, _tmp_picture, &got_picture, _pkt);
...
}
Any idea?
+What the comment says, instead of inbuf = (uint8_t*)av_malloc(w*h*3); you should use this:
int buffer_size = av_image_get_buffer_size((AVPixelFormat)AV_PIX_FMT_RGB24, w, h, 1);
inbuf = (uint8_t*)av_malloc(buffer_size);
Because of portability for alignment and such issues.
And also fix this line _tmp_picture->linesize[0] = ...
with this:
_tmp_picture->linesize[0] = av_image_get_linesize((AVPixelFormat)AV_PIX_FMT_RGB24, w, 0);
Hope that helps.
Well, I finally found a beginning of answer:
I used avcodec_find_encoder_by_name() instead of the avcodec_find_decoder_by_name()... It's certainly better if I want to decode data...
But now, I have the problem that _codec = avcodec_find_decoder_by_name("libx264rgb"); doesn't find the codec. But it works for _codec = avcodec_find_encoder_by_name("libx264rgb"); when I encoded my video.
EDIT : With _codec = avcodec_find_decoder(AV_CODEC_ID_H264); instead of _codec = avcodec_find_decoder_by_name("libx264rgb");, it is able to decode my video. It's weird but ok. Now, I hope I get lossless images.
EDIT2: Now, everything works! I got lossless decoded images with the AV_CODEC_ID_H264 decoder. To summarize, if you want to decode a video encoded with libx264rgb, you can use the classic AV_CODEC_ID_H264 codec.

Using libavformat to mux H.264 frames into RTP

I have an encoder that produces a series of H.264 I-frames and P-frames. I'm trying to use libavformat to mux and transmit these frames over RTP, but I'm stuck.
My program sends RTP data, but the RTP timestamp increments by 1 each successive frame, instead of 90000/fps. It also doesn't look like it's doing the proper framing for H.264 NAL, since I can't decode the stream as H.264 in Wireshark.
I suspect that I'm not setting up the codec information properly, but it appears in many places in the output format context, so it's unclear what exactly needs to be setup. The examples seem to all copy codec context info from encoders, which isn't my use case.
This is what I'm trying:
int main() {
AVFormatContext context = avformat_alloc_context();
if (!context) {
printf("avformat_alloc_context failed\n");
return;
}
AVOutputFormat *format = av_guess_format("rtp", NULL, NULL);
if (!format) {
printf("av_guess_format failed\n");
return;
}
context->oformat = format;
snprintf(context->filename, sizeof(context->filename), "rtp://%s:%d", "192.168.2.16", 10000);
if (avio_open(&(context->pb), context->filename, AVIO_FLAG_READ_WRITE) < 0) {
printf("avio_open failed\n");
return;
}
stream = avformat_new_stream(context, NULL);
if (!stream) {
printf("avformat_new_stream failed\n");
return;
}
stream->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
stream->codecpar->codec_id = AV_CODEC_ID_H264;
stream->codecpar->width = 1920;
stream->codecpar->height = 1080;
avformat_write_header(context, NULL);
...
write packets
...
}
Example write packet:
int write_packet(uint8_t *data, int size) {
AVPacket p;
av_init_packet(&p);
p.data = buffer;
p.size = size;
p.stream_index = stream->index;
av_interleaved_write_frame(context, &p);
}
I've even went so far to build in libx264, find the encoder, and copy the codec context info from there into the stream codecpar, with the same result. My goal is to build without libx264, and any other libs that aren't required, but it isn't clear whether libx264 is required for defaults such as time base.
How can the libavformat RTP muxer be initialized to properly send H.264 frames over RTCP+RTP?

Decoding loop logic from matroska (mkv, webm) to audio (C++ via libvorbis)

(I'm not fluent in english i'll try to do my best)
I try to code (C++) a simple mkv player. I'm very new in this subject, so I discover all I need little by little. For the beginning, I use VP8 codec for video and Vorbis for audio.
The video side seem ok for now, but I'm in trouble with audio.
I can't figure out the loop logic to decode the audio frames I get from mkvparser with the libvorbis.
I looked up to this sample and this brief explanation but can't manage to make it work in my case. And I didn't find other simple examples.
Here is a chunk of my code:
const mkvparser::Block* const pBlock = m_pMkvContext->pBlockEntry->GetBlock();
const mkvparser::Track* const pTrack = m_pMkvContext->pTracks->GetTrackByNumber( (unsigned long)pBlock->GetTrackNumber() );
if ( pTrack != NULL )
{
const long long trackType = pTrack->GetType();
const int frameCount = pBlock->GetFrameCount();
if ( frameCount > 0 )
{
const mkvparser::Block::Frame& oFrame = pBlock->GetFrame( 0 );
unsigned char* pData = (unsigned char*)malloc( (size_t)oFrame.len );
oFrame.Read( &m_pMkvContext->oReader, pData );
if ( trackType == mkvparser::Track::kVideo )
{
// i'm ok here
}
else if ( trackType == mkvparser::Track::kAudio )
{
// what to do here with my audio frame data ?
}
free( pData );
}
}
And maybe the way I get frames is good for video and not for audio...
Do you guys know some good resources to share about it? Or some advices?
Thanks for help !
[EDIT] : I forgot to add one of my try:
bool MoviePlayer::DecodeAudioData( unsigned char* pData, uint32 iSize )
{
int ret;
char* pBuffer = NULL;
pBuffer = ogg_sync_buffer( &m_pOVContext->oOggSyncState, iSize );
memcpy( pBuffer, pData, iSize );
ogg_sync_wrote( &m_pOVContext->oOggSyncState, iSize );
ret = ogg_sync_pageout( &m_pOVContext->oOggSyncState, &m_pOVContext->oOggPage );
ret = ogg_stream_init( &m_pOVContext->oOggStreamState, ogg_page_serialno(&m_pOVContext->oOggPage) );
ret = ogg_stream_pagein( &m_pOVContext->oOggStreamState, &m_pOVContext->oOggPage );
int iPacketsCount = ogg_page_packets( &m_pOVContext->oOggPage );
for ( int i = 0; i < iPacketsCount; ++i )
{
ret = ogg_stream_packetout(&m_pOVContext->oOggStreamState, &m_pOVContext->oOggPacket);
// do something with the packet...
}
return true;
}
It crashes at ogg_sync_pageout, as my ogg_page was not correctly initialized.
But, not coming from a proper .ogg file as in the examples i found, i don't know how to correctly initialize the vorbis structures.
https://matroska.org/technical/specs/codecid/index.html
in A_VORBIS section
The private data contains the first three Vorbis packet in order....
and the codec private is here
https://matroska.org/technical/specs/index.html
"CodecPrivate 3 [63][A2]"

PTS and DTS calculation for video and audio frames

I am receiving video H264 encoded data and audio G.711 PCM encoded data from two different threads to mux / write into mov multimedia container.
The writer function signatures are like:
bool WriteAudio(const unsigned char *pEncodedData, size_t iLength);
bool WriteVideo(const unsigned char *pEncodedData, size_t iLength, bool const bIFrame);
And the function for adding audio and video streams looks like:
AVStream* AudioVideoRecorder::AddMediaStream(enum AVCodecID codecID) {
Log("Adding stream: %s.", avcodec_get_name(codecID));
AVCodecContext* pCodecCtx;
AVStream* pStream;
/* find the encoder */
AVCodec* codec = avcodec_find_encoder(codecID);
if (!codec) {
LogErr("Could not find encoder for %s", avcodec_get_name(codecID));
return NULL;
}
pStream = avformat_new_stream(m_pFormatCtx, codec);
if (!pStream) {
LogErr("Could not allocate stream.");
return NULL;
}
pStream->id = m_pFormatCtx->nb_streams - 1;
pStream->time_base = (AVRational){1, VIDEO_FRAME_RATE};
pCodecCtx = pStream->codec;
switch(codec->type) {
case AVMEDIA_TYPE_VIDEO:
pCodecCtx->codec_id = codecID;
pCodecCtx->bit_rate = VIDEO_BIT_RATE;
pCodecCtx->width = PICTURE_WIDTH;
pCodecCtx->height = PICTURE_HEIGHT;
pCodecCtx->gop_size = VIDEO_FRAME_RATE;
pCodecCtx->pix_fmt = PIX_FMT_YUV420P;
m_pVideoStream = pStream;
break;
case AVMEDIA_TYPE_AUDIO:
pCodecCtx->codec_id = codecID;
pCodecCtx->sample_fmt = AV_SAMPLE_FMT_S16;
pCodecCtx->bit_rate = 64000;
pCodecCtx->sample_rate = 8000;
pCodecCtx->channels = 1;
m_pAudioStream = pStream;
break;
default:
break;
}
/* Some formats want stream headers to be separate. */
if (m_pOutputFmt->flags & AVFMT_GLOBALHEADER)
m_pFormatCtx->flags |= CODEC_FLAG_GLOBAL_HEADER;
return pStream;
}
Inside WriteAudio(..) and WriteVideo(..) functions, I am creating AVPakcet using av_init_packet(...) and set pEncodedData and iLength as packet.data and packet.size. I printed packet.pts and packet.dts and its equivalent to AV_NOPTS_VALUE.
Now, how do I calculate the PTS, DTS, and packet duration (packet.dts, packet.pts and packet.duration) correctly for both audio and video data so that I can sync audio & video and play it properly? I saw many examples on the internet, but none of them are making sense to me. I am new with ffmpeg, and my conception may not be correct in some context. I want to do it in the appropriate way.
Thanks in advance!
EDIT: In my video streams, there is no B frame. So, I think PTS and DTS can be kept the same here.
PTS/DTS are timestamps, they should be set to the timestamps of the input data. I don't know where your date comes from, but any input has some form of timestamps associated with it. Typically, the timestamps of the input media file or a system clock-derived metric if you're recording from your soundcard+webcam, and so on. You should convert these numbers into the form expected, and then assign them to AVPacket.pts/dts.

FFMPEG I/O output buffer

I'm currently having issues trying to encapsulate raw H264 nal packets into a mp4 container. Instead of writing them to disk however, I want to have the result stored in memory. I followed this approach Raw H264 frames in mpegts container using libavcodec but haven't been successful so far.
First, is this the right way to write to memory? I have a small struct in my header
struct IOOutput {
uint8_t* outBuffer;
int bytesSet;
};
where I initialize the buffer and bytesset. I then initialize my AVIOContext variable
AVIOContext* pIOCtx = avio_alloc_context(pBuffer, iBufSize, 1, outptr, NULL, write_packet, NULL);
where outptr is a void pointer to IOOutput output, and write_packet looks like the following
int write_packet (void *opaque, uint8_t *buf, int buf_size) {
IOOutput* out = reinterpret_cast<IOOutput*>(opaque);
memcpy(out->outBuffer+out->bytesSet, buf, buf_size);
out->bytesSet+=buf_size;
return buf_size;
}
I then set
fc->pb = pIOCtx;
fc->flags = AVFMT_FLAG_CUSTOM_IO;
on my AVFormatContext *fc variable.
Then, whenever I encode the nal packets I have from a frame, I write them to the AVFormatContext via av_interleaved_write_frame and then get the mp4 contents via
void getBufferContent(char* buffer) {
memcpy(buffer, output.outBuffer, output.bytesSet);
output.bytesSet=0;
}
and thus reset the variable bytesSet, so during the next writing operation bytes will be inserted at the start of the buffer. Is there a better way to do this? Is this actually a valid way to do it? Does FFMPEG do any reading operation if I only do call av_interleaved_write_frame and avformat_write_header in order to add packets?
Thank you very much in advance!
EDIT
Here is the code regarding the muxing process - in my encode Function I have something like
int frame_size = x264_encoder_encode(obj->mEncoder, &obj->nals, &obj->i_nals, obj->pic_in, obj->pic_out);
int total_size=0;
for(int i=0; i<obj->i_nals;i++)
{
if ( !obj->fc ) {
obj->create( obj->nals[i].p_payload, obj->nals[i].i_payload );
}
if ( obj->fc ) {
obj->write_frame( obj->nals[i].p_payload, obj->nals[i].i_payload);
}
}
// Here I get the output values
int currentBufferSize = obj->output.bytesSet;
char* mem = new char[currentBufferSize];
obj->getBufferContent(mem);
And the create and write functions look like this
int create(void *p, int len) {
AVOutputFormat *of = av_guess_format( "mp4", 0, 0 );
fc = avformat_alloc_context();
// Add video stream
AVStream *pst = av_new_stream( fc, 0 );
vi = pst->index;
void* outptr = (void*) &output;
// Create Buffer
pIOCtx = avio_alloc_context(pBuffer, iBufSize, 1, outptr, NULL, write_packet, NULL);
fc->oformat = of;
fc->pb = pIOCtx;
fc->flags = AVFMT_FLAG_CUSTOM_IO;
pcc = pst->codec;
AVCodec c= {0};
c.type= AVMEDIA_TYPE_VIDEO;
avcodec_get_context_defaults3( pcc, &c );
pcc->codec_type = AVMEDIA_TYPE_VIDEO;
pcc->codec_id = codec_id;
pcc->bit_rate = br;
pcc->width = w;
pcc->height = h;
pcc->time_base.num = 1;
pcc->time_base.den = fps;
}
void write_frame( const void* p, int len ) {
AVStream *pst = fc->streams[ vi ];
// Init packet
AVPacket pkt;
av_init_packet( &pkt );
pkt.flags |= ( 0 >= getVopType( p, len ) ) ? AV_PKT_FLAG_KEY : 0;
pkt.stream_index = pst->index;
pkt.data = (uint8_t*)p;
pkt.size = len;
pkt.dts = AV_NOPTS_VALUE;
pkt.pts = AV_NOPTS_VALUE;
av_interleaved_write_frame( fc, &pkt );
}
See the AVFormatContext.pb documentation. You set it correctly, but you shouldn't touch AVFormatContext.flags. Also, make sure you set it before calling avformat_write_header().
When you say "it doesn't work", what exactly doesn't work? Is the callback not invoked? Is the data in it not of the expected type/format? Something else? If all you want to do is write raw nal packets, then you could just take encoded data directly from the encoder (in the AVPacket), that's the raw nal data. If you use libx264's api directly, it even gives you each nal individually so you don't need to parse it.