I'm trying to encode images into an H264 MP4 video. The issues I'm having is that some of the images are skipped or at the end of the video simply missing. I need the video to play every single image I encode since it is an animation.
Any help setting the encoder properly would be greatly appreciated!
Encoder settings:
AVCodecContext *c;
...
c->codec_id = AV_CODEC_ID_H264;
c->bit_rate = mOutputWidth*mOutputHeight*4;//400000;
/* Resolution must be a multiple of two. */
c->width = mOutputWidth;
c->height = mOutputHeight;
/* timebase: This is the fundamental unit of time (in seconds) in terms
* of which frame timestamps are represented. For fixed-fps content,
* timebase should be 1/framerate and timestamp increments should be
* identical to 1. */
c->time_base.den = mFps;
c->time_base.num = 1;
c->gop_size = 12; /* emit one intra frame every twelve frames at most */
c->pix_fmt = AV_PIX_FMT_YUV420P;
...
av_dict_set(&pOptions, "preset", "medium", 0);
av_dict_set(&pOptions, "tune", "animation", 0);
/* open the codec */
ret = avcodec_open2(c, codec, &pOptions);
if (ret < 0) {
LOGE("Could not open video codec: %s", av_err2str(ret));
return -1;
}
Update 07/24/13:
I was able to achieve a better video by setting the gop_size=FPS and writing the last video frame repeatedly FPS+1 times seemed to resolve all issues. To me it seems odd to do that but might be something standard in the video encoding world? Any tips feedback about this?
From what I understand, you have a set of images and you want to make a video out of them. If this is the case and you don't care about the size of the video, you can try to disable inter prediction. Maybe the encoder finds that some of the images are not required and skips them.
Inter frame prediction can be disabled by setting gop_size to 0.
Related
So I'm writing a C++ program that will take a wav file, generate a visualization, and export the video out alongside the audio using ffmpeg (pipe). I've been able to get output out to ffmpeg just fine and a video with the visualization and audio are created by ffmpeg.
The problem is the video and audio are desyncing. The video is just too fast and ends before the song is completed (the video file is the correct length; the waveform just flatlines and ends, indicating that ffmpeg reached the end of the video and is just using the last frame it received until the audio ends). So I'm not sending enough frames to ffmpeg.
Below is a truncated version of the source code:
// Example code
int main()
{
// LoadAudio();
uint32_t frame_max = audio.sample_rate / 24; // 24 frames per second
uint32_t frame_counter = 0;
// InitializePipe2FFMPEG();
// (Left channel and right channel are always equal in size)
for (uint32_t i = 0; i < audio.left_channel.size(); ++i)
{
// UpdateWaveform4Image();
if (frame_counter % frame_max == 0)
{
// DrawImageAndSend2Pipe();
frame_counter = 1;
}
else
{
++frame_counter;
}
}
// FlushAndClosePipe();
return 0;
}
The commented-out functions are fake and irrelevant. I know this because "UpdateWaveform4Image()" updates the waveform used to generate the image every sample. (I know that's inefficient, but I'll worry about optimization later.) The waveform is a std::vector in which each element stores the y-coordinate of each sample. It has no effect on when the program will generate a new frame for the video.
Also ffmpeg is set to output 24 frames per second--trust me, I thought that was the problem too because by default ffmpeg outputs to 25 fps.
My line of thinking for the modulus check is that frame_counter is incremented every sample. frame_max equals 2000 because 48000 / 24 = 2000. I know the audio is clocked at 48kHz because I created the file myself. So it SHOULD generate a new image every 2000 samples.
Here is a link to the output video: [output]
Any advice would be helpful.
EDIT: Skip to 01:24 to see the waveform flatline.
I need to record frames in real time. To test this situation, I make pts non-linear (since frames may be lost), thus:
// AVFrame
video_frame->pts = prev_pts + 2;
I use libavformat to write to a file. Parameters AVCodecContext and AVStream:
#define STREAM_FRAME_RATE 25
#define CODEC_PIX_FMT AV_PIX_FMT_YUV420P
#define FRAME_WIDTH 1440
#define FRAME_HEIGHT 900
// AVCodecContext
cc->codec_id = video_codec->id;
cc->bit_rate = 400000;
cc->width = FRAME_WIDTH;
cc->height = FRAME_HEIGHT;
cc->gop_size = 12;
cc->pix_fmt = CODEC_PIX_FMT;
// AVStream
video_stream->time_base = AVRational{ 1, STREAM_FRAME_RATE };
cc->time_base = video_stream->time_base;
cc->framerate = AVRational{ STREAM_FRAME_RATE , 1 };
Write to file:
static int write_frame(AVFormatContext *fmt_ctx, const AVRational *time_base, AVStream *st, AVPacket *pkt)
{
/* rescale output packet timestamp values from codec to stream timebase */
//av_packet_rescale_ts(pkt, *time_base, st->time_base);
pkt->pts = av_rescale_q(pkt->pts, *time_base, st->time_base);
pkt->dts = av_rescale_q(pkt->dts, *time_base, st->time_base);
pkt->stream_index = st->index;
/* Write the compressed frame to the media file. */
//log_packet(fmt_ctx, pkt);
//return av_write_frame(fmt_ctx, pkt);
return av_interleaved_write_frame(fmt_ctx, pkt);
}
If you use the avi container, then the information on the number of frames per second is indicated correctly in the file: 25 fps
If you use the mp4 container, then the file information about the number of frames per second is indicated incorrectly: 12.5 fps
Tell me, please, what other settings need to be added?
MP4s do not store framerate, AVIs do.
In MP4s, only timing info for packets is stored. Since your pts expr is video_frame->pts = prev_pts + 2 and stream time base is 1/25, frames are spaced 80ms apart and hence ffmpeg probes the frame rate to be 12.5 fps (correctly).
AVIs do not have per-frame timing. Instead, they write the user-supplied frame rate. Should a packet timing be greater than the pervious frame pts by 1/fps, the muxer will write skip frame(s) which are empty packets, to maintain the frame rate.
Here is the code I used to decode a rtsp stream in a worker thread:
while(1)
{
// Read a frame
if(av_read_frame(pFormatCtx, &packet)<0)
break; // Frame read failed (e.g. end of stream)
if(packet.stream_index==videoStream)
{
// Is this a packet from the video stream -> decode video frame
int frameFinished;
avcodec_decode_video2(pCodecCtx,pFrame,&frameFinished,&packet);
// Did we get a video frame?
if (frameFinished)
{
if (LastFrameOk == false)
{
LastFrameOk = true;
}
// Convert the image format (init the context the first time)
int w = pCodecCtx->width;
int h = pCodecCtx->height;
img_convert_ctx = ffmpeg::sws_getCachedContext(img_convert_ctx, w, h, pCodecCtx->pix_fmt, w, h, ffmpeg::PIX_FMT_RGB24, SWS_BICUBIC, NULL, NULL, NULL);
if (img_convert_ctx == NULL)
{
printf("Cannot initialize the conversion context!\n");
return false;
}
ffmpeg::sws_scale(img_convert_ctx, pFrame->data, pFrame->linesize, 0, pCodecCtx->height, pFrameRGB->data, pFrameRGB->linesize);
// Convert the frame to QImage
LastFrame = QImage(w, h, QImage::Format_RGB888);
for (int y = 0; y < h; y++)
memcpy(LastFrame.scanLine(y), pFrameRGB->data[0] + y*pFrameRGB->linesize[0], w * 3);
LastFrameOk = true;
} // frameFinished
} // stream_index==videoStream
av_free_packet(&packet); // Free the packet that was allocated by av_read_frame
}
I followed the ffmpeg's tutorial and used a while loop to read the packet and decode the video.
But is there a more efficient way to do this, like a event-triggered function when there is packet received?
I haven't seen any event driven approach for reading frames, but what is the purpose of reading RTSP stream? But I can give some recommendations for improving performance. First of all, you may add a very short sleep in your loop (e.g. Sleep(1);). In your program, if your purpose is to:
Display images to the user: Don't use conversion to RGB, after decoding, the resulting frame is in YUV420P format which can be directly displayed to the user using GPU without any CPU usage. Almost all graphics cards support YUV420P (or YV12) format. Conversion to RGB is a highly CPU-consuming operation, especially for large images.
Record (save) to disk: I you want to record the stream to play it later, there is no need to decode the frames. You may use OpenRTSP to record directly to the disk without any CPU usage.
Process realtime images: You may find alternative algorithms to process on YUV420P format instead of RGB. The Y plane in YUV420P is actually a grayscale version of the colored RGB images.
Marked question as outdated as using the deprecated avcodec_decode_video2
I'm currently experiencing artifacts when decoding video using ffmpegs api. On what I would assume to be intermediate frames, artifacts build slowly only from active movement in the frame. These artifacts build for 50-100 frames until I assume a keyframe resets them. Frames are then decoded correctly and the artifacts proceed to build again.
One thing that is bothering me is I have a few video samples that are 30fps(h264) that work correctly, but all of my 60fps videos(h264) experience the problem.
I don't currently have enough reputation to post an image, so hopefully this link will work.
http://i.imgur.com/PPXXkJc.jpg
int numBytes;
int frameFinished;
AVFrame* decodedRawFrame;
AVFrame* rgbFrame;
//Enum class for decoding results, used to break decode loop when a frame is gathered
DecodeResult retResult = DecodeResult::Fail;
decodedRawFrame = av_frame_alloc();
rgbFrame = av_frame_alloc();
if (!decodedRawFrame) {
fprintf(stderr, "Could not allocate video frame\n");
return DecodeResult::Fail;
}
numBytes = avpicture_get_size(PIX_FMT_RGBA, mCodecCtx->width,mCodecCtx->height);
uint8_t* buffer = (uint8_t *)av_malloc(numBytes*sizeof(uint8_t));
avpicture_fill((AVPicture *) rgbFrame, buffer, PIX_FMT_RGBA, mCodecCtx->width, mCodecCtx->height);
AVPacket packet;
while(av_read_frame(mFormatCtx, &packet) >= 0 && retResult != DecodeResult::Success)
{
// Is this a packet from the video stream?
if (packet.stream_index == mVideoStreamIndex)
{
// Decode video frame
int decodeValue = avcodec_decode_video2(mCodecCtx, decodedRawFrame, &frameFinished, &packet);
// Did we get a video frame?
if (frameFinished)// && rgbFrame->pict_type != AV_PICTURE_TYPE_NONE )
{
// Convert the image from its native format to RGB
int SwsFlags = SWS_BILINEAR;
// Accurate round clears up a problem where the start
// of videos have green bars on them
SwsFlags |= SWS_ACCURATE_RND;
struct SwsContext *ctx = sws_getCachedContext(NULL, mCodecCtx->width, mCodecCtx->height, mCodecCtx->pix_fmt, mCodecCtx->width, mCodecCtx->height,
PIX_FMT_RGBA, SwsFlags, NULL, NULL, NULL);
sws_scale(ctx, decodedRawFrame->data, decodedRawFrame->linesize, 0, mCodecCtx->height, rgbFrame->data, rgbFrame->linesize);
//if(count%5 == 0 && count < 105)
// DebugSavePPMImage(rgbFrame, mCodecCtx->width, mCodecCtx->height, count);
++count;
// Viewable frame is a struct to hold buffer and frame together in a queue
ViewableFrame frame;
frame.buffer = buffer;
frame.frame = rgbFrame;
mFrameQueue.push(frame);
retResult = DecodeResult::Success;
sws_freeContext(ctx);
}
}
// Free the packet that was allocated by av_read_frame
av_free_packet(&packet);
}
// Check for end of file leftover frames
if(retResult != DecodeResult::Success)
{
int result = av_read_frame(mFormatCtx, &packet);
if(result < 0)
isEoF = true;
av_free_packet(&packet);
}
// Free the YUV frame
av_frame_free(&decodedRawFrame);
I'm attempting to build a queue of the decoded frames that I then use and free as needed. Is my seperation of the frames causing the intermediate frames to be decoded incorrectly? I also break the decoding loop once I've successfully gathered a frame(Decode::Success, most examples I've seen tend to loop through the whole video.
All codec contect, video stream information, and format contexts are setup up exactly as shown in the main function of https://github.com/chelyaev/ffmpeg-tutorial/blob/master/tutorial01.c
Any suggestions would be greatly appreciated.
For reference if someone finds themselves in a similar position. Apparently with some of the older versions of FFMPEG there's an issue when using sws_scale to convert an image and not changing the actual dimensions of the final frame. If instead you create a flag for the SwsContext using:
int SwsFlags = SWS_BILINEAR; //Whatever you want
SwsFlags |= SWS_ACCURATE_RND; // Under the hood forces ffmpeg to use the same logic as if scaled
SWS_ACCURATE_RND has a performance penalty but for regular video it's probably not that noticeable. This will remove the splash of green, or green bars along the edges of textures if present.
I wanted to thank Multimedia Mike, and George Y, they were also right in that the way I was decoding the frame wasn't preserving the packets correctly and that was what caused the video artifacts building from previous frames.
I'm trying ton encode video from set of jpeg images to h264, using ffmpeg + x264 for it. I init AVCodecContext in such way:
_outputCodec = avcodec_find_encoder(AV_CODEC_ID_H264);
_outputCodecContext = avcodec_alloc_context3(_outputCodec);
avcodec_get_context_defaults3(_outputCodecContext, _outputCodec);
_outputCodecContext->width = _currentWidth;
_outputCodecContext->height = _currentHeight;
_outputCodecContext->pix_fmt = AV_PIX_FMT_YUV420P;
_outputCodecContext->time_base.num = 1;
_outputCodecContext->time_base.den = 25;
_outputCodecContext->profile =FF_PROFILE_H264_BASELINE;
_outputCodecContext->level = 50;
avcodec_open return no errors, anything is OK, but when I call avcodec_encode_video2() I get such messages (I think it's from x264):
using mv_range_thread = %d
%s
profile %s, level %s
And then app crashs. My be there are more neccessary settings for codec context, when use x264 &&
Without a full version of your code it is hard to see what the actual problem is.
Firstly here is a working example of the FFMPEG library encoding RGB frames to a H264 video:
http://www.imc-store.com.au/Articles.asp?ID=276
You could expand on this example by using CImage to load in your JPGs and pass the RGB data to the FFMPEG Class to encode to a H264 video.
A few thoughts on your example though:
Have you called register all like below?
avcodec_register_all();
av_register_all();
Also I'd re-write your code to be something like below:
AVStream *st;
m_video_codec = avcodec_find_encoder(AV_CODEC_ID_H264);
st = avformat_new_stream(_outputCodec, m_video_codec);
_outputCodecContext = st->codec;
_outputCodecContext->codec_id = m_fmt->video_codec;
_outputCodecContext->bit_rate = m_AVIMOV_BPS; //Bits Per Second
_outputCodecContext->width = m_AVIMOV_WIDTH; //Note Resolution must be a multiple of 2!!
_outputCodecContext->height = m_AVIMOV_HEIGHT; //Note Resolution must be a multiple of 2!!
_outputCodecContext->time_base.den = m_AVIMOV_FPS; //Frames per second
_outputCodecContext->time_base.num = 1;
_outputCodecContext->gop_size = m_AVIMOV_GOB; // Intra frames per x P frames
_outputCodecContext->pix_fmt = AV_PIX_FMT_YUV420P;//Do not change this, H264 needs YUV format not RGB
And then you need to convert the RGB JPG picture to the YUV format using swscale as pogorskiy said.
Have a look at the linked example, I tested it on VC++2010 and it works perfectly and you can send it an RGB char array.