I'm trying ton encode video from set of jpeg images to h264, using ffmpeg + x264 for it. I init AVCodecContext in such way:
_outputCodec = avcodec_find_encoder(AV_CODEC_ID_H264);
_outputCodecContext = avcodec_alloc_context3(_outputCodec);
avcodec_get_context_defaults3(_outputCodecContext, _outputCodec);
_outputCodecContext->width = _currentWidth;
_outputCodecContext->height = _currentHeight;
_outputCodecContext->pix_fmt = AV_PIX_FMT_YUV420P;
_outputCodecContext->time_base.num = 1;
_outputCodecContext->time_base.den = 25;
_outputCodecContext->profile =FF_PROFILE_H264_BASELINE;
_outputCodecContext->level = 50;
avcodec_open return no errors, anything is OK, but when I call avcodec_encode_video2() I get such messages (I think it's from x264):
using mv_range_thread = %d
%s
profile %s, level %s
And then app crashs. My be there are more neccessary settings for codec context, when use x264 &&
Without a full version of your code it is hard to see what the actual problem is.
Firstly here is a working example of the FFMPEG library encoding RGB frames to a H264 video:
http://www.imc-store.com.au/Articles.asp?ID=276
You could expand on this example by using CImage to load in your JPGs and pass the RGB data to the FFMPEG Class to encode to a H264 video.
A few thoughts on your example though:
Have you called register all like below?
avcodec_register_all();
av_register_all();
Also I'd re-write your code to be something like below:
AVStream *st;
m_video_codec = avcodec_find_encoder(AV_CODEC_ID_H264);
st = avformat_new_stream(_outputCodec, m_video_codec);
_outputCodecContext = st->codec;
_outputCodecContext->codec_id = m_fmt->video_codec;
_outputCodecContext->bit_rate = m_AVIMOV_BPS; //Bits Per Second
_outputCodecContext->width = m_AVIMOV_WIDTH; //Note Resolution must be a multiple of 2!!
_outputCodecContext->height = m_AVIMOV_HEIGHT; //Note Resolution must be a multiple of 2!!
_outputCodecContext->time_base.den = m_AVIMOV_FPS; //Frames per second
_outputCodecContext->time_base.num = 1;
_outputCodecContext->gop_size = m_AVIMOV_GOB; // Intra frames per x P frames
_outputCodecContext->pix_fmt = AV_PIX_FMT_YUV420P;//Do not change this, H264 needs YUV format not RGB
And then you need to convert the RGB JPG picture to the YUV format using swscale as pogorskiy said.
Have a look at the linked example, I tested it on VC++2010 and it works perfectly and you can send it an RGB char array.
Related
I have 16 bit PCM data with stereo setup which I grab from a microphone.
Once I get the data I encode it using the following encoder settings
AVCodec* audio_codec = avcodec_find_encoder(AV_CODEC_ID_MP2);
AVCodecContext* audio_codec_ctx = avcodec_alloc_context3(audio_codec);
audio_codec_ctx->bit_rate = 64000;
audio_codec_ctx->channels = 2;
audio_codec_ctx->channel_layout = AV_CH_LAYOUT_STEREO;
audio_codec_ctx->sample_rate = 44100;
audio_codec_ctx->sample_fmt = AV_SAMPLE_FMT_S16;
When I pass the audio data into the encoder I see that it takes 4608 bytes of data each time and encodes it correctly to MP2 data. PCM data grabbed by the microphone is 88320 bytes and the encoder takes 4608 bytes each time and compresses it.
If I take each 4608 byte section that has been encoded and pass it through a decoder with the same settings as above but with a decoder.
AVCodecID audio_codec_id = AV_CODEC_ID_MP2;
AVCodec * audio_decodec = avcodec_find_decoder(audio_codec_id);
audio_decodecContext = avcodec_alloc_context3(audio_decodec);
audio_decodecContext->bit_rate = 64000;
audio_decodecContext->channels = 2;
audio_decodecContext->channel_layout = AV_CH_LAYOUT_STEREO;
audio_decodecContext->sample_rate = 44100;
audio_decodecContext->sample_fmt = AV_SAMPLE_FMT_S16;
The decoding works and is successful but when I look at the data size it is exactly half 2034 of what was encoded. I dont understand why that would be. I would of imagined I would get 4608 considering the encoder and decoder are the same.
Can anyone shed some light into why this would be happening. Anything I should be setting?
The requested decoder sample format should be set using audio_decodecContext->request_sample_fmt. sample_fmt is set by the decoder itself, and may be different, in which case you should use libswresample to convert between sample formats.
I know the formula to convert yuy2 to rgb as described in here:
Convert yuy2 to bitmap
My problem is that I don't know how to apply it in a directshow filter:
In directshow i have a buffer and a header but how do I convert these into rgb?
The formula is:
int C = luma - 16;
int D = cr - 128;
int E = cb - 128;
r = (298*C+409*E+128)/256;
g = (298*C-100*D-208*E+128)/256;
b = (298*C+516*D+128)/256;
How do i get these values and how do I write them into the output buffer?
This is how i copy the buffer at the moment:
long lSizeSample = sample->GetSize();
long lSizeOutSample = outsample->GetSize();
outsample->GetPointer(&newBuffer);
sample->GetPointer(&sampleBuffer);
memcpy((void *)newBuffer, (void *)sampleBuffer, lSizeSample);
So i just copy the buffer. But how do i modify it?
Instead of memcpy you are expected to convert pixel by pixel, taking into consideration strides, planar/packed formatting etc. In most cases this needs to be well optimized, such as using SIMD, for decent performance.
You can do the math yourself, of course, but you can also have the conversion done for you by Color Converter DSP, if Vista+ is OK for you.
The DSP is available as DMO, or you can use DMO Wrapper Filter and use it as a readily available DirectShow filter.
I'm using DirectShow to access a video stream, and then using the SampleGrabber filter and interface to get samples from each frame for further image processing. I'm using a callback, so it gets called after each new frame. I've basically just worked from the PlayCap sample application and added a sample filter to the graph.
The problem I'm having is that I'm trying to display the grabbed samples on a different OpenCV window. However, when I try to cast the information in the buffer to an IplImage, I get a garbled mess of pixels. The code for the BufferCB call is below, sans any proper error handling:
STDMETHODIMP BufferCB(double Time, BYTE *pBuffer, long BufferLen)
{
AM_MEDIA_TYPE type;
g_pGrabber->GetConnectedMediaType(&type);
VIDEOINFOHEADER *pVih = (VIDEOINFOHEADER *)type.pbFormat;
BITMAPINFO* bmi = (BITMAPINFO *)&pVih->bmiHeader;
BITMAPINFOHEADER* bmih = &(bmi->bmiHeader);
int channels = bmih->biBitCount / 8;
mih->biPlanes = 1;
bmih->biBitCount = 24;
bmih->biCompression = BI_RGB;
IplImage *Image = cvCreateImage(cvSize(bmih->biWidth, bmih->biHeight), IPL_DEPTH_8U, channels);
Image->imageSize = BufferLen;
CopyMemory(Image->imageData, pBuffer, BufferLen);
cvFlip(Image);
//openCV Mat creation
Mat cvMat = Mat(Image, true);
imshow("Display window", cvMat); // Show our image inside it.
waitKey(2);
return S_OK;
}
My question is, am I doing something wrong here that will make the image displayed look like this:
Am I missing header information or something?
The quoted code is a part of the solution. You create here an image object of certain width/height with 8-bit pixel data and unknown channel/component count. Then you copy data from another buffer of unknown format.
The only chance for it to work well is that all unknowns amazingly match without your effort. So you basically need to start with checking what media type is exactly on Sample Grabber's input pin. Then, if it is not what you wanted, you have to update your code respectively. It might also be important what is the downstream connection of the SG, and whether it is connected to video renderer in particular.
Marked question as outdated as using the deprecated avcodec_decode_video2
I'm currently experiencing artifacts when decoding video using ffmpegs api. On what I would assume to be intermediate frames, artifacts build slowly only from active movement in the frame. These artifacts build for 50-100 frames until I assume a keyframe resets them. Frames are then decoded correctly and the artifacts proceed to build again.
One thing that is bothering me is I have a few video samples that are 30fps(h264) that work correctly, but all of my 60fps videos(h264) experience the problem.
I don't currently have enough reputation to post an image, so hopefully this link will work.
http://i.imgur.com/PPXXkJc.jpg
int numBytes;
int frameFinished;
AVFrame* decodedRawFrame;
AVFrame* rgbFrame;
//Enum class for decoding results, used to break decode loop when a frame is gathered
DecodeResult retResult = DecodeResult::Fail;
decodedRawFrame = av_frame_alloc();
rgbFrame = av_frame_alloc();
if (!decodedRawFrame) {
fprintf(stderr, "Could not allocate video frame\n");
return DecodeResult::Fail;
}
numBytes = avpicture_get_size(PIX_FMT_RGBA, mCodecCtx->width,mCodecCtx->height);
uint8_t* buffer = (uint8_t *)av_malloc(numBytes*sizeof(uint8_t));
avpicture_fill((AVPicture *) rgbFrame, buffer, PIX_FMT_RGBA, mCodecCtx->width, mCodecCtx->height);
AVPacket packet;
while(av_read_frame(mFormatCtx, &packet) >= 0 && retResult != DecodeResult::Success)
{
// Is this a packet from the video stream?
if (packet.stream_index == mVideoStreamIndex)
{
// Decode video frame
int decodeValue = avcodec_decode_video2(mCodecCtx, decodedRawFrame, &frameFinished, &packet);
// Did we get a video frame?
if (frameFinished)// && rgbFrame->pict_type != AV_PICTURE_TYPE_NONE )
{
// Convert the image from its native format to RGB
int SwsFlags = SWS_BILINEAR;
// Accurate round clears up a problem where the start
// of videos have green bars on them
SwsFlags |= SWS_ACCURATE_RND;
struct SwsContext *ctx = sws_getCachedContext(NULL, mCodecCtx->width, mCodecCtx->height, mCodecCtx->pix_fmt, mCodecCtx->width, mCodecCtx->height,
PIX_FMT_RGBA, SwsFlags, NULL, NULL, NULL);
sws_scale(ctx, decodedRawFrame->data, decodedRawFrame->linesize, 0, mCodecCtx->height, rgbFrame->data, rgbFrame->linesize);
//if(count%5 == 0 && count < 105)
// DebugSavePPMImage(rgbFrame, mCodecCtx->width, mCodecCtx->height, count);
++count;
// Viewable frame is a struct to hold buffer and frame together in a queue
ViewableFrame frame;
frame.buffer = buffer;
frame.frame = rgbFrame;
mFrameQueue.push(frame);
retResult = DecodeResult::Success;
sws_freeContext(ctx);
}
}
// Free the packet that was allocated by av_read_frame
av_free_packet(&packet);
}
// Check for end of file leftover frames
if(retResult != DecodeResult::Success)
{
int result = av_read_frame(mFormatCtx, &packet);
if(result < 0)
isEoF = true;
av_free_packet(&packet);
}
// Free the YUV frame
av_frame_free(&decodedRawFrame);
I'm attempting to build a queue of the decoded frames that I then use and free as needed. Is my seperation of the frames causing the intermediate frames to be decoded incorrectly? I also break the decoding loop once I've successfully gathered a frame(Decode::Success, most examples I've seen tend to loop through the whole video.
All codec contect, video stream information, and format contexts are setup up exactly as shown in the main function of https://github.com/chelyaev/ffmpeg-tutorial/blob/master/tutorial01.c
Any suggestions would be greatly appreciated.
For reference if someone finds themselves in a similar position. Apparently with some of the older versions of FFMPEG there's an issue when using sws_scale to convert an image and not changing the actual dimensions of the final frame. If instead you create a flag for the SwsContext using:
int SwsFlags = SWS_BILINEAR; //Whatever you want
SwsFlags |= SWS_ACCURATE_RND; // Under the hood forces ffmpeg to use the same logic as if scaled
SWS_ACCURATE_RND has a performance penalty but for regular video it's probably not that noticeable. This will remove the splash of green, or green bars along the edges of textures if present.
I wanted to thank Multimedia Mike, and George Y, they were also right in that the way I was decoding the frame wasn't preserving the packets correctly and that was what caused the video artifacts building from previous frames.
I'm trying to encode images into an H264 MP4 video. The issues I'm having is that some of the images are skipped or at the end of the video simply missing. I need the video to play every single image I encode since it is an animation.
Any help setting the encoder properly would be greatly appreciated!
Encoder settings:
AVCodecContext *c;
...
c->codec_id = AV_CODEC_ID_H264;
c->bit_rate = mOutputWidth*mOutputHeight*4;//400000;
/* Resolution must be a multiple of two. */
c->width = mOutputWidth;
c->height = mOutputHeight;
/* timebase: This is the fundamental unit of time (in seconds) in terms
* of which frame timestamps are represented. For fixed-fps content,
* timebase should be 1/framerate and timestamp increments should be
* identical to 1. */
c->time_base.den = mFps;
c->time_base.num = 1;
c->gop_size = 12; /* emit one intra frame every twelve frames at most */
c->pix_fmt = AV_PIX_FMT_YUV420P;
...
av_dict_set(&pOptions, "preset", "medium", 0);
av_dict_set(&pOptions, "tune", "animation", 0);
/* open the codec */
ret = avcodec_open2(c, codec, &pOptions);
if (ret < 0) {
LOGE("Could not open video codec: %s", av_err2str(ret));
return -1;
}
Update 07/24/13:
I was able to achieve a better video by setting the gop_size=FPS and writing the last video frame repeatedly FPS+1 times seemed to resolve all issues. To me it seems odd to do that but might be something standard in the video encoding world? Any tips feedback about this?
From what I understand, you have a set of images and you want to make a video out of them. If this is the case and you don't care about the size of the video, you can try to disable inter prediction. Maybe the encoder finds that some of the images are not required and skips them.
Inter frame prediction can be disabled by setting gop_size to 0.