I am creating video conference application. I have discovered that webcams (at least 3 I have) provide higher resolutions and framerates for mJPEG output format. So far I was using YUY2, converted in I420 for compression with X264. To transcode mJPEG to I420, I need to decode it first. I am trying to decode images from webcam with libavcodec. This is my code.
Initialization:
// mJPEG to I420 conversion
AVCodecContext * _transcoder = nullptr;
AVFrame * _outputFrame;
AVPacket _inputPacket;
avcodec_register_all();
_outputFrame = av_frame_alloc();
av_frame_unref(_outputFrame);
av_init_packet(&_inputPacket);
AVCodec * codecDecode = avcodec_find_decoder(AV_CODEC_ID_MJPEG);
_transcoder = avcodec_alloc_context3(codecDecode);
avcodec_get_context_defaults3(_transcoder, codecDecode);
_transcoder->flags2 |= CODEC_FLAG2_FAST;
_transcoder->pix_fmt = AVPixelFormat::AV_PIX_FMT_YUV420P;
_transcoder->width = width;
_transcoder->height = height;
avcodec_open2(_transcoder, codecDecode, nullptr);
Decoding:
_inputPacket.size = size;
_inputPacket.data = data;
int got_picture;
int decompressed_size = avcodec_decode_video2(_transcoder, _outputFrame, &got_picture, &_inputPacket);
But so far, what I am getting is a green screen. Where am I wrong?
UPD:
I have enabled libavcodec logging, but there are not warnings or errors.
Also I have discovered that _outputframe has AV_PIX_FMT_YUVJ422P as format and colorspace, which does not fit any on values in libavcodec's enums (the actual value is 156448160).
After suggestions from comments, I came up with working solution.
Initialization:
av_init_packet(&_inputPacket);
AVCodec * codecDecode = avcodec_find_decoder(AV_CODEC_ID_MJPEG);
_transcoder = avcodec_alloc_context3(codecDecode);
avcodec_get_context_defaults3(_transcoder, codecDecode);
avcodec_open2(_transcoder, codecDecode, nullptr);
// swscale contex init
mJPEGconvertCtx = sws_getContext(width, height, AV_PIX_FMT_YUVJ422P,
width, height, AV_PIX_FMT_YUV420P, SWS_FAST_BILINEAR, nullptr, nullptr, nullptr);
// x264 pic init
x264_picture_t _pic_in;
x264_picture_alloc(&_pic_in, X264_CSP_I420, width, height);
_pic_in.img.i_csp = X264_CSP_I420 ;
_pic_in.img.i_plane = 3;
_pic_in.img.i_stride[0] = width;
_pic_in.img.i_stride[1] = width / 2;
_pic_in.img.i_stride[2] = width / 2;
_pic_in.i_type = X264_TYPE_AUTO ;
_pic_in.i_qpplus1 = 0;
Transcoding:
_inputPacket.size = size;
_inputPacket.data = data;
int got_picture;
// decode
avcodec_decode_video2(_transcoder, _outputFrame, &got_picture, &_inputPacket);
// transform
sws_scale(_mJPEGconvertCtx, _outputFrame->data, _outputFrame->linesize, 0, height,
_pic_in.img.plane, _pic_in.img.i_stride);
Afterwards, _pic_in is used directly by x264. Image is fine, but the transcoding times are horrible for higher resolutions.
Related
I am trying convert a RGB image into YUV.
I am loading image using openCV.
I am calling the function as follows:
//I know IplImage is outdated
IplImage* im = cvLoadImage("1.jpg", 1);
//....
bgr2yuv(im->imageData, dst, im->width, im->height);
the function to convert Color image to yuv image is given below.
I am using ffmpeg to do that.
void bgr2yuv(unsigned char *src, unsigned char *dest, int w, int h)
{
AVFrame *yuvIm = avcodec_alloc_frame();
AVFrame *rgbIm = avcodec_alloc_frame();
avpicture_fill(rgbIm, src, PIX_FMT_BGR24, w, h);
avpicture_fill(yuvIm, dest, PIX_FMT_YUV420P, w, h);
av_register_all();
struct SwsContext * imgCtx = sws_getCachedContext(imgCtx,
w, h,(::PixelFormat)PIX_FMT_BGR24,
w, h,(::PixelFormat)PIX_FMT_YUV420P,
SWS_BICUBIC, NULL, NULL, NULL);
sws_scale(imgCtx, rgbIm->data, rgbIm->linesize,0, h, yuvIm->data, yuvIm->linesize);
av_free(yuvIm);
av_free(rgbIm);
}
I am getting wrong output after conversion.
I am thinking this is due to padding happening in the IplImage.
(My input image width is not multiple of 4).
I updated linesize variable even after that I am not getting correct output.
Its working fine when I am using images whose width is multiple of 4.
Can anybody tell what is the problem in the code.
Check IplImage::align or IplImage::widthStep and use these to set AVFrame::linesize. For the RGB frame, for example, you would set:
frame->linesize[0] = img->widthStep;
The layout of the dst array can be whatever you want, it depends on how you're using it afterwards.
We need to do as follows:
rgbIm->linesize[0] = im->widthStep;
But I think output data from sws_scale() is not padded to make it multiple of 4.
So when you are copying this data (dest) again to IplImage this will
create problem in displaying, saving etc..
So we need to set widthStep=width as follows:
IplImage* yuvImage = cvCreateImageHeader(cvGetSize(im), 8, 1);
yuvImage->widthStep = yuvImage->width;
yuvImage->imageData = dest;
I am trying to encode a YVU file and save it as jpg file. but i didn't understand the following
1.why the packet size is size*3.
av_new_packet(&pkt,size*3);`
2.In fread why we using size*3/2.
if(fread(buffer , 1, size*3/2, ptrInputFile)<=0)`
3.how they are filling data here
frame->data[0] = buffer;
frame->data[1] = buffer + siz;
frame->data[2] = buffer + siz*5/4;
code:
AVFormatContext *avFrameContext;
AVOutputFormat *avOutputFormat;
AVStream *avStream;
AVCodecContext *avCodecContext;
AVCodec *avCodec;
AVFrame *frame;
AVPacket pkt;
const char *output = "temp.jpg";
FILE *ptrInputFile;
const char *input = "cuc_view_480x272.yuv";
ptrInputFile = fopen(input ,"rb");
if(!ptrInputFile)
return -1;
avFrameContext = avformat_alloc_context();
avOutputFormat = av_guess_format("mjpeg", NULL, NULL);
if(!avOutputFormat)
return -1;
avFrameContext->oformat = avOutputFormat;
if(avio_open(&avFrameContext->pb ,output ,AVIO_FLAG_READ_WRITE)<0)
return -1;
avStream = avformat_new_stream(avFrameContext,NULL);
if(!avStream)
return -1;
avCodecContext = avStream->codec;
avCodecContext->codec_id = avOutputFormat->video_codec;
avCodecContext->codec_type = AVMEDIA_TYPE_VIDEO;
avCodecContext->pix_fmt = PIX_FMT_YUVJ420P;
avCodecContext->width = 480;
avCodecContext->height = 272;
avCodecContext->time_base.num = 1;
avCodecContext->time_base.den = 25;
avCodec = avcodec_find_encoder(avCodecContext->codec_id);
if(!avCodec)
return -1;
if(avcodec_open2(avCodecContext ,avCodec,NULL)<0)
return -1;
frame = av_frame_alloc();
int size = avpicture_get_size(PIX_FMT_YUVJ420P ,avCodecContext->width, avCodecContext->height);
uint8_t *buffer = (uint8_t*)av_malloc(size*sizeof(uint8_t));
avpicture_fill((AVPicture*)frame, buffer, avCodecContext->pix_fmt ,avCodecContext->width, avCodecContext->height);
//write header
avformat_write_header(avFrameContext, NULL);
int siz = avCodecContext->width*avCodecContext->height;
av_new_packet(&pkt,siz*3);
if(fread(buffer , 1, siz*3/2, ptrInputFile)<=0)
return -1;
frame->data[0] = buffer;
frame->data[1] = buffer + siz;
frame->data[2] = buffer + siz*5/4;
If you look at the format of the yuv420p (wiki) the data is formatted in the file as:
As there are 'siz' length of pixels in the image:
siz length of y value
siz/4 length of u value
siz/4 length of v value
So for question 2: we have siz*3/2 length of data to read.
For question 3: y starts at buffer+0, u starts at buffer+siz, and v starts at buffer+siz*5/4.
As for question 1: I am not sure if the data is converted to RGB. If it is converted then it would require 3 byte for each pixel. Additional code is required to see that.
I don't know much about the code you provided above. But if you are trying to encode yuv video and save as jpeg you can directly use the following command in ffmpeg
ffmpeg -f rawvideo -vcodec rawvideo -s <resolution> -r 25 -pix_fmt yuv420p -i video.yuv -preset ultrafast -qp 0 %d.jpg
replace <resolution> by resolution of your video eg. 1920x1080
I am trying to scale a decoded YUV420p frame(1018x700) via sws_scale to RGBA, I am saving data to a raw video file and then playing the raw video using ffplay to see the result.
Here is my code:
sws_ctx = sws_getContext(video_dec_ctx->width, video_dec_ctx->height,AV_PIX_FMT_YUV420P, video_dec_ctx->width, video_dec_ctx->height, AV_PIX_FMT_BGR32, SWS_LANCZOS | SWS_ACCURATE_RND, 0, 0, 0);
ret = avcodec_decode_video2(video_dec_ctx, yuvframe, got_frame, &pkt);
if (ret < 0) {
std::cout<<"Error in decoding"<<std::endl;
return ret;
}else{
//the source and destination heights and widths are the same
int sourceX = video_dec_ctx->width;
int sourceY = video_dec_ctx->height;
int destX = video_dec_ctx->width;
int destY = video_dec_ctx->height;
//declare destination frame
AVFrame avFrameRGB;
avFrameRGB.linesize[0] = destX * 4;
avFrameRGB.data[0] = (uint8_t*)malloc(avFrameRGB.linesize[0] * destY);
//scale the frame to avFrameRGB
sws_scale(sws_ctx, yuvframe->data, yuvframe->linesize, 0, yuvframe->height, avFrameRGB.data, avFrameRGB.linesize);
//write to file
fwrite(avFrameRGB.data[0], 1, video_dst_bufsize, video_dst_file);
}
Here is the result without scaling (i.e. in YUV420p Format)
Here is the after scaling while playing using ffplay (i.e. in RGBA format)
I run the ffplay using the following command ('video' is the raw video file)
ffplay -f rawvideo -pix_fmt bgr32 -video_size 1018x700 video
What should I fix to make the correct scaling happen to RGB32?
I found the solution, the problem here was that I was not using the correct buffer size to write to the file.
fwrite(avFrameRGB.data[0], 1, video_dst_bufsize, video_dst_file);
The variable video_dst_file was being taken from the return value of
video_dst_bufsize = av_image_alloc(yuvframe.data, yuvframe.linesize, destX, destY, AV_PIX_FMT_YUV420P, 1);
The solution is to get the return value from and use this in the fwrite statement:
video_dst_bufsize_RGB = av_image_alloc(avFrameRGB.data, avFrameRGB.linesize, destX, destY, AV_PIX_FMT_BGR32, 1);
fwrite(avFrameRGB.data[0], 1, video_dst_bufsize_RGB, video_dst_file);
I want to transfer opengl framebuffer data to AVCodec as fast as possible.
I've already converted RGB to YUV with shader and read it with glReadPixels
I still need to fill AVFrame data manually. Is there any better way?
AVFrame *frame;
// Y
frame->data[0][y*frame->linesize[0]+x] = data[i*3];
// U
frame->data[1][y*frame->linesize[1]+x] = data[i*3+1];
// V
frame->data[2][y*frame->linesize[2]+x] = data[i*3+2];
You can use sws_scale.
In fact, you don't need shaders for converting RGB->YUV. Believe me, it's not gonna have a very different performance.
swsContext = sws_getContext(WIDTH, HEIGHT, AV_PIX_FMT_RGBA, WIDTH, HEIGHT, AV_PIX_FMT_YUV, SWS_BICUBIC, 0, 0, 0 );
sws_scale(swsContext, (const uint8_t * const *)sourcePictureRGB.data, sourcePictureRGB.linesize, 0, codecContext->height, destinyPictureYUV.data, destinyPictureYUV.linesize);
The data in destinyPictureYUV will be ready to go to the codec.
In this sample, destinyPictureYUV is the AVFrame you want to fill up. Try to setup like this:
AVFrame * frame;
AVPicture destinyPictureYUV;
avpicture_alloc(&destinyPictureYUV, codecContext->pix_fmt, newCodecContext->width, newCodecContext->height);
// THIS is what you want probably
*reinterpret_cast<AVPicture *>(frame) = destinyPictureYUV;
With this setup you CAN ALSO fill up with the data you already converted to YUV in the GPU if you desire... you can choose the way you want.
I have colored jpeg images of OpenCV::Mat type and I create from them video using avcodec. The video that I get is upside-down, black & white and each row of each frame is shifted and I got diagonal line. What could be the reason for such output?
Follow this link to watch the video I get using avcodec.
I'm using acpicture_fill function to create avFrame from cv::Mat frame!
P.S.
Each cv::Mat cvFrame has width=810, height=610, step=2432
I noticed that avFrame (that is filled by acpicture_fill) has linesize[0]=2430
I tried manually setting avFrame->linesizep0]=2432 and not 2430 but it still didn't helped.
======== CODE =========================================================
AVCodec *encoder = avcodec_find_encoder(AV_CODEC_ID_H264);
AVStream *outStream = avformat_new_stream(outContainer, encoder);
avcodec_get_context_defaults3(outStream->codec, encoder);
outStream->codec->pix_fmt = AV_PIX_FMT_YUV420P;
outStream->codec->width = 810;
outStream->codec->height = 610;
//...
SwsContext *swsCtx = sws_getContext(outStream->codec->width, outStream->codec->height, PIX_FMT_RGB24,
outStream->codec->width, outStream->codec->height, outStream->codec->pix_fmt, SWS_BICUBIC, NULL, NULL, NULL);
for (uint i=0; i < frameNums; i++)
{
// get frame at location I using OpenCV
cv::Mat cvFrame;
myReader.getFrame(cvFrame, i);
cv::Size frameSize = cvFrame.size();
//Each cv::Mat cvFrame has width=810, height=610, step=2432
1. // create AVPicture from cv::Mat frame
2. avpicture_fill((AVPicture*)avFrame, cvFrame.data, PIX_FMT_RGB24, outStream->codec->width, outStream->codec->height);
3avFrame->width = frameSize.width;
4. avFrame->height = frameSize.height;
// rescale to outStream format
sws_scale(swsCtx, avFrame->data, avFrame->linesize, 0, outStream->codec->height, avFrameRescaledFrame->data, avFrameRescaledFrame ->linesize);
encoderRescaledFrame->pts=i;
avFrameRescaledFrame->width = frameSize.width;
avFrameRescaledFrame->height = frameSize.height;
av_init_packet(&avEncodedPacket);
avEncodedPacket.data = NULL;
avEncodedPacket.size = 0;
// encode rescaled frame
if(avcodec_encode_video2(outStream->codec, &avEncodedPacket, avFrameRescaledFrame, &got_frame) < 0) exit(1);
if(got_frame)
{
if (avEncodedPacket.pts != AV_NOPTS_VALUE)
avEncodedPacket.pts = av_rescale_q(avEncodedPacket.pts, outStream->codec->time_base, outStream->time_base);
if (avEncodedPacket.dts != AV_NOPTS_VALUE)
avEncodedPacket.dts = av_rescale_q(avEncodedPacket.dts, outStream->codec->time_base, outStream->time_base);
// outContainer is "mp4"
av_write_frame(outContainer, & avEncodedPacket);
av_free_packet(&encodedPacket);
}
}
UPDATED
As #Alex suggested I changed the lines 1-4 with the code below
int width = frameSize.width, height = frameSize.height;
avpicture_alloc((AVPicture*)avFrame, AV_PIX_FMT_RGB24, outStream->codec->width, outStream->codec->height);
for (int h = 0; h < height; h++)
{
memcpy(&(avFrame->data[0][h*avFrame->linesize[0]]), &(cvFrame.data[h*cvFrame.step]), width*3);
}
The video (here) I get now is almost perfect. It's NOT upside-down, NOT black & white, BUT it seems that one of the RGB components is missing. Every brown/red colors became blue (in original images it should be vice-verse).
What could be the problem? Could rescaling(sws_scale) to AV_PIX_FMT_YUV420P format causes this?
The problem in a nutshell: avpicture_fill() expects no padding between rows, ie the stride (step) to be equal to width*sizeof(pixel), ie 810*3 = 2430. The actual stride of the data in cv::Mat step as you say is 2432 which is different, so just passing the data directly won't work. There is no way to tell avpicture_fill() to use a different stride for the input data; it is not part of the API (you might say it should be :)
There are two possible solutions:
Create an array in which the input data is contiguous, no padding between rows. You'd have to memcopy each row from the cv::Mat into that array. Then pass it to avpicture_fill().
int width, height; // get from mat
uint8_t* buf = malloc(width * height * 3); // 3 bytes per pixel
for (int i = 0; i < height; i++)
{
memcpy( &( buf[ i*width*3 ] ), &( mat->data[ i*mat->step ] ), width*3 );
}
avpicture_fill(..., buf, ...)
Btw, to flip the video vertically, you can do this to copy the last row to the first and so forth:
...
memcpy( &( buf[ i*width*3 ] ), &( mat->data[ (height - i - 1)*mat->step ] ), width*3 );
...
Or, fill in the AVPicture yourself:
AVPicture* pic = malloc(sizeof(AVPicture));
avpicture_alloc(pic, PIX_FMT_BGR24, width, height);
for (int i = 0; i < height; i++)
{
memcpy( &( pic->data[0][ i*pic->linesize[0] ] ), &( mat->data[ i*mat->step ] ), width*3);
}
There is no need to allocate pic->data[0] or set pic->linesize[0], avpicture_alloc() should do that. There is also no need to fill in data[1] or data[2], those should be null.
EDIT: Removed old code which showed copying R, G, B to separate planes. PIX_FMT_BGR24 is not a planar format.
I'm not familiar enough with OpenCV C++ API to figure out how to get the width and height (it's not mat->width, obviously) but I think you know what I mean.
P.S. Btw, your video is not actually black and white. It's just that each successive row is offset by two bytes, so the colors are rotated: red becomes green, green becomes blue, and so forth. The result is grayscale-ish, but if you look closely the individual rows are colored.
Have you considered using OpenCV's features to create the video for you? It's much more easier since your data is already store in a cv::Mat.
If you would like to keep your approach, you could simply rotate the cv::Mat.
About the color problem in the UPDATE of the original post. Is that caused by,
OpenCV Mat is (BGR) -> FFmpeg AVFrame is (RGB) ?
If so, try,
cvtColor( cvFrame , cvFrame , CV_BGR2RGB ) ;
before line 1.