How do you resize an AVFrame? - c++

How do you resize an AVFrame? I
Here's what I'm currently doing:
AVFrame* frame = /*...*/;
int width = 600, height = 400;
AVFrame* resizedFrame = av_frame_alloc();
auto format = AVPixelFormat(frame->format);
auto buffer = av_malloc(avpicture_get_size(format, width, height) * sizeof(uint8_t));
avpicture_fill((AVPicture *)resizedFrame, (uint8_t*)buffer, format, width, height);
struct SwsContext* swsContext = sws_getContext(frame->width, frame->height, format,
width, height, format,
SWS_BILINEAR, nullptr, nullptr, nullptr);
sws_scale(swsContext, frame->data, frame->linesize, 0, frame->height, resizedFrame->data, resizedFrame->linesize);
But after this resizedFrames->widthand height are still 0, the contents of the AVFrame look like garbage, and I get an warning that data is unaligned when I call sws_scale. Note: I don't want to change the pixel format, and I don't want to hard code what it is.

So, there's a few things going on.
avpicture_fill() does not set frame->width/height/format. You have to set these values yourself.
avpicture_get_size() and avpicture_fill() do not guarantee alignment. The underlying functions called in these wrappers (e.g. av_image_get_buffer_size() or av_image_fill_arrays()) are called with align=1, so there's no buffer alignment between lines. If you want alignment (you do), you either have to call the underlying functions directly with a different align setting, or call avcodec_align_dimensions2() on the width/height and provide aligned width/height to the avpicture_*() functions. If you do that, you can also consider using avpicture_alloc() instead of avpicture_get_size() + av_malloc() + avpicture_fill().
I think if you follow these two suggestions, you'll find that the rescaling works as expected, gives no warnings and has correct output. The quality may not be great because you're trying to do bilinear scaling. Most people use bicubic scaling (SWS_BICUBIC).

Related

Loading RAW grayscale image with FreeImage

How can I load RAW 16-bit grayscale image with FreeImage?
I have unsigned char* buffer with raw data. I know its dimensions in pixels and I know it is 16bit grayscale.
I'm trying to load it with
FIBITMAP* bmp = FreeImage_ConvertFromRawBits(buffer, 1000, 1506, 2000, 16, 0, 0, 0);
and get broken RGB888 image. It is unclear what color masks I should use for grayscale as it has only one channel.
After many experiments I found partially working solution with FreeImage_ConvertFromRawBitsEx:
FIBITMAP* bmp = FreeImage_ConvertFromRawBitsEx(true, buffer, FIT_UINT16, 1000, 1506, 2000, 16, 0xFFFF, 0xFFFF, 0xFFFF);
(thanks #1201ProgramAlarm for hint with masks).
In this way, FreeImage loads the data, but in some semi-custom format. Most of conversion and saving functions (tried: JPG, PNG, BMP, TIF) fail.
As I can't load data in native 16bit format, I preferred to convert it into 8bit grayscale
unsigned short* buffer = new unsigned short[1000 * 1506];
// load data
unsigned char* buffer2 = new unsigned char[1000 * 1506];
for (int i = 0; i < 1000 * 1506; i++)
buffer2[i] = (unsigned char)(buffer[i] / 256.f);
FIBITMAP* bmp = FreeImage_ConvertFromRawBits(buffer2, 1000, 1506, 1000, 8, 0xFF, 0xFF, 0xFF, true);
This is really not the best solution, I even don't want to mark it as right answer (will wait for something better). But after this the format will be convenient for FreeImage and it could save/convert data to whatever.
Concerning your issue: I have read this from their PDF documentation FreeImage1370.pdf:
FreeImage_ConvertFromRawBits
1 4 8 16 24 32
DLL_API FIBITMAP *DLL_CALLCONV FreeImage_ConvertFromRawBits(BYTE *bits, int width, int
height, int pitch, unsigned bpp, unsigned red_mask, unsigned green_mask, unsigned
blue_mask, BOOL topdown FI_DEFAULT(FALSE));
Converts a raw bitmap somewhere in memory to a FIBITMAP. The parameters in this
function are used to describe the raw bitmap. The first parameter is a pointer to the start of
the raw bits. The width and height parameter describe the size of the bitmap. The pitch
defines the total width of a scanline in the source bitmap, including padding bytes that may be
applied. The bpp parameter tells FreeImage what the bit depth of the bitmap is. The
red_mask, green_mask and blue_mask parameters tell FreeImage the bit-layout of the color
components in the bitmap. The last parameter, topdown, will store the bitmap top-left pixel
first when it is TRUE or bottom-left pixel first when it is FALSE.
When the source bitmap uses a 32-bit padding, you can calculate the pitch using the
following formula:
int pitch = ((((bpp * width) + 31) / 32) * 4);
In the code you are showing:
FIBITMAP* bmp = FreeImage_ConvertFromRawBits(buffer, 1000, 1506, 2000, 16, 0, 0, 0);
You have the appropriate FIBTMAP* return type, you pass in your buffer of raw bits. From there the 2nd & 3rd parameters which are the width & height: width = 1000, height = 1506 and the 4th parameter which is the pitch: pitch = 2000 (if the bitmap is using 32bit padding refer to the last note above), the 5th parameter will be the bit depth measured in bpp you have as bpp = 16, the next 3 parameters are for your RGB color masks. Here you label them all as being 0. The last parameter is a bool flag for the orientation of the image :
if (topdown == true ) {
stores top-left pixel first )
else {
bottom left pixel is stored first
}
in which you omit the value.
Without more code of how you are reading in the file, parsing the header information etc. to prepare your buffer it is hard to tell where else there may be an error or an issue, but from what you provided; I think you need to check the color channel masks for grayscale images.
EDIT - I found another PDF for FreeImage from standford.edu here that refers to an older version 3.13.1 however the function declaration - definition doesn't look like it has changed any and they provide examples for b FreeImage_ConvertToRawBits & Free_Image_ConvertFromRawBits:
// this code assumes there is a bitmap loaded and
// present in a variable called ‘dib’
// convert a bitmap to a 32-bit raw buffer (top-left pixel first)
// --------------------------------------------------------------
FIBITMAP *src = FreeImage_ConvertTo32Bits(dib);
FreeImage_Unload(dib);
// Allocate a raw buffer
int width = FreeImage_GetWidth(src);
int height = FreeImage_GetHeight(src);
int scan_width = FreeImage_GetPitch(src);
BYTE *bits = (BYTE*)malloc(height * scan_width);
// convert the bitmap to raw bits (top-left pixel first)
FreeImage_ConvertToRawBits(bits, src, scan_width, 32,
FI_RGBA_RED_MASK, FI_RGBA_GREEN_MASK, FI_RGBA_BLUE_MASK,
TRUE);
FreeImage_Unload(src);
// convert a 32-bit raw buffer (top-left pixel first) to a FIBITMAP
// ----------------------------------------------------------------
FIBITMAP *dst = FreeImage_ConvertFromRawBits(bits, width, height, scan_width,
32, FI_RGBA_RED_MASK, FI_RGBA_GREEN_MASK, FI_RGBA_BLUE_MASK, FALSE);
I think this should help you with your question about the bit masks for the color channels in a grayscale image.
You already mentioned the FreeImage_ConvertFromRawBitsEx() function, which was added at some point between FreeImage v3.8 and v3.17, but are you calling it correctly? I was able to use this function with 16-bit grayscale data:
int nBytesPerRow = nWidth * 2;
int nBitsPerPixel = 16;
FIBITMAP* pFIB = FreeImage_ConvertFromRawBitsEx(TRUE, pImageData, FIT_UINT16, nWidth, nHeight, nBytesPerRow, nBitsPerPixel, 0, 0, 0, TRUE);
Note that nBytesPerRow and nBitsPerPixel have to be specified correctly for the 16-bit data. Also, I believe the color mask parameters are irrelevant for this data, since it is monochrome.
EDIT: I noticed that you said that saving the 16-bit data did not work correctly. That may be due to the file formats themselves. The only file format that I have found to be compatible with 16-bit grayscale data is TIFF. So, if you have 16-bit grayscale data, you can save a TIFF with FreeImage_Save() but you cannot save a BMP.

What is the best way to fill AVFrame.data

I want to transfer opengl framebuffer data to AVCodec as fast as possible.
I've already converted RGB to YUV with shader and read it with glReadPixels
I still need to fill AVFrame data manually. Is there any better way?
AVFrame *frame;
// Y
frame->data[0][y*frame->linesize[0]+x] = data[i*3];
// U
frame->data[1][y*frame->linesize[1]+x] = data[i*3+1];
// V
frame->data[2][y*frame->linesize[2]+x] = data[i*3+2];
You can use sws_scale.
In fact, you don't need shaders for converting RGB->YUV. Believe me, it's not gonna have a very different performance.
swsContext = sws_getContext(WIDTH, HEIGHT, AV_PIX_FMT_RGBA, WIDTH, HEIGHT, AV_PIX_FMT_YUV, SWS_BICUBIC, 0, 0, 0 );
sws_scale(swsContext, (const uint8_t * const *)sourcePictureRGB.data, sourcePictureRGB.linesize, 0, codecContext->height, destinyPictureYUV.data, destinyPictureYUV.linesize);
The data in destinyPictureYUV will be ready to go to the codec.
In this sample, destinyPictureYUV is the AVFrame you want to fill up. Try to setup like this:
AVFrame * frame;
AVPicture destinyPictureYUV;
avpicture_alloc(&destinyPictureYUV, codecContext->pix_fmt, newCodecContext->width, newCodecContext->height);
// THIS is what you want probably
*reinterpret_cast<AVPicture *>(frame) = destinyPictureYUV;
With this setup you CAN ALSO fill up with the data you already converted to YUV in the GPU if you desire... you can choose the way you want.

avcodec YUV to RGB

I'm trying to convert an YUV frame to RGB using libswscale.
Here is my code :
AVFrame *RGBFrame;
SwsContext *ConversionContext;
ConversionContext = sws_getCachedContext(NULL, FrameWidth, FrameHeight, AV_PIX_FMT_YUV420P, FrameWidth, FrameHeight, AV_PIX_FMT_RGB24, SWS_BILINEAR, 0, 0, 0);
RGBFrame = av_frame_alloc();
avpicture_fill((AVPicture *)RGBFrame, &FillVect[0], AV_PIX_FMT_RGB24, FrameWidth, FrameHeight);
sws_scale(ConversionContext, VideoFrame->data, VideoFrame->linesize, 0, VideoFrame->height, RGBFrame->data, RGBFrame->linesize);
My program do SEGFAULT on the sws_scale function.
VideoFrame is an AVFrame struct who hold my decoded frame.
I think this is because the YUV frame come from avcodec_decode_video2, which return an array like this :
VideoFrame->data[0] // Y array, linesize = frame width
VideoFrame->data[1] // U array, linesize = frame width/2
VideoFrame->data[2] // V array, linesize = frame width/2
While YUV420P have theoretically only one plane (according to Wikipedia, YUV420P is a planar format, then Y, U, V data are grouped together).
So, i don't know how to proceed to convert my array where Y, U, V data are separated into RGB24, using swscale.
Please help me, thanks :)
av_frame_alloc only allocates memory for the frame object itself, it does not allocate memory to store the image data. Have you done:
FillVect.resize( avpicture_get_size( PIX_FMT_RGB24, FrameWidth, FrameHeight ) );
somewhere in your code before calling avpicture_fill? Or some other way to make sure FillVect allocates enough memory to keep the whole decoded picture?
Did you try to run it under valgrind to see what exactly trigges SEGFAULT?

Converting QImage to YUV420P pixel format

Has anybody solved this problem earlier? I need simple and fast method to convert QImage::bits() buffer from RGB32 to YUV420P pixel format. Can you help me?
libswscale, part of the ffmpeg project has optimized routines to perform colorspace conversions, scaling, and filtering. If you really want speed, I would suggest using it unless you cannot add the extra dependency. I haven't actually tested this code, but here is the general idea:
QImage img = ... //your image in RGB32
//allocate output buffer. use av_malloc to align memory. YUV420P
//needs 1.5 times the number of pixels (Cb and Cr only use 0.25
//bytes per pixel on average)
char* out_buffer = (char*)av_malloc((int)ceil(img.height() * img.width() * 1.5));
//allocate ffmpeg frame structures
AVFrame* inpic = avcodec_alloc_frame();
AVFrame* outpic = avcodec_alloc_frame();
//avpicture_fill sets all of the data pointers in the AVFrame structures
//to the right places in the data buffers. It does not copy the data so
//the QImage and out_buffer still need to live after calling these.
avpicture_fill((AVPicture*)inpic,
img.bits(),
AV_PIX_FMT_ARGB,
img.width(),
img.height());
avpicture_fill((AVPicture*)outpic,
out_buffer,
AV_PIX_FMT_YUV420P,
img.width(),
img.height());
//create the conversion context. you only need to do this once if
//you are going to do the same conversion multiple times.
SwsContext* ctx = sws_getContext(img.width(),
img.height(),
AV_PIX_FMT_ARGB,
img.width(),
img.height(),
AV_PIX_FMT_YUV420P,
SWS_BICUBIC,
NULL, NULL, NULL);
//perform the conversion
sws_scale(ctx,
inpic->data,
inpic->linesize,
0,
img.height(),
outpic->data,
outpic->linesize);
//free memory
av_free(inpic);
av_free(outpic);
//...
//free output buffer when done with it
av_free(out_buffer);
Like I said, I haven't tested this code so it may require some tweaks to get it working.

Faster encoding of realtime 3d graphics with opengl and x264

I am working on a system that sends a compressed video to a client from 3d graphics that are done in the server as soon as they are rendered.
I already have the code working, but I feel it could be much faster (and it is already a bottleneck in the system)
Here is what I am doing:
First I grab the framebuffer
glReadBuffer( GL_FRONT );
glReadPixels( 0, 0, width, height, GL_RGB, GL_UNSIGNED_BYTE, buffer );
Then I flip the framebuffer, because there is a weird bug with swsScale (which I am using for colorspace conversion) that flips the image vertically when I convert. I am flipping in advance, nothing fancy.
void VerticalFlip(int width, int height, byte* pixelData, int bitsPerPixel)
{
byte* temp = new byte[width*bitsPerPixel];
height--; //remember height array ends at height-1
for (int y = 0; y < (height+1)/2; y++)
{
memcpy(temp,&pixelData[y*width*bitsPerPixel],width*bitsPerPixel);
memcpy(&pixelData[y*width*bitsPerPixel],&pixelData[(height-y)*width*bitsPerPixel],width*bitsPerPixel);
memcpy(&pixelData[(height-y)*width*bitsPerPixel],temp,width*bitsPerPixel);
}
delete[] temp;
}
Then I convert it to YUV420p
convertCtx = sws_getContext(width, height, PIX_FMT_RGB24, width, height, PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL);
uint8_t *src[3]= {buffer, NULL, NULL};
sws_scale(convertCtx, src, &srcstride, 0, height, pic_in.img.plane, pic_in.img.i_stride);
Then I pretty much just call the x264 encoder. I am already using the zerolatency preset.
int frame_size = x264_encoder_encode(_encoder, &nals, &i_nals, _inputPicture, &pic_out);
My guess is that there should be a faster way to do this. Capturing the frame and converting it to YUV420p. It would be nice to convert it to YUV420p in the GPU and only after that copying it to system memory, and hopefully there is a way to do color conversion without the need to flip.
If there is no better way, at least this question may help someone trying to do this, to do it the same way I did.
First , use async texture read using PBOs.Here is example It speeds ups the read by using 2 PBOs which work asynchronously without stalling the pipeline like readPixels does when used directly.In my app I got 80% performance boost when switched to PBOs.
Additionally , on some GPUs glGetTexImage() works faster than glReadPixels() so try it out.
But if you really want to take the video encoding to the next level you can do it via CUDA using Nvidia Codec Library.I recently asked the same question so this can be helpful.