I'm trying to convert an YUV frame to RGB using libswscale.
Here is my code :
AVFrame *RGBFrame;
SwsContext *ConversionContext;
ConversionContext = sws_getCachedContext(NULL, FrameWidth, FrameHeight, AV_PIX_FMT_YUV420P, FrameWidth, FrameHeight, AV_PIX_FMT_RGB24, SWS_BILINEAR, 0, 0, 0);
RGBFrame = av_frame_alloc();
avpicture_fill((AVPicture *)RGBFrame, &FillVect[0], AV_PIX_FMT_RGB24, FrameWidth, FrameHeight);
sws_scale(ConversionContext, VideoFrame->data, VideoFrame->linesize, 0, VideoFrame->height, RGBFrame->data, RGBFrame->linesize);
My program do SEGFAULT on the sws_scale function.
VideoFrame is an AVFrame struct who hold my decoded frame.
I think this is because the YUV frame come from avcodec_decode_video2, which return an array like this :
VideoFrame->data[0] // Y array, linesize = frame width
VideoFrame->data[1] // U array, linesize = frame width/2
VideoFrame->data[2] // V array, linesize = frame width/2
While YUV420P have theoretically only one plane (according to Wikipedia, YUV420P is a planar format, then Y, U, V data are grouped together).
So, i don't know how to proceed to convert my array where Y, U, V data are separated into RGB24, using swscale.
Please help me, thanks :)
av_frame_alloc only allocates memory for the frame object itself, it does not allocate memory to store the image data. Have you done:
FillVect.resize( avpicture_get_size( PIX_FMT_RGB24, FrameWidth, FrameHeight ) );
somewhere in your code before calling avpicture_fill? Or some other way to make sure FillVect allocates enough memory to keep the whole decoded picture?
Did you try to run it under valgrind to see what exactly trigges SEGFAULT?
Related
I am playing around with NVDEC H.264 decoder from NVIDIA CUDA samples, one thing I've found out is once frame is decoded, it's converted from NV12 to BGRA buffer which is allocated on CUDA's side, then this buffer is copied to D3D BGRA texture.
I find this not very efficient in terms of memory usage, and want to convert NV12 frame directly to D3D texture with this kernel:
void Nv12ToBgra32(uint8_t *dpNv12, int nNv12Pitch, uint8_t *dpBgra, int nBgraPitch, int nWidth, int nHeight, int iMatrix)
So, create D3D texture (BGRA, D3D11_USAGE_DEFAULT, D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS, D3D11_CPU_ACCESS_WRITE, 1 mipmap),
then register and write it on CUDA side:
//Register
ck(cuGraphicsD3D11RegisterResource(&cuTexResource, textureResource, CU_GRAPHICS_REGISTER_FLAGS_NONE));
...
//Write output:
CUarray retArray;
ck(cuGraphicsMapResources(1, &cuTexResource, 0));
ck(cuGraphicsSubResourceGetMappedArray(&retArray, cuTexResource, 0, 0));
/*
yuvFramePtr (NV12) is uint8_t* from decoded frame,
it's stored within CUDA memory I believe
*/
Nv12ToBgra32(yuvFramePtr, w, (uint8_t*)retArray, 4 * w, w, h);
ck(cuGraphicsUnmapResources(1, &cuTexResource, 0));
Once kernel is called, I get crash. May be because of misusing CUarray, can anybody please clarify how to use output of cuGraphicsSubResourceGetMappedArray to write texture memory from CUDA kernel? (since writing raw memory is only needed, there is no need to handle correct clamp, filtering and value scaling)
Ok, for anyone who struggling on question "How to write D3D11 texture from CUDA kernel", here is how:
Create D3D texture with D3D11_BIND_UNORDERED_ACCESS.
Then, register resource:
//ID3D11Texture2D *textureResource from D3D texture
CUgraphicsResource cuTexResource;
ck(cuGraphicsD3D11RegisterResource(&cuTexResource, textureResource, CU_GRAPHICS_REGISTER_FLAGS_NONE));
//You can also add write-discard if texture will be fully written by kernel
ck(cuGraphicsResourceSetMapFlags(cuTexResource, CU_GRAPHICS_MAP_RESOURCE_FLAGS_WRITE_DISCARD));
Once texture is created and registered we can use it as write surface.
ck(cuGraphicsMapResources(1, &cuTexResource, 0));
//Get array for first mip-map
CUArray retArray;
ck(cuGraphicsSubResourceGetMappedArray(&retArray, cuTexResource, 0, 0));
//Create surface from texture
CUsurfObject surf;
CUDA_RESOURCE_DESC surfDesc{};
surfDesc.res.array.hArray = retArray;
surfDesc.resType = CU_RESOURCE_TYPE_ARRAY;
ck(cuSurfObjectCreate(&surf, &surfDesc));
/*
Kernel declaration is:
void Nv12ToBgra32Surf(uint8_t* dpNv12, int nNv12Pitch, cudaSurfaceObject_t surf, int nBgraPitch, int nWidth, int nHeight, int iMatrix)
Surface write:
surf2Dwrite<uint>(VALUE, surf, x * sizeof(uint), y);
For BGRA surface we are writing uint, X offset is in bytes,
so multiply it with byte-size of type.
Run kernel:
*/
Nv12ToBgra32Surf(yuvFramePtr, w, /*out*/surf, 4 * w, w, h);
ck(cuGraphicsUnmapResources(1, &cuTexResource, 0));
ck(cuSurfObjectDestroy(surf));
I am trying convert a RGB image into YUV.
I am loading image using openCV.
I am calling the function as follows:
//I know IplImage is outdated
IplImage* im = cvLoadImage("1.jpg", 1);
//....
bgr2yuv(im->imageData, dst, im->width, im->height);
the function to convert Color image to yuv image is given below.
I am using ffmpeg to do that.
void bgr2yuv(unsigned char *src, unsigned char *dest, int w, int h)
{
AVFrame *yuvIm = avcodec_alloc_frame();
AVFrame *rgbIm = avcodec_alloc_frame();
avpicture_fill(rgbIm, src, PIX_FMT_BGR24, w, h);
avpicture_fill(yuvIm, dest, PIX_FMT_YUV420P, w, h);
av_register_all();
struct SwsContext * imgCtx = sws_getCachedContext(imgCtx,
w, h,(::PixelFormat)PIX_FMT_BGR24,
w, h,(::PixelFormat)PIX_FMT_YUV420P,
SWS_BICUBIC, NULL, NULL, NULL);
sws_scale(imgCtx, rgbIm->data, rgbIm->linesize,0, h, yuvIm->data, yuvIm->linesize);
av_free(yuvIm);
av_free(rgbIm);
}
I am getting wrong output after conversion.
I am thinking this is due to padding happening in the IplImage.
(My input image width is not multiple of 4).
I updated linesize variable even after that I am not getting correct output.
Its working fine when I am using images whose width is multiple of 4.
Can anybody tell what is the problem in the code.
Check IplImage::align or IplImage::widthStep and use these to set AVFrame::linesize. For the RGB frame, for example, you would set:
frame->linesize[0] = img->widthStep;
The layout of the dst array can be whatever you want, it depends on how you're using it afterwards.
We need to do as follows:
rgbIm->linesize[0] = im->widthStep;
But I think output data from sws_scale() is not padded to make it multiple of 4.
So when you are copying this data (dest) again to IplImage this will
create problem in displaying, saving etc..
So we need to set widthStep=width as follows:
IplImage* yuvImage = cvCreateImageHeader(cvGetSize(im), 8, 1);
yuvImage->widthStep = yuvImage->width;
yuvImage->imageData = dest;
I want to transfer opengl framebuffer data to AVCodec as fast as possible.
I've already converted RGB to YUV with shader and read it with glReadPixels
I still need to fill AVFrame data manually. Is there any better way?
AVFrame *frame;
// Y
frame->data[0][y*frame->linesize[0]+x] = data[i*3];
// U
frame->data[1][y*frame->linesize[1]+x] = data[i*3+1];
// V
frame->data[2][y*frame->linesize[2]+x] = data[i*3+2];
You can use sws_scale.
In fact, you don't need shaders for converting RGB->YUV. Believe me, it's not gonna have a very different performance.
swsContext = sws_getContext(WIDTH, HEIGHT, AV_PIX_FMT_RGBA, WIDTH, HEIGHT, AV_PIX_FMT_YUV, SWS_BICUBIC, 0, 0, 0 );
sws_scale(swsContext, (const uint8_t * const *)sourcePictureRGB.data, sourcePictureRGB.linesize, 0, codecContext->height, destinyPictureYUV.data, destinyPictureYUV.linesize);
The data in destinyPictureYUV will be ready to go to the codec.
In this sample, destinyPictureYUV is the AVFrame you want to fill up. Try to setup like this:
AVFrame * frame;
AVPicture destinyPictureYUV;
avpicture_alloc(&destinyPictureYUV, codecContext->pix_fmt, newCodecContext->width, newCodecContext->height);
// THIS is what you want probably
*reinterpret_cast<AVPicture *>(frame) = destinyPictureYUV;
With this setup you CAN ALSO fill up with the data you already converted to YUV in the GPU if you desire... you can choose the way you want.
How do you resize an AVFrame? I
Here's what I'm currently doing:
AVFrame* frame = /*...*/;
int width = 600, height = 400;
AVFrame* resizedFrame = av_frame_alloc();
auto format = AVPixelFormat(frame->format);
auto buffer = av_malloc(avpicture_get_size(format, width, height) * sizeof(uint8_t));
avpicture_fill((AVPicture *)resizedFrame, (uint8_t*)buffer, format, width, height);
struct SwsContext* swsContext = sws_getContext(frame->width, frame->height, format,
width, height, format,
SWS_BILINEAR, nullptr, nullptr, nullptr);
sws_scale(swsContext, frame->data, frame->linesize, 0, frame->height, resizedFrame->data, resizedFrame->linesize);
But after this resizedFrames->widthand height are still 0, the contents of the AVFrame look like garbage, and I get an warning that data is unaligned when I call sws_scale. Note: I don't want to change the pixel format, and I don't want to hard code what it is.
So, there's a few things going on.
avpicture_fill() does not set frame->width/height/format. You have to set these values yourself.
avpicture_get_size() and avpicture_fill() do not guarantee alignment. The underlying functions called in these wrappers (e.g. av_image_get_buffer_size() or av_image_fill_arrays()) are called with align=1, so there's no buffer alignment between lines. If you want alignment (you do), you either have to call the underlying functions directly with a different align setting, or call avcodec_align_dimensions2() on the width/height and provide aligned width/height to the avpicture_*() functions. If you do that, you can also consider using avpicture_alloc() instead of avpicture_get_size() + av_malloc() + avpicture_fill().
I think if you follow these two suggestions, you'll find that the rescaling works as expected, gives no warnings and has correct output. The quality may not be great because you're trying to do bilinear scaling. Most people use bicubic scaling (SWS_BICUBIC).
Has anybody solved this problem earlier? I need simple and fast method to convert QImage::bits() buffer from RGB32 to YUV420P pixel format. Can you help me?
libswscale, part of the ffmpeg project has optimized routines to perform colorspace conversions, scaling, and filtering. If you really want speed, I would suggest using it unless you cannot add the extra dependency. I haven't actually tested this code, but here is the general idea:
QImage img = ... //your image in RGB32
//allocate output buffer. use av_malloc to align memory. YUV420P
//needs 1.5 times the number of pixels (Cb and Cr only use 0.25
//bytes per pixel on average)
char* out_buffer = (char*)av_malloc((int)ceil(img.height() * img.width() * 1.5));
//allocate ffmpeg frame structures
AVFrame* inpic = avcodec_alloc_frame();
AVFrame* outpic = avcodec_alloc_frame();
//avpicture_fill sets all of the data pointers in the AVFrame structures
//to the right places in the data buffers. It does not copy the data so
//the QImage and out_buffer still need to live after calling these.
avpicture_fill((AVPicture*)inpic,
img.bits(),
AV_PIX_FMT_ARGB,
img.width(),
img.height());
avpicture_fill((AVPicture*)outpic,
out_buffer,
AV_PIX_FMT_YUV420P,
img.width(),
img.height());
//create the conversion context. you only need to do this once if
//you are going to do the same conversion multiple times.
SwsContext* ctx = sws_getContext(img.width(),
img.height(),
AV_PIX_FMT_ARGB,
img.width(),
img.height(),
AV_PIX_FMT_YUV420P,
SWS_BICUBIC,
NULL, NULL, NULL);
//perform the conversion
sws_scale(ctx,
inpic->data,
inpic->linesize,
0,
img.height(),
outpic->data,
outpic->linesize);
//free memory
av_free(inpic);
av_free(outpic);
//...
//free output buffer when done with it
av_free(out_buffer);
Like I said, I haven't tested this code so it may require some tweaks to get it working.