sws_scale, YUV to RGB conversion - c++

I need convert YUV to RGB. I also need the RGB values to be in the limited range (16-235).
I try to use sws_scale function for this task.
My code you can see below. But after conversion I got the black pixel is (0, 0, 0) instead of (16, 16, 16).
Maybe there are some options to tell sws_scale function to calculate the limited range.
AVFrame* frameRGB = avFrameConvertPixelFormat(_decodedBuffer[i].pAVFrame, AV_PIX_FMT_RGB24);
AVFrame* Decoder::avFrameConvertPixelFormat(const AVFrame* src, AVPixelFormat dstFormat) {
int width = src->width;
int height = src->height;
AVFrame* dst = allocPicture(dstFormat, width, height);
SwsContext* conversion = sws_getContext(width,
height,
(AVPixelFormat)src->format,
width,
height,
dstFormat,
SWS_FAST_BILINEAR,
NULL,
NULL,
NULL);
sws_scale(conversion, src->data, src->linesize, 0, height, dst->data, dst->linesize);
sws_freeContext(conversion);
dst->format = dstFormat;
dst->width = src->width;
dst->height = src->height;
return dst;
}
Also I tried convert YUV pixel to RGB pixel manualy with formula and I got correct result. From YUV (16, 128, 128) I got (16, 16, 16) in RGB.
cmpR = y + 1.402 * (v - 128);
cmpG = y - 0.3441 * (u - 128) - 0.7141 * (v - 128);
cmpB = y + 1.772 * (u - 128);

You may the source format to "full scale" YUVJ.
As far as I know, sws_scale has no option for selecting Studio RGB as output format.
Changing the input format is the best solution I can think of.
The color conversion formula of "JPEG: YUV -> RGB" is the same as the formula in your post.
Examples for setting the source format:
If src->format is PIX_FMT_YUV420P, set the format to PIX_FMT_YUVJ420P.
If src->format is PIX_FMT_YUV422P, set the format to PIX_FMT_YUVJ422P.
If src->format is PIX_FMT_YUV444P, set the format to PIX_FMT_YUVJ444P.
If PIX_FMT_YUV440P, use PIX_FMT_YUVJ440P.
I know the solution is not covering all the possibilists, and there might be some output pixels exceeding the range of [16, 235], so it's not the most general solution...

yuv to rgb conversion using FFMPEG I see lot of information given already for this above. However for code completeness I am re-sharing the code with missing allocPicture() function, header & library to include, it works for me like a charm. Thanks to #Валентин Никин & #Rotem for most of the info & code.
Headers:
#include <libswscale/swscale.h>
Link FFMPEG Library:
libswscale
static AVFrame* allocPicture(enum AVPixelFormat pix_fmt, int width, int height)
{
// Allocate a frame
AVFrame* frame = av_frame_alloc();
if (frame == NULL)
{
fprintf(stderr, "avcodec_alloc_frame failed");
}
if (av_image_alloc(frame->data, frame->linesize, width, height, pix_fmt, 1) < 0)
{
fprintf(stderr, "av_image_alloc failed");
}
frame->width = width;
frame->height = height;
frame->format = pix_fmt;
return frame;
}
static AVFrame* avFrameConvertPixelFormat(const AVFrame* src, enum AVPixelFormat dstFormat)
{
int width = src->width;
int height = src->height;
AVFrame* dst = allocPicture(dstFormat, width, height);
struct SwsContext* conversion = sws_getContext(width,
height,
(enum AVPixelFormat)src->format,
width,
height,
dstFormat,
SWS_FAST_BILINEAR | SWS_FULL_CHR_H_INT | SWS_ACCURATE_RND,
NULL,
NULL,
NULL);
sws_scale(conversion, src->data, src->linesize, 0, height, dst->data, dst->linesize);
sws_freeContext(conversion);
dst->format = dstFormat;
dst->width = src->width;
dst->height = src->height;
return dst;
}
// convert yuv420p10le to rgb24 (or any other RGB formats)
AVFrame* frame = avFrameConvertPixelFormat(frame, AV_PIX_FMT_RGB24);

Related

C/C++ ffmpeg output is low quality and blurry

I've made a program that takes a video file as input, edits it using opengl/glfw, then encodes that edited video. The program works just fine, I get the desired output. However the video quality is really low and I don't know how to adjust it. The editing seems fine, since the display on the glfw window is high resolution. I don'T think its about scaling since it just reads the pixels on the glfw window and passes it to the encoder, and the glfw window is high res.
Here is what the glfw window looks like when the program is running:
I'm encoding in YUV420P formatting, but the information I'm getting from the glfw window is in RGBA format. I'm getting the data using:
glReadPixels(0, 0,
gl_width, gl_height,
GL_RGBA, GL_UNSIGNED_BYTE,
(GLvoid*) state.glBuffer
);
I simply got the muxing.c example from ffmpeg's docs and edited it slightly so it looks something like this:
AVFrame* video_encoder::get_video_frame(OutputStream *ost)
{
AVCodecContext *c = ost->enc;
/* check if we want to generate more frames */
if (av_compare_ts(ost->next_pts, c->time_base,
(float) STREAM_DURATION / 1000, (AVRational){ 1, 1 }) > 0)
return NULL;
/* when we pass a frame to the encoder, it may keep a reference to it
* internally; make sure we do not overwrite it here */
if (av_frame_make_writable(ost->frame) < 0)
exit(1);
if (c->pix_fmt != AV_PIX_FMT_YUV420P) {
/* as we only generate a YUV420P picture, we must convert it
* to the codec pixel format if needed */
if (!ost->sws_ctx) {
ost->sws_ctx = sws_getContext(c->width, c->height,
AV_PIX_FMT_YUV420P,
c->width, c->height,
c->pix_fmt,
SCALE_FLAGS, NULL, NULL, NULL);
if (!ost->sws_ctx) {
fprintf(stderr,
"Could not initialize the conversion context\n");
exit(1);
}
}
#if __AUDIO_ONLY
image_for_audio_only(ost->tmp_frame, ost->next_pts, c->width, c->height);
#endif
sws_scale(ost->sws_ctx, (const uint8_t * const *) ost->tmp_frame->data,
ost->tmp_frame->linesize, 0, c->height, ost->frame->data,
ost->frame->linesize);
} else {
//This is where I set the information I got from the glfw window.
set_frame_yuv_from_rgb(ost->frame, ost->sws_ctx);
}
ost->frame->pts = ost->next_pts++;
return ost->frame;
}
void video_encoder::set_frame_yuv_from_rgb(AVFrame *frame, struct SwsContext *sws_context) {
const int in_linesize[1] = { 4 * width };
//uint8_t* dest[4] = { rgb_data, NULL, NULL, NULL };
sws_context = sws_getContext(
width, height, AV_PIX_FMT_RGBA,
width, height, AV_PIX_FMT_YUV420P,
SWS_BICUBIC, 0, 0, 0);
sws_scale(sws_context, (const uint8_t * const *)&rgb_data, in_linesize, 0,
height, frame->data, frame->linesize);
}
rgb_data is the buffer I got from the glfw window. It's simply an uint8_t*.
And at the end of all this, here is what the encoded output looks like when ran through mplayer:
It's much lower quality compare to the glfw window. How can I improve the quality of the video?
Here are encoding settings from youtube for a better quality:
https://support.google.com/youtube/answer/1722171
Make sure to have high bitrate and gop size. E.g. 5Mbps and 60 correspondingly.

Failing to properly initialize AVFrame for sws_scale conversion

I'm decoding video using FFMpeg, and want to edit the decoded frames using OpenGL, but in order to do that I need to convert the data in AVFrame from YUV to RGB.
In order to do that I create a new AVFrame:
AVFrame *inputFrame = av_frame_alloc();
AVFrame *outputFrame = av_frame_alloc();
av_image_alloc(outputFrame->data, outputFrame->linesize, width, height, AV_PIX_FMT_RGB24, 1);
av_image_fill_arrays(outputFrame->data, outputFrame->linesize, NULL, AV_PIX_FMT_RGB24, width, height, 1);
Create a conversion context:
struct SwsContext *img_convert_ctx = sws_getContext(width, height, AV_PIX_FMT_YUV420P,
width, height, AV_PIX_FMT_RGB24,
0, NULL, NULL, NULL);
And then try to convert it to RGB:
sws_scale(img_convert_ctx, (const uint8_t *const *)&inputFrame->data, inputFrame->linesize, 0, inputFrame->height, outputFrame->data, outputFrame->linesize);
But this causes an "[swscaler # 0x123f15000] bad dst image pointers" error during run time. When I went over FFMpeg's source I found out that the reason is that outputFrame's data wasn't initialized, but I don't understand how it should be.
All existing answers or tutorials that I found (see example) seem to use deprecated APIs, and it's unclear how to use the new APIs. I'd appreciate any help.
Here's how I call sws_scale:
image buf2((buf.w + 15)/16*16, buf.h, 3);
sws_scale(sws_ctx, (const uint8_t * const *)frame->data, frame->linesize, 0, c->height, (uint8_t * const *)buf2.c, &buf2.ys);
There are two differences here:
You pass &inputFrame->data but it shall be inputFrame->data without the address-of operator.
You don't have to allocate a second frame structure. The sws_scale doesn't care about it. It just needs a chunk of memory of the proper size (and maybe alignment).
In my case the av_image_alloc / av_image_fill_arrays did not create the frame->data pointers.
Here is how I did it, not sure if everything is correct, but it works:
d->m_FrameCopy = av_frame_alloc();
uint8_t* buffer = NULL;
int numBytes;
// Determine required buffer size and allocate buffer
numBytes = avpicture_get_size(
AV_PIX_FMT_RGB24, d->m_Frame->width, d->m_Frame->height);
buffer = (uint8_t*)av_malloc(numBytes * sizeof(uint8_t));
avpicture_fill(
(AVPicture*)d->m_FrameCopy,
buffer,
AV_PIX_FMT_RGB24,
d->m_Frame->width,
d->m_Frame->height);
d->m_FrameCopy->format = AV_PIX_FMT_RGB24;
d->m_FrameCopy->width = d->m_Frame->width;
d->m_FrameCopy->height = d->m_Frame->height;
d->m_FrameCopy->channels = d->m_Frame->channels;
d->m_FrameCopy->channel_layout = d->m_Frame->channel_layout;
d->m_FrameCopy->nb_samples = d->m_Frame->nb_samples;

image created on Cimg display different on a pdf when saved using GDI+ with pdf created with jagPDF

What I need to do is very simple, I need to plot a vector using CIMG and then save the graph ina jpg and add the jpg to a PDF document using JAGPDF. In order to save CIMG as JPG, the program uses an external program called Image Magick.
I wanted to avoid using that program and use GDI+ instead by first saving the CIMG as a BMP (it does that natively) and then saving the jpg from the bmp.
MCVE program looks like this
#include "CImg.h"
#include <jagpdf/api.h>
#include <vector>
using namespace jag;
using namespace cimg_library;
int main(int argc, char** const argv)
{
const float x0 = 0;
const float x1 = 9;
const int resolution = 5000;
// Create plot data.
CImg<double> values(1, resolution, 1, 1, 0);
const unsigned int r = resolution - 1;
for (int i1 = 0; i1 < resolution; ++i1)
{
double xtime = x0 + i1*(x1 - x0) / r;
values(0, i1) = 2 * sin(xtime);
}
CImg<unsigned char> graph;
graph.assign(750, 240, 1, 3, 255);
static const unsigned char black[] = { 0, 0, 0 }, white[] = { 255, 255, 255 };
static const unsigned char red[] = { 255, 200, 200 }, bred[] = { 255, 0, 0 };
graph.draw_grid(6, 6, 0, 0, false, true, red, 10.0f, 0xFFFFFFFF, 0xFFFFFFFF);
graph.draw_grid(30, 30, 0, 0, false, true, bred, 10.0f, 0xFFFFFFFF, 0xFFFFFFFF);
graph.draw_graph(values, black, 1, 1, 1, 2, -2, 0xFFFFFFFF);;
//////////////Method 1: Using Image Magick////////////////
graph.save_jpeg("plot2.jpg");
pdf::Document doc(pdf::create_file("report.pdf"));
doc.page_start(848.68, 597.6);
pdf::Image imag2 = doc.image_load_file("plot2.jpg");
doc.page().canvas().image(imag2, 50, 50);
doc.page_end();
doc.finalize();
//////////////Method 2: Using GDI+////////////////
graph.save("plot.bmp");
SaveFile();
pdf::Document doc2(pdf::create_file("report2.pdf"));
doc2.page_start(848.68, 597.6);
pdf::Image imag = doc2.image_load_file("plot.jpg");
doc2.page().canvas().image(imag, 50, 50);
doc2.page_end();
doc2.finalize();
return 0;
}
With SaveFile() being the following function using GDI+ to convert from plot.bmp to plot.jpg
#include <windows.h>
#include <objidl.h>
#include <gdiplus.h>
#include "GdiplusHelperFunctions.h"
#pragma comment (lib,"Gdiplus.lib")
VOID SaveFile()
{
// Initialize GDI+.
Gdiplus::GdiplusStartupInput gdiplusStartupInput;
ULONG_PTR gdiplusToken;
GdiplusStartup(&gdiplusToken, &gdiplusStartupInput, NULL);
CLSID encoderClsid;
Status stat;
Image* image = new Gdiplus::Image(L"plot.bmp");
// Get the CLSID of the PNG encoder.
GetEncoderClsid(L"image/jpeg", &encoderClsid);
stat = image->Save(L"plot.jpg", &encoderClsid, NULL);
if (stat == Ok)
printf("plot.jpg was saved successfully\n");
else
printf("Failure: stat = %d\n", stat);
delete image;
GdiplusShutdown(gdiplusToken);
}
Both methods save jpgs that in properties seems to have the same size but the first put the image correctly in the pdf while the second puts a huge image in the pdf even though they are supossed to be the same size. How can I fix this?
Attached is scrrenshots of report1 and report2
SOLUTION
With your suggestions, I was able to modify the SaveFile function in order to be able to control de DPI, I post the new code in case someone needs it.
VOID SaveFile()
{
// Initialize GDI+.
Gdiplus::GdiplusStartupInput gdiplusStartupInput;
ULONG_PTR gdiplusToken;
GdiplusStartup(&gdiplusToken, &gdiplusStartupInput, NULL);
CLSID encoderClsid;
Status stat;
EncoderParameters encoderParameters;
ULONG quality;
Gdiplus::Bitmap* bitmap = new Gdiplus::Bitmap(L"plot.bmp");
Gdiplus::REAL dpi = 96;
bitmap->SetResolution(dpi,dpi);
// Get the CLSID of the PNG encoder.
GetEncoderClsid(L"image/jpeg", &encoderClsid);
encoderParameters.Count = 1;
encoderParameters.Parameter[0].Guid = EncoderQuality;
encoderParameters.Parameter[0].Type = EncoderParameterValueTypeLong;
encoderParameters.Parameter[0].NumberOfValues = 1;
quality = 100;
encoderParameters.Parameter[0].Value = &quality;
stat = bitmap->Save(L"plot.jpg", &encoderClsid, &encoderParameters);
if (stat == Ok)
printf("plot.jpg was saved successfully\n");
else
printf("Failure: stat = %d\n", stat);
delete bitmap;
GdiplusShutdown(gdiplusToken);
return;
}
I would guess ImageMagick include some perks that filter the image to fit the canvas. The smartass.
I'd try resizing the image before exporting to JPEG. You might give a go to this guide. It basically says you can resize the bmp (in the example it checks w/h ratio but well...). THe goal should be to specify the size you need for the canvas is exactly that.
Gdiplus::Bitmap* GDIPlusImageProcessor::ResizeClone(Bitmap *bmp, INT width, INT height)
{
UINT o_height = bmp->GetHeight();
UINT o_width = bmp->GetWidth();
INT n_width = width;
INT n_height = height;
double ratio = ((double)o_width) / ((double)o_height);
if (o_width > o_height) {
// Resize down by width
n_height = static_cast<UINT>(((double)n_width) / ratio);
} else {
n_width = static_cast<UINT>(n_height * ratio);
}
Gdiplus::Bitmap* newBitmap = new Gdiplus::Bitmap(n_width, n_height, bmp->GetPixelFormat());
Gdiplus::Graphics graphics(newBitmap);
graphics.DrawImage(bmp, 0, 0, n_width, n_height);
return newBitmap;
}
And then, save it using the encoder. ALso, you'd like to check whether you might need to set the quality of the resulting JPEG using encoderparameters as shown in the official documentation.
// Get the CLSID of the JPEG encoder.
GetEncoderClsid(L"image/jpeg", &encoderClsid);
// Before we call Image::Save, we must initialize an
// EncoderParameters object. The EncoderParameters object
// has an array of EncoderParameter objects. In this
// case, there is only one EncoderParameter object in the array.
// The one EncoderParameter object has an array of values.
// In this case, there is only one value (of type ULONG)
// in the array. We will let this value vary from 0 to 100.
encoderParameters.Count = 1;
encoderParameters.Parameter[0].Guid = EncoderQuality;
encoderParameters.Parameter[0].Type = EncoderParameterValueTypeLong;
encoderParameters.Parameter[0].NumberOfValues = 1;
// Save the image as a JPEG with quality level 0.
quality = 0;
encoderParameters.Parameter[0].Value = &quality;
stat = image->Save(L"Shapes001.jpg", &encoderClsid, &encoderParameters);
if(stat == Ok)
wprintf(L"%s saved successfully.\n", L"Shapes001.jpg");
else
wprintf(L"%d Attempt to save %s failed.\n", stat, L"Shapes001.jpg");
// Save the image as a JPEG with quality level 50.
quality = 50;
encoderParameters.Parameter[0].Value = &quality;
stat = image->Save(L"Shapes050.jpg", &encoderClsid, &encoderParameters);
if(stat == Ok)
wprintf(L"%s saved successfully.\n", L"Shapes050.jpg");
else
wprintf(L"%d Attempt to save %s failed.\n", stat, L"Shapes050.jpg");
EDIT: JAGPDF also says image DPI is taken into account when painting. SO we probably are on the right path.
Let's say we would like to tile a region of the page with our image.
To do so we need to know the image dimensions. Because width() and
width() return size in pixels we need to recalculate these to user
space units.
Image DPI is taken into account when the image is painted onto a
canvas. An image usually specifies its DPI. If it is not so a value of
images.default_dpi is used
img_width = img.width() / img.dpi_x() * 72
img_height = img.height() / img.dpi_y() * 72
for x in range(7):
for y in range(15):
canvas.image(img, 90 + x * img_width, 100 + y * img_height)
You might try changing DPI using this SO answer.
If I understand your question correctly, your aim is to remove the dependency on ImageMagick.
You can do that more simply by telling CImg to use its built-in support for JPEG. All you need to do is
define cimg_use_jpeg
link with libjpeg
So your compilation command becomes:
g++ -Dcimg_use_jpeg ... -ljpeg

Process AVFrame using opencv mat causing encoding error

I'm trying to decode a video file using ffmpeg, grab the AVFrame object, convert it to opencv mat object, do some processing then convert it back to AVFrame object and encode it back to a video file.
Well, the program can run, but it produces bad result.
I Keep getting errors like "top block unavailable for requested intra mode at 7 19", "error while decoding MB 7 19, bytestream 358", "concealing 294 DC, 294AC, 294 MV errors in P frame" etc.
And the result video got glithes all over it. like this,
I'm guessing it's because my AVFrame to Mat and Mat to AVFrame methods, and here they are
//unspecified function
temp_rgb_frame = avcodec_alloc_frame();
int numBytes = avpicture_get_size(PIX_FMT_RGB24, width, height);
uint8_t * frame2_buffer = (uint8_t *)av_malloc(numBytes * sizeof(uint8_t));
avpicture_fill((AVPicture*)temp_rgb_frame, frame2_buffer, PIX_FMT_RGB24, width, height);
void CoreProcessor::Mat2AVFrame(cv::Mat **input, AVFrame *output)
{
//create a AVPicture frame from the opencv Mat input image
avpicture_fill((AVPicture *)temp_rgb_frame,
(uint8_t *)(*input)->data,
AV_PIX_FMT_RGB24,
(*input)->cols,
(*input)->rows);
//convert the frame to the color space and pixel format specified in the sws context
sws_scale(
rgb_to_yuv_context,
temp_rgb_frame->data,
temp_rgb_frame->linesize,
0, height,
((AVPicture *)output)->data,
((AVPicture *)output)->linesize);
(*input)->release();
}
void CoreProcessor::AVFrame2Mat(AVFrame *pFrame, cv::Mat **mat)
{
sws_scale(
yuv_to_rgb_context,
((AVPicture*)pFrame)->data,
((AVPicture*)pFrame)->linesize,
0, height,
((AVPicture *)temp_rgb_frame)->data,
((AVPicture *)temp_rgb_frame)->linesize);
*mat = new cv::Mat(pFrame->height, pFrame->width, CV_8UC3, temp_rgb_frame->data[0]);
}
void CoreProcessor::process_frame(AVFrame *pFrame)
{
cv::Mat *mat = NULL;
AVFrame2Mat(pFrame, &mat);
Mat2AVFrame(&mat, pFrame);
}
Am I doing something wrong with the memory? Because if I remove the processing part, just decode and then encode the frame, the result is correct.
Well, it turns out I made a mistake at the initialization of temp_rgb_frame,if should be like this,
temp_rgb_frame = avcodec_alloc_frame();
int numBytes = avpicture_get_size(PIX_FMT_RGB24, width, height);
uint8_t * frame2_buffer = (uint8_t *)av_malloc(numBytes * sizeof(uint8_t));
avpicture_fill((AVPicture*)temp_rgb_frame, frame2_buffer, PIX_FMT_RGB24, width, height);

sws_scale YUV --> RGB distorted image

I want to convert YUV420P image (received from H.264 stream) to RGB, while also resizing it, using sws_scale.
The size of the original image is 480 × 800. Just converting with same dimensions works fine.
But when I try to change the dimensions, I get a distorted image, with the following pattern:
changing to 481 × 800 will yield a distorted B&W image which looks like it's cut in the middle
482 × 800 will be even more distorted
483 × 800 is distorted but in color
484 × 800 is ok (scaled correctly).
Now this pattern follows - scaling will only work fine if the difference between divides by 4.
Here's a sample code of the way that I decode and convert the image. All methods show "success".
int srcX = 480;
int srcY = 800;
int dstX = 481; // or 482, 483 etc
int dstY = 800;
AVFrame* avFrameYUV = avcodec_alloc_frame();
avpicture_fill((AVPicture *)avFrameYUV, decoded_yuv_frame, PIX_FMT_YUV420P, srcX , srcY);
AVFrame *avFrameRGB = avcodec_alloc_frame();
AVPacket avPacket;
av_init_packet(&avPacket);
avPacket.size = read; // size of raw data
avPacket.data = raw_data; // raw data before decoding to YUV
int frame_decoded = 0;
int decoded_length = avcodec_decode_video2(g_avCodecContext, avFrameYUV, &frame_decoded, &avPacket);
int size = dstX * dstY * 3;
struct SwsContext *img_convert_ctx = sws_getContext(srcX, srcY, SOURCE_FORMAT, dstX, dstY, PIX_FMT_BGR24, SWS_BICUBIC, NULL, NULL, NULL);
avpicture_fill((AVPicture *)avFrameRGB, rgb_frame, PIX_FMT_RGB24, dstX, dstY);
sws_scale(img_convert_ctx, avFrameYUV->data, avFrameYUV->linesize, 0, srcY, avFrameRGB->data, avFrameRGB->linesize);
// draws the resulting frame with windows BitBlt
DrawBitmap(hdc, dstX, dstY, rgb_frame, size);
sws_freeContext(img_convert_ctx);
When you make a bitmap image, the width of image MUST be multiple of 4.
So you have to change width like 480, 484, 488, 492 ...
Here is method to change to multiple of 4
#define WIDTHBYTES(bits) (((bits) + 31) / 32 * 4)
void main()
{
BITMAPFILEHEADER bmFileHeader;
BITMAPINFOHEADER bmInfoHeader;
// load image
// ...
// when you use the method, put parameter like this.
int tempWidth = WIDTHBYTES(width * bmInfoHeader.biBitCount);
}
I hope you solve the problem.