I am trying convert a RGB image into YUV.
I am loading image using openCV.
I am calling the function as follows:
//I know IplImage is outdated
IplImage* im = cvLoadImage("1.jpg", 1);
//....
bgr2yuv(im->imageData, dst, im->width, im->height);
the function to convert Color image to yuv image is given below.
I am using ffmpeg to do that.
void bgr2yuv(unsigned char *src, unsigned char *dest, int w, int h)
{
AVFrame *yuvIm = avcodec_alloc_frame();
AVFrame *rgbIm = avcodec_alloc_frame();
avpicture_fill(rgbIm, src, PIX_FMT_BGR24, w, h);
avpicture_fill(yuvIm, dest, PIX_FMT_YUV420P, w, h);
av_register_all();
struct SwsContext * imgCtx = sws_getCachedContext(imgCtx,
w, h,(::PixelFormat)PIX_FMT_BGR24,
w, h,(::PixelFormat)PIX_FMT_YUV420P,
SWS_BICUBIC, NULL, NULL, NULL);
sws_scale(imgCtx, rgbIm->data, rgbIm->linesize,0, h, yuvIm->data, yuvIm->linesize);
av_free(yuvIm);
av_free(rgbIm);
}
I am getting wrong output after conversion.
I am thinking this is due to padding happening in the IplImage.
(My input image width is not multiple of 4).
I updated linesize variable even after that I am not getting correct output.
Its working fine when I am using images whose width is multiple of 4.
Can anybody tell what is the problem in the code.
Check IplImage::align or IplImage::widthStep and use these to set AVFrame::linesize. For the RGB frame, for example, you would set:
frame->linesize[0] = img->widthStep;
The layout of the dst array can be whatever you want, it depends on how you're using it afterwards.
We need to do as follows:
rgbIm->linesize[0] = im->widthStep;
But I think output data from sws_scale() is not padded to make it multiple of 4.
So when you are copying this data (dest) again to IplImage this will
create problem in displaying, saving etc..
So we need to set widthStep=width as follows:
IplImage* yuvImage = cvCreateImageHeader(cvGetSize(im), 8, 1);
yuvImage->widthStep = yuvImage->width;
yuvImage->imageData = dest;
Related
How can I load RAW 16-bit grayscale image with FreeImage?
I have unsigned char* buffer with raw data. I know its dimensions in pixels and I know it is 16bit grayscale.
I'm trying to load it with
FIBITMAP* bmp = FreeImage_ConvertFromRawBits(buffer, 1000, 1506, 2000, 16, 0, 0, 0);
and get broken RGB888 image. It is unclear what color masks I should use for grayscale as it has only one channel.
After many experiments I found partially working solution with FreeImage_ConvertFromRawBitsEx:
FIBITMAP* bmp = FreeImage_ConvertFromRawBitsEx(true, buffer, FIT_UINT16, 1000, 1506, 2000, 16, 0xFFFF, 0xFFFF, 0xFFFF);
(thanks #1201ProgramAlarm for hint with masks).
In this way, FreeImage loads the data, but in some semi-custom format. Most of conversion and saving functions (tried: JPG, PNG, BMP, TIF) fail.
As I can't load data in native 16bit format, I preferred to convert it into 8bit grayscale
unsigned short* buffer = new unsigned short[1000 * 1506];
// load data
unsigned char* buffer2 = new unsigned char[1000 * 1506];
for (int i = 0; i < 1000 * 1506; i++)
buffer2[i] = (unsigned char)(buffer[i] / 256.f);
FIBITMAP* bmp = FreeImage_ConvertFromRawBits(buffer2, 1000, 1506, 1000, 8, 0xFF, 0xFF, 0xFF, true);
This is really not the best solution, I even don't want to mark it as right answer (will wait for something better). But after this the format will be convenient for FreeImage and it could save/convert data to whatever.
Concerning your issue: I have read this from their PDF documentation FreeImage1370.pdf:
FreeImage_ConvertFromRawBits
1 4 8 16 24 32
DLL_API FIBITMAP *DLL_CALLCONV FreeImage_ConvertFromRawBits(BYTE *bits, int width, int
height, int pitch, unsigned bpp, unsigned red_mask, unsigned green_mask, unsigned
blue_mask, BOOL topdown FI_DEFAULT(FALSE));
Converts a raw bitmap somewhere in memory to a FIBITMAP. The parameters in this
function are used to describe the raw bitmap. The first parameter is a pointer to the start of
the raw bits. The width and height parameter describe the size of the bitmap. The pitch
defines the total width of a scanline in the source bitmap, including padding bytes that may be
applied. The bpp parameter tells FreeImage what the bit depth of the bitmap is. The
red_mask, green_mask and blue_mask parameters tell FreeImage the bit-layout of the color
components in the bitmap. The last parameter, topdown, will store the bitmap top-left pixel
first when it is TRUE or bottom-left pixel first when it is FALSE.
When the source bitmap uses a 32-bit padding, you can calculate the pitch using the
following formula:
int pitch = ((((bpp * width) + 31) / 32) * 4);
In the code you are showing:
FIBITMAP* bmp = FreeImage_ConvertFromRawBits(buffer, 1000, 1506, 2000, 16, 0, 0, 0);
You have the appropriate FIBTMAP* return type, you pass in your buffer of raw bits. From there the 2nd & 3rd parameters which are the width & height: width = 1000, height = 1506 and the 4th parameter which is the pitch: pitch = 2000 (if the bitmap is using 32bit padding refer to the last note above), the 5th parameter will be the bit depth measured in bpp you have as bpp = 16, the next 3 parameters are for your RGB color masks. Here you label them all as being 0. The last parameter is a bool flag for the orientation of the image :
if (topdown == true ) {
stores top-left pixel first )
else {
bottom left pixel is stored first
}
in which you omit the value.
Without more code of how you are reading in the file, parsing the header information etc. to prepare your buffer it is hard to tell where else there may be an error or an issue, but from what you provided; I think you need to check the color channel masks for grayscale images.
EDIT - I found another PDF for FreeImage from standford.edu here that refers to an older version 3.13.1 however the function declaration - definition doesn't look like it has changed any and they provide examples for b FreeImage_ConvertToRawBits & Free_Image_ConvertFromRawBits:
// this code assumes there is a bitmap loaded and
// present in a variable called ‘dib’
// convert a bitmap to a 32-bit raw buffer (top-left pixel first)
// --------------------------------------------------------------
FIBITMAP *src = FreeImage_ConvertTo32Bits(dib);
FreeImage_Unload(dib);
// Allocate a raw buffer
int width = FreeImage_GetWidth(src);
int height = FreeImage_GetHeight(src);
int scan_width = FreeImage_GetPitch(src);
BYTE *bits = (BYTE*)malloc(height * scan_width);
// convert the bitmap to raw bits (top-left pixel first)
FreeImage_ConvertToRawBits(bits, src, scan_width, 32,
FI_RGBA_RED_MASK, FI_RGBA_GREEN_MASK, FI_RGBA_BLUE_MASK,
TRUE);
FreeImage_Unload(src);
// convert a 32-bit raw buffer (top-left pixel first) to a FIBITMAP
// ----------------------------------------------------------------
FIBITMAP *dst = FreeImage_ConvertFromRawBits(bits, width, height, scan_width,
32, FI_RGBA_RED_MASK, FI_RGBA_GREEN_MASK, FI_RGBA_BLUE_MASK, FALSE);
I think this should help you with your question about the bit masks for the color channels in a grayscale image.
You already mentioned the FreeImage_ConvertFromRawBitsEx() function, which was added at some point between FreeImage v3.8 and v3.17, but are you calling it correctly? I was able to use this function with 16-bit grayscale data:
int nBytesPerRow = nWidth * 2;
int nBitsPerPixel = 16;
FIBITMAP* pFIB = FreeImage_ConvertFromRawBitsEx(TRUE, pImageData, FIT_UINT16, nWidth, nHeight, nBytesPerRow, nBitsPerPixel, 0, 0, 0, TRUE);
Note that nBytesPerRow and nBitsPerPixel have to be specified correctly for the 16-bit data. Also, I believe the color mask parameters are irrelevant for this data, since it is monochrome.
EDIT: I noticed that you said that saving the 16-bit data did not work correctly. That may be due to the file formats themselves. The only file format that I have found to be compatible with 16-bit grayscale data is TIFF. So, if you have 16-bit grayscale data, you can save a TIFF with FreeImage_Save() but you cannot save a BMP.
Is there a way to convert from RGB to YUYV (YUY 4:2:2) format? I noted that OpenCV has reverse operation, but not RGB to YUYV for some reason. Maybe someone can point to code which does that (even outside of OpenCV library)?
UPDATE
I found libyuv library which may work for this purpose by doing BGR to ARGB conversion and then ARGB to YUY2 format (hopefully this is the same as YUYV 4:2:2). But it doesn't seem to work. Do you happen to know what yuyv buffer dimensions/type should look like? What its stride?
To clarify YUYV and YUY2 are the same formats if it helps.
UPDATE 2
Here is my code of using libyuv library:
Mat frame;
// Convert original image im from BGR to BGRA for further use in libyuv
cvtColor(im, frame, CVX_BGR2BGRA);
// Actually libyuv requires ARGB (i.e. reverse of BGRA), so I swap channels here
int from_to[] = { 0,3, 1,2, 2,1, 3,0 };
mixChannels(&frame, 1, &frame, 1, from_to, 4);
// This is the most confusing part. Not sure what argb_stride suppose to be - length of a row in bytes or size of single value in the array?
const uint8_t* argb_data = frame.data;
int argb_stride = 8;
// Also it is not clear what size of yuyv frame should be since we duplicate one Y
Mat yuyv(frame.rows, frame.cols, CVX_8UC2);
uint8_t* yuyv_data = yuyv.data;
int yuyv_stride = 16;
// Do actual conversion
libyuv::ARGBToYUY2(argb_data, argb_stride, yuyv_data, yuyv_stride,
frame.cols, frame.rows);
// Then I feed yuyv_data to video stream buffer and see green or purple image instead of video stream.
UPDATE 3
Mat frame;
cvtColor(im, frame, CVX_BGR2BGRA);
// ARGB
int from_to[] = { 0,3, 1,2, 2,1, 3,0 };
Mat rgba(frame.size(), frame.type());
mixChannels(&frame, 1, &rgba, 1, from_to, 4);
const uint8_t* argb_data = rgba.data;
int argb_stride = rgba.cols*4;
Mat yuyv(rgba.rows, rgba.cols, CVX_8UC2);
uint8_t* yuyv_data = yuyv.data;
int yuyv_stride = width * 2;
int res = libyuv::ARGBToYUY2(argb_data, argb_stride, yuyv_data, yuyv_stride, rgba.cols, rgba.rows);
It appears that although method is called ARGBToYUY2 it requires BGRA order of channels (not reverse).
I want to transfer opengl framebuffer data to AVCodec as fast as possible.
I've already converted RGB to YUV with shader and read it with glReadPixels
I still need to fill AVFrame data manually. Is there any better way?
AVFrame *frame;
// Y
frame->data[0][y*frame->linesize[0]+x] = data[i*3];
// U
frame->data[1][y*frame->linesize[1]+x] = data[i*3+1];
// V
frame->data[2][y*frame->linesize[2]+x] = data[i*3+2];
You can use sws_scale.
In fact, you don't need shaders for converting RGB->YUV. Believe me, it's not gonna have a very different performance.
swsContext = sws_getContext(WIDTH, HEIGHT, AV_PIX_FMT_RGBA, WIDTH, HEIGHT, AV_PIX_FMT_YUV, SWS_BICUBIC, 0, 0, 0 );
sws_scale(swsContext, (const uint8_t * const *)sourcePictureRGB.data, sourcePictureRGB.linesize, 0, codecContext->height, destinyPictureYUV.data, destinyPictureYUV.linesize);
The data in destinyPictureYUV will be ready to go to the codec.
In this sample, destinyPictureYUV is the AVFrame you want to fill up. Try to setup like this:
AVFrame * frame;
AVPicture destinyPictureYUV;
avpicture_alloc(&destinyPictureYUV, codecContext->pix_fmt, newCodecContext->width, newCodecContext->height);
// THIS is what you want probably
*reinterpret_cast<AVPicture *>(frame) = destinyPictureYUV;
With this setup you CAN ALSO fill up with the data you already converted to YUV in the GPU if you desire... you can choose the way you want.
I am quite suprised since I'm not able to find any method that loads an image from raw data. Is there any elegant way to do it? I just need to create a QImage or similar from raw bitmap binary data (no header).
You can create a QImage object from raw data with the ctor that takes an array of uchars.
You need to specify the format of the data given to the QImage (RGB, RGBA, Indexed, etc.)
QImage ( uchar * data, int width, int height, Format format )
QImage ( const uchar * data, int width, int height, Format format )
QImage ( uchar * data, int width, int height, int bytesPerLine, Format format )
QImage ( const uchar * data, int width, int height, int bytesPerLine,
Format format )
http://doc.qt.digia.com/qt/qimage.html
E.g.:
uchar* data = getDataFromSomewhere();
QImage img(data, width, height, QImage::Format_ARGB32);
Hope that helps.
your question is not clear. use Qpixmap. and Qbyte array. its very easy.
QPixmap pic;
pic.loadFromData(array); //array contains a bite array of the image.
label->setPixmap(pic); //do what ever you want from the image. here I set it to a lable.
I'm using OpenCV to extract a subimage of a scanned document and would like to use tesseract to perform OCR over this subimage.
I found out that I can use two methods for text recognition in tesseract, but so far I wasn't able to find a working solution.
A.) How can I convert a cv::Mat into a PIX*?
(PIX* is a datatype of leptonica)
Based on vasiles code below, this is essentially my current code:
cv::Mat image = cv::imread("c:/image.png");
cv::Mat subImage = image(cv::Rect(50, 200, 300, 100));
int depth;
if(subImage.depth() == CV_8U)
depth = 8;
//other cases not considered yet
PIX* pix = pixCreateHeader(subImage.size().width, subImage.size().height, depth);
pix->data = (l_uint32*) subImage.data;
tesseract::TessBaseAPI tess;
STRING text;
if(tess.ProcessPage(pix, 0, 0, &text))
{
std::cout << text.string();
}
While it doesn't crash or anything, the OCR result still is wrong. It should recognize one word of my sample image, but instead it returns some non-readable characters.
The method PIX_HEADER doesn't exist, so I used pixCreateHeader, but it doesn't take the number of channels as an argument. So how can I set the number of channels?
B.) How can I use cv::Mat for TesseractRect() ?
Tesseract offers another method for text recognition with this signature:
char * TessBaseAPI::TesseractRect (
const UINT8 * imagedata,
int bytes_per_pixel,
int bytes_per_line,
int left,
int top,
int width,
int height
)
Currently I am using the following code, but it also returns non-readable characters (although different ones than from the code above.
char* cr = tess.TesseractRect(
subImage.data,
subImage.channels(),
subImage.channels() * subImage.size().width,
0,
0,
subImage.size().width,
subImage.size().height);
tesseract::TessBaseAPI tess;
cv::Mat sub = image(cv::Rect(50, 200, 300, 100));
tess.SetImage((uchar*)sub.data, sub.size().width, sub.size().height, sub.channels(), sub.step1());
tess.Recognize(0);
const char* out = tess.GetUTF8Text();
For Anybody using the JavaCPP presets of OpenCV/Tesseract, here is what works
Mat img = imread("file.jpg");
Mat gray = new Mat();
cvtColor(img, gray, CV_BGR2GRAY);
// api is a Tesseract client which is initialised
api.SetImage(gray.data().asBuffer(),gray.size().width(),gray.size().height(),gray.channels(),gray.size1())
cv::Mat image = cv::imread(argv[1]);
cv::Mat gray;
cv::cvtColor(image, gray, CV_BGR2GRAY);
PIX *pixS = pixCreate(gray.size().width, gray.size().height, 8);
for(int i=0; i<gray.rows; i++)
for(int j=0; j<gray.cols; j++)
pixSetPixel(pixS, j,i, (l_uint32) gray.at<uchar>(i,j));
First, make a deep copy of your subImage, so that it will be stored in a coninuous memory block:
cv::Mat subImage = image(cv::Rect(50, 200, 300, 100)).clone();
Then, init a PIX headed (I don't know how) with the correct parameters.
// ???? Put your own constructor here.
PIX* pix = new PIX_HEADER(width, height, channels, depth);
OR, create it manually:
PIX pix;
pix.width = subImage.width;
...
Then set the pix data pointer to the subImage data pointer
pix.data = subImage.data;
Finally, make sure your subImage objects does not go out of scope before you finish your work with pix.