Below is te RGB output of a supposed YUV420SP buffer. No conversion, I' m just displaying the YUV420SP as if it were RGB, just to see some patterns.
The image is in a single unsigned char* buffer of size width*height*3. So if this is indeed YUV420SP, then I should have the Y as a black and white image, and then UV interleaved. I think I should see the Y as a black and white image, but why it repeats 3 times in my image? And should I see anything in the UV part?
Of course I tried to convert this buffer to RGB. I used https://github.com/andrechen/yuv2rgb/blob/master/yuv2rgb.h#L70 but I only get a completely black image.
The format looks like I420 format (also called YV12).
I420 is YUV 4:2:0 format with fully planar ordered format.
In YUV420, the Y color channel is the Luma (brightness) of each pixel.
U and V are the Chroma (color) channels.
The resolution of U and V is half of Y in both axes (downsampled by a factor of 0.5 in each axis).
I420 illustration:
Assume unsigned char* src is a pointer to the frame buffer, and the resolution is 640x480:
src -> YYYYYY
YYYYYY
YYYYYY
YYYYYY
src + 640*480 -> UUU
UUU
src + (320*240)*5 -> VVV
VVV
I used MATLAB code for restoring the RGB image from the image you have posted.
Here is the result:
MATLAB code (just for reference):
I = imread('Test.png');
R = I(:,:,1);G = I(:,:,2);B = I(:,:,3);
T = zeros(size(R,1), size(R,2)*3, 'uint8');
T(:, 1:3:end) = R;T(:, 2:3:end) = G;T(:, 3:3:end) = B;
T = T';T = T(:);
Y = T(1:640*480);
U = T(640*480+1:640*480+640*480/4);
V = T(640*480+640*480/4+1:640*480+(640*480/4)*2);
Y = (reshape(Y, [640, 480]))';
U = (reshape(U, [320, 240]))';
V = (reshape(V, [320, 240]))';
U = imresize(U, 2);
V = imresize(V, 2);
YUV = cat(3, Y, U, V);
RGB = ycbcr2rgb(YUV);
I've done a few YUV renderers before.
A YUV 420 buffer should contain width*height bytes for Y, followed by (width*height)/4) bytes for U. And another (width*height)/4) bytes for V. Hence, if your YUV byte buffer should contain (width*height*3)/2 bytes in size.
Just to see the grey scale pattern as you describe it, you'd need to convert the "Y" bytes into 24-bit RGB like the following:
Something like this:
unsigned char* YUV_BYTES = < some buffer of size (width*height*3)/2 with bytes copied in>
unsigned char* RGB_BYTES = < some buffer of size width*height*3 >
const unsigned char* dst = RGB_BYTES;
for (unsigned int r = 0; r < height; r++)
{
unsigned int row_offset = r*width;
for (unsigned int c = 0; c < width; c++)
{
*dst[0] = YUV[row_offset + c]; // R
*dst[1] = YUV[row_offset + c]; // G
*dst[2] = YUV[row_offset + c]; // B
dst += 3;
}
}
I think there's also an implicit assumption about the width and height of YUV images always being divisible by 4. Your renderer might draw this image upside down depending on your graphics library and platform.
Related
I am accessing the image like so:
pDoc = GetDocument();
int iBitPerPixel = pDoc->_bmp->bitsperpixel; // used to see if grayscale(8 bits) or RGB (24 bits)
int iWidth = pDoc->_bmp->width;
int iHeight = pDoc->_bmp->height;
BYTE *pImg = pDoc->_bmp->point; // pointer used to point at pixels in the image
int Wp = iWidth;
const int area = iWidth * iHeight;
int r; // red pixel value
int g; // green pixel value
int b; // blue pixel value
int gray; // gray pixel value
BYTE *pImgGS = pImg; // grayscale image pixel array
and attempting to change the rgb image to gray like so:
// convert RGB values to grayscale at each pixel, then put in grayscale array
for (int i = 0; i<iHeight; i++)
for (int j = 0; j<iWidth; j++)
{
r = pImg[i*iWidth * 3 + j * 3 + 2];
g = pImg[i*iWidth * 3 + j * 3 + 1];
b = pImg[i*Wp + j * 3];
r * 0.299;
g * 0.587;
b * 0.144;
gray = std::round(r + g + b);
pImgGS[i*Wp + j] = gray;
}
finally, this is how I try to draw the image:
//draw the picture as grayscale
for (int i = 0; i < iHeight; i++) {
for (int j = 0; j < iWidth; j++) {
// this should set every corresponding grayscale picture to the current picture as grayscale
pImg[i*Wp + j] = pImgGS[i*Wp + j];
}
}
}
original image:
and the resulting image that I get is this:
First check if image type is 24 bits per pixels.
Second, allocate memory to pImgGS;
BYTE* pImgGS = (BTYE*)malloc(sizeof(BYTE)*iWidth *iHeight);
Please refer this article to see how bmp data is saved. bmp images are saved upside down. Also, first 54 byte of information is BITMAPFILEHEADER.
Hence you should access values in following way,
double r,g,b;
unsigned char gray;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[(i*iWidth + j)*3 + 2];
g = (double)pImg[(i*iWidth + j)*3 + 1];
b = (double)pImg[(i*iWidth + j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
}
If there is padding present, then first determine padding and access in different way. Refer this to understand pitch and padding.
double r,g,b;
unsigned char gray;
long index=0;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[index+ (j)*3 + 2];
g = (double)pImg[index+ (j)*3 + 1];
b = (double)pImg[index+ (j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
index =index +pitch;
}
While drawing image,
as pImg is 24bpp, you need to copy gray values thrice to each R,G,B channel. If you ultimately want to save grayscale image in bmp format, then again you have to write bmp data upside down or you can simply skip that step in converting to gray here:
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
tl; dr:
Make one common path. Convert everything to 32-bits in a well-defined manner, and do not use image dimensions or coordinates. Refactor the YCbCr conversion ( = grey value calculation) into a separate function, this is easier to read and runs at exactly the same speed.
The lengthy stuff
First, you seem to have been confused with strides and offsets. The artefact that you see is because you accidentially wrote out one value (and in total only one third of the data) when you should have written three values.
One can get confused with this easily, but here it happened because you do useless stuff that you needed not do in the first place. You are iterating coordinates left to right, top-to-bottom and painstakingly calculate the correct byte offset in the data for each location.
However, you're doing a full-screen effect, so what you really want is iterate over the complete image. Who cares about the width and height? You know the beginning of the data, and you know the length. One loop over the complete blob will do the same, only faster, with less obscure code, and fewer opportunities of getting something wrong.
Next, 24-bit bitmaps are common as files, but they are rather unusual for in-memory representation because the format is nasty to access and unsuitable for hardware. Drawing such a bitmap will require a lot of work from the driver or the graphics hardware (it will work, but it will not work well). Therefore, 32-bit depth is usually a much better, faster, and more comfortable choice. It is much more "natural" to access program-wise.
You can rather trivially convert 24-bit to 32-bit. Iterate over the complete bitmap data and write out a complete 32-bit word for each 3 byte-tuple read. Windows bitmaps ignore the A channel (the highest-order byte), so just leave it zero, or whatever.
Also, there is no such thing as a 8-bit greyscale bitmap. This simply doesn't exist. Although there exist bitmaps that look like greyscale bitmaps, they are in reality paletted 8-bit bitmaps where (incidentially) the bmiColors member contains all greyscale values.
Therefore, unless you can guarantee that you will only ever process images that you have created yourself, you cannot just rely that e.g. the values 5 and 73 correspond to 5/255 and 73/255 greyscale intensity, respectively. That may be the case, but it is in general a wrong assumption.
In order to be on the safe side as far as correctness goes, you must convert your 8-bit greyscale bitmaps to real colors by looking up the indices (the bitmap's grey values are really indices) in the palette. Otherwise, you could be loading a greyscale image where the palette is the other way around (so 5 would mean 250 and 250 would mean 5), or a bitmap which isn't greyscale at all.
So... you want to convert 24-bit and you want to convert 8-bit bitmaps, both to 32-bit depth. That means you do all the annoying what-if stuff once at the beginning, and the rest is one identical common path. That's a good thing.
What you will be showing on-screen is always a 32-bit bitmap where the topmost byte is ignored, and the lower three are all the same value, resulting in what looks like a shade of grey. That's simple, and simple is good.
Note that if you do a BT.601 style YCbCr conversion (as indicated by your use of the constants 0.299, 0.587, and 0.144), and if your 8-bit greyscale images are perceptive (this is something you must know, there is no way of telling from the file!), then for 100% correctness, you need to to the inverse transformation when converting from paletted 8-bit to RGB. Otherwise, your final result will look like almost right, but not quite. If your 8-bit greycales are linear, i.e. were created without using the above constants (again, you must know, you cannot tell from the image), you need to copy everything as-is (here, doing the conversion would make it look almost-but-not-quite right).
About the RGB-to-greyscale conversion, you do not need an extra greyscale bitmap just to hold the values that you never need again afterwards. You can read the three color values from the loaded bitmap, calculate Y, and directly build the 32-bit ARGB word, which you then write out to the final bitmap. This saves one entirely useless round-trip to memory which is not necessary.
Something like this:
uint32_t* out = (uint32_t*) output_bitmap_data;
for(int i = 0; i < inputSize; i+= 3)
{
uint8_t Y = calc_greyscale(in[0], in[1], in[2]);
*out++ = (Y<<16) | (Y<<8) | Y;
}
Alternatively, you can also do the from-whatever-to-32 conversion, and then do the to-greyscale conversion in-place there. This, in turn, introduces an extra round-trip to memory, but the code becomes much, much easier overall.
I am working with depth images retrieved from kinect which are 16 bits. I found some difficulties on making my own filters due to the index or the size of the images.
I am working with Textures because allows to work with any bit size of images.
So, I am trying to compute an easy gradient to understand what is wrong or why it doesn't work as I expected.
You can see that there is something wrong when I use y dir.
For x:
For y:
That's my code:
typedef concurrency::graphics::texture<unsigned int, 2> TextureData;
typedef concurrency::graphics::texture_view<unsigned int, 2> Texture
cv::Mat image = cv::imread("Depth247.tiff", CV_LOAD_IMAGE_ANYDEPTH);
//just a copy from another image
cv::Mat image2(image.clone() );
concurrency::extent<2> imageSize(640, 480);
int bits = 16;
const unsigned int nBytes = imageSize.size() * 2; // 614400
{
uchar* data = image.data;
// Result data
TextureData texDataD(imageSize, bits);
Texture texR(texDataD);
parallel_for_each(
imageSize,
[=](concurrency::index<2> idx) restrict(amp)
{
int x = idx[0];
int y = idx[1];
// 65535 is the maxium value that can take a pixel with 16 bits (2^16 - 1)
int valX = (x / (float)imageSize[0]) * 65535;
int valY = (y / (float)imageSize[1]) * 65535;
texR.set(idx, valX);
});
//concurrency::graphics::copy(texR, image2.data, imageSize.size() *(bits / 8u));
concurrency::graphics::copy_async(texR, image2.data, imageSize.size() *(bits) );
cv::imshow("result", image2);
cv::waitKey(50);
}
Any help will be very appreciated.
Your indexes are swapped in two places.
int x = idx[0];
int y = idx[1];
Remember that C++AMP uses row-major indices for arrays. Thus idx[0] refers to row, y axis. This is why the picture you have for "For x" looks like what I would expect for texR.set(idx, valY).
Similarly the extent of image is also using swapped values.
int valX = (x / (float)imageSize[0]) * 65535;
int valY = (y / (float)imageSize[1]) * 65535;
Here imageSize[0] refers to the number of columns (the y value) not the number of rows.
I'm not familiar with OpenCV but I'm assuming that it also uses a row major format for cv::Mat. It might invert the y axis with 0, 0 top-left not bottom-left. The Kinect data may do similar things but again, it's row major.
There may be other places in your code that have the same issue but I think if you double check how you are using index and extent you should be able to fix this.
I am converting a RGB image to EXR format, using openexr, as follows:
int w = 1024;
int h = 768;
Array2D<Rgba> p (h, w);
QString fileName = "Penguins.jpg";
QImage Image = QImage(fileName);
QRgb c;
for (int y = 0; y < h; ++y)
{
for (int x = 0; x < w; ++x)
{
c = Image.pixel(x,y);
Rgba &p = px[y][x];
p.r = qRed(c)/255.0;
p.g = qGreen(c)/255.0;
p.b = qBlue(c)/255.0;
p.a = 1;
}
}
However, the converted image has different color, compare to the result of the graphics editor software, such as Adobe Photoshop. Below, you can see the given image, and the converted one (opened in Adobe Photoshop):
The RGB values contained in most common image formats such as JPEG are gamma corrected. The RGB values in OpenEXR are linear. You need to do a conversion on each pixel to make it linear.
The proper transformation to linear would be the sRGB formula. However for a quick test you can approximate it by taking the power of 2.2:
p.r = pow(qRed(c)/255.0, 2.2);
p.g = pow(qGreen(c)/255.0, 2.2);
p.b = pow(qBlue(c)/255.0, 2.2);
In my application, once I load an image into an SDL_Surface object, I need to go through each RGB value in the image and replace it with another RGB value from a lookup function.
(rNew, gNew, bNew) = lookup(rCur, gCur, bCur);
It seems surface->pixels gets me the pixels. I would appreciate it if someone can explain to me how to obtain R, G, and B values from the pixel and replace it with the new RGB value.
Use built-in functions SDL_GetRGB and SDL_MapRGB
#include <stdint.h>
/*
...
*/
short int x = 200 ;
short int y = 350 ;
uint32_t pixel = *( ( uint32_t * )screen->pixels + y * screen->w + x ) ;
uint8_t r ;
uint8_t g ;
uint8_t b ;
SDL_GetRGB( pixel, screen->format , &r, &g, &b );
screen->format deals with the format so you don't have to.
You can also use SDL_Color instead of writing r,g,b variables separately.
Depending on the format of the surface, the pixels are arranged as an array in the buffer.
For typical 32 bit surfaces, it is R G B A R G B A
where each component is 8 bit, and every 4 are a pixel
First of all you need to lock the surface to safely access the data for modification. Now to manipulate the array you need to know the numbers of bit per pixels, and the alignment of the channels (A, R, G, B). As Photon said if is 32 bits per pixel the array can be RGBARGBA.... if it is 24 the array can be RGBRGB.... (can also be BGR, BGR, blue first)
//i assume the signature of lookup to be
int lookup(Uint8 r, Uint8 g, Uint8 b, Uint8 *rnew, Uint8* gnew, Uint8* bnew);
SDL_LockSurface( surface );
/* Surface is locked */
/* Direct pixel access on surface here */
Uint8 byteincrement = surface->format->BytesPerPixel;
int position;
for(position = 0; position < surface->w * surface->h* byteincrement; position += byteincrement )
{
Uint8* curpixeldata = (Uint8*)surface->data + position;
/* assuming RGB, you need to know the position of channels otherwise the code is overly complex. for instance, can be BGR */
Uint8* rdata = curpixeldata +1;
Uint8* gdata = curpixeldata +2;
Uint8* bdata = curpixeldata +3;
/* those pointers point to r, g, b, use it as you want */
lookup(*rdata, *gdata, *bdata, rdata,gdata,bdata);
}
.
SDL_LockSurface( surface );
I have a TV capture card that has a feed coming in as a YUV format. I've seen other posts here similar to this question and attempted to try every possible method stated, but neither of them provided a clear image. At the moment the best results were with the OpenCV cvCvtColor(scr, dst, CV_YUV2BGR) function call.
I am currently unaware of the YUV format and to be honest confuses me a little bit as it looks like it stores 4 channels, but is only 3? I have included an image from the capture card to hope that someone can understand what is possibly going on that I could use to fill in the blanks.
The feed is coming in through a DeckLink Intensity Pro card and being accessed in a C++ application in using OpenCV in a Windows 7 environment.
Update
I have looked at a wikipedia article regarding this information and attempted to use the formula in my application. Below is the code block with the output received from it. Any advice is greatly appreciated.
BYTE* pData;
videoFrame->GetBytes((void**)&pData);
m_nFrames++;
printf("Num Frames executed: %d\n", m_nFrames);
for(int i = 0; i < 1280 * 720 * 3; i=i+3)
{
m_RGB->imageData[i] = pData[i] + pData[i+2]*((1 - 0.299)/0.615);
m_RGB->imageData[i+1] = pData[i] - pData[i+1]*((0.114*(1-0.114))/(0.436*0.587)) - pData[i+2]*((0.299*(1 - 0.299))/(0.615*0.587));
m_RGB->imageData[i+2] = pData[i] + pData[i+1]*((1 - 0.114)/0.436);
}
In newer version of OPENCV there is a built in function can be used to do YUV to RGB conversion
cvtColor(src,dst,CV_YUV2BGR_YUY2);
specify the YUV format after the underscore, like this CV_YUYV2BGR_xxxx
It looks to me like you're decoding a YUV422 stream as YUV444. Try this modification to the code you provided:
for(int i = 0, j=0; i < 1280 * 720 * 3; i+=6, j+=4)
{
m_RGB->imageData[i] = pData[j] + pData[j+3]*((1 - 0.299)/0.615);
m_RGB->imageData[i+1] = pData[j] - pData[j+1]*((0.114*(1-0.114))/(0.436*0.587)) - pData[j+3]*((0.299*(1 - 0.299))/(0.615*0.587));
m_RGB->imageData[i+2] = pData[j] + pData[j+1]*((1 - 0.114)/0.436);
m_RGB->imageData[i+3] = pData[j+2] + pData[j+3]*((1 - 0.299)/0.615);
m_RGB->imageData[i+4] = pData[j+2] - pData[j+1]*((0.114*(1-0.114))/(0.436*0.587)) - pData[j+3]*((0.299*(1 - 0.299))/(0.615*0.587));
m_RGB->imageData[i+5] = pData[j+2] + pData[j+1]*((1 - 0.114)/0.436);
}
I'm not sure you've got your constants correct, but at worst your colors will be off - the image should be recognizable.
I use the following C++ code using OpenCV to convert yuv data (YUV_NV21) to rgb image (BGR in OpenCV)
int main()
{
const int width = 1280;
const int height = 800;
std::ifstream file_in;
file_in.open("../image_yuv_nv21_1280_800_01.raw", std::ios::binary);
std::filebuf *p_filebuf = file_in.rdbuf();
size_t size = p_filebuf->pubseekoff(0, std::ios::end, std::ios::in);
p_filebuf->pubseekpos(0, std::ios::in);
char *buf_src = new char[size];
p_filebuf->sgetn(buf_src, size);
cv::Mat mat_src = cv::Mat(height*1.5, width, CV_8UC1, buf_src);
cv::Mat mat_dst = cv::Mat(height, width, CV_8UC3);
cv::cvtColor(mat_src, mat_dst, cv::COLOR_YUV2BGR_NV21);
cv::imwrite("yuv.png", mat_dst);
file_in.close();
delete []buf_src;
return 0;
}
and the converted result is like the image yuv.png.
you can find the testing raw image from here and the whole project from my Github Project
It may be the wrong path, but many people (I mean, engineers) do mix YUV with YCbCr.
Try to
cvCvtColor(src, dsc, CV_YCbCr2RGB)
or CV_YCrCb2RGB or maybe a more exotic type.
The BlackMagic Intensity software return YUVY' format in bmdFormat8BitYUV, so 2 sources pixels are compressed into 4bytes - I don't think openCV's cvtColor can handle this.
You can either do it yourself, or just call the Intensity software ConvertFrame() function
edit: Y U V is normally stored as
There is a Y (brightness) for each pixel but only a U and V (colour) for every alternate pixel in the row.
So if data is an unsigned char pointing to the start of the memory as shown above.
pixel 1, Y = data[0] U = data[+1] V = data[+3]
pixel 2, Y = data[+2] U = data[+1] V = data[+3]
Then use the YUV->RGB coefficients you used in your sample code.
Maybe someone is confused by color models YCbCr and YUV.
Opencv does not handle YCbCr. Instead it has YCrCb, and it implemented the same way as YUV in opencv.
From the opencv sources https://github.com/Itseez/opencv/blob/2.4/modules/imgproc/src/color.cpp#L3830:
case CV_BGR2YCrCb: case CV_RGB2YCrCb:
case CV_BGR2YUV: case CV_RGB2YUV:
// ...
// 1 if it is BGR, 0 if it is RGB
bidx = code == CV_BGR2YCrCb || code == CV_BGR2YUV ? 0 : 2;
//... converting to YUV with the only difference that brings
// order of Blue and Red channels (variable bidx)
But there is one more thing to say.
There is currently a bug in conversion CV_BGR2YUV and CV_RGB2YUV in OpenCV branch 2.4.* .
At present, this formula is used in implementation:
Y = 0.299B + 0.587G + 0.114R
U = 0.492(R-Y)
V = 0.877(B-Y)
What it should be (according to wikipedia):
Y = 0.299R + 0.587G + 0.114B
U = 0.492(B-Y)
V = 0.877(R-Y)
The channels Red and Blue are misplaced in the implemented formula.
Possible workaround to convert BGR->YUV while the bug is not fixed :
cv::Mat source = cv::imread(filename, CV_LOAD_IMAGE_COLOR);
cv::Mat yuvSource;
cvtColor(source, yuvSource, cv::COLOR_BGR2RGB); // rearranges B and R in the appropriate order
cvtColor(yuvSource, yuvSource, cv::COLOR_BGR2YUV);
// yuvSource will contain here correct image in YUV color space