Image format and unsigned char arrays - c++

I'm developping imaging functions (yes I REALLY want to reinvent the wheel for various reasons).
I'm copying bitmaps into unsigned char arrays but I'm having some problem with byte size versus image pixel format.
for example a lot of images come as 24 bits per pixel for RGB representation so that's rather easy, every pixel has 3 unsigned chars (bytes) and everyone is happy
however sometimes the RGB has weirder types like 48 bits per pixel so 16 bits per color channel. Copying the whole image into the byte array works fine but its when I want to retrieve the data that things get blurry
Right now I have the following code to get a single pixel for grayscale images
unsigned char NImage::get_pixel(int i, int j)
{
return this->data[j * pitch + i];
}
NImage::data is unsigned char array
This returns a single byte. How can I access my data array with different pixel formats?

You should do it like this:
unsigned short NImage::get_pixel(int i, int j)
{
int offset = 2 * (j * pitch + i);
// image pixels are usually stored in big-endian format
return data[offset]*256 + data[offset+1];
}

At 48 bits per pixel, with 16 bit per color, you can't return an 8 bit value, you must return a 16 bit short or unsigned short otherwise the data gets truncated.
You might try developing overloaded functions to handle this.

You have to know how big your pixels are.
If it's RGB then your 100x100 pixel image (say) will have 30,000 unsigned chars.
unsigned char NImage::get_red_component(int i, int j)
{
return this->data[3*(j * pitch + i)];
}
unsigned char NImage::get_green_component(int i, int j)
{
return this->data[3*(j * pitch + i) + 1];
}
unsigned char NImage::get_blue_component(int i, int j)
{
return this->data[3*(j * pitch + i) + 2];
}
Or for 48-bit RGB,
unsigned char NImage::get_red_MSB(int i, int j)
{
return this->data[6*(j * pitch + i)];
}
unsigned char NImage::get_red_LSB(int i, int j)
{
return this->data[6*(j * pitch + i) + 1];
}
... etc etc ...

What's the problem with 48bits per pixel? Simply read your data as uint16_t or unsigned short and you get the 16 bit extracted properly.
It gets worse for more complicated bit pattern, i.e. rgb565 where you'll need to extract data using bitmasks.

Related

How to assign a value of type T to data[i] of type uchar without any loss?

Scenario
I want to create my own SetChannel function that will set a specific channel of an image. For example, I have an image input of type CV_16UC3 (BGR image of type ushort) and I want to change the green channel (=1 due to zero based index) to an ushort value of 32768. For this, I invoke SetChannel(input,1,32768).
template<typename T>
void SetChannel(Mat mat, uint channel, T value)
{
const uint channels = mat.channels();
if (channel + 1 > channels)
return;
T * data = (T*)mat.data;
// MBPR : number of Memory Block Per Row
// Mat.step : number of byte per row
// Mat.elemSize1() : number of byte per channel
const unsigned int MBPR = mat.step / mat.elemSize1();
// N : total number of memory blocks
const unsigned int N = mat.rows * MBPR;
for (uint i = channel; i < N; i += channels)
data[i] = value;
}
I prefer working in a single loop than nested looping so I define the number of iteration N as given above.
The code above works as expected but some other people said the part
T * data = (T*)mat.data;
is a code smell and regarded as a badly designed program.
Now I want to rewrite the new one with another approach as follows.
It is not working as expected because I don't know how to assign T value to data[i] of type uchar.
template<typename T>
void SetChannel(Mat mat, uint channel, T value)
{
const uint channels = mat.channels();
if (channel + 1 > channels)
return;
uchar * data = mat.data;
const unsigned int N = mat.rows * mat.step;// byte per image
const unsigned int bpc = mat.elemSize1();// byte per channel
const unsigned int bpp = mat.elemSize(); // byte per pixel
for (uint i = channel * bpc; i < N; i += bpp)
//data[i] = value;
}
Question
How to assign value of type T to data[i] of type uchar without any loss?
For those who don't know how Mat is, the following might be useful.
About OpenCV Mat class
OpenCV provides a bunch of types of images. For example,
CV_8UC1 represents gray scale image type in which each pixel has one channel of type uchar.
CV_8UC3 represents BGR (not RGB) image type in which each pixel has three channels, each of type uchar.
CV_16UC3 represents BGR (not RGB) image type in which each pixel has three channels, each of type ushort.
etc.
Mat is the class to encapsulate image. It has several attributes and functions. Let me list some of them that I will use in this question so you can understand my scenario better.
Mat.data: pointer of type uchar pointing to a block of image pixels.
Mat.rows: number of rows
Mat.channels(): number of channels per pixel
Mat.elemSize1() (ended with 1): number of byte per channel
Mat.elemSize(): number of byte per pixel.
Mat.elemSize() = Mat.channels() * Mat.elemSize1().
Mat.step: number of byte per row
Here Mat.step can be thought as the product of
- "effective" number of pixels per row (let me name it as EPPR),
- number of channels per pixel or Mat.channels(), and
- number of byte per channel or Mat.elemSize1().
Mathematically,
Mat.step = EPPR * Mat.elemSize()
Mat.step = EPPR * Mat.channels() * Mat.elemSize1()
Let me define EPPR * Mat.channels() as memory blocks per row (MBPR). If you know the correct term for MBPR, let me know.
As a result, MBPR = Mat.step / Mat.elemSize1().
I got it from someone offline. Hopefully it is useful for others as well.
template<typename T>
void SetChannel(Mat mat, uint channel, T value)
{
const uint channels = mat.channels();
if (channel + 1 > channels)
return;
uchar * data = mat.data;
const unsigned int N = mat.rows * mat.step;// byte per image
const unsigned int bpc = mat.elemSize1();// byte per channel
const unsigned int bpp = mat.elemSize(); // byte per pixel
const unsigned int bpu = CHAR_BIT * sizeof(uchar);// bits per uchar
for (uint i = channel * bpc; i < N; i += bpp)
for (uint j = 0; j < bpc; j++)
data[i + j] = value >> bpu * j;
}

Get RGB of a pixel in stb_image

I'm created and loaded this image:
int x, y, comps;
unsigned char* data = stbi_load(".//textures//heightMapTexture.png", &x, &y, &comps, 1);
Now, how do i get a RGB of a certain pixel of this image?
You are using the 8-bits-per-channel interface. Also, you are requesting only one channel (the last argument given to stbi_load). You won't obtain RGB data with only one channel requested.
If you work with rgb images, you will probably get 3 or 4 in comps and you want to have at least 3 in the last argument.
The data buffer returned by stbi_load will containt 8bits * x * y * channelRequested , or x * y * channelCount bytes.
you can access the (i, j) pixel info as such:
unsigned bytePerPixel = channelCount;
unsigned char* pixelOffset = data + (i + x * j) * bytePerPixel;
unsigned char r = pixelOffset[0];
unsigned char g = pixelOffset[1];
unsigned char b = pixelOffset[2];
unsigned char a = channelCount >= 4 ? pixelOffset[3] : 0xff;
That way you can have your RGB(A) per-pixel data.

(C++)(Visual Studio) Change RGB to Grayscale

I am accessing the image like so:
pDoc = GetDocument();
int iBitPerPixel = pDoc->_bmp->bitsperpixel; // used to see if grayscale(8 bits) or RGB (24 bits)
int iWidth = pDoc->_bmp->width;
int iHeight = pDoc->_bmp->height;
BYTE *pImg = pDoc->_bmp->point; // pointer used to point at pixels in the image
int Wp = iWidth;
const int area = iWidth * iHeight;
int r; // red pixel value
int g; // green pixel value
int b; // blue pixel value
int gray; // gray pixel value
BYTE *pImgGS = pImg; // grayscale image pixel array
and attempting to change the rgb image to gray like so:
// convert RGB values to grayscale at each pixel, then put in grayscale array
for (int i = 0; i<iHeight; i++)
for (int j = 0; j<iWidth; j++)
{
r = pImg[i*iWidth * 3 + j * 3 + 2];
g = pImg[i*iWidth * 3 + j * 3 + 1];
b = pImg[i*Wp + j * 3];
r * 0.299;
g * 0.587;
b * 0.144;
gray = std::round(r + g + b);
pImgGS[i*Wp + j] = gray;
}
finally, this is how I try to draw the image:
//draw the picture as grayscale
for (int i = 0; i < iHeight; i++) {
for (int j = 0; j < iWidth; j++) {
// this should set every corresponding grayscale picture to the current picture as grayscale
pImg[i*Wp + j] = pImgGS[i*Wp + j];
}
}
}
original image:
and the resulting image that I get is this:
First check if image type is 24 bits per pixels.
Second, allocate memory to pImgGS;
BYTE* pImgGS = (BTYE*)malloc(sizeof(BYTE)*iWidth *iHeight);
Please refer this article to see how bmp data is saved. bmp images are saved upside down. Also, first 54 byte of information is BITMAPFILEHEADER.
Hence you should access values in following way,
double r,g,b;
unsigned char gray;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[(i*iWidth + j)*3 + 2];
g = (double)pImg[(i*iWidth + j)*3 + 1];
b = (double)pImg[(i*iWidth + j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
}
If there is padding present, then first determine padding and access in different way. Refer this to understand pitch and padding.
double r,g,b;
unsigned char gray;
long index=0;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[index+ (j)*3 + 2];
g = (double)pImg[index+ (j)*3 + 1];
b = (double)pImg[index+ (j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
index =index +pitch;
}
While drawing image,
as pImg is 24bpp, you need to copy gray values thrice to each R,G,B channel. If you ultimately want to save grayscale image in bmp format, then again you have to write bmp data upside down or you can simply skip that step in converting to gray here:
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
tl; dr:
Make one common path. Convert everything to 32-bits in a well-defined manner, and do not use image dimensions or coordinates. Refactor the YCbCr conversion ( = grey value calculation) into a separate function, this is easier to read and runs at exactly the same speed.
The lengthy stuff
First, you seem to have been confused with strides and offsets. The artefact that you see is because you accidentially wrote out one value (and in total only one third of the data) when you should have written three values.
One can get confused with this easily, but here it happened because you do useless stuff that you needed not do in the first place. You are iterating coordinates left to right, top-to-bottom and painstakingly calculate the correct byte offset in the data for each location.
However, you're doing a full-screen effect, so what you really want is iterate over the complete image. Who cares about the width and height? You know the beginning of the data, and you know the length. One loop over the complete blob will do the same, only faster, with less obscure code, and fewer opportunities of getting something wrong.
Next, 24-bit bitmaps are common as files, but they are rather unusual for in-memory representation because the format is nasty to access and unsuitable for hardware. Drawing such a bitmap will require a lot of work from the driver or the graphics hardware (it will work, but it will not work well). Therefore, 32-bit depth is usually a much better, faster, and more comfortable choice. It is much more "natural" to access program-wise.
You can rather trivially convert 24-bit to 32-bit. Iterate over the complete bitmap data and write out a complete 32-bit word for each 3 byte-tuple read. Windows bitmaps ignore the A channel (the highest-order byte), so just leave it zero, or whatever.
Also, there is no such thing as a 8-bit greyscale bitmap. This simply doesn't exist. Although there exist bitmaps that look like greyscale bitmaps, they are in reality paletted 8-bit bitmaps where (incidentially) the bmiColors member contains all greyscale values.
Therefore, unless you can guarantee that you will only ever process images that you have created yourself, you cannot just rely that e.g. the values 5 and 73 correspond to 5/255 and 73/255 greyscale intensity, respectively. That may be the case, but it is in general a wrong assumption.
In order to be on the safe side as far as correctness goes, you must convert your 8-bit greyscale bitmaps to real colors by looking up the indices (the bitmap's grey values are really indices) in the palette. Otherwise, you could be loading a greyscale image where the palette is the other way around (so 5 would mean 250 and 250 would mean 5), or a bitmap which isn't greyscale at all.
So... you want to convert 24-bit and you want to convert 8-bit bitmaps, both to 32-bit depth. That means you do all the annoying what-if stuff once at the beginning, and the rest is one identical common path. That's a good thing.
What you will be showing on-screen is always a 32-bit bitmap where the topmost byte is ignored, and the lower three are all the same value, resulting in what looks like a shade of grey. That's simple, and simple is good.
Note that if you do a BT.601 style YCbCr conversion (as indicated by your use of the constants 0.299, 0.587, and 0.144), and if your 8-bit greyscale images are perceptive (this is something you must know, there is no way of telling from the file!), then for 100% correctness, you need to to the inverse transformation when converting from paletted 8-bit to RGB. Otherwise, your final result will look like almost right, but not quite. If your 8-bit greycales are linear, i.e. were created without using the above constants (again, you must know, you cannot tell from the image), you need to copy everything as-is (here, doing the conversion would make it look almost-but-not-quite right).
About the RGB-to-greyscale conversion, you do not need an extra greyscale bitmap just to hold the values that you never need again afterwards. You can read the three color values from the loaded bitmap, calculate Y, and directly build the 32-bit ARGB word, which you then write out to the final bitmap. This saves one entirely useless round-trip to memory which is not necessary.
Something like this:
uint32_t* out = (uint32_t*) output_bitmap_data;
for(int i = 0; i < inputSize; i+= 3)
{
uint8_t Y = calc_greyscale(in[0], in[1], in[2]);
*out++ = (Y<<16) | (Y<<8) | Y;
}
Alternatively, you can also do the from-whatever-to-32 conversion, and then do the to-greyscale conversion in-place there. This, in turn, introduces an extra round-trip to memory, but the code becomes much, much easier overall.

rotate pixel array in any given degree (e.g. 45 degree) in c++

I am Trying to rotate an RGB/RGBA image in 45 degree in c++.
I have the pixels of the image stored in unsigned char *pBuffer.
I have found this code for 90 degree rotation -
void rotate90(unsigned char *buffer, const unsigned int width, const unsigned int height)
{
const unsigned int sizeBuffer = width * height * 3;
unsigned char *tempBuffer = new unsigned char[sizeBuffer];
for (int y = 0, destinationColumn = height - 1; y < height; ++y, --destinationColumn)
{
int offset = y * width;
for (int x = 0; x < width; x++)
{
tempBuffer[(x * height) + destinationColumn] = buffer[offset + x];
}
}
// Copy rotated pixels
memcpy(buffer, tempBuffer, sizeBuffer);
delete[] tempBuffer;
}
But i want to rotate my image in any given degree for rotation in C++
Short answer. You need to implement some kind of interpolation.
Long answer. First of all keep in mind that resulting bitmap will have a different size. For example if you have a 100x100 pixel image the 45 degrees rotated image will have size of 141x141 pixels, but only the central part will have original pixels, the cornerswill be empty (white? transparent? is up to you).
Regarding how to compute the central pixels I suggest you to do as follows: for each pixel in the destination picture implements the rotation that maps it back in the original one. This implies floating point computations and you will end with a result like 32.4;12.7 Now you have a few choices. The simplest is just round the result to the closest integer approximation and get that pixel. This will give a grained result but may be acceptable in come cases. Or you can get the 4 closest pixel and interpolate them. Linear interpolation is an option but gives a somewhat blurry result, other schemes are possible (cubic)

C++AMP Computing gradient using texture on a 16 bit image

I am working with depth images retrieved from kinect which are 16 bits. I found some difficulties on making my own filters due to the index or the size of the images.
I am working with Textures because allows to work with any bit size of images.
So, I am trying to compute an easy gradient to understand what is wrong or why it doesn't work as I expected.
You can see that there is something wrong when I use y dir.
For x:
For y:
That's my code:
typedef concurrency::graphics::texture<unsigned int, 2> TextureData;
typedef concurrency::graphics::texture_view<unsigned int, 2> Texture
cv::Mat image = cv::imread("Depth247.tiff", CV_LOAD_IMAGE_ANYDEPTH);
//just a copy from another image
cv::Mat image2(image.clone() );
concurrency::extent<2> imageSize(640, 480);
int bits = 16;
const unsigned int nBytes = imageSize.size() * 2; // 614400
{
uchar* data = image.data;
// Result data
TextureData texDataD(imageSize, bits);
Texture texR(texDataD);
parallel_for_each(
imageSize,
[=](concurrency::index<2> idx) restrict(amp)
{
int x = idx[0];
int y = idx[1];
// 65535 is the maxium value that can take a pixel with 16 bits (2^16 - 1)
int valX = (x / (float)imageSize[0]) * 65535;
int valY = (y / (float)imageSize[1]) * 65535;
texR.set(idx, valX);
});
//concurrency::graphics::copy(texR, image2.data, imageSize.size() *(bits / 8u));
concurrency::graphics::copy_async(texR, image2.data, imageSize.size() *(bits) );
cv::imshow("result", image2);
cv::waitKey(50);
}
Any help will be very appreciated.
Your indexes are swapped in two places.
int x = idx[0];
int y = idx[1];
Remember that C++AMP uses row-major indices for arrays. Thus idx[0] refers to row, y axis. This is why the picture you have for "For x" looks like what I would expect for texR.set(idx, valY).
Similarly the extent of image is also using swapped values.
int valX = (x / (float)imageSize[0]) * 65535;
int valY = (y / (float)imageSize[1]) * 65535;
Here imageSize[0] refers to the number of columns (the y value) not the number of rows.
I'm not familiar with OpenCV but I'm assuming that it also uses a row major format for cv::Mat. It might invert the y axis with 0, 0 top-left not bottom-left. The Kinect data may do similar things but again, it's row major.
There may be other places in your code that have the same issue but I think if you double check how you are using index and extent you should be able to fix this.