Should I consider rgb values of a pixel as one value? - c++

I was reading this paper for a project work using imagmagick and C++.
We train on 1.6 million 32*32 color images that have been preprocessed
by subtracting from each pixel its mean value over all images and then
dividing by the standard deviation of all pixels over all images.
I've trouble distinguishing between "from each pixel its mean value over all images" and "standard deviation of all pixels over all images".
Since, I'm dealing with color images, can I just take rgb values of each pixel as one value or should I calculate the mean and SD for every color differently?
For example if I have r=255, g=255, b=255, can I take pixel value as (in binary), (r<<16)+(g<<8)+b ?

Color channel values should be used independently. If you would use 32 bit representation of the pixels, you would get big value differences between very near colors which differ in red or green channel.

Related

C++ OpenCV boundRect[].tl() unit of output

I was wondering what the unit is of my boundRect[].tl() output.
topleft = boundRect[largest_contour_index].tl();
My assumption is that it is in pixels.
If so, do I need to look at the pixels of my camera and the format it outputs to calculate the position of my object?
Or do the pixels that the function outputs change due to the fact that OpenCV converts the image to an 8-bit image? I can imagine that the amount of pixels where the image consists of becomes smaller when the image is converted to 8 bit.
Please correct me if I'm wrong.
Thank you!
First of all, the BoundingRect returns x,y coordinates, width and height. you can refer to its documentation: docs.opencv.org/2.4/modules/core/doc/basic_structures.html#rect
second, the 8-bit image conversion was based on pixel value of color and doesn't have a direct relation with pixel count. So converting a 100x100 image to 8-bit image will still be 100x100 px

OpenCV convertTo()

I came across this code:
image.convertTo(temp_image,CV_16SC3);
I saw the description of the convertTo() function from here, but what confuses me is image. How can we read the above code? What would be the relation between image and temp_image?
Thanks.
The other answers here are correct, but lack some details. Let me try.
image.convertTo(temp_image,CV_16SC3);
You have a source image image, and a destination image temp_image. You didn't specify the type of image, but probably is CV_8UC3 or CV_32FC3, i.e. a 3 channel image (since convertTo doesn't change the number of channels), where each channel has depth 8 bit (unsigned char, CV_8UC3) or 32 bit (float, CV_32FC3).
This line of code will change the depth of each channel, so that temp_image has each channel of depth 16 bit (short). Specifically it's a signed short, since the type specifier has the S: CV_16SC3.
Note that if you are narrowing down the depth, as in the case from float to signed short, then saturate_cast will make sure that all the values in temp_image will be in the correct range, i.e. in [–32768, 32767] for signed short.
Why you need to change the depth of an image?
Some OpenCV functions require input images with a specific depth.
You need a matrix to contain a different range of values. E.g. if you need to sum (or subtract) some images CV_8UC3 (tipically BGR images), you'd better store the result in a CV_16SC3 or you'll probably get wrong results due to saturations, since the range for CV_8U images is in [0,255]
You read with imread, or want to store with imwrite images with 16bit depth. This are usually used (AFAIK) in medical or graphics application to allow a wider range of colors. However, most monitors do not support 16bit image visualization.
There may be other cases, let me know if I miss the one important to you.
An image is a matrix of pixel information (i.e. a 1080p image will be a 1,920 × 1,080 matrix where each entry contains rbg values for that pixel). All you are doing is reformatting that matrix (each pixel entry, iteratively) into a new type (CV_16SC3) so it can be read by different programs.
The temp_image is a new matrix of pixel information based off of image formatted into CV_16SC3.
The first one is a source, the second one - destination. So, it takes image, converts it into type CV_16SC3 and stores in temp_image.

How to get grayscale value of pixels from grayscale image in xCode

I was wondering how to determine the equivalent of RGB values for a grayscale image. The original image is grayscale and everything I have found online is converting an RGB image pixel values to the grayscale pixel values. I already can read in the image. Ideally, this would be for xCode.
I was wondering if there was a class which would do this for me. If so, and you could point me to it, that would be great. I will read on it.
Any help is greatly appreciated.
NOTE: I am a beginner in C++ and do not have time to learn everything formally; I have to learn all of my programming on the fly.
You need more information to transform from a simple Greyscale to RGB, when you do reverse operation, the color information is "lost", as the three channels are set to same value(depending on the algorithm each channel will have a different/same weight in the final color computation).
Digital cameras, usually store more information per pixel, 12 bits per channel in 35mm and 14 bits per channel in medium format (those bits number are the average, some products offer less or even more quality).
Thanks to those additional bits per channel, the camera can compute the "real" color, or what it thinks is the real color based on some parameters.
TL;DR: You can't without more data from your source, in this case the image.
You can convert a gray value to RGB by setting each component of the RGB value to the gray value:
ColorRGB myColorRGB = ColorRGBMake(myGrayValue, myGrayValue, myGrayValue);

1bpp Monochromatic BMP

I ran a demo bmp file format helper program "DDDemo.exe" to help me visualize the format of a 32x1 pixel bmp file (monochromatic). I'm okay with the the two header sections but dont seem to understand the color table and pixel bits portions. I made two 32x1 pixel bmp files to help me compare (please see attached).
Can someone assit me understand how the "pixel bits" relates to the color map?
UPDATE: After some trial and error I finally was able to write a 32x1 pixel monochromatic BMP. Although it has different pixel bits as the attached images, this tool helped with the header and color mapping concept. Thank you for everyones input.
An unset bit in the PIXEL BITS refers to the first color table entry (0,0,0), black, and a set bit refers to the second color table entry (ff,ff,ff), white.
"The 1-bit per pixel (1bpp) format supports 2 distinct colors, (for example: black and white, or yellow and pink). The pixel values are stored in each bit, with the first (left-most) pixel in the most-significant bit of the first byte. Each bit is an index into a table of 2 colors. This Color Table is in 32bpp 8.8.8.0.8 RGBAX format. An unset bit will refer to the first color table entry, and a set bit will refer to the last (second) color table entry." - BMP file format
The color table for these images is simply indicating that there are two colors in the image:
Color 0 is (00, 00, 00) -- pure black
Color 1 is (FF, FF, FF) -- pure white
The image compression method shown (BI_RGB -- uncompressed) doesn't make sense with the given pixel data and images, though.

Color detection algorithm - How should I do this?

I'm a bit stuck on designing a color detection system - I can't quite figure out a way to do it easily.
-
Basically, I have a library of images, that I want to sort by color. So if the user specifies 'sort by blue', then the most blue images will appear at the top of the results, with the least blue appearing at the bottom.
The problem is that the images aren't all one color, so it is doing two things at the same time:
1 - finding the bluest part of the image
2 - ranking this blue color (based on color hue and amount of this color).
I've tried about 3 or 4 different approaches, with varying results - none work well though, and 2 of these were quite mathematical algorithms (which all work much better on paper than in practice haha).
-
What different ways could I go about the whole process? I'm probably missing some really obvious ways it could work - any help or ideas would be much appreciated :)
-
EDIT: Thanks for all the responses - here's what I've tried so far:
getting the average rgb value for the whole image and comparing it to blue. Comparing was done using normalised rgb 3 space vectors and finding distances between them. This works the least well, an image with no blue could easily appear above an image with partial very strong blue.
finding the dominant color and comparing it to blue (again using 3 space vector distances). This didn't work as there might have been a large blue section of the image that wasn't the most (or in the top couple) of dominant color sections.
finding pixels that are close to blue, averaging all of these and comparing the answer to actual blue.
finding all the pixels that are close to blue, incrementing a count and finding a percentage based on count/total pixels.
Two thoughts come to mind:
Cheap version: convert images to HSV color space, and for each pixel compute cos(H - target_hue) or a reasonable approximation (for blue, target_hue would be 240 degrees), multiply by saturation, and average that quantity over all of the pixels in the image. High values are best. Note that colors that are closer to yellow than to blue have "negative blueness", and that black, white, and pure gray have equally "zero blueness". Note that you really want HSV, not HSL, in this situation, because the "S" in HSL doesn't map well to perceptual saturation. For example, the color #f8f8ff (RGB 248, 248, 255) has a saturation of 100% in HSL (i.e. a pure blue), but it looks nearly white. The same color in HSV has an "S" coordinate of only 3%, which is reasonable.
Less cheap version: convert images to CIELAB color space, discard L, and compute the distance in a*b* space between each pixel and the target color, then average or RMS over each pixel. Low values are best.
I think to measure "blueness" you'll need to take all three components into account, not just the blue. Just for example, [255,255,255] is pure white, not blue -- but [0, 0, 30] is pure blue, even though its blue component is much lower in value.
Alternatively, you could convert to something like HSL or HSV, in which case the "blueness" should be a bit simpler to measure (hue and saturation only).
I'd google for an algorythm for creating 256 colour palettes from 24bit images (see http://en.wikipedia.org/wiki/Color_quantization for more info) then see which colours in this palette dominate if the image was mapped to it. ie, running a tally for each 256 palette entry of how many pixels get mapped into it.
notes,
you of course don't need the whole 256, it's just saying 256 to help explain my thinking.
also by directly studying the algorythim for this palette generation might directly give you an answer.
Do you really need to find the bluest part of the image? Why not just rank the "blueness" of an image as the average blue-component value for all pixels?
Another possibility would be to find the density of pixels that pass a threshold, or minimum blue value necessary to qualify as a blue pixel.
If you have one pixel, I'd say its blueness in terms of RGB is the the value of B / (R + G + B), so 1 is totally blue and 0 is not blue at all and white is 1/3 blue. (Watch out for black, which is a special case.) And the blueness of an image is the average blueness of its pixels. And if that's too costly, just take the average of a fixed number of randomly-chosen pixels.
I would say to take the average of the RGB value itself over the whole picture. I would say that the pseudo below should give you the "average blue" of the picture.
SUMr
SUMg
SUMb
for pixel <- image
SUMr += pixel.r
SUMg += pixel.g
SUMb += pixel.b
SUMr / pixelcount
SUMg / pixelcount
SUMb / pixelcount
If this doesn't work out; then I would think that you would need to rank a "blue" pixel as being higher/lower weighted based on the G/B values. Then add up your weighted value(s) and compare those.
weight
for pixel <- image
tweight = b
b -= r
b -= g
b = 0 if b < 0
weight += tweight
compare weights of all images.