The 2nd problem in IOI 2013 states:
You have an Art History exam approaching, but you have been paying
more attention to informatics at school than to your art classes! You
will need to write a program to take the exam for you.
The exam will consist of several paintings. Each painting is an example of one of
four distinctive styles, numbered 1, 2, 3 and 4. Style 1 contains
neoplastic modern art. Style 2 contains impressionist landscapes.
Style 3 contains expressionist action paintings. Style 4 contains
colour field paintings.
Your task is, given a digital image of a painting, to determine which style the painting belongs to.
The image will be given as an H×W grid of pixels. The rows of
the image are numbered 0, …, (H 1) from top to bottom, and the
columns are numbered 0, …, W 1 from left to right. The pixels are
described using twodimensional arrays R , G and B , which give the
amount of red, green and blue respectively in each pixel of the image.
These amounts range from 0 (no red, green or blue) to 255 (the maximum
amount of red, green or blue).
Implementation
You should submit a file
that implements the function style(), as follows:
int style(int H, int W, int R[500][500], int G[500][500], int B[500][500]);
This function should determine the style of the image. Parameters are:
H: The number of rows of pixels in the image.
W: The number of columns of pixels in the image.
R: A twodimensional array of size H×W , giving the amount of red in each pixel of the image.
G: A twodimensional array of size H×W , giving the amount of green in each pixel of the image.
B: A twodimensional array of size H×W , giving the amount of blue in each pixel of the image.
Example pictures are in the problem PDF
I do not want a readymade program. A hint or two to get me started would be nice, as I am clueless about this might be solved.
Since you are provided the image data in RGB format, first prepare a copy of the same image data in YUV. This is essential as some of the image features are easily identified patterns in the Luma(Y) and Chroma(U,V) maps.
Based on the samples provided, here are some of the salient features of each "style" of art :
Style1 - Neoplastic modern art
Zero graininess - Check for large areas with uniform Luma(Y)
Black pixels at edges of the areas(transition between different chroma).
Style2 - Impressionist landscapes
High graininess - Check for high entropy (salt-n-pepper-noise like) patterns in Luma(Y).
Pre-dominantly green - High values in green channel.
Greenavg >> Redavg
Greenavg >> Blueavg
Style3 - Expressionist action paintings
High graininess - Check for high entropy (salt-n-pepper-noise like) patterns in Luma(Y).
NOT green.
Style4 - Color field paintings
Zero graininess - Check for large areas with uniform Luma(Y)
NO black(or near black) pixels at the transition between different chroma.
As long as the input image belongs to one of these classes you should have no trouble in classification by running the image data through functions that are implemented to identify the above features.
Basically it boils down to the following code-flow :
Image has uniform luma?
(If Yes) Image has black pixels at chroma transitions?
(If Yes) Style1
(If No) Style4
(If No) Image is green-ish?
(If Yes) Style2
(If No) Style3
Maybe you can do a first approach using colors and shapes... In neo plastic modern it is likely that there will be only a few number of colors, occupying geometrical areas as in the colour field paintings.
This might gives you a way to differenciate styles 1 and 4 from styles 2 and 3.
In styles 1 and 4 you have large areas with the same color, but in style 4 the color is rarely a solid color but brush strokes of shades of the color.
Anyway you should look into the specialities of each styles, which are the usual colors and methods and then try to make your function "see" it.
Related
I have a Kinect and I'm using OpenCV and point cloud library. I would like to project the IR Image onto a 2D plane for forklift pallet detection. How would I do that?
I'm trying to detect the pallet in the forklift here is an image:
Where are the RGB data? You can use it to help with the detection. You do not need to project the image onto any plane to detect a pellet. There are basically 2 ways used for detection
non-deterministic based on neural network, fuzzy logic, machine learning, etc
This approach need a training dataset to recognize the object. Much experience is needed for proper training set and classifier architecture/topology selection. But other then that you do not need to program it... as usually some readily available lib/tool is used just configure and pass the data.
deterministic based on distance or correlation coefficients
I would start with detecting specific features like:
pallet has specific size
pallet has sharp edges and specific geometry shape in depth data
pallet has specific range of colors (yellowish wood +/- lighting and dirt)
Wood has specific texture patterns
So compute some coefficient for each feature how close the object is to real pallet. And then just treshold the distance of all coefficients combined (possibly weighted as some features are more robust).
I do not use the #1 approach so I would go for #2. So combine the RGB and depth data (they have to be matched exactly). Then segmentate the image (based on depth and color). After that for each found object classify if it is pallet ...
[Edit1]
Your colored image does not correspond to depth data. The aligned gray-scale has poor quality and the depth data image is also very poor. Is the depth data processed somehow (loosing precision)? If you look at your data from different sides:
You can see how poor it is so I doubt you can use depth data for detection at all...
PS. I used my Align already captured rgb and depth images for the visualization.
The only thing left is the colored image and detect areas with matching color only. Then detect the features and classify. The color of your pallet in the image is almost white. Here HSV reduced colors to basic 16 colors (too lazy to segmentate)
You should obtain range of colors of the pallets possible by your setup to ease up the detection. Then check those objects for the features like size, shape,area,circumference...
[Edit2]
So I would start with Image preprocessing:
convert to HSV
treshold only pixels close to pallet color
I chose (H=40,S=18,V>100) as a pallet color. My HSV ranges are <0,255> per channel so Hue angle difference can be only <-180deg,+180deg> max which corresponds to <-128,+128> in my ranges.
remove too thin areas
Just scan all Horizontal an Vertical lines count consequent set pixels and if too small size recolor them to black...
This is the result:
On the left the original image (downsized so it fits to this page), In the middle is the color treshold result and last is the filtering out of small areas. You can play with tresholds and pallet color to change behavior to suite your needs.
Here C++ code:
int tr_d=10; // min size of pallet [pixels[
int h,s,v,x,y,xx;
color c;
pic1=pic0;
pic1.pf=_pf_rgba;
pic2.resize(pic1.xs*3,pic1.ys); xx=0;
pic2.bmp->Canvas->Draw(xx,0,pic0.bmp); xx+=pic1.xs;
// [color selection]
for (y=0;y<pic1.ys;y++)
for (x=0;x<pic1.xs;x++)
{
// get color from image
c=pic0.p[y][x];
rgb2hsv(c);
// distance to white-yellowish color in HSV (H=40,S=18,V>100)
h=c.db[picture::_h]-40;
s=c.db[picture::_s]-18;
v=c.db[picture::_v];
// hue is cyclic angular so use only shorter angle
if (h<-128) h+=256;
if (h>+128) h-=256;
// abs value
if (h< 0) h=-h;
if (s< 0) s=-s;
// treshold close colors
c.dd=0;
if (h<25)
if (s<25)
if (v>100)
c.dd=0x00FFFFFF;
pic1.p[y][x]=c;
}
pic2.bmp->Canvas->Draw(xx,0,pic1.bmp); xx+=pic1.xs;
// [remove too thin areas]
for (y=0;y<pic1.ys;y++)
for (x=0;x<pic1.xs;)
{
for ( ;x<pic1.xs;x++) if ( pic1.p[y][x].dd) break; // find set pixel
for (h=x;x<pic1.xs;x++) if (!pic1.p[y][x].dd) break; // find unset pixel
if (x-h<tr_d) for (;h<x;h++) pic1.p[y][h].dd=0; // if too small size recolor to zero
}
for (x=0;x<pic1.xs;x++)
for (y=0;y<pic1.ys;)
{
for ( ;y<pic1.ys;y++) if ( pic1.p[y][x].dd) break; // find set pixel
for (h=y;y<pic1.ys;y++) if (!pic1.p[y][x].dd) break; // find unset pixel
if (y-h<tr_d) for (;h<y;h++) pic1.p[h][x].dd=0; // if too small size recolor to zero
}
pic2.bmp->Canvas->Draw(xx,0,pic1.bmp); xx+=pic1.xs;
See how to extract the borders of an image (OCT/retinal scan image) for the description of picture and color. Or look at any of my DIP/CV tagged answers. Now the code is well commented and straightforward but just need to add:
You can ignore pic2 stuff it is just the image posted above so I do not need to manually print screen and merge the subresult in paint... To improve robustness you should add enhancing of dynamic range (so the tresholds have the same conditions for any input images). Also you should compare to more then just single color (if more wood types of pallet is present).
Now you should segmentate or label the areas
loop through entire image
find first pixel set with the pallet color
flood fill the area with some distinct ID color different from set pallet color
I use black 0x00000000 space and white 0x00FFFFFF as pallete pixel color. So use ID={1,2,3,4,5...}. Also remember number of filled pixels (that is your area) so you do not need to compute it again. You can also compute bounding box directly while filling.
compute and compare features
You need to experiment with more then one image. To find out what properties are good for detection. I would go for circumference length vs area ratio. and or bounding box size... The circumference can be extracted by simply selecting all pixels with proper ID color neighboring black pixel.
See also similar Fracture detection in hand using image proccessing
Good luck and have fun ...
I begin a project about the detection.
My idea is to rank every pixels of an image (Mat).
Then, I will be able to exit which colour is dominant.
The difficulty is a colour is not unic. For exemple, Green is rgb(0, 255, 0) but is almost rgb(10, 240, 20) too.
The goal of my ranking is to exit pixels which are almost same colour. Then, with a pourcentage, I think I can locate my object.
So, my question: Is it a way to ranking pixels by colour ?
Thx a lot in advance for your answers.
There isn't a straight method of ranking as you say of pixels in colours.
However, you can find an approximation to the most dominant one.
There are several way in which you can do it:
You can calculate the histogram for each colour channel - split it into the R,G,B and compute the histogram. Then you can see where the peaks of the resulting graphs are - e.g.
If you k-means cluster the pixels at the image - in other words, represent each pixel as a 3D point with coordinated (R, G, B). Then you can segment the pixels into k most occurring colours.
If you resize the image to a 1x1 pixel image, you'll find the average of all pixel values. If there is a dominant colour, where the majority of the pixels are in close proximity, it will give a good approximation.
There however, are all approximations. Your best choice would be to use k-means and to find the cluster that either has the most elements, or is the most dense.
In case you are looking for way to locate an object with a specific colour, you can use a maximum likelihood estimation. Something like this, which was used to classify different objects, such as grass, cars, building and pavement from satellite images. You can use it with a single colour and get a heat-map of where the object is in terms of likelihood (the percentage of probability) of that pixel belonging to your object.
In an ordinary image, there's always a number of colors involved. To best average the pixels carrying almost the same colors is done by color quantization which is reducing number of colors in an image using techniques like K-mean clustering. This is best explained here with Python code:
https://www.pyimagesearch.com/2014/07/07/color-quantization-opencv-using-k-means-clustering/
After successful quantization, you can just try the following code to rank the colors based on their frequencies in the image.
top_n_colors = []
n = 3
colors_count = {}
(channel_b, channel_g, channel_r) = cv2.split(_processed_image)
# Flattens the 2D single channel array so as to make it easier to iterate over it
channel_b = channel_b.flatten()
channel_g = channel_g.flatten()
channel_r = channel_r.flatten()
for i in range(len(channel_b)):
RGB = str(channel_r[i]) + " " + str(channel_g[i]) + " " + str(channel_b[i])
if RGB in colors_count:
colors_count[RGB] += 1
else:
colors_count[RGB] = 1
# taking the top n colors from the dictionary objects
_top_colors = sorted(colors_count.items(), key=lambda x: x[1], reverse=True)[0:n]
for _color in _top_colors:
_rgb = tuple([int(value) for value in _color[0].split()])
top_n_colors.append(_rgb)
print(top_n_colors)
I am working on a project to losslessly compress a specific style of BMP images that look like this
I have thought about doing pattern recognition, to find repetitive blocks of N x N pixels but I feel like it wont be fast enough execution time.
Any suggestions?
EDIT: I have access to the dataset that created these images too, I just use the image to visualize my data.
Optical illusions make it hard to tell for sure but are the colors only black/blue/red/green? If so, the most straightforward compression would be to simply make more efficient use of pixels. I'm thinking pixels use a fixed amount of space regardless of what color they are. Thus, chances are you are using 12x as many pixels as you really need to be. Since a pixel can be a lot more colors than just those four.
A simple way to do that would be to do label the pixels with the following base 4 numbers:
Black = 0
Red = 1
Green = 2
Blue = 3
Example:
The first four colors of the image seems to be Blue-Red-Blue-Blue. This is equal to 3233 in base 4, which is simply EF in base 16 or 239 in base 10. This is enough to define what the red color of the new pixel should be. The next 4 would define the green color and the final 4 define what the blue color is. Thus turning 12 pixels into a single pixel.
Beyond that you'll probably want to look into more conventional compression software.
I'm a bit stuck on designing a color detection system - I can't quite figure out a way to do it easily.
-
Basically, I have a library of images, that I want to sort by color. So if the user specifies 'sort by blue', then the most blue images will appear at the top of the results, with the least blue appearing at the bottom.
The problem is that the images aren't all one color, so it is doing two things at the same time:
1 - finding the bluest part of the image
2 - ranking this blue color (based on color hue and amount of this color).
I've tried about 3 or 4 different approaches, with varying results - none work well though, and 2 of these were quite mathematical algorithms (which all work much better on paper than in practice haha).
-
What different ways could I go about the whole process? I'm probably missing some really obvious ways it could work - any help or ideas would be much appreciated :)
-
EDIT: Thanks for all the responses - here's what I've tried so far:
getting the average rgb value for the whole image and comparing it to blue. Comparing was done using normalised rgb 3 space vectors and finding distances between them. This works the least well, an image with no blue could easily appear above an image with partial very strong blue.
finding the dominant color and comparing it to blue (again using 3 space vector distances). This didn't work as there might have been a large blue section of the image that wasn't the most (or in the top couple) of dominant color sections.
finding pixels that are close to blue, averaging all of these and comparing the answer to actual blue.
finding all the pixels that are close to blue, incrementing a count and finding a percentage based on count/total pixels.
Two thoughts come to mind:
Cheap version: convert images to HSV color space, and for each pixel compute cos(H - target_hue) or a reasonable approximation (for blue, target_hue would be 240 degrees), multiply by saturation, and average that quantity over all of the pixels in the image. High values are best. Note that colors that are closer to yellow than to blue have "negative blueness", and that black, white, and pure gray have equally "zero blueness". Note that you really want HSV, not HSL, in this situation, because the "S" in HSL doesn't map well to perceptual saturation. For example, the color #f8f8ff (RGB 248, 248, 255) has a saturation of 100% in HSL (i.e. a pure blue), but it looks nearly white. The same color in HSV has an "S" coordinate of only 3%, which is reasonable.
Less cheap version: convert images to CIELAB color space, discard L, and compute the distance in a*b* space between each pixel and the target color, then average or RMS over each pixel. Low values are best.
I think to measure "blueness" you'll need to take all three components into account, not just the blue. Just for example, [255,255,255] is pure white, not blue -- but [0, 0, 30] is pure blue, even though its blue component is much lower in value.
Alternatively, you could convert to something like HSL or HSV, in which case the "blueness" should be a bit simpler to measure (hue and saturation only).
I'd google for an algorythm for creating 256 colour palettes from 24bit images (see http://en.wikipedia.org/wiki/Color_quantization for more info) then see which colours in this palette dominate if the image was mapped to it. ie, running a tally for each 256 palette entry of how many pixels get mapped into it.
notes,
you of course don't need the whole 256, it's just saying 256 to help explain my thinking.
also by directly studying the algorythim for this palette generation might directly give you an answer.
Do you really need to find the bluest part of the image? Why not just rank the "blueness" of an image as the average blue-component value for all pixels?
Another possibility would be to find the density of pixels that pass a threshold, or minimum blue value necessary to qualify as a blue pixel.
If you have one pixel, I'd say its blueness in terms of RGB is the the value of B / (R + G + B), so 1 is totally blue and 0 is not blue at all and white is 1/3 blue. (Watch out for black, which is a special case.) And the blueness of an image is the average blueness of its pixels. And if that's too costly, just take the average of a fixed number of randomly-chosen pixels.
I would say to take the average of the RGB value itself over the whole picture. I would say that the pseudo below should give you the "average blue" of the picture.
SUMr
SUMg
SUMb
for pixel <- image
SUMr += pixel.r
SUMg += pixel.g
SUMb += pixel.b
SUMr / pixelcount
SUMg / pixelcount
SUMb / pixelcount
If this doesn't work out; then I would think that you would need to rank a "blue" pixel as being higher/lower weighted based on the G/B values. Then add up your weighted value(s) and compare those.
weight
for pixel <- image
tweight = b
b -= r
b -= g
b = 0 if b < 0
weight += tweight
compare weights of all images.
I have an image which is representative of an Array2D:
template<class T = uint8_t>
Array2D<T> mPixData[4]; ///< 3 component channels + alpha channel.
The comment is in the library. I have no clues about the explanation.
Would someone:
explain what are the 3 component channels + alpha channel are about
show how I could resize this image based on the mPixData
Without know what library this is, here is a stab in the dark:
The type definition implies that it is creating a 2D array of unsigned chars (allowing you to store values up to 255.
template<class T = uint8_t> Array2D<T>
Then, mPixData itself is an array, which implies that at each co-ordinate, you have four values (bytes to contend with), 3 for the colours (let's say RGB, but could be something else) and 1 for Alpha.
The "image" is basically this three dimensional array. Presumably when loading stuff into it, it resizes to the input - what you need to do is to find some form of resizing algorithm (not an image processing expert myself, but am sure google will reveal something), which will then allow you to take this data and do what you need...
1) 3 component channels - Red Green Blue channels. alpha channel tells about image transparency
2) There are many algorithms you can use to resize the image. The simplest would be to discard extra pixels. Another simple is to do interpolation
The 3 component channels represent the Red Green Blue (aka RGB) channels. The 4th channel, ALPHA, is the transparency channel.
A pixel is defined by mPixData[4]
mPixData[0] -> R
mPixData[1] -> G
mPixData[2] -> B
mPixData[3] -> A
Therefore, an image can be represented as a vector or array of mPixData[4]. As you already stated, in this case is Array2D<T> mPixData[4];
Resize/rescale/resample an image is not a trivial process. There are lots of materials available on the web about it and I think you should consider using a library to do this. Check CxImage (Windows/Linux).
There are some code here but I haven't tested it. Check the resample() function.
Hi the 3 channels are the rgb + alpha channel. So red green and blue channels and the alpha channel. There are several methods to downscaling. You could take for example every 4 pixel, but the result would look quite bad, take a look at different interpolation methods e.g.: http://en.wikipedia.org/wiki/Bilinear_interpolation.
Or if you want to use a library use: http://www.imagemagick.org/Magick++/
or as mentioned by karlphillip:
http://www.xdp.it/cximage.htm