Compressing BMP methods - compression

I am working on a project to losslessly compress a specific style of BMP images that look like this
I have thought about doing pattern recognition, to find repetitive blocks of N x N pixels but I feel like it wont be fast enough execution time.
Any suggestions?
EDIT: I have access to the dataset that created these images too, I just use the image to visualize my data.

Optical illusions make it hard to tell for sure but are the colors only black/blue/red/green? If so, the most straightforward compression would be to simply make more efficient use of pixels. I'm thinking pixels use a fixed amount of space regardless of what color they are. Thus, chances are you are using 12x as many pixels as you really need to be. Since a pixel can be a lot more colors than just those four.
A simple way to do that would be to do label the pixels with the following base 4 numbers:
Black = 0
Red = 1
Green = 2
Blue = 3
Example:
The first four colors of the image seems to be Blue-Red-Blue-Blue. This is equal to 3233 in base 4, which is simply EF in base 16 or 239 in base 10. This is enough to define what the red color of the new pixel should be. The next 4 would define the green color and the final 4 define what the blue color is. Thus turning 12 pixels into a single pixel.
Beyond that you'll probably want to look into more conventional compression software.

Related

how to find 16 most common colors in the img with extension BMP

for(int yy=0; yy<height/2; yy++)
{
SDL_Color kolor = getPixel(xx,yy); //we are gettig each pixel in img;
setPixel(xx+width/2,yy+height/2,kolor.r,kolor.g,kolor.b );
//setPixel(xx,yy+height/2,kolor.r,kolor.g,kolor.b);
//setPixel(xx+width/2,yy,kolor.r,kolor.g,kolor.b);
}
}
I am trying by using a loop find 16 most common colors in an img and get its RGB.
I've been using mapping and trying to do something with structure but everything was in avail.
If you have some ideas about how to find these colors, I'll be sorely grateful. Thanks
If you had a 4x4 img it would be simple.
Simplify likewise, histogram function each RGB level(0-255x3), expecting each "popular" color to be in there. Sort the mess into top 16 used for each, again expecting those to be correct. Unless you have wild color gyrations, nothing else.
Check second time to see if popular colors actually exist.
Last, you might want to group into "close enough" categories, jpegs are lossy to a fault, anything within 4 RGB variations is grouped together, reducing 0-256 values to 0-64, unless colors swing widely. If you've used paint program magic wand, you know tolerance=0 makes uber mistakes, same idea.
Median filter. It will definitely reduce color count by merging into an average messy color, sample a 3x3, 4x4, circular weighted sample area.
If all else fails, steal the 256 color safe web palette, and work from there.
SWAG-Scientific Wild Ass Guess method, good luck.

Position detection of a defined mark in a picture

I am still a beginner in coding. I am currently working on a program in C/C++ that is determining pixel position of a defined mark (which is a black circle with white surroundings) in a photo.
I made a mask from the mark and a vector, which contains mask's every pixel value as it's elements (using Magick++ I summed values for Red, Green and Blue). Vector contains aprox. 10 000 values since the mask is 100x100px. I also used threshold functions for simplifying the image.
Than I made a grid, that is doing the same for the picture, where I want to find the coordinates of the mark. It is basically a loop, that is going throught the image and when the program knows pixel values in the grid it immediately compares them with the mask. Main idea is to find lowest difference between the mask and one of the grid positions.
The problem is however that this procedure of evaluating all grids position takes huge amount of time (e.g. the image has 1920x1080px so more than 2 million vectors containing 10 000 values). I decided to cycle the grid not every pixel but for example every 10th column and row, and than for the best corellation from this procedure I selected area where I used every pixel loop. But, this still takes lot of time.
I would like to ask you, if there is some way of improving this method for better (faster) results or this whole idea is not time efficient and I should use different approach.
Thanks for every advice!
Edit: The program will be used for processing multiple images and on all of them the size will be same. This is the picture after threshold, the mark is the big black dot.
Image
The idea that I find interesting is a pyramidal scheme - or progressive refinement: you find the spot at a lower size image then search only a small rectangle in the larger image.
If you reduce your image by 2 in each dimension then you would reduce the time by 4 plus some search effort in the larger image.
This has some problems: the reduction will affect accuracy I expect. You might miss the spot.
You have to cut the sample (template) by the same so you create a half-size template in this case. As you half half half... the template will get blurred into the surrounding objects so it will not be possible to have a valid template; for half size once I guess the dot has a couple of pixels around it.
As you haven't specified a tool or OS, I will choose ImageMagick which is installed on most Linux distros and is available for OSX and Windows. I am just using it at the command-line here but there are C, C++, Python, Perl, PHP, Ruby, Java and .Net bindings available.
I would use a "Connect Components Analysis" or "Blob Analysis" like this:
convert image.png -negate \
-define connected-components:area-threshold=1200 \
-define connected-components:verbose=true \
-connected-components 8 -auto-level result.png
I have inverted your image with -negate because in morphological operations, the foreground is usually white rather than black. I have excluded blobs smaller than 1200 pixels because your circles seem to have a radius of 22 pixels which makes for an area of 1520 pixels (Pi * 22^2).
That gives this output, which means 7 blobs - one per line - with the bounding box and area of each:
Objects (id: bounding-box centroid area mean-color):
0: 1358x1032+0+0 640.8,517.0 1296947 gray(0)
3: 341x350+1017+287 1206.5,468.9 90143 gray(255)
106: 64x424+848+608 892.2,829.3 6854 gray(255)
95: 38x101+44+565 61.5,619.1 2619 gray(255)
49: 17x145+1341+379 1350.3,446.7 2063 gray(0)
64: 43x43+843+443 864.2,464.1 1451 gray(255)
86: 225x11+358+546 484.7,551.9 1379 gray(255)
Note that, as your circle is 42x42 pixels you will be looking for a blob that is square-ish and close to that size - so I am looking at the second to last line. I can draw that in in red on your original image like this:
convert image.png -fill none -stroke red -draw "rectangle 843,443 886,486" result.png
Also, note that as you are looking for a circle, you would expect the area to be pi * r^2 or around 1500 pixels and you can check that in the penultimate column of the output.
That runs in 0.4 seconds on a reasonable spec iMac. Note that you could divide the image into 4 and run each quarter in parallel to speed things up. So, if you do something like this:
#!/bin/bash
# Split image into 4 (maybe should allow 23 pixels overlap)
convert image.png -crop 1x4# tile-%02d.mpc
# Do Blob Analysis on 4 strips in parallel
for f in tile-*mpc; do
convert $f -negate \
-define connected-components:area-threshold=1200 \
-define connected-components:verbose=true \
-connected-components 8 info: &
done
# Wait for all 4 to finish
wait
That runs in around 0.14 seconds.

Pointers on solving this Image Processing challenge?

The 2nd problem in IOI 2013 states:
You have an Art History exam approaching, but you have been paying
more attention to informatics at school than to your art classes! You
will need to write a program to take the exam for you.
The exam will consist of several paintings. Each painting is an example of one of
four distinctive styles, numbered 1, 2, 3 and 4. Style 1 contains
neoplastic modern art. Style 2 contains impressionist landscapes.
Style 3 contains expressionist action paintings. Style 4 contains
colour field paintings.
Your task is, given a digital image of a painting, to determine which style the painting belongs to.
The image will be given as an H×W grid of pixels. The rows of
the image are numbered 0, …, (H ­ 1) from top to bottom, and the
columns are numbered 0, …, W ­ 1 from left to right. The pixels are
described using two­dimensional arrays R , G and B , which give the
amount of red, green and blue respectively in each pixel of the image.
These amounts range from 0 (no red, green or blue) to 255 (the maximum
amount of red, green or blue).
Implementation
You should submit a file
that implements the function style(), as follows:
int style(int H, int W, int R[500][500], int G[500][500], int B[500][500]);
This function should determine the style of the image. Parameters are:
H: The number of rows of pixels in the image.
W: The number of columns of pixels in the image.
R: A two­dimensional array of size H×W , giving the amount of red in each pixel of the image.
G: A two­dimensional array of size H×W , giving the amount of green in each pixel of the image.
B: A two­dimensional array of size H×W , giving the amount of blue in each pixel of the image.
Example pictures are in the problem PDF
I do not want a readymade program. A hint or two to get me started would be nice, as I am clueless about this might be solved.
Since you are provided the image data in RGB format, first prepare a copy of the same image data in YUV. This is essential as some of the image features are easily identified patterns in the Luma(Y) and Chroma(U,V) maps.
Based on the samples provided, here are some of the salient features of each "style" of art :
Style1 - Neoplastic modern art
Zero graininess - Check for large areas with uniform Luma(Y)
Black pixels at edges of the areas(transition between different chroma).
Style2 - Impressionist landscapes
High graininess - Check for high entropy (salt-n-pepper-noise like) patterns in Luma(Y).
Pre-dominantly green - High values in green channel.
Greenavg >> Redavg
Greenavg >> Blueavg
Style3 - Expressionist action paintings
High graininess - Check for high entropy (salt-n-pepper-noise like) patterns in Luma(Y).
NOT green.
Style4 - Color field paintings
Zero graininess - Check for large areas with uniform Luma(Y)
NO black(or near black) pixels at the transition between different chroma.
As long as the input image belongs to one of these classes you should have no trouble in classification by running the image data through functions that are implemented to identify the above features.
Basically it boils down to the following code-flow :
Image has uniform luma?
(If Yes) Image has black pixels at chroma transitions?
(If Yes) Style1
(If No) Style4
(If No) Image is green-ish?
(If Yes) Style2
(If No) Style3
Maybe you can do a first approach using colors and shapes... In neo plastic modern it is likely that there will be only a few number of colors, occupying geometrical areas as in the colour field paintings.
This might gives you a way to differenciate styles 1 and 4 from styles 2 and 3.
In styles 1 and 4 you have large areas with the same color, but in style 4 the color is rarely a solid color but brush strokes of shades of the color.
Anyway you should look into the specialities of each styles, which are the usual colors and methods and then try to make your function "see" it.

C++: How to interpret a byte array representation of an image?

I'm trying to work with this camera SDK, and let's say the camera has this function called CameraGetImageData(BYTE* data), which I assume takes in a byte array, modifies it with the image data, and then returns a status code based on success/failure. The SDK provides no documentation whatsoever (not even code comments) so I'm just guestimating here. Here's a code snippet on what I think works
BYTE* data = new BYTE[10000000]; // an array of an arbitrary large size, I'm not
// sure what the exact size needs to be so I
// made it large
CameraGetImageData(data);
// Do stuff here to process/output image data
I've run the code w/ breakpoints in Visual Studio and can confirm that the CameraGetImageData function does indeed modify the array. Now my question is, is there a standard way for cameras to output data? How should I start using this data and what does each byte represent? The camera captures in 8-bit color.
Take pictures of pure red, pure green and pure blue. See what comes out.
Also, I'd make the array 100 million, not 10 million if you've got the memory, at least initially. A 10 megapixel camera using 24 bits per pixel is going to use 30 million bytes, bigger than your array. If it does something crazy like store 16 bits per colour it could take up to 60 million or 80 million bytes.
You could fill this big array with data before passing it. For example fill it with '01234567' repeated. Then it's really obvious what bytes have been written and what bytes haven't, so you can work out the real size of what's returned.
I don't think there is a standard but you can try to identify which values are what by putting some solid color images in front of the camera. So all pixels would be approximately the same color. Having an idea of what color should be stored in each pixel you may understand how the color is represented in your array. I would go with black, white, reg, green, blue images.
But also consider finding a better SDK which has the documentation, because making just a big array is really bad design
You should check the documentation on your camera SDK, since there's no "standard" or "common" way for data output. It can be raw data, it can be RGB data, it can even be already compressed. If the camera vendor doesn't provide any information, you could try to find some libraries that handle most common formats, and try to pass the data you have to see what happens.
Without even knowing the type of the camera, this question is nearly impossible to answer.
If it is a scientific camera, chances are good that it adhers to the IEEE 1394 (aka IIDC or DCAM) standard. I have personally worked with such a camera made by Hamamatsu using this library to interface with the camera.
In my case the camera output was just raw data. The camera itself was monochrome and each pixel had a depth-resolution of 12 bit. Therefore, each pixel intensity was stored as 16-bit unsigned value in the result array. The size of the array was simply width * height * 2 bytes, where width and height are the image dimensions in pixels the factor 2 is for 16-bit per pixel. The width and height were known a-priori from the chosen camera mode.
If you have the dimensions of the result image, try to dump your byte array into a file and load the result either in Python or Matlab and just try to visualize the content. Another possibility is to load this raw file with an image editor such as ImageJ and hope to get anything out from it.
Good luck!
I hope this question's solution will helps you: https://stackoverflow.com/a/3340944/291372
Actually you've got an array of pixels (assume 1 byte per pixel if you camera captires in 8-bit). What you need - is just determine width and height. after that you can try to restore bitmap image from you byte array.

How to resize an image in C++?

I have an image which is representative of an Array2D:
template<class T = uint8_t>
Array2D<T> mPixData[4]; ///< 3 component channels + alpha channel.
The comment is in the library. I have no clues about the explanation.
Would someone:
explain what are the 3 component channels + alpha channel are about
show how I could resize this image based on the mPixData
Without know what library this is, here is a stab in the dark:
The type definition implies that it is creating a 2D array of unsigned chars (allowing you to store values up to 255.
template<class T = uint8_t> Array2D<T>
Then, mPixData itself is an array, which implies that at each co-ordinate, you have four values (bytes to contend with), 3 for the colours (let's say RGB, but could be something else) and 1 for Alpha.
The "image" is basically this three dimensional array. Presumably when loading stuff into it, it resizes to the input - what you need to do is to find some form of resizing algorithm (not an image processing expert myself, but am sure google will reveal something), which will then allow you to take this data and do what you need...
1) 3 component channels - Red Green Blue channels. alpha channel tells about image transparency
2) There are many algorithms you can use to resize the image. The simplest would be to discard extra pixels. Another simple is to do interpolation
The 3 component channels represent the Red Green Blue (aka RGB) channels. The 4th channel, ALPHA, is the transparency channel.
A pixel is defined by mPixData[4]
mPixData[0] -> R
mPixData[1] -> G
mPixData[2] -> B
mPixData[3] -> A
Therefore, an image can be represented as a vector or array of mPixData[4]. As you already stated, in this case is Array2D<T> mPixData[4];
Resize/rescale/resample an image is not a trivial process. There are lots of materials available on the web about it and I think you should consider using a library to do this. Check CxImage (Windows/Linux).
There are some code here but I haven't tested it. Check the resample() function.
Hi the 3 channels are the rgb + alpha channel. So red green and blue channels and the alpha channel. There are several methods to downscaling. You could take for example every 4 pixel, but the result would look quite bad, take a look at different interpolation methods e.g.: http://en.wikipedia.org/wiki/Bilinear_interpolation.
Or if you want to use a library use: http://www.imagemagick.org/Magick++/
or as mentioned by karlphillip:
http://www.xdp.it/cximage.htm