Compare rendered images using imagemagick - unit-testing

I have rendered multiple images from an application. Here is sample images that illustrate two images that looks almost the same to the eye .
I try to compare them with the following command in image magick.
compare -metric AE img1.png img2.png diff.png
6384
This means 6384 pixels differ even if the images are similar.
I got minor changes like if a pattern is moved 1 pixel to the right this will give me a large error in number of different pixels. Is there a good way of do this kind of diff with ImageMagick? I have experimented with the fuzz parameter, but it really does not help me. Is ImageMagick compare only suited for comparing photographic images? Are there better switches to ImageMagick that can recognize a text that has moved some pixels and report it as equal? Should I use another tool?
Edit:
Adding an example on a image that looks clearly different for a human and will illustrate the kind of difference I am trying to differentiate. In this image not many pixels are changed, but the visible pattern is clearly changed.

It's hard to give any detailed answer as I don't know what you are looking for or expecting. I guess you may need some sort of Perceptual Hash if you are looking for images that people would perceive as similar or dissimilar, or maybe a Scale/Rotation/Translation Invariant technique that identifies similar images independently of resizes, shifts and rotations.
You could look at the Perceptual Hash and Image Moments with ImageMagick like this:
identify -verbose -features 1 -moments 1.png
Image: 1.png
Format: PNG (Portable Network Graphics)
Mime type: image/png
Class: PseudoClass
Geometry: 103x115+0+0
Resolution: 37.79x37.79
Print size: 2.72559x3.04313
Units: PixelsPerCentimeter
Type: Grayscale
Base type: Grayscale
Endianess: Undefined
Colorspace: Gray
Depth: 8-bit
Channel depth:
gray: 8-bit
Channel statistics:
Pixels: 11845
Gray:
min: 62 (0.243137)
max: 255 (1)
mean: 202.99 (0.79604)
standard deviation: 85.6322 (0.335812)
kurtosis: -0.920271
skewness: -1.0391
entropy: 0.840719
Channel moments:
Gray:
Centroid: 51.6405,57.1281
Ellipse Semi-Major/Minor axis: 66.5375,60.336
Ellipse angle: 0.117192
Ellipse eccentricity: 0.305293
Ellipse intensity: 190.641 (0.747614)
I1: 0.000838838 (0.213904)
I2: 6.69266e-09 (0.00043519)
I3: 3.34956e-15 (5.55403e-08)
I4: 5.38335e-15 (8.92633e-08)
I5: 2.27572e-29 (6.25692e-15)
I6: -4.33202e-19 (-1.83169e-09)
I7: -2.16323e-30 (-5.94763e-16)
I8: 3.96612e-20 (1.67698e-10)
Channel perceptual hash:
Red, Hue:
PH1: 0.669868, 11
PH2: 3.35965, 11
PH3: 7.27735, 11
PH4: 7.05343, 11
PH5: 11, 11
PH6: 8.746, 11
PH7: 11, 11
Green, Chroma:
PH1: 0.669868, 11
PH2: 3.35965, 11
PH3: 7.27735, 11
PH4: 7.05343, 11
PH5: 11, 11
PH6: 8.746, 11
PH7: 11, 11
Blue, Luma:
PH1: 0.669868, 0.669868
PH2: 3.35965, 3.35965
PH3: 7.27735, 7.27735
PH4: 7.05343, 7.05343
PH5: 11, 11
PH6: 8.746, 8.746
PH7: 11, 11
Channel features (horizontal, vertical, left and right diagonals, average):
Gray:
Angular Second Moment:
0.364846, 0.615673, 0.372224, 0.372224, 0.431242
Contrast:
0.544246, 0.0023846, 0.546612, 0.546612, 0.409963
Correlation:
-0.406263, 0.993832, -0.439964, -0.439964, -0.07309
Sum of Squares Variance:
1.19418, 1.1939, 1.19101, 1.19101, 1.19253
Inverse Difference Moment:
0.737681, 1.00758, 0.745356, 0.745356, 0.808993
Sum Average:
1.63274, 0.546074, 1.63983, 1.63983, 1.36462
Sum Variance:
4.43991, 0.938019, 4.46048, 4.46048, 3.57472
Sum Entropy:
0.143792, 0.159713, 0.143388, 0.143388, 0.14757
Entropy:
0.462204, 0.258129, 0.461828, 0.461828, 0.410997
Difference Variance:
0.0645055, 0.189604, 0.0655494, 0.0655494, 0.0963021
Difference Entropy:
0.29837, 0.003471, 0.297282, 0.297282, 0.224101
Information Measure of Correlation 1:
-0.160631, -0.971422, -0.146024, -0.146024, -0.356026
Information Measure of Correlation 2:
0.294281, 0.625514, 0.29546, 0.29546, 0.377679
You could also go on Fred Weinhaus's excellent website (here) and download his script called moments which will calculate the Hu and Maitra moments and see if those will tell you what you want. Basically, you could run the script on each of your images like this:
./moments image1.png > 1.txt
./moments image2.png > 2.txt
and then use your favourite diff tool to see what has changed between the two images you wish to compare.

Related

How to find the correct HSV threshold values of an object for OpenCV?

I am trying to find the correct lowerbounds/upperbounds threshold values of a ball, so that I can use it in the OpenCV inRange function.
I've read Choosing correct HSV values for OpenCV thresholding with InRangeS, but I still dont understand how to do this in my case:
inRange function:
inRange(frmHsv, Scalar(lowerH, lowerS, lowerV), Scalar(upperH, upperS, upperV), rangeRes);
OpenCV HSV ranges:
H: 0 - 180
S: 0 - 255
V: 0 - 255
How can I find lowerH, lowerS, lowerV, upperH, upperS, upperV?
Take a look of this tutorial is really good!
It create a trackbar to move the upper and lower value online and check what is the best response for your object. I used for a cylinder color coded detection and works really well.
http://opencv-srf.blogspot.com.au/2010/09/object-detection-using-color-seperation.html

Position detection of a defined mark in a picture

I am still a beginner in coding. I am currently working on a program in C/C++ that is determining pixel position of a defined mark (which is a black circle with white surroundings) in a photo.
I made a mask from the mark and a vector, which contains mask's every pixel value as it's elements (using Magick++ I summed values for Red, Green and Blue). Vector contains aprox. 10 000 values since the mask is 100x100px. I also used threshold functions for simplifying the image.
Than I made a grid, that is doing the same for the picture, where I want to find the coordinates of the mark. It is basically a loop, that is going throught the image and when the program knows pixel values in the grid it immediately compares them with the mask. Main idea is to find lowest difference between the mask and one of the grid positions.
The problem is however that this procedure of evaluating all grids position takes huge amount of time (e.g. the image has 1920x1080px so more than 2 million vectors containing 10 000 values). I decided to cycle the grid not every pixel but for example every 10th column and row, and than for the best corellation from this procedure I selected area where I used every pixel loop. But, this still takes lot of time.
I would like to ask you, if there is some way of improving this method for better (faster) results or this whole idea is not time efficient and I should use different approach.
Thanks for every advice!
Edit: The program will be used for processing multiple images and on all of them the size will be same. This is the picture after threshold, the mark is the big black dot.
Image
The idea that I find interesting is a pyramidal scheme - or progressive refinement: you find the spot at a lower size image then search only a small rectangle in the larger image.
If you reduce your image by 2 in each dimension then you would reduce the time by 4 plus some search effort in the larger image.
This has some problems: the reduction will affect accuracy I expect. You might miss the spot.
You have to cut the sample (template) by the same so you create a half-size template in this case. As you half half half... the template will get blurred into the surrounding objects so it will not be possible to have a valid template; for half size once I guess the dot has a couple of pixels around it.
As you haven't specified a tool or OS, I will choose ImageMagick which is installed on most Linux distros and is available for OSX and Windows. I am just using it at the command-line here but there are C, C++, Python, Perl, PHP, Ruby, Java and .Net bindings available.
I would use a "Connect Components Analysis" or "Blob Analysis" like this:
convert image.png -negate \
-define connected-components:area-threshold=1200 \
-define connected-components:verbose=true \
-connected-components 8 -auto-level result.png
I have inverted your image with -negate because in morphological operations, the foreground is usually white rather than black. I have excluded blobs smaller than 1200 pixels because your circles seem to have a radius of 22 pixels which makes for an area of 1520 pixels (Pi * 22^2).
That gives this output, which means 7 blobs - one per line - with the bounding box and area of each:
Objects (id: bounding-box centroid area mean-color):
0: 1358x1032+0+0 640.8,517.0 1296947 gray(0)
3: 341x350+1017+287 1206.5,468.9 90143 gray(255)
106: 64x424+848+608 892.2,829.3 6854 gray(255)
95: 38x101+44+565 61.5,619.1 2619 gray(255)
49: 17x145+1341+379 1350.3,446.7 2063 gray(0)
64: 43x43+843+443 864.2,464.1 1451 gray(255)
86: 225x11+358+546 484.7,551.9 1379 gray(255)
Note that, as your circle is 42x42 pixels you will be looking for a blob that is square-ish and close to that size - so I am looking at the second to last line. I can draw that in in red on your original image like this:
convert image.png -fill none -stroke red -draw "rectangle 843,443 886,486" result.png
Also, note that as you are looking for a circle, you would expect the area to be pi * r^2 or around 1500 pixels and you can check that in the penultimate column of the output.
That runs in 0.4 seconds on a reasonable spec iMac. Note that you could divide the image into 4 and run each quarter in parallel to speed things up. So, if you do something like this:
#!/bin/bash
# Split image into 4 (maybe should allow 23 pixels overlap)
convert image.png -crop 1x4# tile-%02d.mpc
# Do Blob Analysis on 4 strips in parallel
for f in tile-*mpc; do
convert $f -negate \
-define connected-components:area-threshold=1200 \
-define connected-components:verbose=true \
-connected-components 8 info: &
done
# Wait for all 4 to finish
wait
That runs in around 0.14 seconds.

OpenCV: how to create .vec file to use with opencv_traincascade

As I explained in my previous post here, I am trying to generate some cascade.xml files to recognize euro coins to be used in my iOS app. Anyway, I am founding many difficulties in understanding how to generate a .vec file to give as input to opencv_traincascade. This because I heard many dissenting views: someone told me that vector file must include only positive images containing only the object to recognize; someone else instead (and also as read in my tutorials) said that vector file must include "samples" images, in other words random backgrounds to which the object to recognize has been added by opencv_createsamples. In oter words with:
opencv_createsamples -img positives/1.png -bg negatives.txt -info 1.txt -num 210 -maxxangle 0.0 -maxyangle 0.0 -maxzangle 0.9 -bgcolor 255 -bgthresh 8 -w 48 -h 48
which generated 12000 images.
Finally, I have created the .vec file with:
cat *.txt > positives.txt
opencv_createsamples -info positives.txt -bg negatives.txt -vec 2.vec -num 12600 -w 48 -h 48
So, I would like to ask you which the correct images to be contained in vector files from the following two:
Moreover, which is the final command to give to launch the training? This is the ones I have used up to now:
opencv_traincascade -data final -vec 2.vec -bg negatives.txt -numPos 12000 -numNeg 3000 -numStages 20 -featureType HAAR -precalcValBufSize 2048 -precalcIdxBufSize 2048 -minHitRate 0.999 -maxFalseAlarmRate 0.5 -w 48 -h 48 -mode ALL
where the .vec files contains 12000 samples images (background + coin to recognize).
Should the .vec file contain only positive images (only coins), how shall I tell opencv_traincascade to train using the samples images?
I really need to know how to do things correctly because I have launched many training which then lead to no correct result and since they take many hours or days to execute, I can't waste time anymore.
Thank to all for your attention.
UPDATE
I managed to create a cascade.xml file with LBP. See what happens if I give one of the image used as training samples to a simple OpenCV program:
while with an image like the following:
it does not work at all. I really don't know where I am making the mistake.
UPDATE
Maybe firstly converting positive images to gray scale could help?
I've used the negative samples database of the INRIA training http://pascal.inrialpes.fr/data/human/
and this input (png with alpha transparency around the coin):
using this with this command:
opencv_createsamples -img pos_color.png -num 10 -bg neg.txt -info test.dat -maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100
-bgcolor 0 -bgthresh 0
produces output like this:
so background color obviously didn't work. Converting to grayscale in the beginning however gives me this input:
and same command produces output like this:
I know this is no answer to all of your questions, but maybe it still helps.
OpenCV cascades (HAAR, LBP) can excellently detect objects which have permanent features. As example all faces have nose, eyes and mouth at the same places. OpenCV cascades are trained to search common features in required class of object and ignore features which changes from object to object. The problem is conclude that the cascade uses rectangular shape of search window, but a coin has round shape. Therefore an image of the coin always will be have part of background.
So training images of the coin must includes all possible backgrounds in order to classifiers can ignore them (otherwise it will be detect coin only on the specific background).
So all training samples must have the same image ratio, size and position of the coin (square images with coin in center, diameter of coin = 0.8-0.9 image width), and different background!

ImageMagick "invalid colormap index" on greyscale image

I'm attempting to use ImageMagick 6.9.3-1 to convert a medical X-ray format into a more common one, be it a png, bmp or whatever. I have zero documentation on this file format. I have zero access to the software that loads the image. I'm hoping this wonderful community can help me out here.
So far I have determined it is a grey scale jfif with 12 bits per pixel.
I have compiled ImageMagick to enable decompression of 12 bit jfifs, which is where the problem may lie, but more on that later.
ImageMagick identify outputs no errors
C:\libjpeg\ImageMagick-6.9.3-1\VisualMagick\bin>identify.exe C:\Users\alexander\
Desktop\708_885_5856.xdr
C:\Users\alexander\Desktop\708_885_5856.xdr JPEG 1896x1368 1896x1368+0+0 12-bit
Gray 4096c 659KB 0.000u 0:00.000
However, ImageMagick identify -verbose reports the error "invalid colormap index"
C:\libjpeg\ImageMagick-6.9.3-1\VisualMagick\bin>identify.exe -verbose C:\Users\a
lexander\Desktop\708_885_5856.xdr
Image: C:\Users\alexander\Desktop\708_885_5856.xdr
Format: JPEG (Joint Photographic Experts Group JFIF format)
Mime type: image/jpeg
Class: PseudoClass
Geometry: 1896x1368+0+0
Units: Undefined
Type: Grayscale
Endianess: Undefined
Colorspace: Gray
Depth: 12/16-bit
Channel depth:
gray: 16-bit
Channel statistics:
Pixels: 2593728
Gray:
min: 0 (0)
max: 319.989 (0.0781415)
mean: 0.0593298 (1.44883e-05)
standard deviation: 2.68324 (0.000655247)
kurtosis: 3867.56
skewness: 57.5931
entropy: 0.00257194
Colormap entries: 4096
Colormap:
Rendering intent: Undefined
Gamma: 0.454545
Background color: gray(255)
Border color: gray(223)
Matte color: gray(189)
Transparent color: gray(0)
Interlace: None
Intensity: Undefined
Compose: Over
Page geometry: 1896x1368+0+0
Dispose: Undefined
Iterations: 0
Compression: JPEG
Quality: 84
Orientation: Undefined
Properties:
comment: ♦
date:create: 2016-01-12T12:53:56-05:00
date:modify: 2016-01-19T16:41:03-05:00
jpeg:colorspace: 1
jpeg:sampling-factor: 1x1
signature: f913f28ee8f54a718a3224d8b097f5cf3086509c6e4229974f4ce8570fd97f6e
Artifacts:
filename: C:\Users\alexander\Desktop\708_885_5856.xdr
verbose: true
Tainted: False
Filesize: 659KB
Number pixels: 2.594M
Pixels per second: 41.2KB
User time: 62.672u
Elapsed time: 1:04.010
Version: ImageMagick 6.9.3-1 Q16 x64 2016-01-19 http://www.imagemagick.org
identify.exe: Invalid colormap index `C:\Users\alexander\Desktop\708_885_5856.xd
r' # error/colormap-private.h/ConstrainColormapIndex/34.
This seems strange to me because identify was able to calculate the statistics for the gray channel, and "Colormap entries" registers as 4096.
When I attempt to use ImageMagick to convert one of these images, I end up with a appropriately sized grey scale image that is 100% black pixels, and I get the "invalid colormap index" message.
C:\libjpeg\ImageMagick-6.9.3-1\VisualMagick\bin>convert.exe -verbose C:\Users\al
exander\Desktop\708_885_5856.xdr C:\Users\alexander\Desktop\output.png
C:\Users\alexander\Desktop\708_885_5856.xdr JPEG 1896x1368 1896x1368+0+0 12-bit
Gray 4096c 659KB 61.844u 1:02.400
C:\Users\alexander\Desktop\708_885_5856.xdr=>C:\Users\alexander\Desktop\output.p
ng JPEG 1896x1368 1896x1368+0+0 16-bit Gray 22c 9.73KB 0.391u 0:00.187
convert.exe: Invalid colormap index `C:\Users\alexander\Desktop\708_885_5856.xdr
' # error/colormap-private.h/ConstrainColormapIndex/34.
The block of code referenced in the "invalid colormap index" error is
static inline IndexPacket ConstrainColormapIndex(Image *image,
const size_t index)
{
if ((index < image->colors) && ((ssize_t) index >= 0))
return((IndexPacket) index);
(void) ThrowMagickException(&image->exception,GetMagickModule(),
CorruptImageError,"InvalidColormapIndex","`%s'",image->filename);
return((IndexPacket) 0);
}
Which seems to make sense, index is coming in > 4095 or < 0.
I'm not very familiar with Visual Studio or c++, so please bear with me.
When I was setting libjpeg to read 12 bit images, I came across a line of code in jdct.h that caught my attention. The middle word "UINT32" on the middle line was underlined in red.
typedef INT32 DCTELEM; /* must have 32 bits */
typedef UINT32 UDCTELEM;
typedef unsigned long long UDCTELEM2;
Visual Studio does not recognize UINT32 as a valid data type.
In the process of getting the project loaded into Visual Studio, I had to "convert" the project from Visual Studio 2010 to Visual Studio 2015. I fear that something got messed up when I "Converted" the project from Visual Studio 2010 to Visual Studio 2015. Or it may be a red herring.
In conclusion, I guess my questions really are:
How can I print the value of index? (like java "System.println.out(index);")
Isn't there something I can include in C & C++ for standard data types like cstd.h or something?
Why do I even need a color map index? If identify can calculate an average for the pixel values, it can obviously read each one, and if it can read each one, why does it fail?
Would obtaining Visual Studio 2010 most likely solve my problem?
The maximum value in your image is only 319 on a 12-bit range, i.e. 319 out of a possible 4095, or just 7% brightness, so your image is dark and will look black.
Try setting the contrast higher like this:
convert "C:\Users\alexander\Desktop\708_885_5856.xdr" -auto-level result.png

OpenCV - HSV range of values for tracking red color

Could you please tell me how to what are ranges for Hue, Saturation and Value indices for intense red?
I try to use this values for color tracking and I couldn't find a specific answer via Google.
you can map any color to OpenCV HSV. Actually opencv use 1800 hue cylinder while ideally it is 360, on the orher hand MS paint use 2400 cyllinder.
So to get OpenCV HSV value, simply open MS paint, open mixer, and read the value of HSV, now to map this value into OpenCV HSV multiply it with 180/240.
the range to value for saturation and value is 00-1800
You are the only one who can answer this question, since we don't know your criteria for "intense red". Collect as many samples as you can, some of which you consider intense red and some which are close but just miss the cut. Convert them all to HSL. Study the pattern.
You might put together a small app that has sliders for the H, S, and L parameters and displays a block of color corresponding to the settings. That will tell you your limits very quickly.