Here https://www.tensorflow.org/programmers_guide/tensorboard_histograms#a_basic_example, it says "For example, in the following image we can see that the histogram at timestep 176 has a bin centered at 2.25 with 177 elements in that bin.
sample image 1
However, what I saw there are some bins that contains some "half element". By their explanation, the histogram at timestep 35 has a bin centered at 0.221 with 6.88 elements in that bin.sample image 2
So is the height of the histogram in tensorborad really means the number of elements?
Sorry right now I cannot embedded images in my post, so I add some links for those sample images.
Related
I read the paper explaining Yolact and Yolact++. I'm confused with the mask size and prototype mask. There is an illustration of protonet and the output from protonet is of size 138 * 138 * 32. Is this the size of protomask? I have read in the paper saying that the algorithm produces an image sized mask. So Please clarify the size of the mask produced.
Take for example an input with the following size:
(H,W,C) = (512,512,3)
The protonet will give you the following output size (a.k.a proto-masks): (128,128,32) - where 32=Number of Protos. It is 1/4 of the input size.
The protos are being used for getting the mask by a linear combination of them, with the corresponding coefficients predicted by the prediction module.
Therefore you will have a mask, with the size (128,128). Then a crop is being done on this mask, the cropping is done according to the bbox prediction (after NMS).
The bbox values can be related as relative to the image size, therefore (0.5,0.5,1.,1.) which corresponds to (256.,256.,512.,512.) in the input image, are (64.,64.,128.,128.) in the mask created by the protos
scene
I used haar cascade to detect face and that face ROI was passed to estimate its HoG feature...while calculating HoG feature...I used 8x8 patch of image(assuring all image was a square matrix).
problem
I used to create a 9 bin histogram for each patch.for example for a image ROI of 177x177 dimension I obtain 484 8x8 patch...out of all each patch contains a 9 bin histogram(9-dimension vector)...So for whole image I obtain total 4356 catenated histogram( 4356 dimension vector..)...Now I usually don't want to apply PCA algorithm currently or any kind of dimension reduction..The problem is a single image is generating such a large dimensional vector what if I feed 5 image to network (also dimension is changing with picture...its not fixed!)..feels like BooM!!
so what to do..
I have rendered multiple images from an application. Here is sample images that illustrate two images that looks almost the same to the eye .
I try to compare them with the following command in image magick.
compare -metric AE img1.png img2.png diff.png
6384
This means 6384 pixels differ even if the images are similar.
I got minor changes like if a pattern is moved 1 pixel to the right this will give me a large error in number of different pixels. Is there a good way of do this kind of diff with ImageMagick? I have experimented with the fuzz parameter, but it really does not help me. Is ImageMagick compare only suited for comparing photographic images? Are there better switches to ImageMagick that can recognize a text that has moved some pixels and report it as equal? Should I use another tool?
Edit:
Adding an example on a image that looks clearly different for a human and will illustrate the kind of difference I am trying to differentiate. In this image not many pixels are changed, but the visible pattern is clearly changed.
It's hard to give any detailed answer as I don't know what you are looking for or expecting. I guess you may need some sort of Perceptual Hash if you are looking for images that people would perceive as similar or dissimilar, or maybe a Scale/Rotation/Translation Invariant technique that identifies similar images independently of resizes, shifts and rotations.
You could look at the Perceptual Hash and Image Moments with ImageMagick like this:
identify -verbose -features 1 -moments 1.png
Image: 1.png
Format: PNG (Portable Network Graphics)
Mime type: image/png
Class: PseudoClass
Geometry: 103x115+0+0
Resolution: 37.79x37.79
Print size: 2.72559x3.04313
Units: PixelsPerCentimeter
Type: Grayscale
Base type: Grayscale
Endianess: Undefined
Colorspace: Gray
Depth: 8-bit
Channel depth:
gray: 8-bit
Channel statistics:
Pixels: 11845
Gray:
min: 62 (0.243137)
max: 255 (1)
mean: 202.99 (0.79604)
standard deviation: 85.6322 (0.335812)
kurtosis: -0.920271
skewness: -1.0391
entropy: 0.840719
Channel moments:
Gray:
Centroid: 51.6405,57.1281
Ellipse Semi-Major/Minor axis: 66.5375,60.336
Ellipse angle: 0.117192
Ellipse eccentricity: 0.305293
Ellipse intensity: 190.641 (0.747614)
I1: 0.000838838 (0.213904)
I2: 6.69266e-09 (0.00043519)
I3: 3.34956e-15 (5.55403e-08)
I4: 5.38335e-15 (8.92633e-08)
I5: 2.27572e-29 (6.25692e-15)
I6: -4.33202e-19 (-1.83169e-09)
I7: -2.16323e-30 (-5.94763e-16)
I8: 3.96612e-20 (1.67698e-10)
Channel perceptual hash:
Red, Hue:
PH1: 0.669868, 11
PH2: 3.35965, 11
PH3: 7.27735, 11
PH4: 7.05343, 11
PH5: 11, 11
PH6: 8.746, 11
PH7: 11, 11
Green, Chroma:
PH1: 0.669868, 11
PH2: 3.35965, 11
PH3: 7.27735, 11
PH4: 7.05343, 11
PH5: 11, 11
PH6: 8.746, 11
PH7: 11, 11
Blue, Luma:
PH1: 0.669868, 0.669868
PH2: 3.35965, 3.35965
PH3: 7.27735, 7.27735
PH4: 7.05343, 7.05343
PH5: 11, 11
PH6: 8.746, 8.746
PH7: 11, 11
Channel features (horizontal, vertical, left and right diagonals, average):
Gray:
Angular Second Moment:
0.364846, 0.615673, 0.372224, 0.372224, 0.431242
Contrast:
0.544246, 0.0023846, 0.546612, 0.546612, 0.409963
Correlation:
-0.406263, 0.993832, -0.439964, -0.439964, -0.07309
Sum of Squares Variance:
1.19418, 1.1939, 1.19101, 1.19101, 1.19253
Inverse Difference Moment:
0.737681, 1.00758, 0.745356, 0.745356, 0.808993
Sum Average:
1.63274, 0.546074, 1.63983, 1.63983, 1.36462
Sum Variance:
4.43991, 0.938019, 4.46048, 4.46048, 3.57472
Sum Entropy:
0.143792, 0.159713, 0.143388, 0.143388, 0.14757
Entropy:
0.462204, 0.258129, 0.461828, 0.461828, 0.410997
Difference Variance:
0.0645055, 0.189604, 0.0655494, 0.0655494, 0.0963021
Difference Entropy:
0.29837, 0.003471, 0.297282, 0.297282, 0.224101
Information Measure of Correlation 1:
-0.160631, -0.971422, -0.146024, -0.146024, -0.356026
Information Measure of Correlation 2:
0.294281, 0.625514, 0.29546, 0.29546, 0.377679
You could also go on Fred Weinhaus's excellent website (here) and download his script called moments which will calculate the Hu and Maitra moments and see if those will tell you what you want. Basically, you could run the script on each of your images like this:
./moments image1.png > 1.txt
./moments image2.png > 2.txt
and then use your favourite diff tool to see what has changed between the two images you wish to compare.
I am still a beginner in coding. I am currently working on a program in C/C++ that is determining pixel position of a defined mark (which is a black circle with white surroundings) in a photo.
I made a mask from the mark and a vector, which contains mask's every pixel value as it's elements (using Magick++ I summed values for Red, Green and Blue). Vector contains aprox. 10 000 values since the mask is 100x100px. I also used threshold functions for simplifying the image.
Than I made a grid, that is doing the same for the picture, where I want to find the coordinates of the mark. It is basically a loop, that is going throught the image and when the program knows pixel values in the grid it immediately compares them with the mask. Main idea is to find lowest difference between the mask and one of the grid positions.
The problem is however that this procedure of evaluating all grids position takes huge amount of time (e.g. the image has 1920x1080px so more than 2 million vectors containing 10 000 values). I decided to cycle the grid not every pixel but for example every 10th column and row, and than for the best corellation from this procedure I selected area where I used every pixel loop. But, this still takes lot of time.
I would like to ask you, if there is some way of improving this method for better (faster) results or this whole idea is not time efficient and I should use different approach.
Thanks for every advice!
Edit: The program will be used for processing multiple images and on all of them the size will be same. This is the picture after threshold, the mark is the big black dot.
Image
The idea that I find interesting is a pyramidal scheme - or progressive refinement: you find the spot at a lower size image then search only a small rectangle in the larger image.
If you reduce your image by 2 in each dimension then you would reduce the time by 4 plus some search effort in the larger image.
This has some problems: the reduction will affect accuracy I expect. You might miss the spot.
You have to cut the sample (template) by the same so you create a half-size template in this case. As you half half half... the template will get blurred into the surrounding objects so it will not be possible to have a valid template; for half size once I guess the dot has a couple of pixels around it.
As you haven't specified a tool or OS, I will choose ImageMagick which is installed on most Linux distros and is available for OSX and Windows. I am just using it at the command-line here but there are C, C++, Python, Perl, PHP, Ruby, Java and .Net bindings available.
I would use a "Connect Components Analysis" or "Blob Analysis" like this:
convert image.png -negate \
-define connected-components:area-threshold=1200 \
-define connected-components:verbose=true \
-connected-components 8 -auto-level result.png
I have inverted your image with -negate because in morphological operations, the foreground is usually white rather than black. I have excluded blobs smaller than 1200 pixels because your circles seem to have a radius of 22 pixels which makes for an area of 1520 pixels (Pi * 22^2).
That gives this output, which means 7 blobs - one per line - with the bounding box and area of each:
Objects (id: bounding-box centroid area mean-color):
0: 1358x1032+0+0 640.8,517.0 1296947 gray(0)
3: 341x350+1017+287 1206.5,468.9 90143 gray(255)
106: 64x424+848+608 892.2,829.3 6854 gray(255)
95: 38x101+44+565 61.5,619.1 2619 gray(255)
49: 17x145+1341+379 1350.3,446.7 2063 gray(0)
64: 43x43+843+443 864.2,464.1 1451 gray(255)
86: 225x11+358+546 484.7,551.9 1379 gray(255)
Note that, as your circle is 42x42 pixels you will be looking for a blob that is square-ish and close to that size - so I am looking at the second to last line. I can draw that in in red on your original image like this:
convert image.png -fill none -stroke red -draw "rectangle 843,443 886,486" result.png
Also, note that as you are looking for a circle, you would expect the area to be pi * r^2 or around 1500 pixels and you can check that in the penultimate column of the output.
That runs in 0.4 seconds on a reasonable spec iMac. Note that you could divide the image into 4 and run each quarter in parallel to speed things up. So, if you do something like this:
#!/bin/bash
# Split image into 4 (maybe should allow 23 pixels overlap)
convert image.png -crop 1x4# tile-%02d.mpc
# Do Blob Analysis on 4 strips in parallel
for f in tile-*mpc; do
convert $f -negate \
-define connected-components:area-threshold=1200 \
-define connected-components:verbose=true \
-connected-components 8 info: &
done
# Wait for all 4 to finish
wait
That runs in around 0.14 seconds.
I need to calculate the contrast of an color image, so the steps that was given to me are,
computed the histogram for RGB channel separately and combined it together as Histogram = histOfRedC + histOfBlueC + histOfgreenC.
normalize it to unit length, as each image is of different size.
The contrast quality, is equal to the width of the middle 98% mass of the histogram.
I have done the first 2 steps but unable to understand what to compute in 3rd step. Can somebody please explain me what it means?
Let the total mass of the histogram be M.
Accumulate the mass in the bins, starting from index zero, until you pass 0.01 M. You get an index Q01.
Decumulate the mass in the bins, starting from the maximum index, until you pass 0.99 M. You get an index Q99.
These indexes are the so-called first and last percentiles. The contrast is estimated as Q99-Q01.