Algorithm for 'pixelated circle' image recognition - c++

Here are three sample images. In these images I want to find:
Coordinates of the those small pixelated partial circles.
Rotation of these circles. These circle have a 'pointy' side. I want to find its direction.
For example, coordinates and the angle with positive x axis of that small partial circle in the
first image is (51 px, 63 px), 240 degrees, respectively.
second image is (50 px, 52 px), 300 degrees, respectively.
third image is (80 px, 29 px), 225 degrees, respectively.
I don't care about scale invariance.
Methods I have tried:
ORB feature detection
SIFT feature detection
Feature detection don't seem to work here.
Above is the example of ORB feature detector finding similar features in 1st and 2nd image.
It is finding one correct match, rest are wrong.
Probably because these images are too low resolution to find any meaningful corners or blobs. The corners and blob it does find are not much different form other pixelated object present.
I have seen people use erosion and dilution to remove noise, but my objects are too small for that to work.
Perhaps some other feature detector can help?
I am also thinking about Generalized Hough transform, however I cant find a complete tutorial to implement it with OpenCV (c++). Also I want something that is fast. Hopefully in real time.
Any help is appreciated.

If the small circles have constant size, then you might try a convolution.
This is a quick and dirty test I ran with ImageMagick for speed, and coefficients basically pulled out of thin air:
convert test1.png -define convolve:scale='!' -morphology Convolve \
"12x12: \
-9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9 \
-9,-7,-2,-1,0,0,0,0,-1,-2,-7,-9 \
-9,-2,-1,0,9,9,9,9,0,-1,-2,-9 \
-9,-1,0,9,7,7,7,7,9,0,-1,-9 \
-9,0,9,7,-9,-9,-9,-9,7,9,0,-9 \
-9,0,9,7,-9,-9,-9,-9,7,9,0,-9 \
-9,0,9,7,-9,-9,-9,-9,7,9,0,-9 \
-9,0,9,7,-9,-9,-9,-9,7,9,0,-9 \
-9,-1,0,9,7,7,7,7,9,0,-1,-9 \
-9,-2,0,0,9,9,9,9,0,0,-2,-9 \
-9,-7,-2,-1,0,0,0,0,-1,-2,-7,-9 \
-9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9" \
test2.png
I then ran a simple level stretch plus contrast to bring out what already were visibly more luminous pixels, and a sharpen/reduction to shrink pixel groups to their barycenters (these last operations could be done by multiplying the matrix by the proper kernel), and got this.
The source image on the left is converted to the output on the right, the pixels above a certain threshold mean "circle detected".
Once this is done, I imagine the "pointy" end can be refined with a modified quicunx - use a 3x3 square grid centered on the center pixel, count the total luminosity in each of the eight peripheral squares, and that ought to give you a good idea of where the "point" is. You might want to apply thresholding to offset a possible blurring of the border (the centermost circle in the example below, the one inside the large circle, could give you a false reading).
For example, if we know the coordinates of the center in the grayscale matrix M, and we imagine the circle having diameter of 7 pixels (this is more or less what the convolution above says), we would do
uint quic[3][3] = { { 0, 0, 0 }, { 0, 0, 0 }, { 0, 0, 0 } };
for (y = -3; y <= 3; y++) {
for (x = -3; x <= 3; x++) {
if (matrix[cy+y][cx+x] > threshold) {
quic[(y+3)/2-1][(x+3)/2-1] += matrix[cy+y][cx+x];
}
}
}
// Now, we find which quadrant in quic holds the maximum:
// if it is, say, quic[2][0], the point is southeast.
// 0 1 2 x
// 0 NE N NW
// 1 E X W
// 2 SE S SW
// y
// Value X (1,1) is totally unlikely - the convolution would
// not have found the circle in the first place if it was so
For an accurate result you would have to use "sub-pixel" addressing, which is slightly more complicated. With the method above, one of the circles results in these quicunx values, that give a point to the southeast:
Needless to say, with this kind of resolution the use of a finer grid is pointless, you'd get an error of the same order of magnitude.
I've tried with some random doodles and the convolution matrix has a good rejection of non-signal shapes, but of course this is due to information about the target's size and shape - if that assumption fails, this approach will be a dead end.
It would help to know the image source: there're several tricks used in astronomy and medicine to detect specific shapes or features.
Python opencv2
The above can be implemented with Python:
#!/usr/bin/python3
import cv2
import numpy as np
# Scaling factor
d = 240
kernel1 = np.array([
[ -9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9 ],
[ -9,-7,-2,-1,0,0,0,0,-1,-2,-7,-9 ],
[ -9,-2,-1,0,9,9,9,9,0,-1,-2,-9 ],
[ -9,-1,0,9,7,7,7,7,9,0,-1,-9 ],
[ -9,0,9,7,-9,-9,-9,-9,7,9,0,-9 ],
[ -9,0,9,7,-9,-9,-9,-9,7,9,0,-9 ],
[ -9,0,9,7,-9,-9,-9,-9,7,9,0,-9 ],
[ -9,0,9,7,-9,-9,-9,-9,7,9,0,-9 ],
[ -9,-1,0,9,7,7,7,7,9,0,-1,-9 ],
[ -9,-2,0,0,9,9,9,9,0,0,-2,-9 ],
[ -9,-7,-2,-1,0,0,0,0,-1,-2,-7,-9 ],
[ -9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9,-9 ]
], dtype = np.single)
sharpen = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]]);
image = cv2.imread('EuDpD.png')
# Scale kernel
for i in range(0, 12):
for j in range(0, 12):
kernel1[i][j] = kernel1[i][j]/d
identify = cv2.filter2D(src=image, ddepth=-1, kernel=kernel1)
# Sharpen image
identify = cv2.filter2D(src=identify, ddepth=-1, kernel=sharpen)
# Cut at ~90% of maximum
ret,thresh = cv2.threshold(identify, 220, 255, cv2.THRESH_BINARY)
cv2.imwrite('identify.png', thresh)
The above, ran on the grayscaled image (left), gives the following result (right). A better sharpening or adaptive thresholding could come up with a single pixel.

Related

How to compute Pairwise L1 Distance matrix on very large images in neighborhood only?

I am working on Deep learning approach for my project. And I need to calculate Distance Matrix on 4D Tensor which will be of size N x 128 x 64 x 64 (Batch Size x Channels x Height x Width). The distance matrix for this type of tensor will of size N x 128 x 4096 x 4096 and it will be impossible to fit this type of tensor in GPU, even on CPU it will require lot of memory. So, I would like to calculate the distance matrix only in some neighborhood pixels (say within radius of 5) and consider this rectangular matrix for further processing in neural network. With this approach my distance matrix will be of size N x 128 x 4096 x 61. It will take less memory in comparison to full distance matrix.
Precisely, I am trying to implement the Convolution Random Walk Networks for Semantic Segmentation. This network needs to calculate the Pairwise L1 Distance for features.
Architecture
Just to add this type of Distance Matrix is usually calculated for image segmentation via spectral clustering.
For Example
X = [[a,b],[c,d]]
L1_dist = [ [0, |a-b|, |a-c|, 0],
[|a-b|, 0, 0, |b-d|],
[|a-c|, 0, 0, |c-d| ],
[0, |b-d|, |c-d|, 0 ]
]
Final_L1_dist = [ [0, |a-b|, |a-c|], // "a" is near to b and c. Including self element i.e. a
[|a-b|, 0, |b-d|], // "b" is near to a and d.
[|a-c|, 0, |c-d| ], // "c" is near to a and d.
[|b-d|, |c-d|, 0 ] // "d" is near to b and c.
]
I would appreciate, if some one can help me to find an efficient way to compute such a matrix.
Thanks
As far as I understand, the goal is to apply minus operation to each pixel and its surrounding neighbors. This sounds like convolution to me.
Consider the following convolution process (assume padding='SAME'):
The 3x3 kernel calculates, for each pixel, the difference between the center pixel and its left one. For other neighbors, consider the following kernels:
Thus the goal can be achieved via the following:
Repeat each kernel for num_channels times using tf.tile;
Apply each kernel channel-wisely using tf.nn.depthwise_conv2d;
Do tf.abs to get the distance;
Reshape each distance tensor to NxCx(HW)x1 and stack them properly.
For efficient for loop, you may consider using tf.map_fn.

combined Scharr derivatives in opencv

I have few questions regarding Scharr derivatives and its OpenCV implementation.
I am interested in second order image derivatives with (3X3) kernels.
I started with Sobel second derivative, which failed to find some thin lines in the images. After reading the Sobel and Charr comparison in the bottom of this page, I decided to try Scharr instead by changing this line:
Sobel(gray, grad, ddepth, 2, 2, 3, scale, delta, BORDER_DEFAULT);
to this line:
Scharr(img, gray, ddepth, 2, 2, scale, delta, BORDER_DEFAULT );
My problem is that it seems like cv::Scharr allows performing an only first order of one partial derivative at a time, So I get the following error:
error: (-215) dx >= 0 && dy >= 0 && dx+dy == 1 in function getScharrKernels
(see assertion line here)
Following this restriction, I have a few questions regarding Scharr derivatives:
Is it considered bad-practice to use high order Scharr derivatives? Why did OpenCV choose to assert dx+dy == 1?
If I am to call Scharr twice for each axis, What is the correct way to combine the results?
I am currently using:
addWeighted( abs_grad_x, 0.5, abs_grad_y, 0.5, 0, grad );
but I am not sure that this how the Sobel function combines the two axis and in what order it should be done for all 4 derivatives.
If I am to compute the (dx=2,dy=2) derivative by using 4 different kernels, I would like to reduce processing time by unifying all 4 kernels into 1 before applying it on the image (I assume that this is what cv::Sobel does). Is there a reasonable way to create such combined Shcarr kernel and convolve it with my image?
Thanks!
I've never read the original Scharr paper (the dissertation is in German) so I don't know the answer to why the Scharr() function doesn't allow higher order derivatives. Maybe because of the first point I make in #3 below?
The Scharr function is supposed to be a derivative. And the total derivative of a multivariable function f(x) = f(x0, ..., xN) is
df/dx = dx0*df/dx0 + ... + dxN*df/dxN
That is, the sum of the partials each multiplied by the change. In the case of images of course, the change dx in the input is a single pixel, so it's equivalent to 1. In other words, just sum the partials; not weighting them by half. You can use addWeighted() with 1s as the weights, or you can just sum them, but to make sure you won't saturate your image you'll need to convert to a float or 16-bit image first. However, it's also pretty common to compute the Euclidean magnitude of the derivatives, too, if you're trying to get the gradient instead of the derivative.
However, that's just for the first-order derivative. For higher orders, you need to apply some chain ruling. See here for the details of combining a second order.
Note that an optimized kernel for first-order derivatives is not necessarily the optimal kernel for second-order derivatives by applying it twice. Scharr himself has a paper on optimizing second-order derivative kernels, you can read it here.
With that said, filters are split into x and y directions to make linear separable filters, which basically turn your 2d convolution problem into two 1d convolutions with smaller kernels. Think of the Sobel and Scharr kernels: for the x direction, they both just have the single column on either side with the same values (except one is negative). When you slide the kernel across the image, at the first location, you're multiplying the first column and the third column by the values in your kernel. And then two steps later, you're multiplying the third and the fifth. But the third was already computed, so that's wasteful. Instead, since both sides are the same, just multiply each column by the vector since you know you need those values, and then you can just look up the values for the results in column 1 and 3 and subtract them.
In short, I don't think you can combine them with built-in separable filter functions, because certain values are positive sometimes, and negative otherwise; and the only way to know when applying a filter linearly is to do them separately. However, we can examine the result of applying both filters and see how they affect a single pixel, construct the 2D kernel, and then convolve with OpenCV.
Suppose we have a 3x3 image:
image
=====
a b c
d e f
g h i
And we have the Scharr kernels:
kernel_x
========
-3 0 3
-10 0 10
-3 0 3
kernel_y
========
-3 -10 -3
0 0 0
3 10 3
The result of applying each kernel to this image gives us:
image * kernel_x
================
-3a -10b -3c
+0d +0e +0f
+3g +10h +3i
image * kernel_y
================
-3a +0b +3c
-10d +0e +10f
-3g +0h +3i
These values are summed and placed into pixel e. Since the sum of both of these is the total derivative, we sum all these values into pixel e at the end of the day.
image * kernel_x + image * kernel y
===================================
-3a -10b -3c +3g +10h +3i
-3a +3c -10d +10f -3g +3i
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
-6a -10b +0c -10d +10f +0g +10h +6i
And this is the same result we'd have gotten if we multiplied by the kernel
kernel_xy
=============
-6 -10 0
-10 0 10
0 10 6
So there's a 2D kernel that does a single-order derivative. Notice anything interesting? It's just the addition of the two kernels. Is that surprising? Not really, as x(a+b) = ax + bx. Now we can pass that into filter2D()
to compute the addition of the derivatives. Does that actually give the same result?
import cv2
import numpy as np
img = cv2.imread('cameraman.png', 0).astype(np.float32)
kernel = np.array([[-6, -10, 0],
[-10, 0, 10],
[0, 10, 6]])
total_first_derivative = cv2.filter2D(img, -1, kernel)
scharr_x = cv2.Scharr(img, -1, 1, 0)
scharr_y = cv2.Scharr(img, -1, 0, 1)
print((total_first_derivative == (scharr_x + scharr_y)).all())
True
Yep. Now I guess you can just do it twice.

Negative focal length in Camera Calibration Matrix

While trying to work with the ICL_NUIM dataset I have ran into some issues. The camera calibration provided on the site has the following values:
481.20, 0, 319.50
0, -480.00, 239.50
0, 0, 1
where:
fx = 481.20
fy = -480.00
cu = 319.50
cv = 239.50
I am struggling to intuitively understand how the fy can have a negative value.
The mathematical effect is simply that the image is vertically inverted.
This is the equivalent of the image as appears on the classical pinhole model, in the back plane of the camera box. In that case both the fx,fy would be negative, but you get the idea.

How to find an Equivalent point in a Scaled down image?

I would like to calculate the corner points or contours of the star in this in a Larger image. For that I'm scaling down the size to a smaller one & I'm able to get this points clearly. Now How to map this point in original image? I'm using opencv c++.
Consider a trivial example: the image size is reduced exactly by half.
So, the cartesian coordinate (x, y) in the original image becomes coordinate (x/2, y/2) in the reduced image, and coordinate (x', y') in the reduced image corresponds to coordinate (x*2, y*2) in the original image.
Of course, fractional coordinates get typically rounded off, in a reduced scale image, so the exact mapping is only possible for even-numbered coordinates in this example's original image.
Generalizing this, if the image's width is scaled by a factor of w horizontally and h vertically, coordinate (x, y) becomes coordinate(x*w, y*h), rounded off. In the example I gave, both w and h are 1/2, or .5
You should be able to figure out the values of w and h yourself, and be able to map the coordinates trivially. Of course, due to rounding off, you will not be able to compute the exact coordinates in the original image.
I realize this is an old question. I just wanted to add to Sam's answer above, to deal with "rounding off", in case other readers are wondering the same thing I faced.
This rounding off becomes obvious for even # of pixels across a coordinate axis. For instance, along a 1-D axis, a point demarcating the 2nd quartile gets mapped to an inaccurate value:
axis_prev = [0, 1, 2, 3]
axis_new = [0, 1, 2, 3, 4, 5, 6, 7]
w_prev = len(axis_prev) # This is an axis of length 4
w_new = len(axis_new) # This is an axis of length 8
x_prev = 2
x_new = x_prev * w_new / w_prev
print(x_new)
>>> 4
### x_new should be 5
In Python, one strategy would be to linearly interpolate values from one axis resolution to another axis resolution. Say for the above, we wish to map a point from the smaller image to its corresponding point of the star in the larger image:
import numpy as np
from scipy.interpolate import interp1d
x_old = np.linspace(0, 640, 641)
x_new = np.linspace(0, 768, 769)
f = interp1d(x_old, x_new)
x = 35
x_prime = f(x)

How to detect image gradient or normal using OpenCV

I wanted to detect ellipse in an image. Since I was learning Mathematica at that time, I asked a question here and got a satisfactory result from the answer below, which used the RANSAC algorithm to detect ellipse.
However, recently I need to port it to OpenCV, but there are some functions that only exist in Mathematica. One of the key function is the "GradientOrientationFilter" function.
Since there are five parameters for a general ellipse, I need to sample five points to determine one. Howevere, the more sampling points indicates the lower chance to have a good guess, which leads to the lower success rate in ellipse detection. Therefore, the answer from Mathematica add another condition, that is the gradient of the image must be parallel to the gradient of the ellipse equation. Anyway, we'll only need three points to determine one ellipse using least square from the Mathematica approach. The result is quite good.
However, when I try to find the image gradient using Sobel or Scharr operator in OpenCV, it is not good enough, which always leads to the bad result.
How to calculate the gradient or the tangent of an image accurately? Thanks!
Result with gradient, three points
Result without gradient, five points
----------updated----------
I did some edge detect and median blur beforehand and draw the result on the edge image. My original test image is like this:
In general, my final goal is to detect the ellipse in a scene or on an object. Something like this:
That's why I choose to use RANSAC to fit the ellipse from edge points.
As for your final goal, you may try
findContours and [fitEllipse] in OpenCV
The pseudo code will be
1) some image process
2) find all contours
3) fit each contours by fitEllipse
here is part of code I use before
[... image process ....you get a bwimage ]
vector<vector<Point> > contours;
findContours(bwimage, contours, CV_RETR_LIST, CV_CHAIN_APPROX_NONE);
for(size_t i = 0; i < contours.size(); i++)
{
size_t count = contours[i].size();
Mat pointsf;
Mat(contours[i]).convertTo(pointsf, CV_32F);
RotatedRect box = fitEllipse(pointsf);
/* You can put some limitation about size and aspect ratio here */
if( box.size.width > 20 &&
box.size.height > 20 &&
box.size.width < 80 &&
box.size.height < 80 )
{
if( MAX(box.size.width, box.size.height) > MIN(box.size.width, box.size.height)*30 )
continue;
//drawContours(SrcImage, contours, (int)i, Scalar::all(255), 1, 8);
ellipse(SrcImage, box, Scalar(0,0,255), 1, CV_AA);
ellipse(SrcImage, box.center, box.size*0.5f, box.angle, 0, 360, Scalar(200,255,255), 1, CV_AA);
}
}
imshow("result", SrcImage);
If you focus on ellipse(no other shape), you can treat the value of the pixels of the ellipse as mass of the points.
Then you can calculate the moment of inertial Ixx, Iyy, Ixy to find out the angle, theta, which can rotate a general ellipse back to a canonical form (X-Xc)^2/a + (Y-Yc)^2/b = 1.
Then you can find out Xc and Yc by the center of mass.
Then you can find out a and b by min X and min Y.
--------------- update -----------
This method can apply to filled ellipse too.
More than one ellipse on a single image will fail unless you segment them first.
Let me explain more,
I will use C to represent cos(theta) and S to represent sin(theta)
After rotation to canonical form, the new X is [eq0] X=xC-yS and Y is Y=xS+yC where x and y are original positions.
The rotation will give you min IYY.
[eq1]
IYY= Sum(m*Y*Y) = Sum{m*(xS+yC)(xS+yC)} = Sum{ m(xxSS+yyCC+xySC) = Ixx*S^2 + Iyy*C^2 + Ixy*S*C
For min IYY, d(IYY)/d(theta) = 0 that is
2IxxSC - 2IyySC + Ixy(CC-SS) = 0
2(Ixx-Iyy)/Ixy = (SS-CC)/SC = S/C+C/S = Z+1/Z
While programming, the LHS is just a number, let's said N
Z^2 - NZ +1 =0
So there are two roots of Z hence theta, let's said Z1 and Z2, one will min the IYY and the other will max the IYY.
----------- pseudo code --------
Compute Ixx, Iyy, Ixy for a hollow or filled ellipse.
Compute theta1=atan(Z1) and theta2=atan(Z2)
Put These two theta into eq1 find which is smaller. Then you get theta.
Go back to those non-zero pixels, transfer them to new X and Y by the theta you found.
Find center of mass Xc Yc and min X and min Y by sort().
-------------- by hand -----------
If you need the original equation of the ellipse
Just put [eq0] into the canonical form
You're using terms in an unusual way.
Normally for images, the term "gradient" is interpreted as if the image is a mathematical function f(x,y). This gives us a (df/dx, df/dy) vector in each point.
Yet you're looking at the image as if it's a function y = f(x) and the gradient would be f(x)/dx.
Now, if you look at your image, you'll see that the two interpretations are definitely related. Your ellipse is drawn as a set of contrasting pixels, and as a result there are two sharp gradients in the image - the inner and outer. These of course correspond to the two normal vectors, and therefore are in opposite directions.
Also note that your image has pixels. The gradient is also pixelated. The way your ellipse is drawn, with a single pixel width means that your local gradient takes on only values that are a multiple of 45 degrees:
▄▄ ▄▀ ▌ ▀▄