Algorithm behind score calculation in FAST corner detector - c++

How is the score for a detected corner is calculated in the FAST corner detector? I read the original paper "Machine Learning for High Speed Corner detection" but in the score calculation portion nothing is explicitly mentioned as to which N contiguous pixels they are referring at. Is it the N contiguous pixels that satisfy the corner criteria for that point? I also found the below link
https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV1011/AV1FeaturefromAcceleratedSegmentTest.pdf that speaks of the FAST corner score computation . Also, I am not finding any correspondence between the score function described in this paper and the score calculation done by OPENCV for the bresenham circle of radius 3.
https://github.com/opencv/opencv/blob/master/modules/features2d/src/fast_score.cpp
The score has been calculated in the cornerScore<16> function in the above link. Besides these things, no other article explicitly talks about the FAST score calculation in Fast feature Detector. Can anyone kindly give me any insight on this?
N.B -I have also looked at the second paper "Faster and Better:A machine learning approach to corner detection" but it has no explicit mention about the score calculation.

The docs online confused me too:
The score function is defined as:
“The sum of the absolute difference between the pixels in the contiguous arc and the centre
pixel”
I'm pretty sure OpenCV doesn't calculate score like that. If you patiently read the source code you mentioned, you will find that the function cornerScore<16> is doing this:
Get 16 pixel values on the circle centered at the target pixel
Take a set of 9 contiguous pixels from the 16, calculate the absolute differences between the 9 pixels and the center one, and take the minimum (from 9 abs-diffs) value (which is called a threshold)
Take every pixel in the 16 as the beginning one of the step 2, and you will get 16 thresholds
Return the maximum threshold as the corner score
From this pipeline, you can see that the score OpenCV calculates is the maximum threshold that makes the target pixel a FAST-corner.

Related

Find the Peaks of contour in Python-OpenCV

I have got a binary image/contour containing four human beings, and I want to detect/count all humans. Since there are occlusions, so I think it is best to get the head/maxima in the contour of all the humans. In that case human can be counted.
I am able to get the global maxima\topmost point (in terms of calculus language), but I want to get all the local maximas
The code for finding the topmost point is as suggested by Adrian in his blogpost i.e.:
topmost = tuple(biggest_contour[biggest_contour[:,:,1].argmin()][0])
Can anyone please suggest how to get all the local maximas, instead of just topmost location?
Here is the sample of my Image:
The definition of "local maximum" can be tricky to pin down, but if you start with a simple method you'll develop an intuition to look further. Even if there are methods available on the web to do this work for you, it's worth implementing a few basic techniques yourself before you go googling.
One simple method I've used in the path goes something like this:
Find the contours as arrays/lists/containers of (x,y) coordinates.
At each element N (a pixel) in the list, get the pixels at N - D and N + D; that is the pixels D ahead of the current pixel and D behind the current pixel
Calculate the point-to-point distance
Calculate the distance along the contour from N-D to N+D
Calculate (distanceAlongContour)/(point-to-point distance)
...
There are numerous other ways to do this, but this is quick to implement from scratch, and I think a reasonable starting point: Compare the "geodesic" distance and the Euclidean distance.
A few other possibilities:
Do a bunch of curve fits to chunks of pixels from the contour. (Lots of details to investigate here.)
Use Ramer-Puecker-Douglas to render the outlines as polygons, then choose parameters to ensure those polygons are appropriately simplified. (Second time I've mentioned R-P-D today; it's handy.) Check for vertices with angles that deviate much from 180 degrees.
Try a corner detector. Crude, but easy to implement.
Implement an edge follower that moves from one pixel to the next in the contour list, and calculate some kind of "inertia" as the pixel shifts direction. This wouldn't be useful on a pixel-by-pixel basis, but you could compare, say, pixels N-1,N,N+1 to pixels N+1,N+2,N+3. Or just calculate the angle between them.

largest inscribed rectangle in arbitrary polygon

I've worked with OpenCV Stitching for a while. Now I want to do the last step of stitching: crop image. This leads to find the largest inscribed axis-parallel rectangle in general polygon.
I've already googled it and found some answers (How do I crop to largest interior bounding box in OpenCV?). The quality of output image is good despite the program run slowly (it takes 15 sec to crop image quite takes only 47 sec to stitch 36 1600x1200 pictures into 1 panorama) since the used algorithm have bad time complexity (for each point in the contour, it scan all point in same row/column).
Any way to improve this? Thanks.
P/S: I also found this book:
Finding the Largest Area Axis-Parallel Rectangle in a Polygon
Karen Daniels y Victor Milenkovicz Dan Rothx Harvard University,
Division of Applied Sciences,
Center for Research in Computing Technology,
Cambridge, MA 02138.
June 1995
but I didn't have any idea to implement the theory into code :v
You probably don't want to implement that algorithm; it would take quite a while, and I suspect that you would be disappointed with the performance in spite of the big-O bound.
It sounds as though you're working with a raster anyway, so you could use a linear-time algorithm for finding the largest rectangle of zeros in a binary matrix.
Maybe have a look at this largest interior rectancle implementation. It uses the algorithm described in this paper. An example Image is
At the moment I'm implementing the cropping functionality into the stitching package

What does eigen value of structure tensor matrix denote?

It is known that good feature point across two images can be determined properly, if
the two eigen value of above matrix, are greater than 0. Can someone explain, what does it mean to have both eigen value greater than 0 and why the feature point is not good if either of them is approx. equal to 0.
Note that this matrix always has nonnegative eigenvalues. Basically this rule says that one should favor rapid change in all directions, that is corners are better features than edges or flat surfaces.
The biggest eigenvalue corresponds to the eigenvector pointing towards the direction of the most significant change in the image at the point u.
If the two eigenvalues are small the image at point u does not change much.
If one of the eigenvectors is large and the other is small this point might lie on an edge in the image but it will be difficult to figure out where exactly on that edge.
If both are large, the point is like a corner.
There is a nice presentation with examples in the panoramic stitching slide deck from a course taught by Rajesh Rao at the University of Washington.
Here E(u,v) denotes the Eucledian distance between the two areas in the vicinities of pixels shifted by the vector (u,v) from each other. This distance tells how easy it is to distinguish the two pixels from one another.
Edit The matrix of image derivatives is denoted H in this illustration probably because of its relation to Harris corner detection algorithm.
That is related with the concept of Texturedness in the paper of Thomasi-Shi "Good features to track".
The idea of Textureness is to provide a rating of texture to make features (within a window) identifiable and unique. For instance, lines are not good features since are not unique (see Figure 3.9a)
To solve equation an optical flow equation, it must be possible to invert J (Hessian matrix). In practice next conditions must be satisfied:
Eigenvalues of J cannot differ by several orders of magnitude.
Eigenvalues of Hessian overcome image noise levels λnoise: implies that both eigenvalues of J must be large.
For the first condition we know that the greatest eigenvalue cannot be arbitrarily large because intensity variations in a window are bounded by the maximum allowable pixel value.
Regarding to second condition, being λ1 and λ2 two eigenvalues of J, following situations may rise (See Figure 3.10):
• Two small eigenvalues λ1 and λ2: means a roughly constant intensity profile within a window (Pink region). Problem of figure 3.9-b.
• A large and a small eigenvalue: means unidirectional texture patter (Violet or gray region). Problem of figure 3.9-a.
• λ1 and λ2 are both large: can represent a corner, salt and pepper textures or any other pattern that can be tracked reliably (Green region).
Some references:
1 - ORTIZ CAYON, R. J. (2013). Online video stabilization for UAV. Motion estimation and compensation for unnamed aerial vehicles.
2 - Shi, J., & Tomasi, C. (1994, June). Good features to track. In Computer Vision and Pattern Recognition, 1994. Proceedings CVPR'94., 1994 IEEE Computer Society Conference on (pp. 593-600). IEEE.
3 - Richard Szeliski. Image alignment and stitching: a tutorial. Found.
Trends. Comput. Graph. Vis., 2(1):1–104, January 2006.

Determine appropriate size of calibration board for camera calibration

How does one determine the appropriate physical size of calibration board for camera calibration ? In the past, I have used ones like these printed on A4 size paper. Reasonable criteria seem to be the precision(i.e. how accurate) and completeness(i.e. are all corners detected) with a particular sized board. But, short of printing multiple sizes and trying them all, is there any other approach ?
On a related note, if we know that the "depth of field" of interest is 2 metres to 2.5 metres from the camera, would it be better to use the calibration pattern within this range or would it be better to try it ignoring such preferences ?
it is a wrong question to ask. Apart from resolution of the board that is going to be probably fine for a wide range of sizes the important question is how many boards you have to show to your camera in order to find accurately intrinsic and extrinsic parameters.
Unlike PnP or posit or extrinsic problem calibration has to find both extrinsic and intrinsic. You will have to cover camera whole field of view with images of your board plus (or rather times) different board orientations and distances. This is because converging to the right intrinsic and extrinsic requires seeing vanishing points while eliminating a bias in pixel error metrics requires uniform sampling of the field of view.
First, see this answer for general notes on manufacturing a calibration target.
Doing a complete sensitivity analysis is quite complicate. A rule of thumb I have used is to assume that my corner detector is accurate up to 1/2 a pixel worst case. Then, given an approximate field of view (which you can estimate from the lens's nominal focal length in mm and the width of the sensor chip), and a specification for the min and max distance used in the application, I worked out the sensitivity of a plane's estimated pose in various orientations and positions in the calibration volume, given a worst case error in its four corners. I then played with its size until the worst case error became acceptable - and within my budget and the other application constraints (weight, depth of field, resolution, etc).

Finding curvature from a noisy set of data points using 2d/3dsplines? (C++)

I am trying to extract the curvature of a pulse along its profile (see the picture below). The pulse is calculated on a grid of length and height: 150 x 100 cells by using Finite Differences, implemented in C++.
I extracted all the points with the same value (contour/ level set) and marked them as the red continuous line in the picture below. The other colors are negligible.
Then I tried to find the curvature from this already noisy (due to grid discretization) contour line by the following means:
(moving average already applied)
1) Curvature via Tangents
The curvature of the line at point P is defined by:
So the curvature is the limes of angle delta over the arclength between P and N. Since my points have a certain distance between them, I could not approximate the limes enough, so that the curvature was not calculated correctly. I tested it with a circle, which naturally has a constant curvature. But I could not reproduce this (only 1 significant digit was correct).
2) Second derivative of the line parametrized by arclength
I calculated the first derivative of the line with respect to arclength, smoothed with a moving average and then took the derivative again (2nd derivative). But here I also got only 1 significant digit correct.
Unfortunately taking a derivative multiplies the already inherent noise to larger levels.
3) Approximating the line locally with a circle
Since the reciprocal of the circle radius is the curvature I used the following approach:
This worked best so far (2 correct significant digits), but I need to refine even further. So my new idea is the following:
Instead of using the values at the discrete points to determine the curvature, I want to approximate the pulse profile with a 3 dimensional spline surface. Then I extract the level set of a certain value from it to gain a smooth line of points, which I can find a nice curvature from.
So far I could not find a C++ library which can generate such a Bezier spline surface. Could you maybe point me to any?
Also do you think this approach is worth giving a shot, or will I lose too much accuracy in my curvature?
Do you know of any other approach?
With very kind regards,
Jan
edit: It seems I can not post pictures as a new user, so I removed all of them from my question, even though I find them important to explain my issue. Is there any way I can still show them?
edit2: ok, done :)
There is ALGLIB that supports various flavours of interpolation:
Polynomial interpolation
Rational interpolation
Spline interpolation
Least squares fitting (linear/nonlinear)
Bilinear and bicubic spline interpolation
Fast RBF interpolation/fitting
I don't know whether it meets all of your requirements. I personally have not worked with this library yet, but I believe cubic spline interpolation could be what you are looking for (two times differentiable).
In order to prevent an overfitting to your noisy input points you should apply some sort of smoothing mechanism, e.g. you could try if things like Moving Window Average/Gaussian/FIR filters are applicable. Also have a look at (Cubic) Smoothing Splines.