Curvature Scale Space corner detection algorithm. Arc Length Parameter? - c++

I'm studying about the CSS algorithm and I don't get the hang of the concept of 'Arc Length Parameter'.
According to the literature, planar curve Gamma(u)=(x(u),y(u)) and they say this u is the arc length parameter and apparently, Gaussian Kernel g is also parameterized by this u here.
Stop me if I got something wrong but, aren't x and y location of the pixel? How is it represented by another parameter?
I had no idea when I first saw it on the literature so, I looked up the code. and apparently, I got puzzled even more.
here is the portion of the code
void getGaussianDerivs(double sigma, int M, vector<double>& gaussian,
vector<double>& dg, vector<double>& d2g) {
int L = (M - 1) / 2;
double sigma_sq = sigma * sigma;
double sigma_quad = sigma_sq*sigma_sq;
dg.resize(M); d2g.resize(M); gaussian.resize(M);
Mat_<double> g = getGaussianKernel(M, sigma, CV_64F);
for (double i = -L; i < L+1.0; i += 1.0) {
int idx = (int)(i+L);
gaussian[idx] = g(idx);
// from http://www.cedar.buffalo.edu/~srihari/CSE555/Normal2.pdf
dg[idx] = (-i/sigma_sq) * g(idx);
d2g[idx] = (-sigma_sq + i*i)/sigma_quad * g(idx);
}
}
so, it seems the code uses simple 1D Gaussian Kernel Aperture size of M and it is trying to compute its 1st and 2nd derivatives. As far as I know, 1D Gaussian kernel has parameter of x which is a horizontal coordinate and sigma which is scale. it seems like that 'arc length parameter u' is equivalent to the variable of x. That doesn't make any sense because later in the code, it directly convolutes the set of x and y on the contour.
what is this u?
PS. since I replied to the fellow who tried to answer my question, I think I should modify my question, so, here we go.
What I'm confusing is, how is this parameter 'u' implemented in codes? I think I understood the full code above -of course, I inserted only a portion of the code- but the problem is, I have no idea about what it would be for the 'improved' version of the algorithm. It says it's using 'affine length parameter' instead of this 'arc length parameter' and I'm not so sure how I implement the concept into the code.
According to the literature, the main difference between arc length parameter and affine length parameter is it's sampling interval and arc length parameter uses 1 for the vertical and horizontal direction and root of 2 for the diagonal direction. It makes sense since the portion of the code above is using for loop to compute 1st and 2nd derivatives of the 1d Gaussian and it directly inserts the value of interval 1 but, how is it gonna be with different interval with different variable? Is it possible that I'm not able to use 'for loop' for it?

Related

Is there a way to generate the corners of a regular N-gon without division and without trig, given N as input

Edit: So I found a page related to rasterizing a trapezoid https://cse.taylor.edu/~btoll/s99/424/res/ucdavis/GraphicsNotes/Rasterizing-Polygons/Rasterizing-Polygons.html but am still trying to figure out if I can just do the edges
I am trying to generate points for the corners of an arbitrary N-gon. Where N is a non-zero positive integer. But I am trying to do so efficiently without the need of division and trig. I am thinking that there is probably some of sort of Bresenham's type algorithm for this but I cannot seem to find anything.
The only thing I can find on stackoverflow is this but the internal angle increments are found by using 2*π/N:
How to draw a n sided regular polygon in cartesian coordinates?
Here was the algorithm from that page in C
float angle_increment = 2*PI / n_sides;
for(int i = 0; i < n_sides; ++i)
{
float x = x_centre + radius * cos(i*angle_increment +start_angle);
float y = y_centre + radius * sin(i*angle_increment +start_angle);
}
So is it possible to do so without any division?
There's no solution which avoids calling sin and cos, aside from precomputing them for all useful values of N. However, you only need to do the computation once, regardless of N. (Provided you know where the centre of the polygon is. You might need a second computation to figure out the coordinates of the centre.) Every vertex can be computed from the previous vertex using the same rotation matrix.
The lookup table of precomputed values is not an unreasonable solution, since at some sufficiently large value of N the polygon becomes indistinguishable from a circle. So you probably only need a lookup table of a few hundred values.

Confusion about formula for Linear regression with gradient descent, (Pseudocode)

I'm made a program that calculates the line of best fit of a set of data points using gradient descent. I generate a 1000 random points and then it calculates the line of best fit training on these 1000 points. My confusion lies in the theory of my code.
In the part of my code where the training function is, by using the current m and b values for y= mx +b, the function makes a guess of the y values when it goes through the training points x values. This is supervised learning, so I know what the actual y value is, the function calculates the error and using that error adjusts the m and b values. <-- What is happening in the program when adjusting the line of best fit
I get everything above ^. what I'm confused about is the part of the code that calculates how to adjust these m and b values. Here it is:
guess = m * x + b;
error = y - guess;
m = m + (error * x) * learningrate;
b = b + error * learningrate;
Im confused about why we add instead of subtract that delta m (the (error * x) *learningrate)) part. Ignoring the learningrate, the error * x part is the partial derivative of the error with respect to m. But if we took the partial derivative of something with respect to something, wouldn't it give us the direction of the steepest ascent? Shouldn't we go the opposite direction (subtract the delta m) to get the proper m value? Isn't our goal to reduce the error?
Surprisingly to me, the above code works, if you add the delta m, it adjusts the m and b values in the right direction. So basically my question is: Why aren't we subtracting the delta m part (error *x) as it is pointing in the direction of steepest ascent, and we want to get the opposite of that?
Thanks!

Is there more detailed reference on Cascade Classifiers in OpenCV?

I'm trying to modify OpenCV's haar Cascade Classifier into specific purpose and looked into its source code from up to down. But I stuck on understanding very final part which is calculating feature value in line 361 of cascadedetect.hpp
return optfeaturesPtr[featureIdx].calc(pwin) * varianceNormFactor;
and its sub part for (a) calculating each rectangles weighted sum in line 398-407 of cascadedetect.hpp
inline float HaarEvaluator::OptFeature :: calc( const int* ptr ) const
{
float ret = weight[0] * CALC_SUM_OFS(ofs[0], ptr) +
weight[1] * CALC_SUM_OFS(ofs[1], ptr);
if( weight[2] != 0.0f )
ret += weight[2] * CALC_SUM_OFS(ofs[2], ptr);
return ret;
}
and (b) calculating varianceNormFactor in line 691-697 and 701 of cascadedetect.cpp
pwin = &sbuf.at<int>(pt) + s.layer_ofs;
const int* pq = (const int*)(pwin + sqofs);
int valsum = CALC_SUM_OFS(nofs, pwin);
unsigned valsqsum = (unsigned)(CALC_SUM_OFS(nofs, pq));
double area = normrect.area();
double nf = area * valsqsum - (double)valsum * valsum;
line:701 varianceNormFactor = (float)(1./nf);
From my little knowledge I guess 'pwin' represents actual window which is currently in progress, but (Q1) What does pq mean? I couldn't find any line which gives value into sbuf variable (which somehow connected to everything above)
According to Rainer Lienhart's paper which was mentioned by OpenCV's team they used this equation for light correction. But in this paper they said we can calculate σ just by looking at 4 values in integral image from each pixel's square. (Q2) But aren't we supposed to subtract average from each pixels BEFORE taking sum of squares in order to calculate standard deviation or σ in this paper represent something different?
And from source I think equation they used must be similar to this instead. (Q3) So If possible, Where can I get maths behind these codes or detailed reference material for this part? I've red OpenCV's reference manual but didn't find anything about it.
(Q1) What does pq mean:
I think it may implies "Pointer to Square";
OpenCV may calculate the Square of all pixels and append it to a buffer B.
So when do normalization, it can quickly refer to it by:
the offset of current window in B, which is pwin.
the offset of square value for each pixel within current window in B based on pwin, which is sqofs.
(Q2) But aren't we supposed to subtract average from each pixels BEFORE taking sum of squares in order to calculate standard deviation > or σ in this paper represent something different?
I have checked it on OpenCV 3.4.4 and, yes, it does NOT subtract the mean.
I guess since for Haar pattern is symmetry, eg for Horizontal 2 one region subtract a region with the same area. Hence the effect sums up to zero and we can ignore it.

Given camera matrices, how to find point correspondances using OpenCV?

I'm following this tutorial, which uses Features2D + Homography. If I have known camera matrix for each image, how can I optimize the result? I tried some images, but it didn't work well.
//Edit
After reading some materials, I think I should rectify two image first. But the rectification is not perfect, so a vertical line on image 1 correspond a vertical band on image 2 generally. Are there any good algorithms?
I'm not sure if I understand your problem. You want to find corresponding points between the images or you want to improve the correctness of your matches by use of the camera intrinsics?
In principle, in order to use camera geometry for finding matches, you would need the fundamental or essential matrix, depending on wether you know the camera intrinsics (i.e. calibrated camera). That means, you would need an estimate for the relative rotation and translation of the camera. Then, by computing the epipolar lines corresponding to the features found in one image, you would need to search along those lines in the second image to find the best match. However, I think it would be better to simply rely on automatic feature matching. Given the fundamental/essential matrix, you could try your luck with correctMatches, which will move the correspondences such that the reprojection error is minimised.
Tips for better matches
To increase the stability and saliency of automatic matches, it usually pays to
Adjust the parameters of the feature detector
Try different detection algorithms
Perform a ratio test to filter out those keypoints which have a very similar second-best match and are therefore unstable. This is done like this:
Mat descriptors_1, descriptors_2; // obtained from feature detector
BFMatcher matcher;
vector<DMatch> matches;
matcher = BFMatcher(NORM_L2, false); // norm depends on feature detector
vector<vector<DMatch>> match_candidates;
const float ratio = 0.8; // or something
matcher.knnMatch(descriptors_1, descriptors_2, match_candidates, 2);
for (int i = 0; i < match_candidates.size(); i++)
{
if (match_candidates[i][0].distance < ratio * match_candidates[i][1].distance)
matches.push_back(match_candidates[i][0]);
}
A more involved way of filtering would be to compute the reprojection error for each keypoint in the first frame. This means to compute the corresponding epipolar line in the second image and then checking how far its supposed matching point is away from that line. Throwing away those points whose distance exceeds some threshold would remove the matches which are incompatible with the epiploar geometry (which I assume would be known). Computing the error can be done like this (I honestly do not remember where I took this code from and I may have modified it a bit, also the SO editor is buggy when code is inside lists, sorry for the bad formatting):
double computeReprojectionError(vector& imgpts1, vector& imgpts2, Mat& inlier_mask, const Mat& F)
{
double err = 0;
vector lines[2];
int npt = sum(inlier_mask)[0];
// strip outliers so validation is constrained to the correspondences
// which were used to estimate F
vector imgpts1_copy(npt),
imgpts2_copy(npt);
int c = 0;
for (int k = 0; k < inlier_mask.size().height; k++)
{
if (inlier_mask.at(0,k) == 1)
{
imgpts1_copy[c] = imgpts1[k];
imgpts2_copy[c] = imgpts2[k];
c++;
}
}
Mat imgpt[2] = { Mat(imgpts1_copy), Mat(imgpts2_copy) };
computeCorrespondEpilines(imgpt[0], 1, F, lines[0]);
computeCorrespondEpilines(imgpt1, 2, F, lines1);
for(int j = 0; j < npt; j++ )
{
// error is computed as the distance between a point u_l = (x,y) and the epipolar line of its corresponding point u_r in the second image plus the reverse, so errij = d(u_l, F^T * u_r) + d(u_r, F*u_l)
Point2f u_l = imgpts1_copy[j], // for the purpose of this function, we imagine imgpts1 to be the "left" image and imgpts2 the "right" one. Doesn't make a difference
u_r = imgpts2_copy[j];
float a2 = lines1[j][0], // epipolar line
b2 = lines1[j]1,
c2 = lines1[j][2];
float norm_factor2 = sqrt(pow(a2, 2) + pow(b2, 2));
float a1 = lines[0][j][0],
b1 = lines[0][j]1,
c1 = lines[0][j][2];
float norm_factor1 = sqrt(pow(a1, 2) + pow(b1, 2));
double errij =
fabs(u_l.x * a2 + u_l.y * b2 + c2) / norm_factor2 +
fabs(u_r.x * a1 + u_r.y * b1 + c1) / norm_factor1; // distance of (x,y) to line (a,b,c) = ax + by + c / (a^2 + b^2)
err += errij; // at this point, apply threshold and mark bad matches
}
return err / npt;
}
The point is, grab the fundamental matrix, use it to compute epilines for all the points and then compute the distance (the lines are given in a parametric form so you need to do some algebra to get the distance). This is somewhat similar in outcome to what findFundamentalMat with the RANSAC method does. It returns a mask wherein for each match there is either a 1, meaning that it was used to estimate the matrix, or a 0 if it was thrown out. But estimating the fundamental Matrix like this will probably be less accurate than using chessboards.
EDIT: Looks like oarfish beat me to it, but I'll leave this here.
The fundamental matrix (F) defines a mapping from a point in the left image to a line in the right image on which the corresponding point must lie, assuming perfect calibration. This is the epipolar line, i.e. the line though the point in the left image and the two epipoles of the stereo camera pair. For references, see these lecture notes and this chapter of the HZ book.
Given a set of point correspondences in the left and right images: (p_L, p_R), from SURF (or any other feature matcher), and given F, the constraint from epipolar geometry of the stereo pair says that p_R should lie on the epipolar line projected by p_L onto the right image, i.e.
In practice, calibration errors from noise as well as erroneous feature matches lead to a non-zero value.
However, using this idea, you can then perform outlier removal by rejecting those feature matches for which this equation is greater than a certain threshold value, i.e. reject (p_L, p_R) if and only if:
When selecting this threshold, keep in mind that it is the distance in image space of a point from an epipolar line that you are willing to tolerate, which in some sense is your epipolar error tolerance.
Degenerate case: To visually imagine what this means, let us assume that the stereo pair differ only in a pure X-translation. Then the epipolar lines are horizontal. This means that you can connect the feature matched point pairs by a line and reject those pairs whose line slope is not close to zero. The equation above is a generalization of this idea to arbitrary stereo rotation and translation, which is accounted for by the matrix F.
Your specific images: It looks like your feature matches are sparse. I suggest instead to use a dense feature matching approach so that after outlier removal, you are still left with a sufficient number of good-quality matches. I'm not sure which dense feature matcher is already implemented in OpenCV, but I suggest starting here.
Giving your pictures, your are trying to do a stereo matching.
This page will be helpfull. The rectification you want can be done using stereoCalibrate then stereoRectify.
The result (from the doc):
In order to find the Fundamental Matrix, you need correct correspondances, but in order to get good correspondances, you need a good estimate of the fundamental matrix. This might sound like an impossible chicken-and-the-egg-problem, but there is well established methods to do this; RANSAC.
It randomly selects a small set of correspondances, uses those to calculate a fundamental matrix (using the 7 or 8 point algorithm) and then tests how many of the other correspondences that comply with this matrix (using the method described by scribbleink for measuring the distance between point and epipolar line). It keeps testing new combinations of correspondances for a certain number of iterations and selects the one with the most inliers.
This is already implemented in OpenCV as cv::findFundamentalMat (http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#findfundamentalmat). Select the method CV_FM_RANSAC to use ransac to remove bad correspondances. It will output a list of all the inlier correspondances.
The requirement for this is that all the points does not lie on the same plane.

fastest way to compute angle with x-axis

What is the fastest way to calculate angle between a line and the x-axis?
I need to define a function, which is Injective at the PI:2PI interval (I have to angle between point which is the uppermost point and any point below).
PointType * top = UPPERMOST_POINT;
PointType * targ = TARGET_POINT;
double targetVectorX = targ->x - top->x;
double targetVectorY = targ->y - top->y;
first try
//#1
double magnitudeTarVec = sqrt(targetVectorX*targetVectorX + targetVectorY*targetVectorY);
angle = tarX / magTar;
second try
//#2 slower
angle = atan2(targetVectorY, targetVectorX);
I do not need the angle directly (radians or degrees), just any value is fine as far as by comparing these values of this kind from 2 points I can tell which angle is bigger. (for example angle in example one is between -1 and 1 (it is cosine argument))
Check for y being zero as atan2 does; then The quotient x/y will be plenty fine. (assuming I understand you correctly).
I just wrote Fastest way to sort vectors by angle without actually computing that angle about the general question of finding functions monotonic in the angle, without any code or connections to C++ or the likes. Based on the currently accepted answer there I'd now suggest
double angle = copysign( // magnitude of first argument with sign of second
1. - targetVectorX/(fabs(targetVectorX) + fabs(targetVectorY)),
targetVectorY);
The great benefit compared to the currently accepted answer here is the fact that you won't have to worry about infinite values, since all non-zero vectors (i.e. targetVectorX and targetVectorY are not both equal to zero at the same time) will result in finite pseudoangle values. The resulting pseudoangles will be in the range [−2 … 2] for real angles [−π … π], so the signs and the discontinuity are just like you'd get them from atan2.