How to detect image gradient or normal using OpenCV - c++

I wanted to detect ellipse in an image. Since I was learning Mathematica at that time, I asked a question here and got a satisfactory result from the answer below, which used the RANSAC algorithm to detect ellipse.
However, recently I need to port it to OpenCV, but there are some functions that only exist in Mathematica. One of the key function is the "GradientOrientationFilter" function.
Since there are five parameters for a general ellipse, I need to sample five points to determine one. Howevere, the more sampling points indicates the lower chance to have a good guess, which leads to the lower success rate in ellipse detection. Therefore, the answer from Mathematica add another condition, that is the gradient of the image must be parallel to the gradient of the ellipse equation. Anyway, we'll only need three points to determine one ellipse using least square from the Mathematica approach. The result is quite good.
However, when I try to find the image gradient using Sobel or Scharr operator in OpenCV, it is not good enough, which always leads to the bad result.
How to calculate the gradient or the tangent of an image accurately? Thanks!
Result with gradient, three points
Result without gradient, five points
----------updated----------
I did some edge detect and median blur beforehand and draw the result on the edge image. My original test image is like this:
In general, my final goal is to detect the ellipse in a scene or on an object. Something like this:
That's why I choose to use RANSAC to fit the ellipse from edge points.

As for your final goal, you may try
findContours and [fitEllipse] in OpenCV
The pseudo code will be
1) some image process
2) find all contours
3) fit each contours by fitEllipse
here is part of code I use before
[... image process ....you get a bwimage ]
vector<vector<Point> > contours;
findContours(bwimage, contours, CV_RETR_LIST, CV_CHAIN_APPROX_NONE);
for(size_t i = 0; i < contours.size(); i++)
{
size_t count = contours[i].size();
Mat pointsf;
Mat(contours[i]).convertTo(pointsf, CV_32F);
RotatedRect box = fitEllipse(pointsf);
/* You can put some limitation about size and aspect ratio here */
if( box.size.width > 20 &&
box.size.height > 20 &&
box.size.width < 80 &&
box.size.height < 80 )
{
if( MAX(box.size.width, box.size.height) > MIN(box.size.width, box.size.height)*30 )
continue;
//drawContours(SrcImage, contours, (int)i, Scalar::all(255), 1, 8);
ellipse(SrcImage, box, Scalar(0,0,255), 1, CV_AA);
ellipse(SrcImage, box.center, box.size*0.5f, box.angle, 0, 360, Scalar(200,255,255), 1, CV_AA);
}
}
imshow("result", SrcImage);

If you focus on ellipse(no other shape), you can treat the value of the pixels of the ellipse as mass of the points.
Then you can calculate the moment of inertial Ixx, Iyy, Ixy to find out the angle, theta, which can rotate a general ellipse back to a canonical form (X-Xc)^2/a + (Y-Yc)^2/b = 1.
Then you can find out Xc and Yc by the center of mass.
Then you can find out a and b by min X and min Y.
--------------- update -----------
This method can apply to filled ellipse too.
More than one ellipse on a single image will fail unless you segment them first.
Let me explain more,
I will use C to represent cos(theta) and S to represent sin(theta)
After rotation to canonical form, the new X is [eq0] X=xC-yS and Y is Y=xS+yC where x and y are original positions.
The rotation will give you min IYY.
[eq1]
IYY= Sum(m*Y*Y) = Sum{m*(xS+yC)(xS+yC)} = Sum{ m(xxSS+yyCC+xySC) = Ixx*S^2 + Iyy*C^2 + Ixy*S*C
For min IYY, d(IYY)/d(theta) = 0 that is
2IxxSC - 2IyySC + Ixy(CC-SS) = 0
2(Ixx-Iyy)/Ixy = (SS-CC)/SC = S/C+C/S = Z+1/Z
While programming, the LHS is just a number, let's said N
Z^2 - NZ +1 =0
So there are two roots of Z hence theta, let's said Z1 and Z2, one will min the IYY and the other will max the IYY.
----------- pseudo code --------
Compute Ixx, Iyy, Ixy for a hollow or filled ellipse.
Compute theta1=atan(Z1) and theta2=atan(Z2)
Put These two theta into eq1 find which is smaller. Then you get theta.
Go back to those non-zero pixels, transfer them to new X and Y by the theta you found.
Find center of mass Xc Yc and min X and min Y by sort().
-------------- by hand -----------
If you need the original equation of the ellipse
Just put [eq0] into the canonical form

You're using terms in an unusual way.
Normally for images, the term "gradient" is interpreted as if the image is a mathematical function f(x,y). This gives us a (df/dx, df/dy) vector in each point.
Yet you're looking at the image as if it's a function y = f(x) and the gradient would be f(x)/dx.
Now, if you look at your image, you'll see that the two interpretations are definitely related. Your ellipse is drawn as a set of contrasting pixels, and as a result there are two sharp gradients in the image - the inner and outer. These of course correspond to the two normal vectors, and therefore are in opposite directions.
Also note that your image has pixels. The gradient is also pixelated. The way your ellipse is drawn, with a single pixel width means that your local gradient takes on only values that are a multiple of 45 degrees:
▄▄ ▄▀ ▌ ▀▄

Related

opencv C++ negative X,Y cordinates from detections (yolo detections)

I am using Yolo detection algorithm to predict bounding boxes, but the detection returns some values as negative integers , but the cv::rectangle function draws correct rectangles on to images, so it's kind of puzzling why there are negative or even long values in the detection coordinates, here get_rect function returns x,y,width,height, after applying NMS
std::vector<uint> rr = get_rect(img, res[j].bbox);
later converted it into opencv understandable format:
cv::Rect r = cv::Rect(rr[0],rr[1],rr[2],rr[3]);
when print following values of Rect r some values are ambiguous,
r.x = 398
r.y = 1431655936
r.width = 22
r.height = -1431655867
As can be seen values of y coordinate and height are completely out of the canvas, so any reasons for this.
Also , i made sure that the input image dimensions , infering dimensions , and output image rendering dimensions all are same, Also cv::rectangle function correctly drawing all the rectangles with there respective object locations.

Given camera matrices, how to find point correspondances using OpenCV?

I'm following this tutorial, which uses Features2D + Homography. If I have known camera matrix for each image, how can I optimize the result? I tried some images, but it didn't work well.
//Edit
After reading some materials, I think I should rectify two image first. But the rectification is not perfect, so a vertical line on image 1 correspond a vertical band on image 2 generally. Are there any good algorithms?
I'm not sure if I understand your problem. You want to find corresponding points between the images or you want to improve the correctness of your matches by use of the camera intrinsics?
In principle, in order to use camera geometry for finding matches, you would need the fundamental or essential matrix, depending on wether you know the camera intrinsics (i.e. calibrated camera). That means, you would need an estimate for the relative rotation and translation of the camera. Then, by computing the epipolar lines corresponding to the features found in one image, you would need to search along those lines in the second image to find the best match. However, I think it would be better to simply rely on automatic feature matching. Given the fundamental/essential matrix, you could try your luck with correctMatches, which will move the correspondences such that the reprojection error is minimised.
Tips for better matches
To increase the stability and saliency of automatic matches, it usually pays to
Adjust the parameters of the feature detector
Try different detection algorithms
Perform a ratio test to filter out those keypoints which have a very similar second-best match and are therefore unstable. This is done like this:
Mat descriptors_1, descriptors_2; // obtained from feature detector
BFMatcher matcher;
vector<DMatch> matches;
matcher = BFMatcher(NORM_L2, false); // norm depends on feature detector
vector<vector<DMatch>> match_candidates;
const float ratio = 0.8; // or something
matcher.knnMatch(descriptors_1, descriptors_2, match_candidates, 2);
for (int i = 0; i < match_candidates.size(); i++)
{
if (match_candidates[i][0].distance < ratio * match_candidates[i][1].distance)
matches.push_back(match_candidates[i][0]);
}
A more involved way of filtering would be to compute the reprojection error for each keypoint in the first frame. This means to compute the corresponding epipolar line in the second image and then checking how far its supposed matching point is away from that line. Throwing away those points whose distance exceeds some threshold would remove the matches which are incompatible with the epiploar geometry (which I assume would be known). Computing the error can be done like this (I honestly do not remember where I took this code from and I may have modified it a bit, also the SO editor is buggy when code is inside lists, sorry for the bad formatting):
double computeReprojectionError(vector& imgpts1, vector& imgpts2, Mat& inlier_mask, const Mat& F)
{
double err = 0;
vector lines[2];
int npt = sum(inlier_mask)[0];
// strip outliers so validation is constrained to the correspondences
// which were used to estimate F
vector imgpts1_copy(npt),
imgpts2_copy(npt);
int c = 0;
for (int k = 0; k < inlier_mask.size().height; k++)
{
if (inlier_mask.at(0,k) == 1)
{
imgpts1_copy[c] = imgpts1[k];
imgpts2_copy[c] = imgpts2[k];
c++;
}
}
Mat imgpt[2] = { Mat(imgpts1_copy), Mat(imgpts2_copy) };
computeCorrespondEpilines(imgpt[0], 1, F, lines[0]);
computeCorrespondEpilines(imgpt1, 2, F, lines1);
for(int j = 0; j < npt; j++ )
{
// error is computed as the distance between a point u_l = (x,y) and the epipolar line of its corresponding point u_r in the second image plus the reverse, so errij = d(u_l, F^T * u_r) + d(u_r, F*u_l)
Point2f u_l = imgpts1_copy[j], // for the purpose of this function, we imagine imgpts1 to be the "left" image and imgpts2 the "right" one. Doesn't make a difference
u_r = imgpts2_copy[j];
float a2 = lines1[j][0], // epipolar line
b2 = lines1[j]1,
c2 = lines1[j][2];
float norm_factor2 = sqrt(pow(a2, 2) + pow(b2, 2));
float a1 = lines[0][j][0],
b1 = lines[0][j]1,
c1 = lines[0][j][2];
float norm_factor1 = sqrt(pow(a1, 2) + pow(b1, 2));
double errij =
fabs(u_l.x * a2 + u_l.y * b2 + c2) / norm_factor2 +
fabs(u_r.x * a1 + u_r.y * b1 + c1) / norm_factor1; // distance of (x,y) to line (a,b,c) = ax + by + c / (a^2 + b^2)
err += errij; // at this point, apply threshold and mark bad matches
}
return err / npt;
}
The point is, grab the fundamental matrix, use it to compute epilines for all the points and then compute the distance (the lines are given in a parametric form so you need to do some algebra to get the distance). This is somewhat similar in outcome to what findFundamentalMat with the RANSAC method does. It returns a mask wherein for each match there is either a 1, meaning that it was used to estimate the matrix, or a 0 if it was thrown out. But estimating the fundamental Matrix like this will probably be less accurate than using chessboards.
EDIT: Looks like oarfish beat me to it, but I'll leave this here.
The fundamental matrix (F) defines a mapping from a point in the left image to a line in the right image on which the corresponding point must lie, assuming perfect calibration. This is the epipolar line, i.e. the line though the point in the left image and the two epipoles of the stereo camera pair. For references, see these lecture notes and this chapter of the HZ book.
Given a set of point correspondences in the left and right images: (p_L, p_R), from SURF (or any other feature matcher), and given F, the constraint from epipolar geometry of the stereo pair says that p_R should lie on the epipolar line projected by p_L onto the right image, i.e.
In practice, calibration errors from noise as well as erroneous feature matches lead to a non-zero value.
However, using this idea, you can then perform outlier removal by rejecting those feature matches for which this equation is greater than a certain threshold value, i.e. reject (p_L, p_R) if and only if:
When selecting this threshold, keep in mind that it is the distance in image space of a point from an epipolar line that you are willing to tolerate, which in some sense is your epipolar error tolerance.
Degenerate case: To visually imagine what this means, let us assume that the stereo pair differ only in a pure X-translation. Then the epipolar lines are horizontal. This means that you can connect the feature matched point pairs by a line and reject those pairs whose line slope is not close to zero. The equation above is a generalization of this idea to arbitrary stereo rotation and translation, which is accounted for by the matrix F.
Your specific images: It looks like your feature matches are sparse. I suggest instead to use a dense feature matching approach so that after outlier removal, you are still left with a sufficient number of good-quality matches. I'm not sure which dense feature matcher is already implemented in OpenCV, but I suggest starting here.
Giving your pictures, your are trying to do a stereo matching.
This page will be helpfull. The rectification you want can be done using stereoCalibrate then stereoRectify.
The result (from the doc):
In order to find the Fundamental Matrix, you need correct correspondances, but in order to get good correspondances, you need a good estimate of the fundamental matrix. This might sound like an impossible chicken-and-the-egg-problem, but there is well established methods to do this; RANSAC.
It randomly selects a small set of correspondances, uses those to calculate a fundamental matrix (using the 7 or 8 point algorithm) and then tests how many of the other correspondences that comply with this matrix (using the method described by scribbleink for measuring the distance between point and epipolar line). It keeps testing new combinations of correspondances for a certain number of iterations and selects the one with the most inliers.
This is already implemented in OpenCV as cv::findFundamentalMat (http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#findfundamentalmat). Select the method CV_FM_RANSAC to use ransac to remove bad correspondances. It will output a list of all the inlier correspondances.
The requirement for this is that all the points does not lie on the same plane.

Compare intensity pixel value Vec3b in OpenCV

I have a 3 channel Mat image, type is CV_8UC3.
I want to compare, in a loop, the intensity value of a pixel with its neighbours and then set 0 or 1 if the neighbour is greater or not.
I can get the intensity calling Img.at<Vec3b>(x,y).
But my question is: how can I compare two Vec3b?
Should I compare pixels value for every channel (BGR or Vec3b[0], Vec3b[1] and Vec3b[2]), and then merge the three channels results into a single Mat object?
Me again :)
If you want to compare (greater or less) two RGB values you need to project the 3-dimensional RGB space onto a plane or axis.
Of course, there are many possibilities to do this, but an easy way would be to use the HSV color space. The hue (H), however, is not appropriate as a linear order function because it is circular (i.e. the value 1.0 is identical with 0.0, so you cannot decide if 0.5 > 0.0 or 0.5 < 0.0). However, the saturation (S) or the value (V) are appropriate projection functions for your purpose:
If you want to have colored pixels "larger" than monochrome pixels, you will prefer S.
If you want to have lighter pixels larger than darker pixels, you will probably prefer V.
Also any combination of S and V would be a valid projection function, e.g. S+V.
As far as I understand, you want a measure to calculate distance/similarity between two Vec3b pixels. This can be reflected to the general problem of finding distance between two vectors in an n-mathematical space.
One of the famous measures (and I think this is what you're asking for), is the Euclidean distance.
If you are using Opencv then you can simply use:
cv::Vec3b a(1, 1, 1);
cv::Vec3b b(5, 5, 5);
double dist = cv::norm(a, b, CV_L2);
You can refer to this for reading about cv::norm and its options.
Edit: If you are doing this to measure color similarity, it's recommended to use the LAB color space as it's proved that Euclidean distance in LAB space is a good approximation for human perception of colors.
Edit 2: I see what you mean, for this you can get the magnitude of each vector and then compare them, something like this:
double a_magnitude = cv::norm(a, CV_L2);
double b_magnitude = cv::norm(b, CV_L2);
if(a_magnitude > b_magnitude)
// do something
else
// do something else.

Pass vector<Point2f> to getAffineTransform

I'm trying to calculate affine transformation between two consecutive frames from a video. So I have found the features and got the matched points in the two frames.
FastFeatureDetector detector;
vector<Keypoints> frame1_features;
vector<Keypoints> frame2_features;
detector.detect(frame1 , frame1_features , Mat());
detector.detect(frame2 , frame2_features , Mat());
vector<Point2f> features1; //matched points in 1st image
vector<Point2f> features2; //matched points in 2nd image
for(int i = 0;i<frame2_features.size() && i<frame1_features.size();++i )
{
double diff;
diff = pow((frame1.at<uchar>(frame1_features[i].pt) - frame2.at<uchar>(frame2_features[i].pt)) , 2);
if(diff<SSD) //SSD is sum of squared differences between two image regions
{
feature1.push_back(frame1_features[i].pt);
feature2.push_back(frame2_features[i].pt);
}
}
Mat affine = getAffineTransform(features1 , features2);
The last line gives the following error :
OpenCV Error: Assertion failed (src.checkVector(2, CV_32F) == 3 && dst.checkVector(2, CV_32F) == 3) in getAffineTransform
Can someone please tell me how to calculate the affine transformation with a set of matched points between the two frames?
Your problem is that you need exactly 3 point correspondences between the images.
If you have more than 3 correspondences, you should optimize the transformation to fit all the correspondences (except of outliers).
Therefore, I recommend to take a look at findHomography()-function (http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#findhomography).
It calculates a perspective transformation between the correspondences and needs at least 4 point correspondences.
Because you have more than 3 correspondences and affine transformations are a subset of perspective transformations, this should be appropriate for you.
Another advantage of the function is that it is able to detect outliers (correspondences that do not fit to the transformation and the other points) and these are not considered for transformation calculation.
To sum up, use findHomography(features1 , features2, CV_RANSAC) instead of getAffineTransform(features1 , features2).
I hope I could help you.
As I read from your code and assertion, there is something wrong with your vectors.
int checkVector(int elemChannels,int depth) //
this function returns N if the matrix is 1-channel (N x ptdim) or ptdim-channel (1 x N) or (N x 1); negative number otherwise.
And according to the documentation; http://docs.opencv.org/modules/imgproc/doc/geometric_transformations.html#getaffinetransform: Calculates an affine transform from three pairs of the corresponding points.
You seem to have more or less than three points in one or both of your vectors.

Number of Sides Required to draw a circle in OpenGL

Does anyone know some algorithm to calculate the number of sides required to approximate a circle using polygon, if radius, r of the circle and maximum departure of the polygon from circularity, D is given? I really need to find the number of sides as I need to draw the approximated circle in OpenGL.
Also, we have the resolution of the screen in NDC coordinates per pixel given by P and solving D = P/2, we could guarantee that our circle is within half-pixel of accuracy.
What you're describing here is effectively a quality factor, which often goes hand-in-hand with error estimates.
A common way we handle this is to calculate the error for a a small portion of the circumference of the circle. The most trivial is to determine the difference in arc length of a slice of the circle, compared to a line segment joining the same two points on the circumference. You could use more effective measures, like difference in area, radius, etc, but this method should be adequate.
Think of an octagon, circumscribed with a perfect circle. In this case, the error is the difference in length of the line between two adjacent points on the octagon, and the arc length of the circle joining those two points.
The arc length is easy enough to calculate: PI * r * theta, where r is your radius, and theta is the angle, in radians, between the two points, assuming you draw lines from each of these points to the center of the circle/polygon. For a closed polygon with n sides, the angle is just (2*PI/n) radians. Let the arc length corresponding to this value of n be equal to A, ie A=2*PI*r/n.
The line length between the two points is easily calculated. Just divide your circle into n isosceles triangles, and each of those into two right-triangles. You know the angle in each right triangle is theta/2 = (2*PI/n)/2 = (PI/n), and the hypotenuse is r. So, you get your equation of sin(PI/n)=x/r, where x is half the length of the line segment joining two adjacent points on your circumscribed polygon. Let this value be B (ie: B=2x, so B=2*r*sin(PI/n)).
Now, just calculate the relative error, E = |A-B| / A (ie: |TrueValue-ApproxValue|/|TrueValue|), and you get a nice little percentage, represented in decimal, of your error vector. You can use the above equations to set a constraint on E (ie: it cannot be greater than some value, say, 1.05), in order for it to "look good".
So, you could write a function that calculates A, B, and E from the above equations, and loop through values of n, and have it stop looping when the calculated value of E is less than your threshold.
I would say that you need to set the number of sides depending on two variables the radius and the zoom (if you allow zoom)
A circle or radius 20 pixels can look ok with 32 to 56 sides, but if you use the same number of sided for a radios of 200 pixels that number of sides will not be enough
numberOfSides = radius * 3
If you allow zoom in and out you will need to do something like this
numberOfSides = radiusOfPaintedCircle * 3
When you zoom in radiusOfPaintedCircle will be bigger that the "property" of the circle being drawn
I've got an algorithm to draw a circle using fixed function opengl, maybe it'll help?
It's hard to know what you mean when you say you want to "approximate a circle using polygon"
You'll notice in my algorithm below that I don't calculate the number of lines needed to draw the circle, I just iterate between 0 .. 2Pi, stepping the angle by 0.1 each time, drawing a line with glVertex2f to that point on the circle, from the previous point.
void Circle::Render()
{
glLoadIdentity();
glPushMatrix();
glBegin(GL_LINES);
glColor3f(_vColour._x, _vColour._y, _vColour._z);
glVertex3f(_State._position._x, _State._position._y, 0);
glVertex3f(
(_State._position._x + (sinf(_State._angle)*_rRadius)),
(_State._position._y + (cosf(_State._angle)*_rRadius)),
0
);
glEnd();
glTranslatef(_State._position._x, _State._position._y, 0);
glBegin(GL_LINE_LOOP);
glColor3f(_vColour._x, _vColour._y, _vColour._z);
for(float angle = 0.0f; angle < g_k2Pi; angle += 0.1f)
glVertex2f(sinf(angle)*_rRadius, cosf(angle)*_rRadius);
glEnd();
glPopMatrix();
}