This question is on the OpenCV functions findHomography, getPerspectiveTransform & getAffineTransform
What is the difference between findHomography and getPerspectiveTransform?. My understanding from the documentation is that getPerspectiveTransform computes the transform using 4 correspondences (which is the minimum required to compute a homography/perspective transform) where as findHomography computes the transform even if you provide more than 4 correspondencies (presumably using something like a least squares method?).
Is this correct?
(In which case the only reason OpenCV still continues to support getPerspectiveTransform should be legacy? )
My next concern is that I want to know if there is an equivalent to findHomography for computing an Affine transformation? i.e. a function which uses a least squares or an equivalent robust method to compute and affine transformation.
According to the documentation getAffineTransform takes in only 3 correspondences (which is the min required to compute an affine transform).
Best,
Q #1: Right, the findHomography tries to find the best transform between two sets of points. It uses something smarter than least squares, called RANSAC, which has the ability to reject outliers - if at least 50% + 1 of your data points are OK, RANSAC will do its best to find them, and build a reliable transform.
The getPerspectiveTransform has a lot of useful reasons to stay - it is the base for findHomography, and it is useful in many situations where you only have 4 points, and you know they are the correct ones. The findHomography is usually used with sets of points detected automatically - you can find many of them, but with low confidence. getPerspectiveTransform is good when you kn ow for sure 4 corners - like manual marking, or automatic detection of a rectangle.
Q #2 There is no equivalent for affine transforms. You can use findHomography, because affine transforms are a subset of homographies.
I concur with everything #vasile has written. I just want to add some observations:
getPerspectiveTransform() and getAffineTransform() are meant to work on 4 or 3 points (respectively), that are known to be correct correspondences. On real-life images taken with a real camera, you can never get correspondences that accurate, not with automatic nor manual marking of the corresponding points.
There are always outliers. Just look at the simple case of wanting to fit a curve through points (e.g. take a generative equation with noise y1 = f(x) = 3.12x + gauss_noise or y2 = g(x) = 0.1x^2 + 3.1x + gauss_noise): it will be much more easier to find a good quadratic function to estimate the points in both cases, than a good linear one. Quadratic might be an overkill, but in most cases will not be (after removing outliers), and if you want to fit a straight line there you better be mightily sure that is the right model, otherwise you are going to get unusable results.
That said, if you are mightily sure that affine transform is the right one, here's a suggestion:
use findHomography, that has RANSAC incorporated in to the functionality, to get rid of the outliers and get an initial estimate of the image transformation
select 3 correct matches-correspondances (that fit with the homography found), or reproject 3 points from the 1st image to the 2nd (using the homography)
use those 3 matches (that are as close to correct as you can get) in getAffineTransform()
wrap all of that in your own findAffine() if you want - and voila!
Re Q#2, estimateRigidTransform is the oversampled equivalent of getAffineTransform. I don't know if it was in OCV when this was first posted, but it's available in 2.4.
There is an easy solution for the finding the Affine transform for the system of over-determined equations.
Note that in general an Affine transform finds a solution to the over-determined system of linear equations Ax=B by using a pseudo-inverse or a similar technique, so
x = (A At )-1 At B
Moreover, this is handled in the core openCV functionality by a simple call to solve(A, B, X).
Familiarize yourself with the code of Affine transform in opencv/modules/imgproc/src/imgwarp.cpp: it really does just two things:
a. rearranges inputs to create a system Ax=B;
b. then calls solve(A, B, X);
NOTE: ignore the function comments in the openCV code - they are confusing and don’t reflect the actual ordering of the elements in the matrices. If you are solving [u, v]’= Affine * [x, y, 1] the rearrangement is:
x1 y1 1 0 0 1
0 0 0 x1 y1 1
x2 y2 1 0 0 1
A = 0 0 0 x2 y2 1
x3 y3 1 0 0 1
0 0 0 x3 y3 1
X = [Affine11, Affine12, Affine13, Affine21, Affine22, Affine23]’
u1 v1
B = u2 v2
u3 v3
All you need to do is to add more points. To make Solve(A, B, X) work on over-determined system add DECOMP_SVD parameter. To see the powerpoint slides on the topic, use this link. If you’d like to learn more about the pseudo-inverse in the context of computer vision, the best source is: ComputerVision, see chapter 15 and appendix C.
If you are still unsure how to add more points see my code below:
// extension for n points;
cv::Mat getAffineTransformOverdetermined( const Point2f src[], const Point2f dst[], int n )
{
Mat M(2, 3, CV_64F), X(6, 1, CV_64F, M.data); // output
double* a = (double*)malloc(12*n*sizeof(double));
double* b = (double*)malloc(2*n*sizeof(double));
Mat A(2*n, 6, CV_64F, a), B(2*n, 1, CV_64F, b); // input
for( int i = 0; i < n; i++ )
{
int j = i*12; // 2 equations (in x, y) with 6 members: skip 12 elements
int k = i*12+6; // second equation: skip extra 6 elements
a[j] = a[k+3] = src[i].x;
a[j+1] = a[k+4] = src[i].y;
a[j+2] = a[k+5] = 1;
a[j+3] = a[j+4] = a[j+5] = 0;
a[k] = a[k+1] = a[k+2] = 0;
b[i*2] = dst[i].x;
b[i*2+1] = dst[i].y;
}
solve( A, B, X, DECOMP_SVD );
delete a;
delete b;
return M;
}
// call original transform
vector<Point2f> src(3);
vector<Point2f> dst(3);
src[0] = Point2f(0.0, 0.0);src[1] = Point2f(1.0, 0.0);src[2] = Point2f(0.0, 1.0);
dst[0] = Point2f(0.0, 0.0);dst[1] = Point2f(1.0, 0.0);dst[2] = Point2f(0.0, 1.0);
Mat M = getAffineTransform(Mat(src), Mat(dst));
cout<<M<<endl;
// call new transform
src.resize(4); src[3] = Point2f(22, 2);
dst.resize(4); dst[3] = Point2f(22, 2);
Mat M2 = getAffineTransformOverdetermined(src.data(), dst.data(), src.size());
cout<<M2<<endl;
getAffineTransform:affine transform is combination of translation, scale, shear, and rotation
https://www.mathworks.com/discovery/affine-transformation.html
https://www.tutorialspoint.com/computer_graphics/2d_transformation.htm
getPerspectiveTransform:perspective transform is project mapping
enter image description here
Related
I'm following this tutorial, which uses Features2D + Homography. If I have known camera matrix for each image, how can I optimize the result? I tried some images, but it didn't work well.
//Edit
After reading some materials, I think I should rectify two image first. But the rectification is not perfect, so a vertical line on image 1 correspond a vertical band on image 2 generally. Are there any good algorithms?
I'm not sure if I understand your problem. You want to find corresponding points between the images or you want to improve the correctness of your matches by use of the camera intrinsics?
In principle, in order to use camera geometry for finding matches, you would need the fundamental or essential matrix, depending on wether you know the camera intrinsics (i.e. calibrated camera). That means, you would need an estimate for the relative rotation and translation of the camera. Then, by computing the epipolar lines corresponding to the features found in one image, you would need to search along those lines in the second image to find the best match. However, I think it would be better to simply rely on automatic feature matching. Given the fundamental/essential matrix, you could try your luck with correctMatches, which will move the correspondences such that the reprojection error is minimised.
Tips for better matches
To increase the stability and saliency of automatic matches, it usually pays to
Adjust the parameters of the feature detector
Try different detection algorithms
Perform a ratio test to filter out those keypoints which have a very similar second-best match and are therefore unstable. This is done like this:
Mat descriptors_1, descriptors_2; // obtained from feature detector
BFMatcher matcher;
vector<DMatch> matches;
matcher = BFMatcher(NORM_L2, false); // norm depends on feature detector
vector<vector<DMatch>> match_candidates;
const float ratio = 0.8; // or something
matcher.knnMatch(descriptors_1, descriptors_2, match_candidates, 2);
for (int i = 0; i < match_candidates.size(); i++)
{
if (match_candidates[i][0].distance < ratio * match_candidates[i][1].distance)
matches.push_back(match_candidates[i][0]);
}
A more involved way of filtering would be to compute the reprojection error for each keypoint in the first frame. This means to compute the corresponding epipolar line in the second image and then checking how far its supposed matching point is away from that line. Throwing away those points whose distance exceeds some threshold would remove the matches which are incompatible with the epiploar geometry (which I assume would be known). Computing the error can be done like this (I honestly do not remember where I took this code from and I may have modified it a bit, also the SO editor is buggy when code is inside lists, sorry for the bad formatting):
double computeReprojectionError(vector& imgpts1, vector& imgpts2, Mat& inlier_mask, const Mat& F)
{
double err = 0;
vector lines[2];
int npt = sum(inlier_mask)[0];
// strip outliers so validation is constrained to the correspondences
// which were used to estimate F
vector imgpts1_copy(npt),
imgpts2_copy(npt);
int c = 0;
for (int k = 0; k < inlier_mask.size().height; k++)
{
if (inlier_mask.at(0,k) == 1)
{
imgpts1_copy[c] = imgpts1[k];
imgpts2_copy[c] = imgpts2[k];
c++;
}
}
Mat imgpt[2] = { Mat(imgpts1_copy), Mat(imgpts2_copy) };
computeCorrespondEpilines(imgpt[0], 1, F, lines[0]);
computeCorrespondEpilines(imgpt1, 2, F, lines1);
for(int j = 0; j < npt; j++ )
{
// error is computed as the distance between a point u_l = (x,y) and the epipolar line of its corresponding point u_r in the second image plus the reverse, so errij = d(u_l, F^T * u_r) + d(u_r, F*u_l)
Point2f u_l = imgpts1_copy[j], // for the purpose of this function, we imagine imgpts1 to be the "left" image and imgpts2 the "right" one. Doesn't make a difference
u_r = imgpts2_copy[j];
float a2 = lines1[j][0], // epipolar line
b2 = lines1[j]1,
c2 = lines1[j][2];
float norm_factor2 = sqrt(pow(a2, 2) + pow(b2, 2));
float a1 = lines[0][j][0],
b1 = lines[0][j]1,
c1 = lines[0][j][2];
float norm_factor1 = sqrt(pow(a1, 2) + pow(b1, 2));
double errij =
fabs(u_l.x * a2 + u_l.y * b2 + c2) / norm_factor2 +
fabs(u_r.x * a1 + u_r.y * b1 + c1) / norm_factor1; // distance of (x,y) to line (a,b,c) = ax + by + c / (a^2 + b^2)
err += errij; // at this point, apply threshold and mark bad matches
}
return err / npt;
}
The point is, grab the fundamental matrix, use it to compute epilines for all the points and then compute the distance (the lines are given in a parametric form so you need to do some algebra to get the distance). This is somewhat similar in outcome to what findFundamentalMat with the RANSAC method does. It returns a mask wherein for each match there is either a 1, meaning that it was used to estimate the matrix, or a 0 if it was thrown out. But estimating the fundamental Matrix like this will probably be less accurate than using chessboards.
EDIT: Looks like oarfish beat me to it, but I'll leave this here.
The fundamental matrix (F) defines a mapping from a point in the left image to a line in the right image on which the corresponding point must lie, assuming perfect calibration. This is the epipolar line, i.e. the line though the point in the left image and the two epipoles of the stereo camera pair. For references, see these lecture notes and this chapter of the HZ book.
Given a set of point correspondences in the left and right images: (p_L, p_R), from SURF (or any other feature matcher), and given F, the constraint from epipolar geometry of the stereo pair says that p_R should lie on the epipolar line projected by p_L onto the right image, i.e.
In practice, calibration errors from noise as well as erroneous feature matches lead to a non-zero value.
However, using this idea, you can then perform outlier removal by rejecting those feature matches for which this equation is greater than a certain threshold value, i.e. reject (p_L, p_R) if and only if:
When selecting this threshold, keep in mind that it is the distance in image space of a point from an epipolar line that you are willing to tolerate, which in some sense is your epipolar error tolerance.
Degenerate case: To visually imagine what this means, let us assume that the stereo pair differ only in a pure X-translation. Then the epipolar lines are horizontal. This means that you can connect the feature matched point pairs by a line and reject those pairs whose line slope is not close to zero. The equation above is a generalization of this idea to arbitrary stereo rotation and translation, which is accounted for by the matrix F.
Your specific images: It looks like your feature matches are sparse. I suggest instead to use a dense feature matching approach so that after outlier removal, you are still left with a sufficient number of good-quality matches. I'm not sure which dense feature matcher is already implemented in OpenCV, but I suggest starting here.
Giving your pictures, your are trying to do a stereo matching.
This page will be helpfull. The rectification you want can be done using stereoCalibrate then stereoRectify.
The result (from the doc):
In order to find the Fundamental Matrix, you need correct correspondances, but in order to get good correspondances, you need a good estimate of the fundamental matrix. This might sound like an impossible chicken-and-the-egg-problem, but there is well established methods to do this; RANSAC.
It randomly selects a small set of correspondances, uses those to calculate a fundamental matrix (using the 7 or 8 point algorithm) and then tests how many of the other correspondences that comply with this matrix (using the method described by scribbleink for measuring the distance between point and epipolar line). It keeps testing new combinations of correspondances for a certain number of iterations and selects the one with the most inliers.
This is already implemented in OpenCV as cv::findFundamentalMat (http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#findfundamentalmat). Select the method CV_FM_RANSAC to use ransac to remove bad correspondances. It will output a list of all the inlier correspondances.
The requirement for this is that all the points does not lie on the same plane.
I want to fit a plane to a 3D point cloud. I use a RANSAC approach, where I sample several points from the point cloud, calculate the plane, and store the plane with the smallest error. The error is the distance between the points and the plane. I want to do this in C++, using Eigen.
So far, I sample points from the point cloud and center the data. Now, I need to fit the plane to the samples points. I know I need to solve Mx = 0, but how do I do this? So far I have M (my samples), I want to know x (the plane) and this fit needs to be as close to 0 as possible.
I have no idea where to continue from here. All I have are my sampled points and I need more data.
From you question I assume that you are familiar with the Ransac algorithm, so I will spare you of lengthy talks.
In a first step, you sample three random points. You can use the Random class for that but picking them not truly random usually gives better results. To those points, you can simply fit a plane using Hyperplane::Through.
In the second step, you repetitively cross out some points with large Hyperplane::absDistance and perform a least-squares fit on the remaining ones. It may look like this:
Vector3f mu = mean(points);
Matrix3f covar = covariance(points, mu);
Vector3 normal = smallest_eigenvector(covar);
JacobiSVD<Matrix3f> svd(covariance, ComputeFullU);
Vector3f normal = svd.matrixU().col(2);
Hyperplane<float, 3> result(normal, mu);
Unfortunately, the functions mean and covariance are not built-in, but they are rather straightforward to code.
Recall that the equation for a plane passing through origin is Ax + By + Cz = 0, where (x, y, z) can be any point on the plane and (A, B, C) is the normal vector perpendicular to this plane.
The equation for a general plane (that may or may not pass through origin) is Ax + By + Cz + D = 0, where the additional coefficient D represents how far the plane is away from the origin, along the direction of the normal vector of the plane. [Note that in this equation (A, B, C) forms a unit normal vector.]
Now, we can apply a trick here and fit the plane using only provided point coordinates. Divide both sides by D and rearrange this term to the right-hand side. This leads to A/D x + B/D y + C/D z = -1. [Note that in this equation (A/D, B/D, C/D) forms a normal vector with length 1/D.]
We can set up a system of linear equations accordingly, and then solve it by an Eigen solver as follows.
// Example for 5 points
Eigen::Matrix<double, 5, 3> matA; // row: 5 points; column: xyz coordinates
Eigen::Matrix<double, 5, 1> matB = -1 * Eigen::Matrix<double, 5, 1>::Ones();
// Find the plane normal
Eigen::Vector3d normal = matA.colPivHouseholderQr().solve(matB);
// Check if the fitting is healthy
double D = 1 / normal.norm();
normal.normalize(); // normal is a unit vector from now on
bool planeValid = true;
for (int i = 0; i < 5; ++i) { // compare Ax + By + Cz + D with 0.2 (ideally Ax + By + Cz + D = 0)
if ( fabs( normal(0)*matA(i, 0) + normal(1)*matA(i, 1) + normal(2)*matA(i, 2) + D) > 0.2) {
planeValid = false; // 0.2 is an experimental threshold; can be tuned
break;
}
}
This method is equivalent to the typical SVD-based method, but much faster. It is suitable for use when points are known to be roughly in a plane shape. However, the SVD-based method is more numerically stable (when the plane is far far away from origin) and robust to outliers.
I am trying to calculate scale, rotation and translation between two consecutive frames of a video. So basically I matched keypoints and then used opencv function findHomography() to calculate the homography matrix.
homography = findHomography(feature1 , feature2 , CV_RANSAC); //feature1 and feature2 are matched keypoints
My question is: How can I use this matrix to calculate scale, rotation and translation?.
Can anyone provide me the code or explanation as to how to do it?
if you can use opencv 3.0, this decomposition method is available
http://docs.opencv.org/3.0-beta/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#decomposehomographymat
The right answer is to use homography as it is defined dst = H ⋅ src and explore what it does to small segments around a particular point.
Translation
Given a single point, for translation do
T = dst - (H ⋅ src)
Rotation
Given two points p1 and p2
p1 = H ⋅ p1
p2 = H ⋅ p2
Now just calculate the angle between vectors p1 p2 and p1' p2'.
Scale
You can use the same trick but now just compare the lengths: |p1 p2| and |p1' p2'|.
To be fair, use another segment orthogonal to the first and average the result. You will see that there is no constant scale factor or translation one. They will depend on the src location.
Given Homography matrix H:
|H_00, H_01, H_02|
H = |H_10, H_11, H_12|
|H_20, H_21, H_22|
Assumptions:
H_20 = H_21 = 0 and normalized to H_22 = 1 to obtain 8 DOF.
The translation along x and y axes are directly calculated from H:
tx = H_02
ty = H_12
The 2x2 sub matrix on the top left corner is decomposed to calculate shear, scaling and rotation. An easy and quick decomposition method is explained here.
Note: this method assumes invertible matrix.
Since i had to struggle for a couple of days to create my homography transformation function I'm going to put it here for the benefit of everyone.
Here you can see the main loop where every input position is multiplied by the homography matrix h. Then the result is used to copy the pixel from the original position to the destination position.
for (tempIn[0] = 0; tempIn[0] < stride; tempIn[0]++)
{
for (tempIn[1] = 0; tempIn[1] < rows; tempIn[1]++)
{
double w = h[6] * tempIn[0] + h[7] * tempIn[1] + 1; // very important!
//H_20 = H_21 = 0 and normalized to H_22 = 1 to obtain 8 DOF. <-- this is wrong
tempOut[0] = ((h[0] * tempIn[0]) + (h[1] * tempIn[1]) + h[2])/w;
tempOut[1] =(( h[3] * tempIn[0]) +(h[4] * tempIn[1]) + h[5])/w;
if (tempOut[1] < destSize && tempOut[0] < destSize && tempOut[0] >= 0 && tempOut[1] >= 0)
dest_[destStride * tempOut[1] + tempOut[0]] = src_[stride * tempIn[1] + tempIn[0]];
}
}
After such process an image with some kind of grid will be produced. Some kind of filter is needed to remove the grid. In my code i have used a simple linear filter.
Note: Only the central part of the original image is really required for producing a correct image. Some rows and columns can be safely discarded.
For estimating a tree-dimensional transform and rotation induced by a homography, there exist multiple approaches. One of them provides closed formulas for decomposing the homography, but they are very complex. Also, the solutions are never unique.
Luckily, OpenCV 3 already implements this decomposition (decomposeHomographyMat). Given an homography and a correctly scaled intrinsics matrix, the function provides a set of four possible rotations and translations.
The question seems to be about 2D parameters. Homography matrix captures perspective distortion. If the application does not create much perspective distortion, one can approximate a real world transformation using affine transformation matrix (that uses only scale, rotation, translation and no shearing/flipping). The following link will give an idea about decomposing an affine transformation into different parameters.
https://math.stackexchange.com/questions/612006/decomposing-an-affine-transformation
I am following this paper here using de Casteljau's Algorithm http://www.cgafaq.info/wiki/B%C3%A9zier_curve_evaluation and I have tried using the topic Drawing Bezier curves using De Casteljau Algorithm in C++ , OpenGL to help. No success.
My bezier curves look like this when evaluated
As you can see, even though it doesn't work the wanted I wanted it to, all the points are indeed on the curve. I do not think that this algorithm is inaccurate for this reason.
Here are my points on the top curve in that image:
(0,0)
(2,0)
(2,2)
(4,2) The second curve uses the same set of points, except the third point is (0,2), that is, two units above the first point, forming a steeper curve.
Something is wrong. I should put in 0.25 for t and it should spit out 1.0 for the X value, and .75 should always return 3. Assume t is time. It should progress at a constant rate, yeah? Exactly 25% of the way in, the X value should be 1.0 and then the Y should be associated with that value.
Are there any adequate ways to evaluate a bezier curve? Does anyone know what is going on here?
Thanks for any help! :)
EDIT------
I found this book in a google search http://www.tsplines.com/resources/class_notes/Bezier_curves.pdf and here is the page I found on explicit / non-parametric bezier curves. They are polynomials represented as bezier curves, which is what I am going for here. Here is that page from the book:
Anyone know how to convert a bezier curve to a parametric curve? I may open a different thread now...
EDIT AGAIN AS OF 1 NOVEMBER 2011-------
I've realized that I was only asking the question about half as clear as I should have. What I'm trying to build is like Maya's animation graph editor such as this http://www.youtube.com/watch?v=tckN35eYJtg&t=240 where the bezier control points that are used to modify the curve are more like tangent modifiers of equal length. I didn't remember them as being equal length, to be honest. By forcing a system like this, you can insure 100% that the result is a function and contains no overlapping segments.
I found this, which may have my answer http://create.msdn.com/en-US/education/catalog/utility/curve_editor
Here you can see the algorithm implemented in Mathematica following the nomenclature in your link, and your two plots:
(*Function Definitions*)
lerp[a_, b_, t_] := (1 - t) a + t b;
pts1[t_] := {
lerp[pts[[1]], pts[[2]], t],
lerp[pts[[2]], pts[[3]], t],
lerp[pts[[3]], pts[[4]], t]};
pts2[t_] := {
lerp[pts1[t][[1]], pts1[t][[2]], t],
lerp[pts1[t][[2]], pts1[t][[3]], t]};
pts3[t_] := {
lerp[pts2[t][[1]], pts2[t][[2]], t]};
(*Usages*)
pts = {{0, 0}, {2, 0}, {2, 2}, {4, 2}};
Framed#Show[ParametricPlot[pts3[t], {t, 0, 1}, Axes -> True],
Graphics[{Red, PointSize[Large], Point#pts}]]
pts = {{0, 0}, {2, 0}, {0, 2}, {4, 2}};
Framed#Show[ParametricPlot[pts3[t], {t, 0, 1}, Axes -> True],
Graphics[{Red, PointSize[Large], Point#pts}]]
BTW, the curves are defined by the following parametric equations, which are the functions pts3[t] in the code above:
c1[t_] := {2 t (3 + t (-3 + 2 t)), (* <- X component *)
2 (3 - 2 t) t^2} (* <- Y component *)
and
c2[t_] := {2 t (3 + t (-6 + 5 t)), (* <- X component *)
, 2 (3 - 2 t) t^2} (* <- Y component *)
Try plotting them!
Taking any of these curve equations, and by solving a cubic polynomial you can in these cases get an expression for y[x], which is certainly not always possible. Just for you to get a flavor of it, from the first curve you get (C syntax):
y[x]= 3 - x - 3/Power(-2 + x + Sqrt(5 + (-4 + x)*x),1/3) +
3*Power(-2 + x + Sqrt(5 + (-4 + x)*x),1/3)
Try plotting it!
Edit
Just an amusement:
Mathematica is a quite powerful functional language, and in fact the whole algorithm can be expressed as a one liner:
f = Nest[(1 - t) #[[1]] + t #[[2]] & /# Partition[#, 2, 1] &, #, Length## - 1] &
Such as
f#{{0, 0}, {2, 0}, {0, 2}, {4, 2}}
gives the above results, but supports any number of points.
Let's try with six random points:
p = RandomReal[1, {6, 2}];
Framed#Show[
Graphics[{Red, PointSize[Large], Point#p}],
ParametricPlot[f#p, {t, 0, 1}, Axes -> True]]
Moreover, the same function works in 3D:
p = RandomReal[1, {4, 3}];
Framed#Show[
Graphics3D[{Red, PointSize[Large], Point#p}],
ParametricPlot3D[f[p], {t, 0, 1}, Axes -> True]]
A bezier curve can be solved by solving the following parametric equations for the x, y, and z coordinates (if it's just 2D, do only x and y):
Px = (1-t)^3(P1x) + 3t(1-t)^2(P2x) + 3t^2(1-t)(P3x) + t^3(P4x)
Py = (1-t)^3(P1y) + 3t(1-t)^2(P2y) + 3t^2(1-t)(P3y) + t^3(P4y)
Pz = (1-t)^3(P1z) + 3t(1-t)^2(P2z) + 3t^2(1-t)(P3z) + t^3(P4z)
You can also solve this by multiplying the matrix equation ABC = X where:
matrix A is a 1x4 matrix and represents the values of the powers of t
matrix B are the coefficients of the powers of t, and is a lower-triangular 4x4 matrix
matrix C is a 4x3 matrix that represents each of the four bezier points in 3D-space (it would be a 4x2 matrix in 2D-space)
This would look like the following:
(Update - the bottom left 1 ought to be a -1)
An important note in both forms of the equation (the parametric and the matrix forms) is that t is in the range [0, 1].
Rather than attempting to solve the values for t that will give you integral values of x and y, which is going to be time-consuming given that you're basically solving for the real root of a 3rd-degree polynomial, it's much better to simply create a small enough differential in your t value such that the difference between any two points on the curve is smaller than a pixel-value increment. In other words the distance between the two points P(t1) and P(t2) is such that it is less than a pixel value. Alternatively, you can use a larger differential in t, and simply linearly interpolate between P(t1) and P(t2), keeping in mind that the curve may not be "smooth" if the differential between P(t1) and P(t2) is not small enough for the given range of t from [0, 1].
A good way to find the necessary differential in t to create a fairly "smooth" curve from a visual standpoint is to actually measure the distance between the four points that define the bezier curve. Measure the distance from P1 to P2, P2, to P3, and P3 to P4. Then take the longest distance, and use the inverse of that value as the differential for t. You may still need to-do some linear interpolation between points, but the number of pixels in each "linear" sub-curve should be fairly small, and therefore the curve itself will appear fairly smooth. You can always decrease the differential value on t from this initial value to make it "smoother".
Finally, to answer your question:
Assume t is time. It should progress at a constant rate, yeah? Exactly 25% of the way in, the X value should be 1.0 and then the Y should be associated with that value.
No, that is not correct, and the reason is that the vectors (P2 - P1) and (P3 - P4) are not only tangent to the bezier curve at P1 and P4, but their lengths define the velocity along the curve at those points as well. Thus if the vector (P2 - P1) is a short distance, then that means for a given amount of time t, you will not travel very far from the point P1 ... this translates into the x,y values along the curve being packed together very closely for a given fixed differential of t. You are effectively "slowing down" in velocity as you move towards P1. The same effect takes place at P4 on the curve depending on the length of the vector (P3 - P4). The only way that the velocity along the curve would be "constant", and therefore the distance between any points for a common differential of t would be the same, would be if the lengths of all three segements (P2 - P1), (P3 - P2), and (P4 - P3) were the same. That would then indicate that there was no change in velocity along the curve.
It sounds like you actually just want a 1D cubic Bezier curve instead of the 2D that you have. Specifically, what you actually want is just a cubic polynomial segment that starts at 0 and goes up to 2 when evaluated over the domain of 0 to 4. So you could use some basic math and just find the polynomial:
f(x) = a + b*x + c*x^2 + d*x^3
f(0) = 0
f(4) = 2
That leaves two degrees of freedom.
Take the derivative of the function:
f'(x) = b + 2*c*x + 3*d*x^2
If you want it to be steep at the beginning and then level off at the end you might say something like:
f'(0) = 10
f'(4) = 0
Then we can plug in values. a and b come for free because we're evaluating at zero.
a = 0
b = 10
So then we have:
f(4) = 2 = 40 + c*16 + d*64
f'(4) = 0 = 10 + c*8 + d*48
That's a pretty easy linear system to solve. For completeness, we get:
16c + 64d = -38
8c + 48d = -10
So-
1/(16*48 - 8*64)|48 -64||-38| = |c| = |-37/8 |
|-8 16||-10| |d| | 9/16|
f(x) = 10*x - (37/8)*x^2 + (9/16)*x^3
If, instead, you decide that you want to use Bezier control points, just pick your 4 y-value control points and recognize that in order to get t in [0,1], you just have to say t=x/4 (remembering that if you also need derivatives, you'll have to do a change there too).
Added:
If you happen to know the points and derivatives you want to begin and end with, but you want to use Bezier control points P1, P2, P3, and P4, the mapping is just this (assuming a curve parametrized over 0 to 1):
P1 = f(0)
P2 = f'(0)/3 + f(0)
P3 = f(1) - f'(1)/3
P4 = f(1)
If, for some reason, you wanted to stick with your 2D Bezier control points and wanted to ensure that the x dimension advanced linearly from 0 to 2 as t advanced from 0 to 1, then you could do that with control points (0,y1) (2/3,y2) (4/3,y3) (2,y4). You can see that I just made the x dimension start at 0, end at 2, and have a constant slope (derivative) of 2 (with respect to t). Then you just make the y-coordinate be whatever you need it to be. The different dimensions are essentially independent of each other.
After all this time, I was looking for hermite curves. Hermites are good because in one dimension they're guaranteed to produce a functional curve that can be evaluated to an XY point. I was confusing Hermites with Bezier.
"Assume t is time."
This is the problem - t is not the time. The curve has it's own rate of change of t, depending on the magnitude of the tangents. Like Jason said, the distance between the consequent points must be the same in order of t to be the same as the time. This is exactly what the non-weighted mode (which is used by default) in the Maya's curve editor is. So this was a perfectly good answer for how to fix this issue. To make this work for arbitrary tangents, you must convert the time to t. You can find t by calculating the bezier equation in the x (or time) direction.
Px = (1-t)^3(P1x) + 3t(1-t)^2(P2x) + 3t^2(1-t)(P3x) + t^3(P4x)
Px is your time, so you know everything here, but t. You must solve a cubic equation to find the roots. There is a tricky part to find the exact root you need though. Then you solve the other equation to find Py (the actual value you are looking for), knowing now t:
Py = (1-t)^3(P1y) + 3t(1-t)^2(P2y) + 3t^2(1-t)(P3y) + t^3(P4y)
This is what the weighted curves in Maya are.
I know the question is old, but I lost a whole day researching this simple thing, and nobody explains exactly what happens. Otherwise, the calculation process itself is written on many places, the Maya API manual for example. The Maya devkit also has a source code to do this.
I calculated the histogram(a simple 1d array) for an 3D grayscale Image.
Now I would like to calculate the gradient for the this histogram at each point. So this would actually mean I have to calculate the gradient for a 1D function at certain points. However I do not have a function. So how can I calculate it with concrete x and y values?
For the sake of simplicity could you probably explain this to me on an example histogram - for example with the following values (x is the intensity, and y the frequency of this intensity):
x1 = 1; y1 = 3
x2 = 2; y2 = 6
x3 = 3; y3 = 8
x4 = 4; y4 = 5
x5 = 5; y5 = 9
x6 = 6; y6 = 12
x7 = 7; y7 = 5
x8 = 8; y8 = 3
x9 = 9; y9 = 5
x10 = 10; y10 = 2
I know that this is also a math problem, but since I need to solve it in c++ I though you could help me here.
Thank you for your advice
Marc
I think you can calculate your gradient using the same approach used in image border detection (which is a gradient calculus). If your histogram is in a vector you can calculate an approximation of the gradient as*:
for each point in the histogram compute
gradient[x] = (hist[x+1] - hist[x])
This is a very simple way to do it, but I'm not sure if is the most accurate.
approximation because you are working with discrete data instead of continuous
Edited:
Other operators will may emphasize small differences (small gradients will became more emphasized). Roberts algorithm derives from the derivative calculus:
lim delta -> 0 = f(x + delta) - f(x) / delta
delta tends infinitely to 0 (in order to avoid 0 division) but is never zero. As in computer's memory this is impossible, the smallest we can get of delta is 1 (because 1 is the smallest distance from to points in an image (or histogram)).
Substituting
lim delta -> 0 to lim delta -> 1
we get
f(x + 1) - f(x) / 1 = f(x + 1) - f(x) => vet[x+1] - vet[x]
Two generally approaches here:
a discrete approximation to the derivative
take the real derivative of a fitted function
In the first case try:
g = (y_(i+1) - y_(i-1))/2*dx
at all the points except the ends, or one of
g_left-end = (y_(i+1) - y_i)/dx
g_right-end = (y_i - y_(i-1))/dx
where dx is the spacing between x points. (Unlike the equally correct definition Andres suggested, this one is symmetric. Whether it matters or not depends on you use case.)
In the second case, fit a spline to your data[*], and ask the spline library the derivative at the point you want.
[*] Use a library! Do not implement this yourself unless this is a learning project. I'd use ROOT because I already have it on my machine, but it is a pretty heavy package just to get a spline...
Finally, if you data is noisy, you ma want to smooth it before doing slope detection. That was you avoid chasing the noise, and only look at large scale slopes.
Take some squared paper and draw on it your histogram. Draw also vertical and horizontal axes through the 0,0 point of your histogram.
Take a straight edge and, at each point you are interested in, rotate the straight edge until it accords with your idea of what the gradient at that point is. It is most important that you do this, your definition of gradient is the one you want.
Once the straight edge is at the angle you desire draw a line at that angle.
Drop perpendiculars from any 2 points on the line you just drew. It will be easier to take the following step if the horizontal distance between the 2 points you choose is about 25% or more of the width of your histogram. From the same 2 points draw horizontal lines to intersect the vertical axis of your histogram.
Your lines now define an x-distance and a y-distance, ie the length of the horizontal/ vertical (respectively) axes marked out by their intersections with the perpendiculars/horizontal lines. The gradient you want is the y-distance divided by the x-distance.
Now, to translate this into code is very straightforward, apart from step 2. You have to define what the criteria are for determining what the gradient at any point on the histogram is. Simple choices include:
a) at each point, set down your straight edge to pass through the point and the next one to its right;
b) at each point, set down your straight edge to pass through the point and the next one to its left;
c) at each point, set down your straight edge to pass through the point to the left and the point to the right.
You may want to investigate more complex choices such as fitting a curve (such as a quadratic or higher-order polynomial) through a number of points on your histogram and using the derivative of that to represent the gradient.
Until you understand the question on paper avoid coding in C++ or anything else. Once you do understand it, coding should be trivial.