I am working on a project and I am stuck with one thing. I have two sets of points extracted from contours of the same object but rotated in 2D and I need to find the best rotation transformation (or just angle of rotation) between those points.
What I did is that I rescaled one of the contours so that I have two contours of the same size and also I made those two contours have the same center of mass. Then what I do is that I choose 3 random points from the first set of points and 3 points of the second set of points (some kind of RANSAC in which I draw N times random points) and I need to find the rotation transformation around their center of mass of one set to the other one. I tried to use Kabsch algorithm but I'm not sure if I'm implementing it correctly because it is not working properly. Here is my code:
Here is my code of Kabsch:
// P,Q - sets of points of contours
cv::Mat Pt;
transpose(P, Pt);
cv::Mat H = Pt * Q;
cv::Mat Ht;
cv::Mat invH = H.inv();
cv::Mat HtH = Ht * H;
cv::Mat sqrtH(HtH.size(), CV_32F);
for (int i=0;i<2;i++)
for (int j = 0; j < 2; j++)
sqrtH.at<float>(j, i) = sqrt(HtH.at<float>(j,i));
// Final transform
cv::Mat R = sqrtH * invH;
I would like to at least get an angle of rotation between two sets of points. When I use my code I get strange results and transformation that are messing my sets of points up.
I have been struggling trying to implement the outlining algorithm described here and here.
The general idea of the paper is determining the Hausdorff distance of binary images and using it to find the template image from a test image.
For template matching, it is recommended to construct image pyramids along with sliding windows which you'll use to slide over your test image for detection. I was able to do both of these as well.
I am stuck on how to move forward from here on. Do I slide my template over the test image from different pyramid layers? Or is it the test image over the template? And with regards to the sliding window, is/are they meant to be a ROI of the test or template image?
In a nutshell, I have pieces to the puzzle but no idea of which direction to take to solve the puzzle
int distance(vector<Point>const& image, vector<Point>const& tempImage)
int maxDistance = 0;
for(Point imagePoint: image)
int minDistance = numeric_limits<int>::max();
for(Point tempPoint: tempImage)
Point diff = imagePoint - tempPoint;
int length = (diff.x * diff.x) + (diff.y * diff.y);
if(length < minDistance) minDistance = length;
if(length == 0) break;
maxDistance += minDistance;
return maxDistance;
double hausdorffDistance(vector<Point>const& image, vector<Point>const& tempImage)
double maxDistImage = distance(image, tempImage);
double maxDistTemp = distance(tempImage, image);
return sqrt(max(maxDistImage, maxDistTemp));
vector<Mat> buildPyramids(Mat& frame)
vector<Mat> pyramids;
int count = 6;
Mat prevFrame = frame, nextFrame;
while(count > 0)
resize(prevFrame, nextFrame, Size(), .85, .85);
prevFrame = nextFrame;
return pyramids;
vector<Rect> slidingWindows(Mat& image, int stepSize, int width, int height)
vector<Rect> windows;
for(size_t row = 0; row < image.rows; row += stepSize)
if((row + height) > image.rows) break;
for(size_t col = 0; col < image.cols; col += stepSize)
if((col + width) > image.cols) break;
windows.push_back(Rect(col, row, width, height));
return windows;
Edit I: More analysis on my solution can be found here
This is a bi-directional task.
Forward Direction
1. Translation
For each contour, calculate its moment. Then for each point in that contour, translate it about the moment i.e. contour.point[i] = contour.point[i] - contour.moment[i]. This moves all of the contour points to the origin.
PS: You need to keep track of each contour's produced moment because it will be used in the next section
2. Rotation
With the newly translated points, calculate their rotated rect. This will give you the angle of rotation. Depending on this angle, you would want to calculate the new angle which you want to rotate this contour by; this answer would be helpful.
After attaining the new angle, calculate the rotation matrix. Remember that your center here will be the origin i.e. (0, 0). I did not take scaling into account (that's where the pyramids come into play) when calculating the rotation matrix hence I passed 1.
PS: You need to keep track of each contour's produced matrix because it will be used in the next section
Using this matrix, you can go ahead and rotate each point in the contour by it as shown here*.
Once all of this is done, you can go ahead and calculate the Hausdorff distance and find contours which pass your set threshold.
Back Direction
Everything done in the first section, has to be undone in order for us to draw the valid contours onto our camera feed.
1. Rotation
Recall that each detected contour produced a rotation matrix. You want to undo the rotation of the valid contours. Just perform the same rotation but using the inverse matrix.
For each valid contour and corresponding matrix
inverse_matrix = matrix[i].inv(cv2.DECOMP_SVD)
Use * to rotate the points but with inverse_matrix as parameter
PS: When calculating the inverse, if the produced matrix was not a square one, it would fail. cv2.DECOMP_SVD will produce an inverse matrix even if the original matrix was a non-square.
2. Translation
With the valid contours' points rotated back, you just have to undo the previously performed translation. Instead of subtracting, just add the moment to each point.
You can now go ahead and draw these contours to your camera feed.
This is were image pyramids come into play.
All you have to do is resize your template image by a fixed size/ratio upto your desired number of times (called layers). The tutorial found here does a good job of explaining how to do this in OpenCV.
It goes without saying that the values you choose to resize your image by and number of layers will and do play a huge role in how robust your program will be.
Put it all together
Template Image Operations
Create a pyramid consisting of n layers
For each layer in n
Find contours
Translate the contour points
Rotate the contour points
This operation should only be performed once and only store the results of the rotated points.
Camera Feed Operations
Let the rotated contours of the template image at each level be stored in templ_contours. So if I say templ_contours[0], this is going to give me the rotated contours at pyramid level 0.
Let the image's translated, rotated contours and moments be stored in transCont, rotCont and moment respectively.
image_contours = Find Contours
for each contour detected in image
moment = calculate moment
for each point in image_contours
transCont.thisPoint = forward_translate(image_contours.thisPoint)
rotCont.thisPoint = forward_rotate(transCont.thisPoint)
for each contour_layer in templ_contours
for each contour in rotCont
calculate Hausdorff Distance
valid_contours = contours_passing_distance_threshold
for each point in valid_contours
valid_point = backward_rotate(valid_point)
for each point in valid_contours
valid_point = backward_translate(valid_point)
drawContours(valid_contours, image)
I have one point set to position (x,y) and two angles from this point. I draw in example bellow two lines for demonstration, how it should look.
Now what I want is change lightness to all pixels outside from this lines.
Here is original image.
And here is example, what I want.
How can I easy change pixels with Opencv(C++), if I have and know input image, point, and two angles? I know many of solution, but I want easiest one, how can detect which pixels need change and which not.
One way would be to:
Make a binary mask of the size of the original image, based on your points and angle (i.e draw filled polygon).
Make a clone of the original image. Apply brightness changes to the whole of cloned image.
Copy cloned image back to original image based on the mask.
I write code bellow from #Zindarod steps. Hope to help someone.
Angles are in degress.
void view(cv::Mat& frame, double angle_left, double angle_right, cv::Point center){
int length = 1500;
cv::Point left_view;
left_view.x = (int)round(center.x + length * cos((angle_left * (CV_PI / 180))));
left_view.y = (int)round(center.y + length * sin((angle_left * (CV_PI / 180))));
cv::Point right_view;
right_view.x = (int)round(center.x + length * cos((angle_right * (CV_PI / 180))));
right_view.y = (int)round(center.y + length * sin((angle_right * (CV_PI / 180))));
cv::Point pts[4] = { position_of_eyes, left_view, right_view, position_of_eyes };
Mat mask = Mat(frame.size(), CV_32FC3, cv::Scalar(1.0, 1.0, 0.3));
cv::fillConvexPoly(mask, pts, 3, cv::Scalar(1.0,1.0,1.0));
cv::cvtColor(frame, frame, CV_BGR2HSV);
frame.convertTo(frame, CV_32FC3);
cv::multiply(frame, mask, frame);
frame.convertTo(frame, CV_8UC3);
cv::cvtColor(frame, frame, CV_HSV2BGR);
Given an origin point and two angles, you can calculate 2 unit vectors for you two lines, let these be unitA and unitB.
For each pixel of the image do these steps:
1. get a vector (called vec) from the origin to the pixel.
2. find the angle (ang) between vec and a reference vector (refVec).
3. if ang is greater than the angle between refVec and unitA, but smaller than the angle between the refVec and unitB recolor the pixel.
I have a little problem with some projection and geometry. I have an image where I detect a square. After the square detection, I crop the square from image. In the ROI I detect the point P(x,y) (see the image below).
My problem is that I know the coordinate of point P in the ROI, the coordinates of A,B,C,D, and rotation of ROI (RotatedRect::angle) but I want to get the coordinate of P in original image. Any advice could help.
For ROI crop I have this code
vector< RotatedRect > rect(squares.size());
for (int i=0;i<squares.size();i++)
rect[i] = minAreaRect(Mat(squares[i]));
Mat M,rotated,cropped;
float angle = rect[i].angle;
Size rect_size = rect[i].size;
if (rect[i].angle<-45)
angle += 90;
M = getRotationMatrix2D(rect[i].center,angle,1.0);
SatelliteClass[i].m_vecRect = rect[i];
It's basically a question of vector addition. Take the inverse of M, apply it to P ( so you're rotating P back to the original frame ) and then add P to the left corner of the rectangle.
There might be a way to do this within the API you're using instead of reinventing the wheel.
I'm to build a panorama image of the ground covered by a downward facing camera (at a fixed height, around 1 metre above ground). This could potentially run to thousands of frames, so the Stitcher class' built in panorama method isn't really suitable - it's far too slow and memory hungry.
Instead I'm assuming the floor and motion is planar (not unreasonable here) and trying to build up a cumulative homography as I see each frame. That is, for each frame, I calculate the homography from the previous one to the new one. I then get the cumulative homography by multiplying that with the product of all previous homographies.
Let's say I get H01 between frames 0 and 1, then H12 between frames 1 and 2. To get the transformation to place frame 2 onto the mosaic, I need to get H01*H12. This continues as the frame count increases, such that I get H01*H12*H23*H34*H45*....
In code, this is something akin to:
cv::Mat previous, current;
// Init cumulative homography
cv::Mat cumulative_homography = cv::Mat::eye(3);
video_stream >> previous;
for(;;) {
video_stream >> current;
// Here I do some checking of the frame, etc
// Get the homography using my DenseMosaic class (using Farneback to get OF)
cv::Mat tmp_H = DenseMosaic::get_homography(previous,current);
// Now normalise the homography by its bottom right corner
tmp_H /= tmp_H.at<double>(2, 2);
cumulative_homography *= tmp_H;
previous = current.clone( );
It works pretty well, except that as the camera moves "up" in the viewpoint, the homography scale decreases. As it moves down, the scale increases again. This gives my panoramas a perspective type effect that I really don't want.
For example, this is taken on a few seconds of video moving forward then backward. The first frame looks ok:
The problem comes as we move forward a few frames:
Then when we come back again, you can see the frame gets bigger again:
I'm at a loss as to where this is coming from.
I'm using Farneback dense optical flow to calculate pixel-pixel correspondences as below (sparse feature matching doesn't work well on this data) and I've checked my flow vectors - they're generally very good, so it's not a tracking problem. I also tried switching the order of the inputs to find homography (in case I'd mixed up the frame numbers), still no better.
cv::calcOpticalFlowFarneback(grey_1, grey_2, flow_mat, 0.5, 6,50, 5, 7, 1.5, flags);
// Using the flow_mat optical flow map, populate grid point correspondences between images
std::vector<cv::Point2f> points_1, points_2;
median_motion = DenseMosaic::dense_flow_to_corresp(flow_mat, points_1, points_2);
cv::Mat H = cv::findHomography(cv::Mat(points_2), cv::Mat(points_1), CV_RANSAC, 1);
Another thing I thought it could be was the translation I include in the transformation to ensure my panorama is centred within the scene:
cv::warpPerspective(init.clone(), warped, translation*homography, init.size());
But having checked the values in the homography before the translation is applied, the scaling issue I mention is still present.
Any hints are gratefully received. There's a lot of code I could put in but it seems irrelevant, please do let me know if there's something missing
I've tried switching out the *= operator for the full multiplication and tried reversing the order the homographies are multiplied in, but no luck. Below is my code for calculating the homography:
\brief Calculates the homography between the current and previous frames
cv::Mat DenseMosaic::get_homography()
cv::Mat grey_1, grey_2; // Grayscale versions of frames
cv::cvtColor(prev, grey_1, CV_BGR2GRAY);
cv::cvtColor(cur, grey_2, CV_BGR2GRAY);
// Calculate the dense flow
if (frame_number > 2) {
flags = flags | cv::OPTFLOW_USE_INITIAL_FLOW;
cv::calcOpticalFlowFarneback(grey_1, grey_2, flow_mat, 0.5, 6,50, 5, 7, 1.5, flags);
// Convert the flow map to point correspondences
std::vector<cv::Point2f> points_1, points_2;
median_motion = DenseMosaic::dense_flow_to_corresp(flow_mat, points_1, points_2);
// Use the correspondences to get the homography
cv::Mat H = cv::findHomography(cv::Mat(points_2), cv::Mat(points_1), CV_RANSAC, 1);
return H;
And this is the function I use to find the correspondences from the flow map:
\brief Calculate pixel->pixel correspondences given a map of the optical flow across the image
\param[in] flow_mat Map of the optical flow across the image
\param[out] points_1 The set of points from #cur
\param[out] points_2 The set of points from #prev
\param[in] step_size The size of spaces between the grid lines
\return The median motion as a point
Uses a dense flow map (such as that created by cv::calcOpticalFlowFarneback) to obtain a set of point correspondences across a grid.
cv::Point2f DenseMosaic::dense_flow_to_corresp(const cv::Mat &flow_mat, std::vector<cv::Point2f> &points_1, std::vector<cv::Point2f> &points_2, int step_size)
std::vector<double> tx, ty;
for (int y = 0; y < flow_mat.rows; y += step_size) {
for (int x = 0; x < flow_mat.cols; x += step_size) {
/* Flow is basically the delta between left and right points */
cv::Point2f flow = flow_mat.at<cv::Point2f>(y, x);
/* There's no need to calculate for every single point,
if there's not much change, just ignore it
if (fabs(flow.x) < 0.1 && fabs(flow.y) < 0.1)
points_1.push_back(cv::Point2f(x, y));
points_2.push_back(cv::Point2f(x + flow.x, y + flow.y));
// I know this should be median, not mean, but it's only used for plotting the
// general motion direction so it's unimportant.
cv::Point2f t_median;
cv::Scalar mtx = cv::mean(tx);
t_median.x = mtx[0];
cv::Scalar mty = cv::mean(ty);
t_median.y = mty[0];
return t_median;
It turns out this was because my viewpoint was close to the features, meaning that the non-planarity of the tracked features was causing skew to the homography. I managed to prevent this (it's more of a hack than a method...) by using estimateRigidTransform instead of findHomography, as this does not estimate for perspective variations.
In this particular case, it makes sense to do so, as the view does only ever undergo rigid transformations.
I want to write a program which can correct an answer sheet by opencv in C++.
But as I use cvFindContours() the contours' border-points hasn't been completely found.
I mean, I don't have closed-objects. (I use cvDilate and cvErode, without know their real functionality, but dilate omit some contours and erode add some extra, unwanted contours)
And the bigger problem is that I want to find a center-dot in each contour (to compare the location of answers with predefined left & down sidebars), but some contours are not symmetric so the center-dot's location is not exactly in the middle.
Look at the black picture, on the left, in the second contour, some of the bottom points are detected but the top ones are not detected.
cvCvtColor(pic, blackpic, CV_BGR2GRAY);
cvResize(blackpic, src0);
cvSmooth(src0, src0, CV_GAUSSIAN, 3, 3);
cvThreshold(src0, src, 140, 255, CV_THRESH_BINARY);
//Find Contour
CvMemStorage* st = cvCreateMemStorage();
CvSeq* first_contour = NULL;
cvFindContours(src, st, &first_contour, sizeof(CvContour), CV_RETR_LIST);
vector <vector <CvPoint> > cont;
vector <CvPoint> dot;
for (CvSeq* s = first_contour; s != NULL; s = s->h_next)
if (s -> total > C_MIN_SIZE && cvContourArea(s) > C_MIN_AREA)
cont.push_back(vector <CvPoint>()); //convert seq to vector
CvPoint c = cvPoint(0,0);
for (int i = 0; i < s -> total; i++)
CvPoint* p = CV_GET_SEQ_ELEM(CvPoint, s, i);
CV_IMAGE_ELEM(test, uchar, p -> y, p -> x) = 255; //drawing each contour's point
c.x += p -> x; //find the center point by average
c.y += p -> y;
c.x = floor(c.x / s -> total);
c.y = floor(c.y / s -> total);
Although I am using C++, but I use IplImage* and CvPoint (c structres for opencv) insted of cv::Mat and cv::Point (C++ stractures), if it is possible, please don't use Mat and C++ mode.
I don't understand that when I draw contours by cvDrawContours() the contours are completely drawn, but when I personally iterate over the contours' points and draw them point by point, it seems that most of them are not detected!
For your first question, you can have thousands of links on the internet that explain Erode, Dilate and all the other basic image processing pillars, take your time while reading the documentation don't jump over steps .
Second question :
I am not sure what do you expect from the contour center ? do you think you will have a point exactly at the center of those ellipses ? NO, that will never happen and if it is the case, congratulations for this big heap in Image Processing history !!
What I suggest to do, is a simple manipulation that resolves your issue :
find the contours (exactly like you are doing now)
calculate the center of each contour
when you compare the answers with centers location don't compare using a == b because this never happens !! rather, use a comparison in of distance (threshold) to ensure the operation of your software
Example :
bool correct;
CvPoint answer, center;
double distance = sqrt((answer.X - center.X)^2 + (answer.Y - center.Y)^2); // Euclidean distance
// judge using this distance
if (distance <= 5) // here you select the number as you (want) i gave example 5
correct = true; // correct answer :)
Good luck