How to improve accuracy of estimateAffine2D (or estimageRigidTransform) in OpenCV? - c++

I have two sets of points, one from time t-1 and current time t. The first set was generated using goodFeaturesToTrack, and the latter from using calcOpticalFlowPyrLK(). Using these two sets of points, I then estimate a transformation matrix via estimateAffine2DPartial() in order to keep track of its scale & rotation. Code snippet is listed below:
// Precompute image pyramids
maxLvl = cv::buildOpticalFlowPyramid(_imgPrev, imPyr1, _winSize, maxLvl, true);
maxLvl = cv::buildOpticalFlowPyramid(tmpImg, imPyr2, _winSize, maxLvl, true);
// Optical flow call for tracking pixels
cv::calcOpticalFlowPyrLK(imPyr1, imPyr2, _currentPoints, nextPts, status, err, _winSize, maxLvl, _terminationCriteria, 0, 0.000001);
// Get transformation matrix between the two data sets
cv::Mat H = cv::estimateAffinePartial2D(_currentPoints, nextPts, inlier_mask, cv::RANSAC, 10.0, 2000, 0.99);
Using H, I then map my masking points using perspectiveTransform(). The result seems accurate for the first few dozen frames until I notice some drift (in terms of rotation) occurring when the object I am tracking continues to rotate (usually when rotation becomes > M_PI). I'm honestly stumped on where the culprit is, but my main suspicion is perhaps my window size for optical flow might be too small, or too big. However, tweaking the window size did not seem to help, the position of my object is still accurate, but the estimated rotation (and scale) got worse. Can anyone hope to shed a light on this?
Warm regards and thanks.
EDIT: Images attached to show drift issue
Starting Frame
First few frames -- Rotation OK
Z-Rotation Drift occurs -- see anchor line has drifted towards the red rectangle.

Lucas Kanade tracker needs more features. Guess the tracking template you provided is not good enough.
(1) Try with other feature rich real images? e.g Opencv feautre tracking template image
(2) fix scale. Since you are doing simulation, you can try to anchor the size first.
calcOpticalFlowPyrLK is widely used in visual inertial state estimation studies. such as Semi direct visual odometry or VINSMONO. You can try to find the code inside those project to see how other people is playing with the feature and parameters

Related

Detection of small fast objects with using of Simd::Motion::Detector

I try to use motion detector to detect shooting star in the video with the example of code UseMotionDetector.cpp.
If I use the motion detector with default options than nothing works.
I think that may be connected with small object size, its fast speed, or big noise.
The motions detector has a big amount of parameters (and also this) but I have no experience in using of any motion detector.
So I have some questions:
Is this task can be resolved with using of algorithms of motion detection?
Is this motion detector suit for this job?
How to tune its paraameters?
Thank in advance!
I have analysed your video and there are some troubles which don allow to work Simd::Motion::Detector properly with default settings.
And you have already listed most of them above:
The object (shooting star) has small size.
It moves too fast.
Time of its existance is short.
There is a big noise on the video.
In order to solve these troubles I changed following parameters of motion detector:
To detect objects of small size I decreased minimal object size in model:
Model model;
model.size = FSize(0.01, 0.01); // By default it is equal to FSize(0.1, 0.1).
detector.SetModel(model);
To reduce influence of fast motion:
Options options;
options.TrackingAdditionalLinking = 5; // Boosts binding of trajectory.
To resolve trouble of short object existence time:
options.ClassificationShiftMin = 0.01; // Decreases minimal shift of object to be detected.
options.ClassificationTimeMin = 0.01; // Decreases minimal life time of object to be detected.
To reduce big noise:
options.DifferenceDxFeatureWeight = 0; // Turns off gradient along X axis feature.
options.DifferenceDyFeatureWeight = 0; // Turns off gradient along Y axis feature.
detector.SetOptions(options);
And it works! I hope that I helped you.

How to use buildOpticalFlowPyramid?

I'm using OpenCV 3.3.1. I want to do a semi-dense optical flow operation using cv::calcOpticalFlowPyrLK, but I've been getting some really noticeable slowdown whenever my ROI is pretty big (Partly due to the fact that I am letting the user decide what the winSize should be, ranging from from 10 to 100). Anyways, it seems like cv::buildOpticalFlowPyramid can mitigate the slowdown by building image pyramids? I'm sorta familiar what image pyramids are, but in context of the function, I'm especially confused about what parameters I pass in, and how it impacts my function call to cv::calcOpticalFlowPyrLK. With that in mind, I now have these set of questions:
The output is, according to the documentation, is an OutputArrayOfArrays, which I take it can be a vector of cv::Mat objects. If so, what do I pass in to cv::calcOpticalFlowPyrLK for prevImg and nextImg (assuming that I need to make image pyramids for both)?
According to the docs for cv::buildOpticalFlowPyramid, you need to pass in a winSize parameter in order to calculate required padding for pyramid levels. If so, do you pass in the same winSize value when you eventually call cv::calcOpticalFlowPyrLK?
What exactly are the arguments for pyrBorder and derivBorder doing?
Lastly, and apologies if it sounds newbish, but what is the purpose of this function? I always assumed that cv::calcOpticalFlowPyrLK internally builds the image pyramids. Is it just to speed up the optical flow operation?
I hope my questions were clear, I'm still very new to OpenCV, and computer vision, but this topic is very interesting.
Thank you for your time.
EDIT:
I used the function to see if my guess was correct, so far it has worked, but I've seen no noticeable speed up. Below is how I used it:
// Building pyramids
int maxLvl = 3;
maxLvl = cv::buildOpticalFlowPyramid(imgPrev, imPyr1, cv::Size(searchSize, searchSize), maxLvl, true);
maxLvl = cv::buildOpticalFlowPyramid(tmpImg, imPyr2, cv::Size(searchSize, searchSize), maxLvl, true);
// LK optical flow call
cv::calcOpticalFlowPyrLK(imPyr1, imPyr2, currentPoints, nextPts, status, err,
cv::Size(searchSize, searchSize), maxLvl, termCrit, 0, 0.00001);
So now I'm wondering what's the purpose of preparing the image pyramids if calcOpticalFlowPyrLK does it internally?
So the point of your question is that you are trying to improve speed of optical flow tracking by tuning your input parameters.
If you want dirty and quick answer then here it is
KTL (OpenCV's calcOpticalFlowPyrLK) define a e residual function which are sum of gradient of point inside search window .
The main purpose is to find vector of point that can minimize residual function
So if you increase search window size (winSize) then it is more difficult to find that set of points.
If your really really want to do that then please read the official paper.
See the section 2.4
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.185.585&rep=rep1&type=pdf
I took it from official document
https://docs.opencv.org/2.4/modules/video/doc/motion_analysis_and_object_tracking.html#bouguet00
Hope that help

What accuracy should I expect from basic opencv ortho-rectification algorithms?

So, I'm taking over the work on an ortho-rectification algorithm that is intended to produce "accurate" results. I'm running into trouble trying to increase the accuracy and could use a little help.
Here is the basic approach.
Extract a calibration pattern from an image that was taken from a mobile phone.
Rectify the image based on a calibration pattern in the image
Scale the image to get the real world size of the scene around the pattern.
The calibration pattern is held against a flat surface, like a wall, counter, table, floor and the user takes a picture. With that picture, we want to measure artifacts on the same surface as the calibration pattern. We have tried this with calibration patterns ranging from the size of a credit card to a sheet of paper (8.5" x 11")
Here is an example input picture
With this resulting output image
Right now our measurements are usually within 1-2% of what we expect. This is sufficient for small areas (less than 25cm away from the calibration pattern. However, we'd like the algorithm to scale so that we can accurately measure a 2x2 meter area. However, at that size, the current error is too much (2-4 cm).
Here is the algorithm we are following.
// convert original image to grayscale and perform morphological dilation to reduce false matches when finding circle grid
Mat imgGray;
cvtColor(imgOriginal, imgGray, CV_BGR2GRAY);
// find calibration pattern in original image
Size patternSize(4, 11);
vector <Point2f> circleCenters_OriginalImage;
if (!findCirclesGrid(imgGray, patternSize, circleCenters_OriginalImage, CALIB_CB_ASYMMETRIC_GRID))
{
return false;
}
Point2f inputQuad[4];
inputQuad[0] = Point2f(circleCenters_OriginalImage[0].x, circleCenters_OriginalImage[0].y);
inputQuad[1] = Point2f(circleCenters_OriginalImage[3].x, circleCenters_OriginalImage[3].y);
inputQuad[2] = Point2f(circleCenters_OriginalImage[43].x, circleCenters_OriginalImage[43].y);
inputQuad[3] = Point2f(circleCenters_OriginalImage[40].x, circleCenters_OriginalImage[40].y);
// create model points for calibration pattern
vector <Point2f> circleCenters_ObjectSpace = GeneratePatternPointsInObjectSpace(circleCenters_OriginalImage[0], Distance(circleCenters_OriginalImage[0], circleCenters_OriginalImage[1]) / 2.0f, ioData.marker_up);
Point2f outputQuad[4];
outputQuad[0] = Point2f(circleCenters_ObjectSpace[0].x, circleCenters_ObjectSpace[0].y);
outputQuad[1] = Point2f(circleCenters_ObjectSpace[3].x, circleCenters_ObjectSpace[3].y);
outputQuad[2] = Point2f(circleCenters_ObjectSpace[43].x, circleCenters_ObjectSpace[43].y);
outputQuad[3] = Point2f(circleCenters_ObjectSpace[40].x, circleCenters_ObjectSpace[40].y);
Mat lambda(2,4,CV_32FC1);
lambda = Mat::zeros(imgOriginal.rows, imgOriginal.cols, imgOriginal.type());
lambda = getPerspectiveTransform(inputQuad, outputQuad);
warpPerspective(imgOriginal, imgOrthorectified, lambda, imgOrthorectified.size());
...
My Questions:
Is it reasonable to shoot for error < 0.25%? Is there a different algorithm that would yield more accurate results? What are the most valuable sources of error to identify and resolve?
As I've worked on this, I've also looked at removing pincushion / barrel distortions, and trying homographies to find the perspective transform. The best approaches I have found so far remain in the 1-2% error.
Any suggestions of where to go next would be really helpful

How can I detect the position and the radius of the ball using opencv?

I need to detect this ball: and find its position and radius using opencv. I have downloaded many codes, but neither of them works. Any helps are highly appreciated.
I see you have quite a setup installed. As mentioned in the comments, please make sure that you have appropriate lighting to capture the ball, as well as making the ball distinguishable from it's surroundings by painting it a different colour.
Once your setup is optimized for detection, you may proceed via different ways to track your ball (stationary or not). A few ways may be:
Feature detection : Via Hough Circles, detect 2D circles (and their radius) that lie within a certain range of color, as explained below
There are many more ways to detect objects via feature detection, such as this clever blog points out.
Object Detection: Via SURF, SIFT and many other methods, you may detect your ball, calculate it's radius and even predict it's motion.
This code uses Hough Circles to compute the ball position, display it in real time and calculate it's radius in real time. I am using Qt 5.4 with OpenCV version 2.4.12
void Dialog::TrackMe() {
webcam.read(cim); /*call read method of webcam class to take in live feed from webcam and store each frame in an OpenCV Matrice 'cim'*/
if(cim.empty()==false) /*if there is something stored in cim, ie the webcam is running and there is some form of input*/ {
cv::inRange(cim,cv::Scalar(0,0,175),cv::Scalar(100,100,256),cproc);
/* if any part of cim lies between the RGB color ranges (0,0,175) and (100,100,175), store in OpenCV Matrice cproc */
cv::HoughCircles(cproc,veccircles,CV_HOUGH_GRADIENT,2,cproc.rows/4,100,50,10,100);
/* take cproc, process the output to matrice veccircles, use method [CV_HOUGH_GRADIENT][1] to process.*/
for(itcircles=veccircles.begin(); itcircles!=veccircles.end(); itcircles++)
{
cv::circle(cim,cv::Point((int)(*itcircles)[0],(int)(*itcircles)[1]), 3, cv::Scalar(0,255,0), CV_FILLED); //create center point
cv::circle(cim,cv::Point((int)(*itcircles)[0],(int)(*itcircles)[1]), (int)(*itcircles)[2], cv::Scalar(0,0,255),3); //create circle
}
QImage qimgprocess((uchar*)cproc.data,cproc.cols,cproc.rows,cproc.step,QImage::Format_Indexed8); //convert cv::Mat to Qimage
ui->output->setPixmap(QPixmap::fromImage(qimgprocess));
/*render QImage to screen*/
}
else
return; /*no input, return to calling function*/
}
How does the processing take place?
Once you start taking in live input of your ball, the frame captured should be able to show where the ball is. To do so, the frame captured is divided into buckets which are further divides into grids. Within each grid, an edge is detected (if it exists) and thus, a circle is detected. However, only those circles that pass through the grids that lie within the range mentioned above (in cv::Scalar) are considered. Thus, for every circle that passes through a grid that lies in the specified range, a number corresponding to that grid is incremented. This is known as voting.
Each grid then stores it's votes in an accumulator grid. Here, 2 is the accumulator ratio. This means that the accumulator matrix will store only half as many values as resolution of input image cproc. After voting, we can find local maxima in the accumulator matrix. The positions of the local maxima are corresponding to the circle centers in the original space.
cproc.rows/4 is the minimum distance between centers of the detected circles.
100 and 50 are respectively the higher and lower threshold passed to the canny edge function, which basically detects edges only between the mentioned thresholds
10 and 100 are the minimum and maximum radius to be detected. Anything above or below these values will not be detected.
Now, the for loop processes each frame captured and stored in veccircles. It create a circle and a point as detected in the frame.
For the above, you may visit this link

Match object between different video frames

Am trying to use OPENCV to detect the shift in consecutive video frames when the camera is unstable and moving real time as shown in the picture.. To compensate the effect of shaking or changing in the angle I want to match some objects in the image as example the clock and from the center of the same object in the consecutive frames I can detect the shift value and compensate its effect. I don't know the way to do this real time or how many ways are available and accurate to do this.
Thank you in advance and I hope my question is clear.
This is a fairly standard operation, as it's actively used in MPEG-4 compression. It's called "motion estimation" and you don't do it on objects (too hard, requires image segmentation). In OpenCV, it's covered under Video Stabilization
If you want to try writing code yourself then one method is to first of all crop the frame to produce a sub image of your actual image slightly smaller than your actual image along each dimension. This will give you some room to move.
Next you want to be able to find and track shapes in OpenCV - an example of code is here - http://opencv-srf.blogspot.co.uk/2011/09/object-detection-tracking-using-contours.html - Play around until you get a few geometric primitive shapes coming up on each frame.
Next you want to build some vectors between the centres of each shape - these are what will determine the movement of the camera - if in the next frame most of the vectors are displaced but parallel that is a good indicator that the camera has moved.
The last step is to calculate the displacement, which should is matter of measuring the distance between detected parallel vectors. If this is smaller than your sub-image cropping then you can crop the original image to negate the displacement.
The pseudo code for each iteration would be something like -
//Variables
image wholeFrame1, wholeFrame2, subImage, shapesFrame1, shapesFrame2
vectorArray vectorsFrame1, vectorsFrame2; parallelVectorList
vector cameraDisplacement = [0,0]
//Display image
subImage = cropImage(wholeFrame1, cameraDisplacement)
display(subImage);
//Find shapes to track
shapesFrame1 = findShapes(wholeFrame1)
shapesFrame2 = findShapes(wholeFrame2)
//Store a list of parallel vectors
parallelVectorList = detectParallelVectors(shapesFrame1, shapesFrame2)
//Find the mean displacement of each pair of parallel vectors
cameraDisplacement = meanDisplacement(parallelVectorList)
//Crop the next image accounting for camera displacement
subImage = cropImage(wholeFrame1, cameraDisplacement)
There are better ways of doing it but this would be easy enough for someone doing their first attempt at image stabilisation with experience of OpenCV.