Match object between different video frames - c++

Am trying to use OPENCV to detect the shift in consecutive video frames when the camera is unstable and moving real time as shown in the picture.. To compensate the effect of shaking or changing in the angle I want to match some objects in the image as example the clock and from the center of the same object in the consecutive frames I can detect the shift value and compensate its effect. I don't know the way to do this real time or how many ways are available and accurate to do this.
Thank you in advance and I hope my question is clear.

This is a fairly standard operation, as it's actively used in MPEG-4 compression. It's called "motion estimation" and you don't do it on objects (too hard, requires image segmentation). In OpenCV, it's covered under Video Stabilization

If you want to try writing code yourself then one method is to first of all crop the frame to produce a sub image of your actual image slightly smaller than your actual image along each dimension. This will give you some room to move.
Next you want to be able to find and track shapes in OpenCV - an example of code is here - http://opencv-srf.blogspot.co.uk/2011/09/object-detection-tracking-using-contours.html - Play around until you get a few geometric primitive shapes coming up on each frame.
Next you want to build some vectors between the centres of each shape - these are what will determine the movement of the camera - if in the next frame most of the vectors are displaced but parallel that is a good indicator that the camera has moved.
The last step is to calculate the displacement, which should is matter of measuring the distance between detected parallel vectors. If this is smaller than your sub-image cropping then you can crop the original image to negate the displacement.
The pseudo code for each iteration would be something like -
//Variables
image wholeFrame1, wholeFrame2, subImage, shapesFrame1, shapesFrame2
vectorArray vectorsFrame1, vectorsFrame2; parallelVectorList
vector cameraDisplacement = [0,0]
//Display image
subImage = cropImage(wholeFrame1, cameraDisplacement)
display(subImage);
//Find shapes to track
shapesFrame1 = findShapes(wholeFrame1)
shapesFrame2 = findShapes(wholeFrame2)
//Store a list of parallel vectors
parallelVectorList = detectParallelVectors(shapesFrame1, shapesFrame2)
//Find the mean displacement of each pair of parallel vectors
cameraDisplacement = meanDisplacement(parallelVectorList)
//Crop the next image accounting for camera displacement
subImage = cropImage(wholeFrame1, cameraDisplacement)
There are better ways of doing it but this would be easy enough for someone doing their first attempt at image stabilisation with experience of OpenCV.

Related

How to improve accuracy of estimateAffine2D (or estimageRigidTransform) in OpenCV?

I have two sets of points, one from time t-1 and current time t. The first set was generated using goodFeaturesToTrack, and the latter from using calcOpticalFlowPyrLK(). Using these two sets of points, I then estimate a transformation matrix via estimateAffine2DPartial() in order to keep track of its scale & rotation. Code snippet is listed below:
// Precompute image pyramids
maxLvl = cv::buildOpticalFlowPyramid(_imgPrev, imPyr1, _winSize, maxLvl, true);
maxLvl = cv::buildOpticalFlowPyramid(tmpImg, imPyr2, _winSize, maxLvl, true);
// Optical flow call for tracking pixels
cv::calcOpticalFlowPyrLK(imPyr1, imPyr2, _currentPoints, nextPts, status, err, _winSize, maxLvl, _terminationCriteria, 0, 0.000001);
// Get transformation matrix between the two data sets
cv::Mat H = cv::estimateAffinePartial2D(_currentPoints, nextPts, inlier_mask, cv::RANSAC, 10.0, 2000, 0.99);
Using H, I then map my masking points using perspectiveTransform(). The result seems accurate for the first few dozen frames until I notice some drift (in terms of rotation) occurring when the object I am tracking continues to rotate (usually when rotation becomes > M_PI). I'm honestly stumped on where the culprit is, but my main suspicion is perhaps my window size for optical flow might be too small, or too big. However, tweaking the window size did not seem to help, the position of my object is still accurate, but the estimated rotation (and scale) got worse. Can anyone hope to shed a light on this?
Warm regards and thanks.
EDIT: Images attached to show drift issue
Starting Frame
First few frames -- Rotation OK
Z-Rotation Drift occurs -- see anchor line has drifted towards the red rectangle.
Lucas Kanade tracker needs more features. Guess the tracking template you provided is not good enough.
(1) Try with other feature rich real images? e.g Opencv feautre tracking template image
(2) fix scale. Since you are doing simulation, you can try to anchor the size first.
calcOpticalFlowPyrLK is widely used in visual inertial state estimation studies. such as Semi direct visual odometry or VINSMONO. You can try to find the code inside those project to see how other people is playing with the feature and parameters

Depth/Disparity Map from a moving camera in OpenCV

Is that possible to get the depth/disparity map from a moving camera? Let say I capture an image at x location, after I travelled let say 5cm and I capture another picture, and from there I calculate the depth map of the image.
I have tried using BlockMatching in opencv but the result is not good.The first and second image are as following:
first image,second image,
disparity map (colour),disparity map
My code is as following:
GpuMat leftGPU;
GpuMat rightGPU;
leftGPU.upload(left);rightGPU.upload(right);
GpuMat disparityGPU;
GpuMat disparityGPU2;
Mat disparity;Mat disparity1,disparity2;
Ptr<cuda::StereoBM> stereo = createStereoBM(256,3);
stereo->setMinDisparity(-39);
stereo->setPreFilterCap(61);
stereo->setPreFilterSize(3);
stereo->setSpeckleRange(1);
stereo->setUniquenessRatio(0);
stereo->compute(leftGPU,rightGPU,disparityGPU);
drawColorDisp(disparityGPU, disparityGPU2,256);
disparityGPU.download(disparity);
disparityGPU2.download(disparity2);
imshow("display img",disparityGPU);
how can I improve upon this? From the colour disparity map, there are quite a lot error (ie. the tall circle is red in colour and it is the same as some of the part of the table.). Also,from the disparity map, there are small noise (all the black dots in the picture), how can I pad those black dots with nearby disparities?
It is possible if the object is static.
To properly do stereo matching, you first need to rectify your images! If you don't have calibrated cameras, you can do this from detected feature points. Also note that for cuda::StereoBM the minimum default disparity is 0. (I have never used cuda, but I don't think your setMinDisparity is doing anything, see this anser.)
Now, in your example images corresponding points are only about 1 row apart, therefore your disparity map actually doesn't look too bad. Maybe having a larger blockSize would already do in this special case.
Finally, your objects have very low texture, therefore the block matching algorithm can't detect much.

What accuracy should I expect from basic opencv ortho-rectification algorithms?

So, I'm taking over the work on an ortho-rectification algorithm that is intended to produce "accurate" results. I'm running into trouble trying to increase the accuracy and could use a little help.
Here is the basic approach.
Extract a calibration pattern from an image that was taken from a mobile phone.
Rectify the image based on a calibration pattern in the image
Scale the image to get the real world size of the scene around the pattern.
The calibration pattern is held against a flat surface, like a wall, counter, table, floor and the user takes a picture. With that picture, we want to measure artifacts on the same surface as the calibration pattern. We have tried this with calibration patterns ranging from the size of a credit card to a sheet of paper (8.5" x 11")
Here is an example input picture
With this resulting output image
Right now our measurements are usually within 1-2% of what we expect. This is sufficient for small areas (less than 25cm away from the calibration pattern. However, we'd like the algorithm to scale so that we can accurately measure a 2x2 meter area. However, at that size, the current error is too much (2-4 cm).
Here is the algorithm we are following.
// convert original image to grayscale and perform morphological dilation to reduce false matches when finding circle grid
Mat imgGray;
cvtColor(imgOriginal, imgGray, CV_BGR2GRAY);
// find calibration pattern in original image
Size patternSize(4, 11);
vector <Point2f> circleCenters_OriginalImage;
if (!findCirclesGrid(imgGray, patternSize, circleCenters_OriginalImage, CALIB_CB_ASYMMETRIC_GRID))
{
return false;
}
Point2f inputQuad[4];
inputQuad[0] = Point2f(circleCenters_OriginalImage[0].x, circleCenters_OriginalImage[0].y);
inputQuad[1] = Point2f(circleCenters_OriginalImage[3].x, circleCenters_OriginalImage[3].y);
inputQuad[2] = Point2f(circleCenters_OriginalImage[43].x, circleCenters_OriginalImage[43].y);
inputQuad[3] = Point2f(circleCenters_OriginalImage[40].x, circleCenters_OriginalImage[40].y);
// create model points for calibration pattern
vector <Point2f> circleCenters_ObjectSpace = GeneratePatternPointsInObjectSpace(circleCenters_OriginalImage[0], Distance(circleCenters_OriginalImage[0], circleCenters_OriginalImage[1]) / 2.0f, ioData.marker_up);
Point2f outputQuad[4];
outputQuad[0] = Point2f(circleCenters_ObjectSpace[0].x, circleCenters_ObjectSpace[0].y);
outputQuad[1] = Point2f(circleCenters_ObjectSpace[3].x, circleCenters_ObjectSpace[3].y);
outputQuad[2] = Point2f(circleCenters_ObjectSpace[43].x, circleCenters_ObjectSpace[43].y);
outputQuad[3] = Point2f(circleCenters_ObjectSpace[40].x, circleCenters_ObjectSpace[40].y);
Mat lambda(2,4,CV_32FC1);
lambda = Mat::zeros(imgOriginal.rows, imgOriginal.cols, imgOriginal.type());
lambda = getPerspectiveTransform(inputQuad, outputQuad);
warpPerspective(imgOriginal, imgOrthorectified, lambda, imgOrthorectified.size());
...
My Questions:
Is it reasonable to shoot for error < 0.25%? Is there a different algorithm that would yield more accurate results? What are the most valuable sources of error to identify and resolve?
As I've worked on this, I've also looked at removing pincushion / barrel distortions, and trying homographies to find the perspective transform. The best approaches I have found so far remain in the 1-2% error.
Any suggestions of where to go next would be really helpful

How can I detect the position and the radius of the ball using opencv?

I need to detect this ball: and find its position and radius using opencv. I have downloaded many codes, but neither of them works. Any helps are highly appreciated.
I see you have quite a setup installed. As mentioned in the comments, please make sure that you have appropriate lighting to capture the ball, as well as making the ball distinguishable from it's surroundings by painting it a different colour.
Once your setup is optimized for detection, you may proceed via different ways to track your ball (stationary or not). A few ways may be:
Feature detection : Via Hough Circles, detect 2D circles (and their radius) that lie within a certain range of color, as explained below
There are many more ways to detect objects via feature detection, such as this clever blog points out.
Object Detection: Via SURF, SIFT and many other methods, you may detect your ball, calculate it's radius and even predict it's motion.
This code uses Hough Circles to compute the ball position, display it in real time and calculate it's radius in real time. I am using Qt 5.4 with OpenCV version 2.4.12
void Dialog::TrackMe() {
webcam.read(cim); /*call read method of webcam class to take in live feed from webcam and store each frame in an OpenCV Matrice 'cim'*/
if(cim.empty()==false) /*if there is something stored in cim, ie the webcam is running and there is some form of input*/ {
cv::inRange(cim,cv::Scalar(0,0,175),cv::Scalar(100,100,256),cproc);
/* if any part of cim lies between the RGB color ranges (0,0,175) and (100,100,175), store in OpenCV Matrice cproc */
cv::HoughCircles(cproc,veccircles,CV_HOUGH_GRADIENT,2,cproc.rows/4,100,50,10,100);
/* take cproc, process the output to matrice veccircles, use method [CV_HOUGH_GRADIENT][1] to process.*/
for(itcircles=veccircles.begin(); itcircles!=veccircles.end(); itcircles++)
{
cv::circle(cim,cv::Point((int)(*itcircles)[0],(int)(*itcircles)[1]), 3, cv::Scalar(0,255,0), CV_FILLED); //create center point
cv::circle(cim,cv::Point((int)(*itcircles)[0],(int)(*itcircles)[1]), (int)(*itcircles)[2], cv::Scalar(0,0,255),3); //create circle
}
QImage qimgprocess((uchar*)cproc.data,cproc.cols,cproc.rows,cproc.step,QImage::Format_Indexed8); //convert cv::Mat to Qimage
ui->output->setPixmap(QPixmap::fromImage(qimgprocess));
/*render QImage to screen*/
}
else
return; /*no input, return to calling function*/
}
How does the processing take place?
Once you start taking in live input of your ball, the frame captured should be able to show where the ball is. To do so, the frame captured is divided into buckets which are further divides into grids. Within each grid, an edge is detected (if it exists) and thus, a circle is detected. However, only those circles that pass through the grids that lie within the range mentioned above (in cv::Scalar) are considered. Thus, for every circle that passes through a grid that lies in the specified range, a number corresponding to that grid is incremented. This is known as voting.
Each grid then stores it's votes in an accumulator grid. Here, 2 is the accumulator ratio. This means that the accumulator matrix will store only half as many values as resolution of input image cproc. After voting, we can find local maxima in the accumulator matrix. The positions of the local maxima are corresponding to the circle centers in the original space.
cproc.rows/4 is the minimum distance between centers of the detected circles.
100 and 50 are respectively the higher and lower threshold passed to the canny edge function, which basically detects edges only between the mentioned thresholds
10 and 100 are the minimum and maximum radius to be detected. Anything above or below these values will not be detected.
Now, the for loop processes each frame captured and stored in veccircles. It create a circle and a point as detected in the frame.
For the above, you may visit this link

Remove moving objects to get the background model from multiple images

I want to find the background in multiple images captured with a fixed camera. Camera detect moving objects(animal) and captured sequential Images. So I need to find a simple background model image by process 5 to 10 captured images with same background.
Can someone help me please??
Is your eventual goal to find foreground? Can you show some images?
If animals move fast enough they will create a lot of intensity changes while background pixels will remain closely correlated among most of the frames. I won’t write you real code but will give you a pseudo-code in openCV. The main idea is to average only correlated pixels:
Mat Iseq[10];// your sequence
Mat result, Iacc=0, Icnt=0; // Iacc and Icnt are float types
loop through your sequence, i=0; i<N-1; i++
matchTemplate(Iseg[i], Iseq[i+1], result, CV_TM_CCOEFF_NORMED);
mask = 1 & (result>0.9); // get correlated part, which is probably background
Iacc += Iseq[i] & mask + Iseq[i+1] & mask; // accumulate background infer
Icnt += 2*mask; // keep count
end of loop;
Mat Ibackground = Iacc.mul(1.0/Icnt); // average background (moving parts fade away)
To improve the result you may reduce mage resolution or apply blur to enhance correlation. You can also clean every mask from small connected components by erosion, for example.
If
each pixel location appears as background in more than half the frames, and
the colour of a pixel does not vary much across the subset of frames in which it is background,
then there's a very simple algorithm: for each pixel location, just take the median intensity over all frames.
How come? Suppose the image is greyscale (this makes it easier to explain, but the process will work for colour images too -- just treat each colour component separately). If a particular pixel appears as background in more than half the frames, then when you take the intensities of that pixel across all frames and sort them, a background-coloured pixel must appear at the half-way (median) position. (In the worst case, all background-coloured pixels get pushed to the very front or the very back in this order, but even then there are enough of them to cover the half-way point.)
If you only have 5 images it's going to be hard to identify background and most sophisticated techniques probably won't work. For general background identification methods, see Link