I am trying to stitch 2 aerial images together with very little overlap, probably <500 px of overlap. These images have 3600x2100 resolution. I am using the OpenCV library to complete this task.
Here is my approach:
1. Find feature points and match points between the two images.
2. Find homography between two images
3. Warp one of the images using the homgraphy
4. Stitch the two images
Right now I am trying to get this to work with two images. I am having trouble with step 3 and possibly step 2. I used findHomography() from the OpenCV library to grab my homography between the two images. Then I called warpPerspective() on one of my images using the homgraphy.
The problem with the approach is that the transformed image is all distorted. Also it seems to only transform a certain part of the image. I have no idea why it is not transforming the whole image.
Can someone give me some advice on how I should approach this problem? Thanks
In the results that you have posted, I can see that you have at least one keypoint mismatch. If you use findHomography(src, dst, 0), it will mess up your homography. You should use findHomography(src, dst, CV_RANSAC) instead.
You can also try to use warpAffine instead of warpPerspective.
Edit: In the results that you posted in the comments to your question, I had the impression that the matching worked quite stable. That means that you should be able to get good results with the example as well. Since you mostly seem to have to deal with translation you could try to filter out the outliers with the following sketched algorithm:
calculate the average (or median) motion vector x_avg
calculate the normalized dot product <x_avg, x_match>
discard x_match if the dot product is smaller than a threshold
To make it work for images with smaller overlap, you would have to look at the detector, descriptors and matches. You do not specify which descriptors you work with, but I would suggest using SIFT or SURF descriptors and the corresponding detectors. You should also set the detector parameters to make a dense sampling (i.e., try to detect more features).
You can refer to this answer which is slightly related: OpenCV - Image Stitching
To stitch images using Homography, the most important thing that should be taken care of is finding of correspondence points in both the images. Lesser the outliers in the correspondence points, the better is the generated homography.
Using robust techniques such as RANSAC along with FindHomography() function of OpenCV(Use CV_RANSAC as option) will still generate reasonable homography provided percentage of inliers is more than percentage of outliers. Also make sure that there are at-least 4 inliers in the correspondence points that passed to the FindHomography function.
Related
I have an application where I have to detect the presence of some items in a scene. The items can be rotated and a little scaled (bigger or smaller). I've tried using keypoint detectors but they're not fast and accurate enough. So I've decided to first detect edges in the template and the search area, using Canny ( or a faster edge detection algo ), and then match the edges to find the position, orientation, and size of the match found.
All this needs to be done in less than a second.
I've tried using matchTemplate(), and matchShape() but the former is NOT scale and rotation invariant, and the latter doesn't work well with the actual images. Rotating the template image in order to match is also time consuming.
So far I have been able to detect the edges of the template but I don't know how to match them with the scene.
I've already gone through the following but wasn't able to get them to work (they're either using old version of OpenCV, or just not working with other images apart from those in the demo):
https://www.codeproject.com/Articles/99457/Edge-Based-Template-Matching
Angle and Scale Invariant template matching using OpenCV
https://answers.opencv.org/question/69738/object-detection-kinect-depth-images/
Can someone please suggest me an approach for this? Or a code snipped for the same if possible ?
This is my sample input image ( the parts to detect are marked in red )
These are some software that are doing this and also how I want it should be:
This topic is what I am actually dealing for a year on a project. So I will try to explain what my approach is and how I am doing that. I assume that you already did the preprocess steps(filters,brightness,exposure,calibration etc). And be sure you clean the noises on image.
Note: In my approach, I am collecting data from contours on a reference image which is my desired object. Then I am comparing these data with the other contours on the big image.
Use canny edge detection and find the contours on reference
image. You need to be sure here about that it shouldn't miss some parts of
contours. If it misses, probably preprocess part should have some
problems. The other important point is that you need to find an
appropriate mode of findContours because every modes have
different properties so you need to find an appropriate one for your
case. At the end you need to eliminate the contours which are okey
for you.
After getting contours from reference, you can find the length of
every contours using outputArray of findContours(). You can compare
these values on your big image and eliminate the contours which are
so different.
minAreaRect precisely draws a fitted, enclosing rectangle for
each contour. In my case, this function is very good to use. I am
getting 2 parameters using this function:
a) Calculate the short and long edge of fitted rectangle and compare the
values with the other contours on the big image.
b) Calculate the percentage of blackness or whiteness(if your image is
grayscale, get a percentage how many pixel close to white or black) and
compare at the end.
matchShape can be applied at the end to the rest of contours or you can also apply to all contours(I suggest first approach). Each contour is just an array so you can hold the reference contours in an array and compare them with the others at the end. After doing 3 steps and then applying matchShape is very good on my side.
I think matchTemplate is not good to use directly. I am drawing every contour to a different mat zero image(blank black surface) as a template image and then I compare with the others. Using a reference template image directly doesnt give good results.
OpenCV have some good algorithms about finding circles,convexity etc. If your situations are related with them, you can also use them as a step.
At the end, you just get the all data,values, and you can make a table in your mind. The rest is kind of statistical analysis.
Note: I think the most important part is preprocess part. So be sure about that you have a clean almost noiseless image and reference.
Note: Training can be a good solution for your case if you just want to know the objects exist or not. But if you are trying to do something for an industrial application, this is totally wrong way. I tried YOLO and haarcascade training algorithms several times and also trained some objects with them. The experiences which I get is that: they can find objects almost correctly but the center coordinates, rotation results etc. will not be totally correct even if your calibration is correct. On the other hand, training time and collecting data is painful.
You have rather bad image quality very bad light conditions, so you have only two ways:
1. To use filters -> binary threshold -> find_contours -> matchShape. But this very unstable algorithm for your object type and image quality. You will get a lot of wrong contours and its hard to filter them.
2. Haarcascades -> cut bounding box -> check the shape inside
All "special points/edge matching " algorithms will not work in such bad conditions.
I have 7 images from gopro (5 cameras in rig and one for top and one for bottom, They all are gopro camera). I want to stitch all these images together to create a 3d panorama. I have been able to stitch 5 images in Rig by using opencv stitching_detailed.cpp. Link to file:
https://raw.githubusercontent.com/opencv/opencv/master/samples/cpp/stitching_detailed.cpp
But I'm not sure how to stitch top and bottom (For me right now bottom is not that important but i have to do something about top). Any idea how this can be done? Please let me know if i can use same stitching_detailed.cpp to stitch top also.
Following Link contains images which I'm using. It also contains results i got from stitching images in rig.
https://drive.google.com/folderview?id=0B_Bl8s2ePunQcnBaM3A4WDlDcXM&usp=sharing
So first you need to understand how stitching_detailed.cpp works.
1. features keypoint is detected in each image using SURF/ORB/SIFT or so. Then for each image pair, best feature matches are found and homography matrix is calculated and number of inliers for each pair is obtained
(*finder)(img, features[i]);
BestOf2NearestMatcher matcher(try_cuda, match_conf);
matcher(features, pairwise_matches);
All these pairs are pased into leaveBiggestComponent to obtain largest set of images, which belong to a panorama.
camera parameters or each image is calculated from above obtained set and warping and blending is done.
step 1 will find homography for each pair and generate number of inliers. Step 2 will remove all those image pair for which confidence factor(number of inliers) is less than threshold. Since cam7 img has so less features and almost no overlapping region with any other image, it will get rejected in leavebiggestcomponent step.
You can see features and mtaching in this link (I have used orbfeatures)
https://drive.google.com/open?id=0B2wDitsftUG9QnhCWFIybENkbDA
Also I have not changed the image size, but i guess reducing the image size a bit(by half maybe) will yield more feature points
What you can do is reduce the time interval in which you take frame for stitching. To obtain good results, there must be atleast 40% overlapping region between images.
I'm new to CV, and trying to stitch together a video of two cameras which are stationary one relative to the other. The details:
The cameras are one beside the other and I can adjust the rotation angle between them. The cameras will be moving with respect to the world, so the scene will be changing.
The amount of frames to be stitched is roughly 300 (each frame is composed of two pictures, one from each camera).
I don't need to do the stitching in real time, but I want to do it as fast as possible using the fact that I know the relative positions of the cameras. Resolution of each picture is relatively high, around 900x600.
Right now I'm at the stage where I have code to stitch 2 single pictures, courtesy of http://ramsrigoutham.com/2012/11/22/panorama-image-stitching-in-opencv/
The main stages are:
Using SURF detector to find SURF descriptor in both images
matching the SURF descriptor using FLANN Matcher
Postprocessing matches to find good matches
Using RANSAC to estimate the Homography matrix using the matched SURF descriptors
Warping the images based on the homography matrix
My question is: How can I optimize the process based on the fact that I already know the camera positions?
Ideally I would like to do some initial calculation once to find the transform between the camera perspectives, and then reuse it. But not sure with my rudimentary CV knowledge if this is indeed possible, and what transform I could use if so.
I understand that calculating the homography matrix once and reusing it won't work, since the scene is changing.
Two other possibilities:
I found a similar case (but stationary scene) where the transform is computed once and reused. Which transform is this, and could it work in my case?
The other possibility I found is to use the initial knowledge to find the overlapping region between two pictures, and ignore the rest of the pictures to save time. Relevant thread
Any help would be greatly appreciated!
Ron
I am doing object detection using feature extraction (sift,orb).
I want to extract ORB feature from different point of view of the object (train images) and then matching all of them with a query image.
The problem I am facing is: how can I create a good homography from keypoint coming from different point of view of the image that have of course different sizes?
Edit
I was thinking to create an homography for each train images that got say 3-4 matches and then calculate some "mean" homography...
The probleam arise when you have for example say just 1-2 matches from each train image, at that point you cannot create not even 1 homography
Code for create homography
//> For each train images with at least some good matches ??
H = findHomography( train, scene, CV_RANSAC );
perspectiveTransform( trainCorners, sceneCorners, H);
I think there is no point on doing that as a pair of images A and B has nothing to do with a pair of images B and C when you talk about homography. You will get different sets of good matches and different homographies, but homographies will be unrelated and no error minimization would have a point.
All minimization has to be within matches, keypoints and descriptors considering just the pair of images.
There is an idea similar to what you ask in FREAK descriptor. You can train the selected pairs with a set of images. That means that FREAK will decide the best pattern for extracting descriptors basing on a set of images. After this training you are supposed to find more robust mathces that will give you a better homography.
To find a good homography you need accurate matches of your keypoints. You need 4 matches.
The most common methos is DLT combined with RANSAC. DLT is a linear transform that finds the homography 3x3 matrix that proyects your keypoints into the scene. RANSAC finds the best set of inliers/outliers that satisfies the mathematicl model, so it will find the best 4 points as input of DLT.
EDIT
You need to find robust keypoints. SIFT is supossed to do that, scale and perspective invariant. I don't think you need to train with different images. Finding a mean homography has no point. You need to find an only homography for an object detected, and that homography will be the the transformation between the marker and the object detected. Homography is precise, there is no point on finding a mean.
Have you tried the approach of getting keypoints from views of the object: train_kps_1, train_kps_2... then match those arrays with the scene, then select the best matches from those several arrays resulting in a single array of good matches. And finally use that result in find homography as the 'train'.
The key here is how to select the best matches, which is a different question, which you can find a nice anwser here:
http://answers.opencv.org/question/15/how-to-get-good-matches-from-the-orb-feature/
And maybe here:
http://answers.opencv.org/question/2493/best-matches/
I am new to OpenCV. I would like to know if we can compare two images (one of the images made by photoshop i.e source image and the otherone will be taken from the camera) and find if they are same or not.
I tried to compare the images using template matching. It does not work. Can you tell me what are the other procedures which we can use for this kind of comparison?
Comparison of images can be done in different ways depending on which purpose you have in mind:
if you just want to compare whether two images are approximately equal (with a few
luminance differences), but with the same perspective and camera view, you can simply
compute a pixel-to-pixel squared difference, per color band. If the sum of squares over
the two images is smaller than a threshold the images match, otherwise not.
If one image is a black-white variant of the other, conversion of the color images is
needed (see e.g. http://www.johndcook.com/blog/2009/08/24/algorithms-convert-color-grayscale). Afterwarts simply perform the step above.
If one image is a subimage of the other, you need to perform registration of the two
images. This means determining the scale, possible rotation and XY-translation that is
necessary to lay the subimage on the larger image (for methods to register images, see:
Pluim, J.P.W., Maintz, J.B.A., Viergever, M.A. , Mutual-information-based registration of
medical images: a survey, IEEE Transactions on Medical Imaging, 2003, Volume 22, Issue 8,
pp. 986 – 1004)
If you have perspective differences, you need an algorithm for deskewing one image to
match the other as well as possible. For ways of doing deskewing look for example in
http://javaanpr.sourceforge.net/anpr.pdf from page 15 and onwards.
Good luck!
You should try SIFT. You apply SIFT to your marker (image saved in memory) and you get some descriptors (points robust to be recognized). Then you can use FAST algorithm with the camera frames in order to find the coprrespondent keypoints of the marker in the camera image.
You have many threads about this topic:
How to get a rectangle around the target object using the features extracted by SIFT in OpenCV
How to search the image for an object with SIFT and OpenCV?
OpenCV - Object matching using SURF descriptors and BruteForceMatcher
Good luck