OpenCV hamming distance between FLANN matches - c++

Is there a way to get the hamming distance between two matched descriptors when using the flann matcher without manually calculating it? (i.e. looping through the descriptors matched, XORing each element, then counting).
I am matching descriptors computed by ORB like so:
FlannBasedMatcher flannMatcher;
flannMatcher.match(des1, des2, matches);
If I check the distance:
cout << matches.at(0).distance;
I get the NORM_L2 distance, however my application requires the hamming distance.
Context:
The reason I want to do this is that I am generating train descriptors from a training image set using ORB, finding matches using the brute force matcher, then filtering poor matches based on hamming distance.
I then want to use the flann matcher to match descriptors on a webcam stream to these train descriptors (and show which of the training images most closely matches the current frame), but since the flann matcher doesn't seem to give the hamming distance I'm stuck when it comes to filtering out poor matches, and I get a lot of error when choosing the train image that matches best.

In OpenCV's tutorial, it is described how to create flann based matcher for SIFT and ORB (here is the ORB)
While using ORB, you can pass the following. The commented values are recommended as per the docs, but it didn't provide required results in some cases. Other values worked fine.:
FLANN_INDEX_LSH = 6
index_params= dict(algorithm = FLANN_INDEX_LSH,
table_number = 6, # 12
key_size = 12, # 20
multi_probe_level = 1) #2
https://docs.opencv.org/3.4/dc/dc3/tutorial_py_matcher.html

Try using flann::LshIndexParams as the distance type. This does Locality Sensitive Hashing (which is close to the Hamming distance)
FlannBasedMatcher matcher2(new flann::LshIndexParams(20,10,2));
See also the discussion here

Related

Fast matching of binary descriptors using flann

I want to match a set of binary descriptors (query data) against a larger set of binary descriptors (train data).
The matching should be done fast. I decided to try the FlannBasedMatcher of OpenCv. The matcher supports a variety of Algorithms. For binary descriptors there is Multi-Probe LSH implemented.
FlannBasedMatcher matcher (new flann::LshIndexParams(12,10,2));
std::vector<DMatch> matches;
matcher.knnMatch(query, train, matches, 2);
(I tried different LshIndexParams settings)
The Problem is that Flann matching with LSH is very slow compared to BruteForce Matching or compared to Flann matching with KDTreeIndexParams using float descriptors.
It also looks like some others experienced the same problem (see 1, 2).
In one of the answers it is suggested to try Hierarchical Clustering.
That should be faster than LSH.
I would like to use the same FlannBasedMatcher Interface as with LSH.
FlannBasedMatcher matcher (new flann::HierarchicalClusteringIndexParams());
std::vector<DMatch> matches;
matcher.knnMatch(query, train, matches, 2);
However this doesn't work with binary descriptors (see the error message):
OpenCV Error: Unsupported format or combination of formats
But the "raw" Hierarchical Clustering interface supports multiple distance types and it is possible to choose the hamming distance.
cv::flann::Index tree(train, cv::flann::HierarchicalClusteringIndexParams(), FLANN_DIST_HAMMIN‌​G);
cv::Mat indices, dists;
tree.knnSearch(query, indices, dists, 2, cv::flann::SearchParams());
My questions are:
Is it possible to set a distance type for the FlannBasedMatcher to work with binary descriptors and with Hierarchical Clustering? Or can I define a custom flann::HierarchicalClusteringIndexParams() with Hamming distance? I would like to use the FlannBasedMatcher interface.
Is there an alternative faster method for matching binary descriptors? Better/faster than using FLANN with LSH or Hierarchical Clustering?

OpenCV transformation from two images

Theres is a MATLAB example that matches two images and outputs the rotation and scale:
https://de.mathworks.com/help/vision/examples/find-image-rotation-and-scale-using-automated-feature-matching.html?requestedDomain=www.mathworks.com
My goal is to recreate this example using C++. I am using the same method of keypoint detection (Harris) and the keypoints seem to be mostly identical to the ones Matlab finds. So far so good.
cv::goodFeaturesToTrack(image_grayscale, corners, number_of_keypoints, 0.01, 5, mask, 3, true, 0.04);
for (int i = 0; i < corners.size(); i++) {
keypoints.push_back(cv::KeyPoint(corners[i], 5));
}
BRISK is used to extract features from the keypoints.
int Threshl = 120;
int Octaves = 8;
float PatternScales = 1.0f;
cv::Ptr<cv::Feature2D> extractor = cv::BRISK::create(Threshl, Octaves, PatternScales);
extractor->compute(image, mykeypoints, descriptors);
These descriptors are then matched using flannbasedmatcher.
cv::FlannBasedMatcher matcher;
matcher.match(descriptors32A, descriptors32B, matches);
Now the problem is that about 80% of my matches are wrong and unusable. For the identical set of images Matlab returns only a couple of matches from which only ~20% are wrong. I have tried sorting the Matches in C++ based on their distance value with no success. The values range between 300 and 700 and even the matches with the lowest distance are almost entirely incorrect.
Now 20% of good matches are enough to calculate the offset but a lot of processing power is wasted on checking wrong matches. What would be a better way to sort the correct matches or is there something obvious I am doing wrong?
EDIT:
I have switched from Harris/BRISK to AKAZE which seems to deliver much better features and matches that can easily be sorted by their distance value. The only downside is the much higher computation time. With two 1000px wide images AKAZE needs half a minute to find the keypoints (on a PC). I reducted this by scaling down the images which makes for an acceptable ~3-5 seconds.
The method you are using finds for each point an nearest neighbour no matter how close it is. Two strategies are common:
1. Match set A to set B and set B to A and keep only matches which exist in both matchings.
2. Use 2 knnMatch and perform a ratio check, i.e. keep only the matches where the 1 NN is a lot closer than the 2 NN, e.g.
d1 < 0.8 * d2.
The MATLAB code uses SURF. OpenCV also provides SURF, SIFT and AKAZE, try one of these. Especially SURF would be interesting for a comparison.

Estimating R/T from Homography

I've been trying to calculate the features in 2 images and then pass those features back to CameraParams.R without luck. The features are calculated and matched successfully, however, the problem is passing them back to R & t.
I understand that you must decompose the Homography in order for this to be possible, which I've done using something like this: https://github.com/syilma/homography-decomp, but am I really doing it right?
Right now I'm simply using:
Matching:
vector< vector<DMatch> > matches;
Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create(algorithmName);
matcher->knnMatch( descriptors_1, descriptors_2, matches, 50 );
vector< DMatch > good_matches; // Storing good matches here
I've noticed that the good_matches isn't used anywhere. So I guess my question is, how can I pass back good_matches to cameras.R/t?
Extracting Homography:
Mat K;
cameras[img_idx].K().convertTo(K, CV_32F);
findHomography -> decomposeHomography(H, K, outputR,outputT,noarray()).
Then by utilizing the library above, I pass in the values from R & t but the response is that the homography isn't found in the 4 possible outcomes.
Am I on the right path here? Seems like decomposeHomography is a 3D solution, but, findHomography is 2D?
Absolute Goal:
Refine CameraParam.R/t depending on the features found in the images.
Why? Because I'm currently passing in the .R from the devices rotation matrix but the rotation is slightly inaccurate. See more info about it on my previous question: Refining Camera parameters and calculating errors - OpenCV
If you are using the calculated R for image stitching, then there is no need to use decompose homography. Whole stitching pipeline assumes zero translation. So it gives perfect output for only rotation case and slight error is introduced with the introduction of translation in camera pose. If you look into opencv calculation of R from homography, it assumes 0 translation.
Mat R = K_from.inv() * pairwise_matches[pair_idx].H.inv() * K_to;
cameras[edge.to].R = cameras[edge.from].R * R;
You can find the source code in motion_estimators.cpp ->calcRotation function.
Coming to your question of using goodmatches for calculating R. goodmatches are actually used to calculate homography matrix, using findhomography function
So the whole process will be like
Find matches (as you mentioned)
Find homography matrix from these matches using findhomography
function
Use calcrotation function to find R
Find focal using findfocalfromhomography function and create intrinsic matrix
Use warper, seamfinder and blender for final stitching output

checking duplicate images with ORB

Currently i am working on checking duplicate images , so i am using ORB for that, the first part is almost complete, i have the descriptor vector of both the images, now as the second part i want to know how we calculate the scores using hamming distance, and what should be the threshold of saying that these are duplicates
img1 = gray_image15
img2 = gray_image25
# Initiate STAR detector
orb = cv2.ORB_create()
# find the keypoints with ORB
kp1 = orb.detect(img1,None)
kp2 = orb.detect(img2,None)
# compute the descriptors with ORB
kp1, des1 = orb.compute(img1, kp1)
kp2, des2 = orb.compute(img2, kp2)
matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = matcher.match(des1, des2)
# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)
i just want to know the next step in this process so that ultimately i can print yes or no for duplicates. i am using opencv3.0.0 with python 2.7
Once you obtain the descriptors, you can use a bag-of-words model to cluster the descriptors of the reference image, that is, build a vocabulary (visual words).
Then project the descriptors of the other image on to this vocabulary.
Then you can obtain a histogram showing the distribution of each of the visual words in the two images.
Compare these two histograms using a histogram comparison technique and use a threshold to detect the duplicates. For example, if you use Bhattacharyya distance, a low value means a good match.
I don't have a python implementation of this, but you can find something similar in c++ here.

BoW in OpenCV using precomputed features

I need to do BOW (bag of words) but I only have the described keypoints of the images.
For the moment, I have obtained the vocabulary using:
cv::BOWKMeansTrainer bowtrainerCN(numCenters); //num clusters
bowtrainerCN.add(allDescriptors);
cv::Mat vocabularyCN = bowtrainerCN.cluster();
So now I need to do the assignment but I can't use the compute function because it calculates the descriptors of the images and I already have that. Is there any function to do the assignment or have I to compute it manually?
Once you have built the vocabulary (codebook) using cv::BOWKMeansTrainer::cluster() method, you can then match a descriptor (with suitable size and type) to the codebook. You first have to choose the type of matcher you need with a norm to use. (see opencv doc)
For example, with cv::BFMatcher and L2 norm
// init the matcher with you pre-trained codebook
cv::Ptr<cv::DescriptorMatcher > matcher = new cv::BFMatcher(cv::NORM_L2);
matcher->add(std::vector<cv::Mat>(1, vocabulary));
// matches
std::vector<cv::DMatch> matches;
matcher->match(new_descriptors,matches);
Then the index of the closest codeword in your codebook for the new_descriptors[i] will be
matches[i].trainIdx;