checking duplicate images with ORB - python-2.7

Currently i am working on checking duplicate images , so i am using ORB for that, the first part is almost complete, i have the descriptor vector of both the images, now as the second part i want to know how we calculate the scores using hamming distance, and what should be the threshold of saying that these are duplicates
img1 = gray_image15
img2 = gray_image25
# Initiate STAR detector
orb = cv2.ORB_create()
# find the keypoints with ORB
kp1 = orb.detect(img1,None)
kp2 = orb.detect(img2,None)
# compute the descriptors with ORB
kp1, des1 = orb.compute(img1, kp1)
kp2, des2 = orb.compute(img2, kp2)
matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = matcher.match(des1, des2)
# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)
i just want to know the next step in this process so that ultimately i can print yes or no for duplicates. i am using opencv3.0.0 with python 2.7

Once you obtain the descriptors, you can use a bag-of-words model to cluster the descriptors of the reference image, that is, build a vocabulary (visual words).
Then project the descriptors of the other image on to this vocabulary.
Then you can obtain a histogram showing the distribution of each of the visual words in the two images.
Compare these two histograms using a histogram comparison technique and use a threshold to detect the duplicates. For example, if you use Bhattacharyya distance, a low value means a good match.
I don't have a python implementation of this, but you can find something similar in c++ here.


Measure vertical distance of binarized image (Open CV) C++

So this should be straight forward but I a not very familiar with OpenCV.
Can someone suggest a method to measure the distance in pixels (red line) as shown in the image below? Preferably it had some options like width of measurement (as demonstrated at the end and begining of the red line) or something of sorts. This kind of measurement is very common in software like ImageJ, I can imagine it should be somewhat trivial to do it in OpenCV.
I would like to take several samples accros the image width as well.
I am using openCV and learning about it
Your task is quite simple.
optional smoothing (Gauss filter) - you have to experiment with your data to see if it helps
edge detection (will transform image to lines representing edges) - for example cv::Canny
Hough transform to detect lines - openCV.
Find two maximum values (longest lines) in Hough transform
you will have two questions of straight lines, then you can use this information to calculate distance between them
Note that whit this approach image doesn't have to be straight. You will have line equations which you have to manipulate in smart way. If those two lines are parallel this there is simple formula to get distance between them. If they are not perfectly parallel then you have to take this int account and use information about image area to get average distance.
A simple way to find the width of the channel would be the following:
distance = []
h = img.shape[0]
for j in range(img.shape[1]):
line_top = 0
line_bottom = img.shape[0]
found_top = False
found_bottom = False
for i in range(h):
if img[i,j,0] > 0 and not found_top:
line_top = i
found_top = True
if img[h-i-1,j,0] > 0 and not found_bottom:
line_bottom = h-i
found_bottom = True
if found_top and found_bottom:
But this would cause the distance to take into acount the very small white speckles.
To solve this there are several options:
Preprocess the image using opencv morphological transformation.
Preprocess the image using opencv gaussian filter or similar.
Update the code to use a larger window.
Another solution would be to apply opencv's findContours.

extracting features from all bounding boxes of Fast R-CNN

I just installed Fast RCNN and have run the demo,
and I came to wonder if it's possible to extract features from all bounding boxes in the image (and do this for the entire dataset).
For example, if Fast RCNN detects cat, dog, and a car from an image,
I'd like to extract separate CNN features for each of cat, dog, and car.
And do this for tens of thousands of images.
The feature extraction example on Fast RCNN's Github ( seems to be the replica of feature extraction using caffe for the entire image, not each bounding box.
Could anyone help me on this?
Apparently, feature extraction for each bounding box is done in the following part of the code from
# When mapping from image ROIs to feature map ROIs, there's some aliasing
# (some distinct image ROIs get mapped to the same feature ROI).
# Here, we identify duplicate feature ROIs, so we only compute features
# on the unique subset.
if cfg.DEDUP_BOXES > 0:
v = np.array([1, 1e3, 1e6, 1e9, 1e12])
hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v)
_, index, inv_index = np.unique(hashes, return_index=True,
blobs['rois'] = blobs['rois'][index, :]
boxes = boxes[index, :]
# reshape network inputs
blobs_out = net.forward(data=blobs['data'].astype(np.float32, copy=False),
rois=blobs['rois'].astype(np.float32, copy=False))
if cfg.TEST.SVM:
# use the raw scores before softmax under the assumption they
# were trained as linear SVMs
scores = net.blobs['cls_score'].data
# use softmax estimated probabilities
scores = blobs_out['cls_prob']
# Apply bounding-box regression deltas
box_deltas = blobs_out['bbox_pred']
pred_boxes = _bbox_pred(boxes, box_deltas)
pred_boxes = _clip_boxes(pred_boxes, im.shape)
# Simply repeat the boxes, once for each class
pred_boxes = np.tile(boxes, (1, scores.shape[1]))
if cfg.DEDUP_BOXES > 0:
# Map scores and predictions back to the original set of boxes
scores = scores[inv_index, :]
pred_boxes = pred_boxes[inv_index, :]
return scores, pred_boxes
I'm trying to figure out how to tweak this to save the features, as we do with Caffe for features of the entire images, which are saved to a mdb file.
During the process of determining the right bounding boxes, Fast-RCNN extracts CNN features from a high (~800-2000) number of image regions, called object proposals. These regions are obtained through different algorithms, typically selective search. After this computation, it uses those features to recognize the "right" proposals and find out the "right" bounding box. This is called bounding box regression.
Of course Fast-RCNN optimizes this process, but still has to extract CNN features features from many more regions than the ones related with the object of interest.
Shortly, if you were to save the variable blobs_out in the code snap you pasted, you will save the features relative to all the object proposals, including the "wrong" proposals. But you can save all that and then try to prune and retrieve only the desired ones. To save the features, just use pickle.dump().
Look at the end of the test_net function, here. The nms_dets variable seems to store the final boxes. There may be a way to take the blobs_out you stored and throw the undesired features off, but it doesn't seem so straightforward.
The simplest solution I'm able to think about is as follows.
Let's Fast-RCNN compute the final bounding boxes. Then, extract the relative image patches, with something like the following (I'm assuming Python):
img = cv2.imread('/path/to/image')
for bbox in bboxes_list:
x0, y0, x1, y1 = bbox
cut = img[y0:y1, x0:x1]
The feature extraction is identical to the entire image case:
net = Caffe.NET('deploy.prototxt', 'caffemodel', caffe.TEST)
# preprocess input
net.blobs['data'].data[...] = net_input
feats = net.blobs['my_layer'].data.copy()
Of course this method is computationally expensive, since you are basically compute twice the CNN features. It depends on your requirements about speed and the size of the CNN models.

OpenCV hamming distance between FLANN matches

Is there a way to get the hamming distance between two matched descriptors when using the flann matcher without manually calculating it? (i.e. looping through the descriptors matched, XORing each element, then counting).
I am matching descriptors computed by ORB like so:
FlannBasedMatcher flannMatcher;
flannMatcher.match(des1, des2, matches);
If I check the distance:
cout <<;
I get the NORM_L2 distance, however my application requires the hamming distance.
The reason I want to do this is that I am generating train descriptors from a training image set using ORB, finding matches using the brute force matcher, then filtering poor matches based on hamming distance.
I then want to use the flann matcher to match descriptors on a webcam stream to these train descriptors (and show which of the training images most closely matches the current frame), but since the flann matcher doesn't seem to give the hamming distance I'm stuck when it comes to filtering out poor matches, and I get a lot of error when choosing the train image that matches best.
In OpenCV's tutorial, it is described how to create flann based matcher for SIFT and ORB (here is the ORB)
While using ORB, you can pass the following. The commented values are recommended as per the docs, but it didn't provide required results in some cases. Other values worked fine.:
index_params= dict(algorithm = FLANN_INDEX_LSH,
table_number = 6, # 12
key_size = 12, # 20
multi_probe_level = 1) #2
Try using flann::LshIndexParams as the distance type. This does Locality Sensitive Hashing (which is close to the Hamming distance)
FlannBasedMatcher matcher2(new flann::LshIndexParams(20,10,2));
See also the discussion here

Hu moments and SVM does not work

I have come across one problem when trying to train data with SVM.
I get some different regions (set of connected pixels) from face images, and regions from eyes are very similar, so I want to use Hu moments for shape description and SVM for training.
But SVM does not work properly, method svm.predict evaluates afterwards everything as non-eye, moreover the same regions which were labeled and used in traning phase as eye, are evaluated as non-eye.
Feature data consists only of 7 Hu moments. I will post here some samples of source code in a moment, thanks in advance :)
Additional info:
input image:
Setting up basic svm for 1 image:
int image_regions = 10;
Mat training_mat(image_regions ,7,CV_32FC1); // 7 hu moments
Mat labels(image_regions ,1,CV_32FC1); // for labels 1 (eye) and -1 (non eye)
// computing hu moments
Moments moments2=moments(croppedImage,false);
double hu[7];
// putting them into svm traning mat
for (int k=0;k<huCounter;k++)<float>(counter,k) = hu[k]; // counter is current number of region
if (isEye(...))
//I use the following:
CvSVM svm;
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::LINEAR;
params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 1000, 1e-6);
// ... do the above mentioned phase, and then:
svm.train(training_mat, labels, Mat(), Mat(), params);
I hope the following suggestions can help you…..
The simplest task is to use a clustering algorithm and try to cluster the data into two classes. If an algorithm like ‘k-means’ can do the job why make things complex by using SVM and Neural Nets. I suggest you use this technique because your feature vector dimension is of a very small size (7 Hu Moments) as well as your number of samples.
Perform feature Normalization (specified in point 4) to make sure the values fall in a limited range.
Check out “is your data really separable?” As your data is small, take a few samples from positive images and a few samples from negative images and plot the feature vectors. If you can visually see the difference surely any learning algorithm can do the job for you. As I said earlier simple tricks can do better than complex math.
Only if you then decide to use SVM you should know the following:
• As I can see from your code you are using a Linear SVM, may be your data is non-separable by a linear kernel. Try using some polynomial kernel or other kernels. There is one option bool CvSVM::train_auto in openCV just have a look.
• Try to check whether the feature vector values you are getting are proper values or not (make sure that they are not some garbage values).
• Also you can perform feature normalization “ZERO MEAN and UNIT VARIENCE” before you use it for training.
• Most importantly increase the number of images for training, both positively and negatively labeled.
• Last but not least SVM is not magic, at the end of the day it is just drawing a line between two sets of points. So don’t expect it to classify anything you give it as input.
If nothing works “Just improve your feature extraction technique”

BoW in OpenCV using precomputed features

I need to do BOW (bag of words) but I only have the described keypoints of the images.
For the moment, I have obtained the vocabulary using:
cv::BOWKMeansTrainer bowtrainerCN(numCenters); //num clusters
cv::Mat vocabularyCN = bowtrainerCN.cluster();
So now I need to do the assignment but I can't use the compute function because it calculates the descriptors of the images and I already have that. Is there any function to do the assignment or have I to compute it manually?
Once you have built the vocabulary (codebook) using cv::BOWKMeansTrainer::cluster() method, you can then match a descriptor (with suitable size and type) to the codebook. You first have to choose the type of matcher you need with a norm to use. (see opencv doc)
For example, with cv::BFMatcher and L2 norm
// init the matcher with you pre-trained codebook
cv::Ptr<cv::DescriptorMatcher > matcher = new cv::BFMatcher(cv::NORM_L2);
matcher->add(std::vector<cv::Mat>(1, vocabulary));
// matches
std::vector<cv::DMatch> matches;
Then the index of the closest codeword in your codebook for the new_descriptors[i] will be