Match corresponding peaks/valleys of similar histograms in OpenCV - c++

I'm currently working on a project where I have to compare similar histograms of image intensity. These histograms are obtained from photos taken under different illumination conditions.
I know that OpenCV offers the compareHist function. However this function returns a metric of similarity and I'm looking for a method that matches corresponding peaks/valleys between similar histograms.
For instance, if we have two photos of the same subject, one underexposed and one with the "ideal" exposure, their histograms of intensity might look something like the image in the following URL:
http://i.stack.imgur.com/tLIGR.png
As shown by the arrows, the peaks in one histogram also exist in the other. Anyone has a suggestion on how to match corresponding peaks?
Thank you!

You can use an implementation of DTW (https://en.wikipedia.org/wiki/Dynamic_time_warping) to compare the histograms.
Using dynamic programming, you can create a matrix that calculates DTW. Then, you can trace back through the matrix to find the relations between different parts of the histograms.
After that, it's simply a matter of extracting only the peaks.

Related

C++ - Using Bag of Words for matching pictures together?

I would like to compare a picture (with his descriptors) with thousand of pictures inside a database in order to do a matching. (if two pictures are the same,that is to say the same thing but it can bo rotated, a bit blured, has a different scale etc.).
For example :
I saw on StackOverflaw that compute descriptors for each picture and compare them one to one is very a long process.
I did some researches and i saw that i can do an algorithm based on Bag of Words.
I don't know exactly how is works yet, but it seems to be good. But in think, i can be mistaked, it is only to detect what kind of object is it not ?
I would like to know according to you if using it can be a good solution to compare a picture to a thousands of pictures using descriptors like Sift of Surf ?
If yes, do you have some suggestions about how i can do that ?
Thank,
Yes, it is possible. The only thing you have to pay attention is the computational requirement which can be a little overwhelming. If you can narrow the search, that usually help.
To support my answer I will extract some examples from a recent work of ours. We aimed at recognizing a painting on a museum's wall using SIFT + RANSAC matching. We have a database of all the paintings in the museum and a SIFT descriptor for each one of them. We aim at recognizing the paining in a video which can be recorded from a different perspective (all the templates are frontal) or under different lighting conditions. This image should give you an idea: on the left you can see the template and the current frame. The second image is the SIFT matching and the third shows the results after RANSAC.
Once you have the matching between your image and each SIFT descriptor in your database, you can compute the reprojection error, namely the ratio between matched points (after RANSAC) and the total number of keypoints. This can be repeated for each image and the image with the lowest reprojection error can be declared as the match.
We used this for paintings but I think that can be generalized for every kind of image (the android logo you posted in the question is a fair example i think).
Hope this helps!

Measure of accuracy in pattern recognition using SURF in OpenCV

I’m currently working on pattern recognition using SURF in OpenCV. What do I have so far: I’ve written a program in C# where I can select a source-image and a template which I want to find. After that I transfer both pictures into a C++-dll where I’ve implemented a program using the OpenCV-SURFdetector, which returns all the keypoints and matches back to my C#-program where I try to draw a rectangle around my matches.
Now my question: Is there a common measure of accuracy in pattern recognition? Like for example number of matches in proportion to the number of keypoints in the template? Or maybe the size-difference between my match-rectangle and the original size of the template-image? What are common parameters that are used to say if a match is a “real” and “good” match?
Edit: To make my question clearer. I have a bunch of matchpoints, that are already thresholded by minHessian and distance value. After that I draw something like a rectangle around my matchpoints as you can see in my picture. This is my MATCH. How can I tell now how good this match is? I'm already calculating angle, size and color differences between my now found match and my template. But I think that is much too vague.
I am not 100% sure about what you are really asking, because what you call a "match" is vague. But since you said you already matched your SURF points and mentionned pattern recognition and the use of a template, I am assuming that, ultimately, you want to localize the template in your image and you are asking about a localization score to decide whether you found the template in the image or not.
This is a challenging problem and I am not aware that a good and always-appropriate solution has been found yet.
However, given your approach, what you could do is analyze the density of matched points in your image: consider local or global maximas as possible locations for your template (global if you know your template appears only once in the image, local if it can appear multiple times) and use a threshold on the density to decide whether or not the template appears. A sketch of the algorithm could be something like this:
Allocate a floating point density map of the size of your image
Compute the density map, by increasing by a fixed amount the density map in the neighborhood of each matched point (for instance, for each matched point, add a fixed value epsilon in the rectangle your are displaying in your question)
Find the global or local maximas of the density map (global can be found using opencv function MinMaxLoc, and local maximas can be found using morpho maths, e.g. How can I find local maxima in an image in MATLAB?)
For each maxima obtained, compare the corresponding density value to a threshold tau, to decide whether your template is there or not
If you are into resarch articles, you can check the following ones for improvement of this basic algorithm:
"ADABOOST WITH KEYPOINT PRESENCE FEATURES FOR REAL-TIME VEHICLE VISUAL DETECTION", by T.Bdiri, F.Moutarde, N.Bourdis and B.Steux, 2009.
"Interleaving Object Categorization and Segmentation", by B.Leibe and B.Schiele, 2006.
EDIT: another way to address your problem is to try and remove accidently-matched points in order to keep only those truly corresponding to your template image. This can be done by enforcing a constraint of consistancy between close matched points. The following research article presents an approach like this: "Context-dependent logo matching and retrieval", by H.Sahbi, L.Ballan, G.Serra, A.Del Bimbo, 2010 (however, this may require some background knowledge...).
Hope this helps.
Well, when you compare points you use some metric.
So results or comparison have some resulting distance.
And the less this distance is the better.
Example of code:
BFMatcher matcher(NORM_L2,true);
vector<DMatch> matches;
matcher.match(descriptors1, descriptors2, matches);
matches.erase(std::remove_if(matches.begin(),matches.end(),bad_dist),matches.end());
where bad_dist is defined as
bool dist(const DMatch &m) {
return m.distance > 150;
}
In this code i get rid of 'bad' matches.
There are many ways to match two patterns in the same image, actually it's a very open topic in computer vision, because there isn't a global best solution.
For instance, if you know your object can appear rotated (I'm not familiar with SURF, but I guess the descriptors are rotation invariant like SIFT descriptors), you can estimate the rotation between the pattern you have in the training set and the pattern you just matched with. A match with the minimum error will be a better match.
I recommend you consult Computer Vision: Algorithms and Applications. There's no code in it, but lots of useful techniques typically used in computer vision (most of them already implemented in opencv).

Estimate color distribution with Gaussian mixture model

I am trying to use two Gaussian mixtures with EM algorithm to estimate color distribution of a video frame. For that, I want to use two separate peaks in the color distribution as the two Gaussian means to facilitate the EM calculation. I have several difficulties with the implementation of these in OpenCV.
My first question is: how can I determine the two peaks? I've searched about peak estimation in OpenCV, but still couldn't find any seperate function. So I am going to determine two regions, then find their maximum values as peaks. Is this way correct?
My second question is: how to perform Gaussian mixture model with EM in OpenCV? As far as I know, the "cv::EM::predict" function could give me the index of the most probable mixture component. But I have difficulties with training EM. I've searched and found some other codes, but finding the correct parameters is too much difficult for. Could someone provide me any example code for this? Thank you in advance.
#ederman, try {OpenCV library location}\opencv\samples\cpp\em.cpp instead of the web link. I think the sample code in the link is out of date now. I have successfully compiled the sample code in OpenCV 2.3.1. It shouldn't be a problem for 2.4.2.
Good luck:)
My first question is: how can I determine the two peaks?
I would iterate through the range of sample values possible, and test when the does EM.predict(sample)[0] peaks.

Analyzing gaze tracking data

I have an image which was shown to groups of people with different domain knowledge of its content. I than recorded gaze fixation data of them watching the image.
I now kind of want to compare the results of the two groups - so what I need to know is, if there is a correlation of the positions of the sampling data between the two groups or not.
I have the original image as well as the fixation coords. Do you have any good idea how to start analyzing the data?
It's more about the idea or the plan so you don't have to be too technical on that one.
Thanks
Simple idea: render all the coordinates on the original image in a 'heat map' like way, one image for each group. You can then visually compare the images for correlation, and you have some nice graphics for in your paper.
There is something like the two-dimensional correlation coefficient. With software like R or Matlab you can do the number crunching for the correlation.
Matlab has a function for this:
Two Dimensional Correlation Function: corr2
Computes two dimensional correlation coefficient between two matrices
and the matrices must be of the same size. r = corr2 (A,B)
In gaze tracking, the most interesting data lies in two areas.
In where all people look, for that you can use the heat map Daan suggests. Make a heat map for all people, and heat maps for separate groups of people.
In when people look there. For that I would recommend you start by making heat maps as above, but for short time intervals starting from the time the picture was first shown. Again, for all people, and for the separate groups you have.
The resulting set of heat-maps, perhaps animated for the ones from the second point, should give you some pointers for further analysis.

stitching aerial images

I am trying to stitch 2 aerial images together with very little overlap, probably <500 px of overlap. These images have 3600x2100 resolution. I am using the OpenCV library to complete this task.
Here is my approach:
1. Find feature points and match points between the two images.
2. Find homography between two images
3. Warp one of the images using the homgraphy
4. Stitch the two images
Right now I am trying to get this to work with two images. I am having trouble with step 3 and possibly step 2. I used findHomography() from the OpenCV library to grab my homography between the two images. Then I called warpPerspective() on one of my images using the homgraphy.
The problem with the approach is that the transformed image is all distorted. Also it seems to only transform a certain part of the image. I have no idea why it is not transforming the whole image.
Can someone give me some advice on how I should approach this problem? Thanks
In the results that you have posted, I can see that you have at least one keypoint mismatch. If you use findHomography(src, dst, 0), it will mess up your homography. You should use findHomography(src, dst, CV_RANSAC) instead.
You can also try to use warpAffine instead of warpPerspective.
Edit: In the results that you posted in the comments to your question, I had the impression that the matching worked quite stable. That means that you should be able to get good results with the example as well. Since you mostly seem to have to deal with translation you could try to filter out the outliers with the following sketched algorithm:
calculate the average (or median) motion vector x_avg
calculate the normalized dot product <x_avg, x_match>
discard x_match if the dot product is smaller than a threshold
To make it work for images with smaller overlap, you would have to look at the detector, descriptors and matches. You do not specify which descriptors you work with, but I would suggest using SIFT or SURF descriptors and the corresponding detectors. You should also set the detector parameters to make a dense sampling (i.e., try to detect more features).
You can refer to this answer which is slightly related: OpenCV - Image Stitching
To stitch images using Homography, the most important thing that should be taken care of is finding of correspondence points in both the images. Lesser the outliers in the correspondence points, the better is the generated homography.
Using robust techniques such as RANSAC along with FindHomography() function of OpenCV(Use CV_RANSAC as option) will still generate reasonable homography provided percentage of inliers is more than percentage of outliers. Also make sure that there are at-least 4 inliers in the correspondence points that passed to the FindHomography function.