Error in calculating exact nearest neighbors in radius with FLANN - c++

I am trying to find the exact number of neighbour nodes in a big 3D points dataset. The goal is for each point of the dataset to retrieve all the possible neighbours in a region with a given radius. FLANN ensures that for lower dimensional data can retrieve the exact neighbors while comparing with brute force search it seems to not be the case. The neighbors are essential for further calculations and therefore I need the exact number. I tested increasing the radius a little bit but doesn't seem to be this the problem. Is anyone aware how to calculate the exact neighbors with FLANN or other C++ library?
The code:
// All nodes to be tested for inclusion in support domain.
flann::Matrix<double> query_nodes = flann::Matrix<double>(&nodes_pos[0].x, nodes_pos.size(), 3);
// Set default search parameters
flann::SearchParams search_parameters = flann::SearchParams();
search_parameters.checks = -1;
search_parameters.sorted = false;
search_parameters.use_heap = flann::FLANN_True;
flann::KDTreeSingleIndexParams index_parameters = flann::KDTreeSingleIndexParams();
flann::KDTreeSingleIndex<flann::L2_3D<double> > index(query_nodes, index_parameters);
index.buildIndex();
//FLANN uses L2 for radius search.
double l2_radius = (this->support_layer_*grid.spacing)*(this->support_layer_*grid.spacing);
double extension = l2_radius/10.;
l2_radius+= extension;
index.radiusSearch(query_nodes, indices, dists, l2_radius, search_parameters);

Try nanoflann. It is designed for low dimensional spaces and gives exact nearest neighbors. Furthermore, it is just one header file that you can either "install" or just copy to your project.

You should check page 6+ from the flann-manual, to fine-tune your search parameters, such as target_precision, which should be set to 1, for "maximum" accuracy.
That parameter is often found as epsilon (ε) in Approximate Nearest Neighbor Search (ANNS), which is used in high dimensional spaces, in order to (try) to beat the curse of dimensionality. FLANN is usually used in 128 dimensions, not 3, as far as I can tell, which may explain the bad performance you are experiencing.
A c++ library that works well in 3 dimensions is CGAL. However, it's much larger than FLANN, because it is a library for computational geometry, thus it provides functionality for many problems, not just NNS.

Related

Histogram Binning of Gradient Vectors

I am working on a project that has a small component requiring the comparison of distributions over image gradients. Assume I have computed the image gradients in the x and y directions using a Sobel filter and have for each pixel a 2-vector. Obviously getting the magnitude and direction is reasonably trivial and is as follows:
However, what is not clear to me is how to bin these two components in to a two dimensional histogram for an arbitrary number of bins.
I had considered something along these lines(written in browser):
//Assuming normalised magnitudes.
//Histogram dimensions are bins * bins.
int getHistIdx(float mag, float dir, int bins) {
const int magInt = reinterpret_cast<int>(mag);
const int dirInt = reinterpret_cast<int>(dir);
const int magMod = reinterpret_cast<int>(static_cast<float>(1.0));
const int dirMod = reinterpret_cast<int>(static_cast<float>(TWO_PI));
const int idxMag = (magInt % magMod) & bins
const int idxDir = (dirInt % dirMod) & bins;
return idxMag * bins + idxDir;
}
However, I suspect that the mod operation will introduce a lot of incorrect overlap, i.e. completely different gradients getting placed in to the same bin.
Any insight in to this problem would be very much appreciated.
I would like to avoid using any off the shelf libraries as I want to keep this project as dependency light as possible. Also I intend to implement this in CUDA.
This is more of a what is an histogram question? rather than one of your tags. Two things:
In a 2D plain two directions equal by modulation of 2pi are in fact the same - so it makes sense to modulate.
I see no practical or logical reason of modulating the norms.
Next, you say you want a "two dimensional histogram", but return a single number. A 2D histogram, and what would make sense in your context, is a 3D plot - the plane is theta/R, 2 indexed, while the 3D axis is the "count".
So first suggestion, return
return Pair<int,int>(idxMag,idxDir);
Then you can make a 2D histogram, or 2 2D histograms.
Regarding the "number of bins"
this is use case dependent. You need to define the number of bins you want (maybe different for theta and R). Maybe just some constant 10 bins? Maybe it should depend on the amount of vectors? In any case, you need a function that receives either the number of vectors, or the total set of vectors, and returns the number of bins for each axis. This could be a constant (10 bins) initially, and you can play with it. Once you decide on the number of bins:
Determine the bins
For a bounded case such as 0<theta<2 pi, this is easy. Divide the interval equally into the number of bins, assuming a flat distribution. Your modulation actually handles this well - if you would have actually modulated by 2*pi, which you didn't. You would still need to determine the bin bounds though.
For R this gets trickier, as this is unbounded. Two options here, but both rely on the same tactic - choose a maximal bin. Either arbitrarily (Say R=10), so any vector longer than that is placed in the "longer than max" bin. The rest is divided equally (for example, though you could choose other distributions). Another option is for the longest vector to determine the edge of the maximal bin.
Getting the index
Once you have the bins, you need to search the magnitude/direction of the current vector in your bins. If bins are pairs representing min/max of bin (and maybe an index), say in a linked list, then it would be something like (for mag for example):
bin = histogram.first;
while ( mag > bin.min ) bin = bin.next;
magIdx = bin.index;
If the bin does not hold the index you can just use a counter and increase it in the while. Also, for the magnitude the final bin should hold "infinity" or some large number as a limit. Note this has nothing to do with modulation, though that would work for your direction - as you have coded. I don't see how this makes sense for the norm.
Bottom line though, you have to think a bit about what you want. In any case all the "objects" here are trivial enough to write yourself, or even use small arrays.
I think you should arrange your bins in a square array, and then bin by vx and vy independently.
If your gradients are reasonably even you just need to scan the data first to accumulate the min and max in x and y, and then split the gradients evenly.
If the gradients are very unevenly distributed, you might want to sort the (eg) vx first and arrange that the boundaries between each bin exactly evenly divides the values.
An intermediate solution might be to obtain the min and max ignoring the (eg) 10% most extreme values.

OpenCV transformation from two images

Theres is a MATLAB example that matches two images and outputs the rotation and scale:
https://de.mathworks.com/help/vision/examples/find-image-rotation-and-scale-using-automated-feature-matching.html?requestedDomain=www.mathworks.com
My goal is to recreate this example using C++. I am using the same method of keypoint detection (Harris) and the keypoints seem to be mostly identical to the ones Matlab finds. So far so good.
cv::goodFeaturesToTrack(image_grayscale, corners, number_of_keypoints, 0.01, 5, mask, 3, true, 0.04);
for (int i = 0; i < corners.size(); i++) {
keypoints.push_back(cv::KeyPoint(corners[i], 5));
}
BRISK is used to extract features from the keypoints.
int Threshl = 120;
int Octaves = 8;
float PatternScales = 1.0f;
cv::Ptr<cv::Feature2D> extractor = cv::BRISK::create(Threshl, Octaves, PatternScales);
extractor->compute(image, mykeypoints, descriptors);
These descriptors are then matched using flannbasedmatcher.
cv::FlannBasedMatcher matcher;
matcher.match(descriptors32A, descriptors32B, matches);
Now the problem is that about 80% of my matches are wrong and unusable. For the identical set of images Matlab returns only a couple of matches from which only ~20% are wrong. I have tried sorting the Matches in C++ based on their distance value with no success. The values range between 300 and 700 and even the matches with the lowest distance are almost entirely incorrect.
Now 20% of good matches are enough to calculate the offset but a lot of processing power is wasted on checking wrong matches. What would be a better way to sort the correct matches or is there something obvious I am doing wrong?
EDIT:
I have switched from Harris/BRISK to AKAZE which seems to deliver much better features and matches that can easily be sorted by their distance value. The only downside is the much higher computation time. With two 1000px wide images AKAZE needs half a minute to find the keypoints (on a PC). I reducted this by scaling down the images which makes for an acceptable ~3-5 seconds.
The method you are using finds for each point an nearest neighbour no matter how close it is. Two strategies are common:
1. Match set A to set B and set B to A and keep only matches which exist in both matchings.
2. Use 2 knnMatch and perform a ratio check, i.e. keep only the matches where the 1 NN is a lot closer than the 2 NN, e.g.
d1 < 0.8 * d2.
The MATLAB code uses SURF. OpenCV also provides SURF, SIFT and AKAZE, try one of these. Especially SURF would be interesting for a comparison.

Function to determine all local maxima of a histogram

Is there an OpenCV function that can give me a list of all the local maxima for a histogram? Maybe there is a function that lets me specify a minimum peak/threshold and will tell me the bins of all those local maxima above that threshold.
If not, is there a function that can sort the bins from highest(most frequent) to lowest (least frequent). I can then grab all the first 20 or so bins and I have my 20 biggest local maxima.
Opencv minMaxLoc can be used in this context with a sliding window. If the location of the maxima is on an edge then ignore the maxima, otherwise record as maxima. You can use something like the function below (Note: this code is more like psuedocode it has not been tested)
/**
* Assumes a 1 channel histogram
*/
vector<int> findMaxima(Mat histogram, int windowsize, int histbins){
vector<int> maximas;
int lastmaxima;
for(int i = 0; i < histbins - windowsize; i++){
//Just some Local variables, only maxloc and maxval are used.
int maxval,minval;
Point* maxloc, maxloc;
//Crop the windows
Rect window(i,0,windowsize,1);
//Get the maxima
minMaxLoc(histogram(window), minval,maxval,maxloc,minloc);
//Check if its not on the side
if(maxloc.x != 0&&maxloc.x != windowsize-1){
//Translate from cropped window into real position
int originalposition = maxloc.x+i;
//Check that this is a new maxima and not already recorded
if(lastmaxima != originalposition){
maximas.push(originalposition);
lastmaxima = originalposition;
}
}
}
return maximas;
}
Of course this is a very simplistic system. You might want to use a multiscale approach with different sliding window sizes. You may also need to apply gaussian smoothing depending on your data. Another approach could be to run this for a small window size like 3 or 4 (you need a mimimum of 3). Then you could use something else for non maxima-suppression.
For your approach in which you suggested
Maybe there is a function that lets me specify a minimum peak/threshold and will tell me the bins of all those local maxima above that threshold.
You could simply perform a threshold before finding the maxima with the above function.
threshold(hist,res ...aditional parameters...);
vector<int> maximas = findMaximas(hist, ...other parameters...);
AFAIK OpenCV doesn't have such functionality, but it is possible do implement something similar yourself.
In order to sort histogram bins you can possibly use sortIdx, but as a result you will obtain list of largest bins, which is different than local maxima (those should be "surrounded" by smaller values).
To obtain local maxima you can compare each bin with its neighbors (2 in 1D case). A bin should be larger than neighbors with some margin to be considered a local maximum.
Depending on the size of the bins, you may want to filter the histogram before this step (for example convolve it with Gaussian kernel), since otherwise you'd obtain too much of these maxima, especially for small bin sizes. If you've used Gaussian kernel - it's sigma would be related to the size of the neighborhood in which detected local maxima are "global".
Once you detect those points - you may want to perform non-maximal suppression, to replace groups of points that lie very close together with a single point. A simple strategy for that would be to sort those maxima according to some criteria (for example difference with neighbors), then take one maximum and remove all the points in its neighborhood (its size can be related the the Gaussian kernel sigma), take next remaining maximum and again remove points in its neighborhood and so on until you run out of points or go below some meaningful difference values.
Finally, you may want to sort remaining candidate points by their absolute values (to get "largest" local maxima), or by their differences with neighbors (to get "sharpest" ones).
You may try another approach. We can use this definition of local maximum to implement a simpler algorithm: just move a sliding window of size S along the histogram and pick maximum in each position. This will have some problems:
in locations with prominent maximum multiple window positions will generate points that correspond to the same maximum (can be fixed with non maximum suppression),
in locations with no or small variation it will return
semi-random maxima (can be fixed with threshold on variance in
window or difference between maximum and neighborhood),
in regions with monotonic histogram it will return a largest value (which is not necessarily a maximum).
Once you perform all the "special case" handling - those 2 approaches would be quite similar I believe.
Another thing to implement may be "multi scale" approach, which can be considered as an extension if those 2. Basically it boils down to detecting local maxima for different neighborhood sizes, and then storing them all along with corresponding neighborhood size, which can be helpful for some purposes.
As you can see, this is a quite vague guide, and there's a reason for that: the type and amount of local maximas you want to get will most likely depend on the problem you have in mind. There's no hard and easy rule to decide if the point should be considered a local maxima, so you should probably start with some simple approach and then refine it for your specific case.

Backpropagation 2-Dimensional Neuron Network C++

I am learning about Two Dimensional Neuron Network so I am facing many obstacles but I believe it is worth it and I am really enjoying this learning process.
Here's my plan: To make a 2-D NN work on recognizing images of digits. Images are 5 by 3 grids and I prepared 10 images from zero to nine. For Example this would be number 7:
Number 7 has indexes 0,1,2,5,8,11,14 as 1s (or 3,4,6,7,9,10,12,13 as 0s doesn't matter) and so on. Therefore, my input layer will be a 5 by 3 neuron layer and I will be feeding it zeros OR ones only (not in between and the indexes depends on which image I am feeding the layer).
My output layer however will be one dimensional layer of 10 neurons. Depends on which digit was recognized, a certain neuron will fire a value of one and the rest should be zeros (shouldn't fire).
I am done with implementing everything, I have a problem in computing though and I would really appreciate any help. I am getting an extremely high error rate and an extremely low (negative) output values on all output neurons and values (error and output) do not change even on the 10,000th pass.
I would love to go further and post my Backpropagation methods since I believe the problem is in it. However to break down my work I would love to hear some comments first, I want to know if my design is approachable.
Does my plan make sense?
All the posts are speaking about ranges ( 0->1, -1 ->+1, 0.01 -> 0.5 etc ), will it work for either { 0 | .OR. | 1 } on the output layer and not a range? if yes, how can I control that?
I am using TanHyperbolic as my transfer function. Does it make a difference between this and sigmoid, other functions.. etc?
Any ideas/comments/guidance are appreciated and thanks in advance
Well, by the description given above, I think that the design and approach taken it's correct! With respect to the choice of the activation function, remember that those functions help to get the neurons which have the largest activation number, also, their algebraic properties, such as an easy derivative, help with the definition of Backpropagation. Taking this into account, you should not worry about your choice of activation function.
The ranges that you mention above, correspond to a process of scaling of the input, it is better to have your input images in range 0 to 1. This helps to scale the error surface and help with the speed and convergence of the optimization process. Because your input set is composed of images, and each image is composed of pixels, the minimum value and and the maximum value that a pixel can attain is 0 and 255, respectively. To scale your input in this example, it is essential to divide each value by 255.
Now, with respect to the training problems, Have you tried checking if your gradient calculation routine is correct? i.e., by using the cost function, and evaluating the cost function, J? If not, try generating a toy vector theta that contains all the weight matrices involved in your neural network, and evaluate the gradient at each point, by using the definition of gradient, sorry for the Matlab example, but it should be easy to port to C++:
perturb = zeros(size(theta));
e = 1e-4;
for p = 1:numel(theta)
% Set perturbation vector
perturb(p) = e;
loss1 = J(theta - perturb);
loss2 = J(theta + perturb);
% Compute Numerical Gradient
numgrad(p) = (loss2 - loss1) / (2*e);
perturb(p) = 0;
end
After evaluating the function, compare the numerical gradient, with the gradient calculated by using backpropagation. If the difference between each calculation is less than 3e-9, then your implementation shall be correct.
I recommend to checkout the UFLDL tutorials offered by the Stanford Artificial Intelligence Laboratory, there you can find a lot of information related to neural networks and its paradigms, it's worth to take look at it!
http://ufldl.stanford.edu/wiki/index.php/Main_Page
http://ufldl.stanford.edu/tutorial/

To implement FlannBasedMatcher

I am doing a project on face recognition from video images.I extracted the features,now I need to compare the feature.So I found FlannBasedMatcher is a good method, also it is very fast.FlannBasedMatcher is already in the opencv (I am using opencv),but like to implement it myself with out any opencv help.Please help me to find what is exactly happening inside FlannBasedMatcher.Any response will be greatly appreciated.
Features are typically compared using some distance metrics such as Euclidian distance between features that are considered to be points in some multi-demnsional space; one can use the angle between two vectors (that is feature vectors) that is independent of vector scaling; one can use a Humming distance for comparing binary strings, etc. The best way depends on the structure and the meaning of your feature vector. For faces it can be an angle between two vectors expressed through a dot product.
Now, flann is used for finding nearest neighbors and as such is not directly related to feature comparison though it can help to speed up finding similar features that are worth comparison (flann=fast library for nearest neighbors). Thus you won’t need to search through all your vectors trying to select the one that has highest dot product with the query vector, but instead directly compare a given face (vector) with just a few closest faces (vectors).
Finally, addressing a previous answer, in some cases one can use sparse arrays instead of KD trees. They are part of openCV too but can be implented through hash tables or trees. In sparse arrays you can check indices of neighboring elements which is analogous to flann nearest neighbors. Of course, sparse arrays are more limited than flann - for example, they require an exhaustive search in the neighborhood to get a nearest neighbors list but this is still faster than global search. Here is an example:
int dims = 3;
int sz[] = {1000, 1000, 1000}; // memory efficient
SparseMat M3d(dims, sz, CV_32F);
Point3i idx_sparse;
Vec3f p;
//set the element of a sparse 3D Mat
M3d.ref<Vec3f>(idx_sparse.x, idx_sparse.y, idx_sparse.z) = p;
// iterate
SparseMatIterator it = M3d.begin();
SparseMatIterator it_end = M3d.end();
for (; it != it_end; ++it) {
// access existing element through iterator
Vec3f vec = it.value<Vec3f>();
// check neighbors if they exist
int* idx = it.node()->index;
idx[0]++; idx[1]--; idx[2]+=2;
if (M3d.find(idx) != M3d.end()) {
Vec3f vec = M3d.ref<Vec3f>(idx);
}
}
It is not that easy. You have to implement kd-tree with aproximated nearest neighbor search. It is described in paper "An Optimal Algorithm for Approximate Nearest
Neighbor Searching in Fixed Dimensions" by Arya et al.
If you don`t want to do it from the scratch and just want to get rid of OpenCV, you can take original FLANN implementation.