I'm working on code that computes dense SIFT features from a set of images, based on SIFT flow: http://people.csail.mit.edu/celiu/SIFTflow/
I'd like to try building a FLANN index on these images by comparing the "energy" between each image in SIFT flow representation.
I have the code to compute the energy from here: http://richardt.name/publications/video-deanaglyph/
Is there a way to create my own distance function for the indexing?
RELATED NOTE:
I was finally able to get an alternate (but not custom) distance function working with flann::Index. The trick is you need to use flann::GenericIndex like so:
flann::GenericIndex<cvflann::ChiSquareDistance<int>> flannIndex(descriptors, cvflann::KDTreeIndexParams());
But you need to give it CV_32S descriptors.
And if you use knnSearch with custom distance function, you have to provide a CV_32S results Mat and CV_32F distances Mat.
Here's my full code in case it's helpful (not a lot of documentation out there):
Mat samples;
loadDescriptors(samples); // loading descriptors from .yml file
samples *= 100000; // scaling up my descriptors to be int
samples.convertTo(samples, CV_32S); // convert float to int
// create flann index
flann::GenericIndex<cvflann::ChiSquareDistance<int>> flannIndex(samples, cvflann::KDTreeIndexParams());
// NOTE lack of distance type in constructor parameters
// (unlike flann::index)
// now try knnSearch
int k=10; // find 10 nearest neighbors
Mat results(1,10,CV_32S), dists(1,10,CV_32F);
// (1,10) Mats for the output, types CV_32S and CV_32F
Mat responseHistogram;
responseHistogram = samples.row(60);
// choose a random row from the descriptors Mat
// to find nearest neighbors
flannIndex.knnSearch(responseHist, results, dists, k, cvflann::SearchParams(200) );
cout << results << endl;
cout << dists << endl;
flannIndex.save(ofToDataPath("indexChi2.txt"));
Using Chi Squared actually seems to work better for me than L2 distance. My feature vectors are BoW histograms in this case.
Related
I want to calculate a (ranged) histogram of a cv::GpuMat image of type CV_32FC1 using OpenCV 3.4.7. Speed optimization is my major concern.
I read the documentation (https://docs.opencv.org/3.4.7/d8/d0e/group__cudaimgproc__hist.html) of histogram functions in the namespace cv::cuda and found that, given the cv::GpuMat image were integer valued of type CV_8U, CV_16U, or CV_16S, cv::cuda::histRange would be the function of choice. What would be the analogous way for a floating point valued cv::GpuMat image of type CV_32FC1?
The only way I can think of is to either download the data to CPU memory, do the CPU variant cv::histRange (which supports cv::Mat of type CV_32F), and upload back to GPU memory or to do a quantization (scaling) and type conversion on GPU memory.
Is there a way to circumvent the overhead?
Thanks #timo for your comment and thanks #Gehová for your answer.
After reading into the source code as #timo suggested I found out that CV_32F is supported albeit it's not stated in the documentation.
Suppose you have some cv::cuda::GpuMat image_gpu of type CV_32FC1, e.g. created by
cv::cuda::GpuMat image_gpu(cv::Size(image_height, image_width), CV_32FC1);
then you can straight forwardly calculate a ranged histogram. I give an example which detects minimal and maximal value of (non-constantly valued) image_gpu at the device and downloads those two values to the host, creates an evenly distributed binning vector between min and max at the host, uploads that binning vector to the device and then calculates the ranged histogram at the device using cv::cuda::histRange().
// set number of bins
int num_bins = 100;
// detect min and max of image_gpu
double min_val, max_val;
cv::cuda::minMax(image_gpu, &min_val, &max_val);
// create binning vector at host
float bin_width = static_cast<float>(max_val - min_val) / num_bins;
cv::Mat_<float> bin_edges(1, num_bins + 1);
for (int bin_index = 0; bin_index < num_bins + 1; bin_index++)
{
bin_edges.at<float>(0, bin_index) = static_cast<float>(min_val) + bin_index * bin_width;
}
// make the histogram calculation inclusive regarding the range [min_val, max_val]
bin_edges.at<float>(0, num_bins) += 1E-08F;
// upload binning vector from host to device
cv::cuda::GpuMat bin_edges_gpu;
bin_edges_gpu.create(1, num_bins + 1, CV_32FC1);
bin_edges_gpu.upload(bin_edges, cuda_stream);
cuda_stream.waitForCompletion();
cv::cuda::GpuMat absolute_histogram_gpu;
absolute_histogram_gpu.create(1, num_bins, CV_32SC1);
// calculate the absolute histogram of image_gpu at the device using OpenCV's cuda implementation
cv::cuda::histRange(image_gpu, absolute_histogram_gpu, bin_edges_gpu, cuda_stream);
cuda_stream.waitForCompletion();
// download the absolute histogram of image_gpu from device to host
cv::Mat_<int32_t> absolute_histogram(1, num_bins);
absolute_histogram_gpu.download(absolute_histogram, cuda_stream);
cuda_stream.waitForCompletion();
Create a wrapper for the function nppiHistogramRange_32f_C1R. You can read the code for the opencv function you already mentioned.
I am trying to use SIFT descriptors that are directly used for image classification. The SIFT is defined by: Ptr<SIFT> sift = SIFT::create(100). Then I expect there would be 100 keypoints to be extracted. But the number of actually detected keypoints (sift->detect(img_resiz,keypoints)) is not always 100 (sometimes exceeding the preset value). How could that happen?
I want to have the fixed number of keypoints per image so as to produce the consistent length of descriptors (after being reshaped into a row vector) among different images (alternatively I may need more processing based on the bag-of-word to represent the sift descriptors into the same dimension).
There was an error in the function KeyPointsFilter::retainBest(std::vector<KeyPoint>& keypoints, int n_points) as you can see here: https://github.com/opencv/opencv/commit/3f3c8823ac22e34a37d74bc824e00a807535b91b.
I could reproduce the error with an older version of OpenCV (3.4.5) and sometimes you had 1 more KeyPoint than expected e.g. 101 instead of 100 because of that marked line.
If you don't want to switch to a newer OpenCV version you could do something like:
// Detect SIFT keypoints
std::vector<cv::KeyPoint> keypoints_sift, keypoints_sift_100;
cv::Ptr<cv::xfeatures2d::SiftFeatureDetector> sift = cv::xfeatures2d::SiftFeatureDetector::create(100);
sift->detect(img, keypoints_sift);
std::cout << keypoints_sift.size() << std::endl;
for (size_t i = 0; i < 100; ++i) {
keypoints_sift_100.push_back(keypoints_sift[i]);
}
So you would keep the 100 best keypoints after detection since they are ranked by their scores https://docs.opencv.org/4.1.0/d5/d3c/classcv_1_1xfeatures2d_1_1SIFT.html.
I'm trying to multiply two images of different models, in my case HSV and YCRCB.
I get the "vector is out of bound error" every time.
I have checked the sizes of the input images being multiplied, the number of rows and columns. I know the value is exceeding over 255.
I tried to implement this method opencv - image multiplication, but the code has way to many MAT's that have to be initialized. This also leads me to ask the question if images with more than 1 channel can be multiplied. Also tried direct multiplication and it doesn't work, so tried multiplying channel wise. To make things easier, I used the loop method but then the error occured.
A short summary about the code and reason for doing it : I'm using it for skin detection but want to further reduce noise. I think this can be done by multiplying the 2 output images generated by the threshold operations (for HSV & YCRCB). Since these images have different noises in the image, the output of the multiplication will have even less noise (I have seen the output on different screens, the overlapping regions are very small) hence this can detect skin color at all almost times and noise will be minimal and thus will help in tracking skin better.
The code given below is not complete cause it never executes till the end. After this there are morphological and dilation operations being done, that's it.
This is my first time asking a question on Stack Overflow and I'm still learning Open CV . Sorry If I have been over-descriptive and all suggestions are welcome. Thank You.
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv\cv.h>
#include <opencv\highgui.h>
#include <iostream>
#include <opencv2\imgproc\imgproc.hpp>
using namespace cv;
using namespace std;
char key;
Mat image,hsv,ycr;
vector<Mat> channels,ycrs,threshold_output;
int main()
{
VideoCapture cap(0); // open the default camera
if(!cap.isOpened()) // check if we succeeded
{
cout << "Cannot open the web cam" << endl;
return -1;
}
while(1)
{
cap>>image;
cvtColor( image, ycr, CV_BGR2YCrCb ); //Converts into YCRCB
cvtColor( image, hsv, CV_BGR2HSV ); //Converts into HSV
Mat imgThresholded;
Mat imgThresholded1;
inRange(ycr, Scalar(0, 140,105 ), Scalar(255, 165,135), imgThresholded1); //for yrcrcb range
inRange(hsv, Scalar(0, 48,150 ), Scalar(20, 150,255), imgThresholded); //for hsv range
split(imgThresholded1, channels);
split(imgThresholded, ycrs);
for( int i = 0; i <3 ; i++ )
{
multiply(channels[i],ycrs[i], threshold_output[i], 1,-1 );
}//code breaks here
Even if the input to inRange are multi-channeled, the output of inRange will be a single-channel CV_8UC1.
The reason is that inRange computes a Cartesian intersection:
Result (x, y) is true (uchar of 255) if ALL of these are true:
For first channel, lower[0] <= img(x, y)[0] <= upper[0], AND
For second channel, lower[1] <= img(x, y)[1] <= upper[1], AND
And so on.
In other words, after it has checked each channel's pixel values against the lower and upper bound, the logical result is then "boiled down" Logical-And operation over the channels of the image.
"Boiled down" is my colloquial way of referring to reduction, or fold, where a function can accept arbitrary number of arguments and it can be "reduced" down to a single value. Summation, multiplication, string concatenation, etc.
It is therefore not necessary to use cv::split on the output of cv::inRange. In fact, because the output has only one channel, calling channels[1] or ycrs[1] will be an undefined behavior, which will either cause an exception for a debug-build and undefined behavior or crash or memory corruption for a release-build.
What I am doing is after pre-processing the image (by thresholding) find contours of the Image.
And I want to get the Discrete Fourier Descriptor of each contours (using dft() function)
My code follows,
vector<Mat> contourLines1;
vector<Mat> contourLines2;
getContourLine(exC1, contourLines1, binThreshold, numOfErosions);
getContourLine(exC2, contourLines2, binThreshold, numOfErosions);
// calculate fourier descriptor
Mat fd1 = makeFD(contourLines1.front());
Mat fd2 = makeFD(contourLines2.front());
/////////////////////////
void getContourLine(Mat& img, vector<Mat>& objList, int thresh, int k){
threshold(img,img,thresh,255,THRESH_BINARY);
erode(img,img,0,cv::Point(-1,-1),k);
cv::findContours(img,objList,CV_RETR_LIST,CV_CHAIN_APPROX_SIMPLE);
}
/////////////////////////
Mat makeFD(Mat& contour){
Mat result;
dft(contour,result,DFT_ROWS);
return result;
}
What is the problem??? I can't find it.. I think the type of parameters of functions (such as cv::finContours or dft ) is wrong....
Output of findContours is vector< vector< Point > >. You are providing vector< Mat>. This is a legitimate use (although a bit obscure), but you have to remember that the type elements in matrix is 'int'. DFT on the other hand works only with matrices of floats. This is what causes the crash. You can use convertTo function to create matrices of proper type.
Also I am not sure that the output will have any meaning to whatever computation you are doing. As far as I know Fourier transform supposed to work with signal, not with coordinates that are extracted from it.
Just a stylistic remark: cleaner way to perform same threshold is
img = (img > thresh);
I have some problems with opencv flann::Index -
I'm creating index
Mat samples = Mat::zeros(vfv_net_quie.size(),24,CV_32F);
for (int i =0; i < vfv_net_quie.size();i++)
{
for (int j = 0;j<24;j++)
{
samples.at<float>(i,j)=(float)vfv_net_quie[i].vfv[j];
}
}
cv::flann::Index flann_index(
samples,
cv::flann::KDTreeIndexParams(4),
cvflann::FLANN_DIST_EUCLIDEAN
);
flann_index.save("c:\\index.fln");
A fter that I'm tryin to load it and find nearest neiborhoods
cv::flann::Index flann_index(Mat(),
cv::flann::SavedIndexParams("c:\\index.fln"),
cvflann::FLANN_DIST_EUCLIDEAN
);
cv::Mat resps(vfv_reg_quie.size(), K, CV_32F);
cv::Mat nresps(vfv_reg_quie.size(), K, CV_32S);
cv::Mat dists(vfv_reg_quie.size(), K, CV_32F);
flann_index.knnSearch(sample,nresps,dists,K,cv::flann::SearchParams(64));
And have access violation in miniflann.cpp in line
((IndexType*)index)->knnSearch(_query, _indices, _dists, knn,
(const ::cvflann::SearchParams&)get_params(params));
Please help
You should not load the flann-file into a Mat(), as it is the place where the index is stored. It is a temporary object destroyed after the constructor was called. That's why the index isn't pointing anywhere useful when you call knnSearch().
I tried following:
cv::Mat indexMat;
cv::flann::Index flann_index(
indexMat,
cv::flann::SavedIndexParams("c:\\index.fln"),
cvflann::FLANN_DIST_EUCLIDEAN
);
resulting in:
Reading FLANN index error: the saved data size (100, 64) or type (5) is different from the passed one (0, 0), 0
which means, that the matrix has to be initialized with the correct dimensions (seems very stupid to me, as I don't necessarily know, how many elements are stored in my index).
cv::Mat indexMat(samples.size(), CV_32FC1);
cv::flann::Index flann_index(
indexMat,
cv::flann::SavedIndexParams("c:\\index.fln"),
cvflann::FLANN_DIST_EUCLIDEAN
);
does the trick.
In the accepted answer is somehow not clear and misleading why the input matrix in the cv::flann::Index constructor must have the same dimension as the matrix used for generating the saved Index. I'll elaborate on #Sau's comment with an example.
KDTreeIndex was generated using as input a cv::Mat sample, and then saved. When you load it, you must provide the same sample matrix to generate it, something like (using the templated GenericIndex interface):
cv::Mat sample(sample_num, sample_size, ... /* other params */);
cv::flann::SavedIndexParams index_params("c:\\index.fln");
cv::flann::GenericIndex<cvflann::L2<float>> flann_index(sample, index_params);
L2 is the usual Euclidean distance (other types can be found in opencv2/flann/dist.h).
Now the index can be used as shown the find the K nearest neighbours of a query point:
std::vector<float> query(sample_size);
std::vector<int> indices(K);
std::vector<float> distances(K);
flann_index.knnSearch(query, indices, distances, K, cv::flann::SearchParams(64));
The matrix indices will contain the locations of the nearest neighbours in the matrix sample, which was used at first to generate the index. That's why you need to load the saved index with the very matrix used to generate the index, otherwise the returned vector will contain indices pointing to meaningless "nearest neighbours".
In addition you get a distances matrix containing how far are the found neighbours from your query point, which you can later use to perform some inverse distance weighting, for example.
Please also note that sample_size has to match across sample matrix and query point.