Using SIFT Detector and Extractor, with FlannBased Matcher, and the Dictionary set up for the BOWKMeansTrainer like this:
TermCriteria termCrit(CV_TERMCRIT_ITER, 100, 0.001);
int dictionarySize = 15; // -- Same as number of images given in
int retries = 1;
int flags = KMEANS_PP_CENTERS;
BOWKMeansTrainer trainBowTrainer(dictionarySize, termCrit, retries, flags);
the array size of the Clustered Extracted Keypoints will come out as [128 x 15].
Then when using the BOWImgDescriptorExtractor as the Extractor on a different set of 15 Images, with the previously extracted array as its Vocabulary, the array comes out at [15 x 15].
Why?
I can't find that much on how this all actually works, rather just where to put it and what values to give.
The result should always be [n x 15] if you have n images and k=15.
But in the first run, you looked at the vocabular, not the feature representation of the first images. The 128 you are seeing there is the SIFT dimensionality; these are 15 "typical" SIFT vectors; they are not descriptions of your images.
You need to read up on the BoW model, and why the outcome should ways be a vector of length k (potentially sparse, i.e. with many 0s) for each image. I have the impression you expect this approach to produce one 128-dimensional feature vector for each image. Also, k=15 is probably too small; and the training data set is too small as well.
Related
Theres is a MATLAB example that matches two images and outputs the rotation and scale:
https://de.mathworks.com/help/vision/examples/find-image-rotation-and-scale-using-automated-feature-matching.html?requestedDomain=www.mathworks.com
My goal is to recreate this example using C++. I am using the same method of keypoint detection (Harris) and the keypoints seem to be mostly identical to the ones Matlab finds. So far so good.
cv::goodFeaturesToTrack(image_grayscale, corners, number_of_keypoints, 0.01, 5, mask, 3, true, 0.04);
for (int i = 0; i < corners.size(); i++) {
keypoints.push_back(cv::KeyPoint(corners[i], 5));
}
BRISK is used to extract features from the keypoints.
int Threshl = 120;
int Octaves = 8;
float PatternScales = 1.0f;
cv::Ptr<cv::Feature2D> extractor = cv::BRISK::create(Threshl, Octaves, PatternScales);
extractor->compute(image, mykeypoints, descriptors);
These descriptors are then matched using flannbasedmatcher.
cv::FlannBasedMatcher matcher;
matcher.match(descriptors32A, descriptors32B, matches);
Now the problem is that about 80% of my matches are wrong and unusable. For the identical set of images Matlab returns only a couple of matches from which only ~20% are wrong. I have tried sorting the Matches in C++ based on their distance value with no success. The values range between 300 and 700 and even the matches with the lowest distance are almost entirely incorrect.
Now 20% of good matches are enough to calculate the offset but a lot of processing power is wasted on checking wrong matches. What would be a better way to sort the correct matches or is there something obvious I am doing wrong?
EDIT:
I have switched from Harris/BRISK to AKAZE which seems to deliver much better features and matches that can easily be sorted by their distance value. The only downside is the much higher computation time. With two 1000px wide images AKAZE needs half a minute to find the keypoints (on a PC). I reducted this by scaling down the images which makes for an acceptable ~3-5 seconds.
The method you are using finds for each point an nearest neighbour no matter how close it is. Two strategies are common:
1. Match set A to set B and set B to A and keep only matches which exist in both matchings.
2. Use 2 knnMatch and perform a ratio check, i.e. keep only the matches where the 1 NN is a lot closer than the 2 NN, e.g.
d1 < 0.8 * d2.
The MATLAB code uses SURF. OpenCV also provides SURF, SIFT and AKAZE, try one of these. Especially SURF would be interesting for a comparison.
I am quite new to both OpenCV and Support Vector Machines. I want to use SVM to train a dataset with two labels and then predict the label of a given set. My current set contains about 600 rows with equal class distributions (300 for 1 and 300 for -1) containing 34 columns.
This is my current code for setting up OpenCV's SVM. I am using OpenCV 3.0.0
// trainingData is an int array with size 600x34
// labels is an int array with size 600, they're the labels corresponding to the trainingData rows
cv::Mat trainingDataMat(600, 34, CV_32FC1, trainingData);
cv::Mat labelsMat(600, 1, CV_32SC1, labels);
cv::Ptr<cv::ml::SVM> svm = cv::ml::SVM::create();
cv::Ptr<cv::ml::TrainData> tempData = cv::ml::TrainData::create(trainingDataMat, cv::ml::ROW_SAMPLE, labelsMat);
svm->setType(cv::ml::SVM::C_SVC);
svm->setKernel(cv::ml::SVM::RBF);
svm->setTermCriteria(cv::TermCriteria(cv::TermCriteria::MAX_ITER, 100, 0.001));
// Assign the SVM parameters to the most accurate result
svm->trainAuto(tempData);
// Train the SVM
svm->train(trainingDataMat, cv::ml::ROW_SAMPLE, labelsMat);
// predictRow contains a row of data with 34 columns to predict against the SVM Model
cv::Mat sampleMat(1, 34, CV_32FC1, predictRow);
// Prediction
float response = svm->predict(sampleMat);
std::cout << response << std::endl;
The SVM training seems to work fine. But when I predict a row, response always shows "1" no matter how the input looks like. Even when I try to predict using training rows with "-1" label I used earlier, the response is still "1".
I tried to increase the max iteration parameter for the termination criteria to a large number. The training process takes more time but the results are still the same.
I tried to use the libsvm library (https://www.csie.ntu.edu.tw/~cjlin/libsvm/) to see if the same behavior occurs. Interestingly, it worked well. I use the Windows "svm-train.exe" and "svm-predict.exe" command to validate it and the responses are accurate.
I even tried to run the executables on the OpenCV program by using some dirty system calls and file I/O. The resulting responses using the training rows are correct.
I suspect there is something wrong with the my SVM parameters. Even by using train_auto function, the SVM model still shows strange behaviour. I wonder if anyone can help me setting the SVM parameters correctly in OpenCV 3.0?
I have come across one problem when trying to train data with SVM.
I get some different regions (set of connected pixels) from face images, and regions from eyes are very similar, so I want to use Hu moments for shape description and SVM for training.
But SVM does not work properly, method svm.predict evaluates afterwards everything as non-eye, moreover the same regions which were labeled and used in traning phase as eye, are evaluated as non-eye.
Feature data consists only of 7 Hu moments. I will post here some samples of source code in a moment, thanks in advance :)
Additional info:
input image:
http://i.stack.imgur.com/GyLO0.png
Setting up basic svm for 1 image:
int image_regions = 10;
Mat training_mat(image_regions ,7,CV_32FC1); // 7 hu moments
Mat labels(image_regions ,1,CV_32FC1); // for labels 1 (eye) and -1 (non eye)
// computing hu moments
Moments moments2=moments(croppedImage,false);
double hu[7];
HuMoments(moments2,hu);
// putting them into svm traning mat
for (int k=0;k<huCounter;k++)
training_mat.at<float>(counter,k) = hu[k]; // counter is current number of region
if (isEye(...))
{
labels.at<float>(counter,0)=1.0;
}
else
{
labels.at<float>(counter,0)=-1.0;
}
//I use the following:
CvSVM svm;
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::LINEAR;
params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 1000, 1e-6);
// ... do the above mentioned phase, and then:
svm.train(training_mat, labels, Mat(), Mat(), params);
I hope the following suggestions can help you…..
The simplest task is to use a clustering algorithm and try to cluster the data into two classes. If an algorithm like ‘k-means’ can do the job why make things complex by using SVM and Neural Nets. I suggest you use this technique because your feature vector dimension is of a very small size (7 Hu Moments) as well as your number of samples.
Perform feature Normalization (specified in point 4) to make sure the values fall in a limited range.
Check out “is your data really separable?” As your data is small, take a few samples from positive images and a few samples from negative images and plot the feature vectors. If you can visually see the difference surely any learning algorithm can do the job for you. As I said earlier simple tricks can do better than complex math.
Only if you then decide to use SVM you should know the following:
• As I can see from your code you are using a Linear SVM, may be your data is non-separable by a linear kernel. Try using some polynomial kernel or other kernels. There is one option bool CvSVM::train_auto in openCV just have a look.
• Try to check whether the feature vector values you are getting are proper values or not (make sure that they are not some garbage values).
• Also you can perform feature normalization “ZERO MEAN and UNIT VARIENCE” before you use it for training.
• Most importantly increase the number of images for training, both positively and negatively labeled.
• Last but not least SVM is not magic, at the end of the day it is just drawing a line between two sets of points. So don’t expect it to classify anything you give it as input.
If nothing works “Just improve your feature extraction technique”
I am having difficulty with reading an image, extracting features for training, and testing on new images in OpenCV using SVMs. can someone please point me to a great link? I have looked at the OpenCV Introduction to Support Vector Machines. But it doesn't help with reading in images, and I am not sure how to incorporate it.
My goals are to classify pixels in an image. These pixel would belong to a curves. I understand forming the training matrix (for instance,
image A
1,1 1,2 1,3 1,4 1,5
2,1 2,2 2,3 2,4 2,5
3,1 3,2 3,3 3,4 3,5
I would form my training matrix as a [3][2]={ {1,1} {1,2} {1,3} {1,4} {1,5} {2,1} ..{} }
However, I am a little confuse about the labels. From my understanding, I have to specify which row (image) in the training matrix corresponds, which corresponds to a curve or non-curve. But, how can I label a training matrix row (image) if there are some pixels belonging to the curve and some not belonging to a curve. For example, my training matrix is [3][2]={ {1,1} {1,2} {1,3} {1,4} {1,5} {2,1} ..{} }, pixels {1,1} and {1,4} belong to the curve but the rest does not.
I've had to deal with this recently, and here's what I ended up doing to get SVM to work for images.
To train your SVM on a set of images, first you have to construct the training matrix for the SVM. This matrix is specified as follows: each row of the matrix corresponds to one image, and each element in that row corresponds to one feature of the class -- in this case, the color of the pixel at a certain point. Since your images are 2D, you will need to convert them to a 1D matrix. The length of each row will be the area of the images (note that the images must be the same size).
Let's say you wanted to train the SVM on 5 different images, and each image was 4x3 pixels. First you would have to initialize the training matrix. The number of rows in the matrix would be 5, and the number of columns would be the area of the image, 4*3 = 12.
int num_files = 5;
int img_area = 4*3;
Mat training_mat(num_files,img_area,CV_32FC1);
Ideally, num_files and img_area wouldn't be hardcoded, but obtained from looping through a directory and counting the number of images and taking the actual area of an image.
The next step is to "fill in" the rows of training_mat with the data from each image. Below is an example of how this mapping would work for one row.
I've numbered each element of the image matrix with where it should go in the corresponding row in the training matrix. For example, if that were the third image, this would be the third row in the training matrix.
You would have to loop through each image and set the value in the output matrix accordingly. Here's an example for multiple images:
As for how you would do this in code, you could use reshape(), but I've had issues with that due to matrices not being continuous. In my experience I've done something like this:
Mat img_mat = imread(imgname,0); // I used 0 for greyscale
int ii = 0; // Current column in training_mat
for (int i = 0; i<img_mat.rows; i++) {
for (int j = 0; j < img_mat.cols; j++) {
training_mat.at<float>(file_num,ii++) = img_mat.at<uchar>(i,j);
}
}
Do this for every training image (remembering to increment file_num). After this, you should have your training matrix set up properly to pass into the SVM functions. The rest of the steps should be very similar to examples online.
Note that while doing this, you also have to set up labels for each training image. So for example if you were classifying eyes and non-eyes based on images, you would need to specify which row in the training matrix corresponds to an eye and a non-eye. This is specified as a 1D matrix, where each element in the 1D matrix corresponds to each row in the 2D matrix. Pick values for each class (e.g., -1 for non-eye and 1 for eye) and set them in the labels matrix.
Mat labels(num_files,1,CV_32FC1);
So if the 3rd element in this labels matrix were -1, it means the 3rd row in the training matrix is in the "non-eye" class. You can set these values in the loop where you evaluate each image. One thing you could do is to sort the training data into separate directories for each class, and loop through the images in each directory, and set the labels based on the directory.
The next thing to do is set up your SVM parameters. These values will vary based on your project, but basically you would declare a CvSVMParams object and set the values:
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::POLY;
params.gamma = 3;
// ...etc
There are several examples online on how to set these parameters, like in the link you posted in the question.
Next, you create a CvSVM object and train it based on your data!
CvSVM svm;
svm.train(training_mat, labels, Mat(), Mat(), params);
Depending on how much data you have, this could take a long time. After it's done training, however, you can save the trained SVM so you don't have to retrain it every time.
svm.save("svm_filename"); // saving
svm.load("svm_filename"); // loading
To test your images using the trained SVM, simply read an image, convert it to a 1D matrix, and pass that in to svm.predict():
svm.predict(img_mat_1d);
It will return a value based on what you set as your labels (e.g., -1 or 1, based on my eye/non-eye example above). Alternatively, if you want to test more than one image at a time, you can create a matrix that has the same format as the training matrix defined earlier and pass that in as the argument. The return value will be different, though.
Good luck!
I'm very new to OpenCV. I am trying to use the CvNormalBayesClassifier to train my program to learn skin pixel colours.
Currently I have got around 20 human pictures (face/other body parts) under different light conditions and backgrounds. I have also got 20 corresponding responses in which the skin parts are marked red and everything else marked green.
I have problem understanding how to use the function
bool CvNormalBayesClassifier::train(const CvMat* _train_data, const CvMat* _response, const Cv*Mat _var_idx = 0, const CvMat* _sample_idx=0,, bool update=false);
How should I use the current two picture libraries I have got to prepare the values that can be passed in as _train_data and _responses?
Many thanks.
You need to put in train_data the pixel values from your training image, and in responses an index corresponding to the class of this pixel (e.g. 1 for class skin, 0 for class non-skin). var_idx and sample_idx can be left as is, they are used to mask out some of the descriptors or samples in your training set. Set update to true/false depending on wether you get all the descriptors (all the pixels of all your training images) at once in case you can let it to false, or you process your training images incrementally (which might be better for memory issues), in which case you need to update your model.
Let me clarify you with the code (not checked, and using the C++ interface to OpenCV which I strongly recommand instead of the old C)
int main(int argc, char **argv)
{
CvNormalBaseClassifier classifier;
for (int i = 0; i < argc; ++i) {
cv::Mat image = // read in your training image, say cv::imread(argv[i]);
// read your mask image
cv::Mat mask = ...
cv::Mat response = mask == CV_RGB(255,0,0); // little trick: you said red pixels in your mask correspond to skin, so pixels in responses are set to 1 if corresponding pixel in mask is red, 0 otherwise.
cv::Mat responseInt;
response.convertTo(responsesInt, CV_32S); // train expects a matrix of integers
image = image.reshape(0, image.rows*image.cols); // little trick number 2 convert your width x height, N channel image into a witdth*height row matrix by N columns, as each pixel should be considere as a training sample.
responsesInt = responsesInt.reshape(0, image.rows*image.cols); // the same, image and responses have the same number of rows (w*h).
classifier.train(image, responsesInt, 0, 0, true);
}
I did a google search on this class but didn't find much information, and actually even the official opencv document does not provide direct explanation on the parameters. But I did notice one thing in opencv document
The method trains the Normal Bayes classifier. It follows the
conventions of the generic CvStatModel::train() approach with the
following limitations:
which direct me to CvStatModel class and from there I found something useful. And probably you can also take a look on the book from page 471 which gives you more details of this class. The book is free from google Books.