Hu moments and SVM does not work

Hu moments and SVM does not work - c++

I have come across one problem when trying to train data with SVM.
I get some different regions (set of connected pixels) from face images, and regions from eyes are very similar, so I want to use Hu moments for shape description and SVM for training.
But SVM does not work properly, method svm.predict evaluates afterwards everything as non-eye, moreover the same regions which were labeled and used in traning phase as eye, are evaluated as non-eye.
Feature data consists only of 7 Hu moments. I will post here some samples of source code in a moment, thanks in advance :)
Additional info:
input image:
http://i.stack.imgur.com/GyLO0.png
Setting up basic svm for 1 image:
int image_regions = 10;
Mat training_mat(image_regions ,7,CV_32FC1); // 7 hu moments
Mat labels(image_regions ,1,CV_32FC1); // for labels 1 (eye) and -1 (non eye)
// computing hu moments
Moments moments2=moments(croppedImage,false);
double hu[7];
HuMoments(moments2,hu);
// putting them into svm traning mat
for (int k=0;k<huCounter;k++)
training_mat.at<float>(counter,k) = hu[k]; // counter is current number of region
if (isEye(...))
{
labels.at<float>(counter,0)=1.0;
}
else
{
labels.at<float>(counter,0)=-1.0;
}
//I use the following:
CvSVM svm;
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::LINEAR;
params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 1000, 1e-6);
// ... do the above mentioned phase, and then:
svm.train(training_mat, labels, Mat(), Mat(), params);

I hope the following suggestions can help you…..
The simplest task is to use a clustering algorithm and try to cluster the data into two classes. If an algorithm like ‘k-means’ can do the job why make things complex by using SVM and Neural Nets. I suggest you use this technique because your feature vector dimension is of a very small size (7 Hu Moments) as well as your number of samples.
Perform feature Normalization (specified in point 4) to make sure the values fall in a limited range.
Check out “is your data really separable?” As your data is small, take a few samples from positive images and a few samples from negative images and plot the feature vectors. If you can visually see the difference surely any learning algorithm can do the job for you. As I said earlier simple tricks can do better than complex math.
Only if you then decide to use SVM you should know the following:
• As I can see from your code you are using a Linear SVM, may be your data is non-separable by a linear kernel. Try using some polynomial kernel or other kernels. There is one option bool CvSVM::train_auto in openCV just have a look.
• Try to check whether the feature vector values you are getting are proper values or not (make sure that they are not some garbage values).
• Also you can perform feature normalization “ZERO MEAN and UNIT VARIENCE” before you use it for training.
• Most importantly increase the number of images for training, both positively and negatively labeled.
• Last but not least SVM is not magic, at the end of the day it is just drawing a line between two sets of points. So don’t expect it to classify anything you give it as input.
If nothing works “Just improve your feature extraction technique”

Related

OpenCV Neural network for images processing

I new in AI world and try some practice.
It looks like I need some third-party experience.
Let's say I need to get rid of image defects (actually the task more tricky).
I hope that trained NN will be able to interpolate defect area.
For these reasons I try to create simple neural network.
It has input : grayscale image with deffect(72*54) and the same image with no defect.
Hidden layer has 2*72*54 neurons.
Main piece of code
cv::Ptr<cv::ml::ANN_MLP> ann = cv::ml::ANN_MLP::create();
int inputsCount = imageSizes.width * imageSizes.height;
std::vector<int> layerSizes = { inputsCount, inputsCount * 2, inputsCount};
ann->setLayerSizes(layerSizes);
ann->setActivationFunction(cv::ml::ANN_MLP::SIGMOID_SYM);
cv::TermCriteria tc(cv::TermCriteria::MAX_ITER + cv::TermCriteria::EPS, 50, 0.1);
ann->setTermCriteria(tc);
ann->setTrainMethod(cv::ml::ANN_MLP::BACKPROP, 0.0001);
std::cout << "Result : " << ann->train(trainData, cv::ml::ROW_SAMPLE, resData) << std::endl;
ann->predict(trainData, predicted);
My training dataset looks like
Trained on 10 items dataset NN gives bad results on this(same) inputs. I tried different params
But trained on only 2 images NN gets close output (on trained data).
I suppose that it's not inappropriate approach and solution is not so easy.
Maybe someone has some advice about parameters or neural network architecture or whole approach.

It seems that the termination criteria were fine for just two samples but were not good enough when training with a larger number of samples. Do try adjusting them, and also the learning rate.
Judging by the quality of the pixels that have been restored properly, the network architecture seems to be fine for this task. Once the network works well on 10 samples, I strongly recommend adding more training samples.

The chief problem is that you have way to little data for the given network.
Your NN is fully connected. The weights for pixel 0,0 are entirely separate from those of pixel 1,0, and pixel 0,1 has again different weights. And you have a lot of weights, with so many nodes. So while you have plenty of pixels in 10 images, you have nowhere near enough pixels for all the weights.
A Convolutional Neural Network has far less weights, as many of its weights are reused. That means that in training, these weights are trained by multiple pixels from each training image.
Not that I'd expect this to work well with just 10 images. The human expectation is derived from years of human vision, literally billions of images.

Eigenfaces in OpenCV using C++

I have written a code to create eigenfaces. I have taken 3 images of different people as input. I have calculated the eigenvectors and eigenvalues. Since only 3 images are taken, I select all the three eigenvectors, each of size 36000x1, as the principal components. When I reshape the eigenvectors to see the image, I get eigenface for only one person. The other images are almost completely blank.
I am extracting each eigenvector from covevec(matrix of eigenvectors of covariance matrix)
col1=covevec.col(0);
col2=covevec.col(1);
col3=covevec.col(2);
I reshape them as follows:
if (!col1.isContinuous() && !col2.isContinuous() && !col3.isContinuous())
{
col1=col1.clone();
col2=col2.clone();
col3=col3.clone();
}
Mat final1,final2,final3;
final1=col1.reshape(0,200);
final2=col2.reshape(0,200);
final3=col3.reshape(0,200);
This how final2 looks like:
And the other two look like this:
What am I doing wrong?

Your code looks fine, so what is wrong?
DataDataData, it's so crucial when performing computer vision tasks like this. To give yourself an advantage use a readily avaliable dataset with corresponding test data - This would work
Also, as berak says normalising the images will help. In Turk & Pentland (Which if you haven't read you should) they state:
Step 6.3: compute the M best eigenvectors of AAT : ui = Avi
(important: normalize ui such that ||ui|| = 1)
This will mean that all your training data will be of the same vein and give your algorithm much better chance of success

Scikit-learn RandomForestClassifier() feature selection, just select the train set?

I'm using scikit-learn for machine learning.
I have 800 samples with 2048 features, therefore I want to reduce my features to get hopefully a better accuracy.
It is a multiclass problem (class 0-5), and the features consists of 1's and 0's: [1,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0....,0]
I'm using the ensemble method, RandomForestClassifier().
Should I just feature select the training data ?
Is it enough if I'm using this code:
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size = .3 )
clf = RandomForestClassifier( n_estimators = 200,
warm_start = True,
criterion = 'gini',
max_depth = 13
)
clf.fit( X_train, y_train ).transform( X_train )
predicted = clf.predict( X_test )
expected = y_test
confusionMatrix = metrics.confusion_matrix( expected, predicted )
Cause the accuracy didn't get higher. Is everything ok in the code or am I doing something wrong?
I'll be very grateful for your help.

I'm not sure I understood your question correctly so I'll answer to what I thought I understood =)
First, reducing the dimension of your features (from 2048 to 500 e.g.) might not provide you with better results. It all depends on the capacity of your model to catch the geometry of your data. You can get much better results for example with a linear model if you reduce dimension through non-linear methods that would catch a particular geometry and 'linearize' it, instead of directly using this linear model on the raw data. But this is because your data would intrinsicaly be non-linear and the linear model is not good therefore in the original space to catch this geometry (think of a circle in 2D).
In the code you gave, you did not reduce dimension though, you splitted the data into two dataset (feature dimension is the same, 2048, only the number of samples changed). Training on a smaller dataset most of the time results in worst accuracy (data = information, when you leave some out you lose information). But splitting data allows you to test overfitting in particular, which is very impotant. But once the best parameters chosen (see cross-validation) you should learn on all the data you have!
Given your 0.7*800=560 samples, I think a depth of 13 is pretty big and you might overfit. You may want to play with this parameter first if you want to improve your accuracy!

1) Often reducing the features space does not help with accuracy, and using a regularized classifier leads to better results.
2) To do feature selection, you need two methods: one to reduce the set of features, another that does the actual supervised task (classification here).
Have you tried just using the standard classifiers? Clearly you tried the RF, but I'd also try a linear method like LinearSVC/LogisticRegression or a kernel SVC.
If you want to do feature selection, what you need to do is something like this:
feature_selector = LinearSVC(penalty='l1') #or maybe start with SelectKBest()
feature_selector.train(X_train, y_train)
X_train_reduced = feature_selector.transform(X_train)
X_test_reduced = feature_selector.transform(X_test)
classifier = RandomForestClassifier().fit(X_train_reduced, y_train)
prediction = classifier.predict(X_test_reduced)
Or you use a pipeline, as here: http://scikit-learn.org/dev/auto_examples/feature_selection/feature_selection_pipeline.html
Maybe we should add a version without the pipeline to the examples?
[cross-posted from the mailing list where this was originally asked]

Dimensionality reduction or feature selection is definitely advisable if you have more features than samples. You could look into Principal Component Analysis and other modules in sklearn.decomposition to reduce the number of features. There is also a useful section on Feature Selection in the scikit-learn documentation.
After fitting sklearn.decomposition.PCA, you could inspect the explained_variance_ratio_ to determine an advisable number of features (n_components) to reduce to (the point of PCA here is to find a reduced number of features that captures most of the variance in your original feature space). Some might like to retain features that have a cumulative explained_variance_ratio_ above 0.9, 0.95 etc, some like to drop features beyond which the explained_variance_ratio_ drops suddenly. Then refit the PCA with the n_components you like, transform your X_train and X_test, and fit your classifier as above.

OpenCV Linear SVM not training

I've been stuck on this for some time now. OpenCV's SVM implementation doesn't seem to work for a linear kernel. I'm fairly sure there's no bug in the code: when I change the kernel_type to RBF or POLY, keeping everything else as is, it works.
The reason I say it doesn't work is, I save the generated model and check it out. It shows support vector count as 1. Which is not the case in RBF or POLYnomial kernels.
There's nothing special about the code in itself, I've used OpenCV's SVM implementation before, but never a linear kernel. I tried setting the degree to 1 in a POLY kernel and it results in the same model. Which makes me believe something is buggy here.
The code structure, if required:
Mat trainingdata; //acquire from files. done and correct.
Mat testingdata; //acquire from files. done and correct again.
Mat labels; //corresponding labels. checked and correct.
SVM my_svm;
SVMParams my_params;
my_params.svm_type = SVM::C_SVC;
my_params.kernel_type = SVM::LINEAR; //or poly, with my_params.degree = 1.
my_param.C = 0.02; //doesn't matter if I set it to 20000, makes no difference.
my_svm.train( trainingdata, labels, Mat(), Mat(), my_params );
//train_auto(..) function with 10-fold cross-validation takes the same time as above (~2sec)!
Mat responses;
my_svm.predict( testingdata, responses );
//responses matrix is all wrong.
I have 500 samples from one class and 600 from the other class to test, and the correct classifications I get are: 1/500 and 597/600.
Craziest part:
I have done the same experiment with the same data on libSVM's MATLAB wrapper, and it works. Was just trying to do an OpenCV version of it.

It is not a bug that you always get only one support vector with linear CvSVM.
OpenCV optimizes a linear SVM down to one support vector.
The idea here is that the support vectors define the classification margin, and to do the actual classification only the separating hyperplane is needed and it can be defined by only one vector.
Parameter C doesn't matter if your training data is linearly separable. Maybe it is your case.

OpenCV Face Recognition strange result

I have been using OpenCV's SVM and RF for a multi-class face recognition problem with 11 classes and only 5 images per class. I used two kinds of features - initially a toy intensity image feature (just each image resized to 32x32 grayscale) and then the second feature was simply another toy feature using Tan Triggs preprocessing(link). Here is the feature code:
void Feature::makeFeature(cv::Mat &image, cv::Mat &result)
{
cv::resize( image, image, cv::Size(32, 32), 0, 0, cv::INTER_CUBIC );
cv::equalizeHist(image, image);
// Images must be aligned - Only pitch executed, yaw and roll assumed negligible
algmt->getAlignedImage( image, image ); // image alignment
// tan triggs
{
tan_triggs_preprocessing(image, result);
result = result.reshape(0, 1); // make a single row vector, needed for the training samples matrix
}
// if plain intensity
{
// image.copyTo(result);
// result.convertTo(result, CV_32F, 1.0f/255.0f);
// result = result.reshape(0, 1); // make a single row vector, needed for the training samples matrix
}
}
Where the tan_triggs_preprocessing function is the same as the Tan Triggs preprocessing function given in the link. I added one step - i normalized the result between 0 and 1.
The results on test for both were not very good, as expected, but then I made a silly mistake and discovered something strange: When I accidentally gave the training directory as input for both training and test, I get 100% results on the plain intensity feature, but the Tan Triggs feature gives the following as result:
SVM Training Complete
Total number of correct: 51 and accuracy: 92.7273
RF Training Complete
Total number of correct: 53 and accuracy: 96.3636
I do know however much you overfit the result should be perfect when the training set is input to test. Everything else is standard, both SVM and RF are standard as in the OpenCV examples. Besides I get 100% for plain intensity feature so of course I am mucking something up here when using Tan Triggs. Anyone has any idea what mistake I am making?
I have used other complex features like LTPs and LQPs without issue, but this preprocessing method is something I want to use. I use the Jain-Learned Miller congealing algorithm for alignment as I assume frontals for face recognition, no pose correction.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js