libsvm (C++) Always Outputting Same Prediction - c++
I have implemented an OpenCV/C++ wrapper for libsvm. When doing a grid-search for SVM parameters (RBF kernel), the prediction always returns the same label. I have created artificial data sets which have very easily separable data (and tried predicting data I just trained on) but still, it returns the same label.
I have used the MATLAB implementation of libsvm and achieved high accuracy on the same data set. I must be doing something wrong with setting up the problem but I've gone through the README many times and I can't quite find the issue.
Here is how I set up the libsvm problem, where data is an OpenCV Mat:
const int rowSize = data.rows;
const int colSize = data.cols;
this->_svmProblem = new svm_problem;
std::memset(this->_svmProblem,0,sizeof(svm_problem));
//dynamically allocate the X matrix...
this->_svmProblem->x = new svm_node*[rowSize];
for(int row = 0; row < rowSize; ++row)
this->_svmProblem->x[row] = new svm_node[colSize + 1];
//...and the y vector
this->_svmProblem->y = new double[rowSize];
this->_svmProblem->l = rowSize;
for(int row = 0; row < rowSize; ++row)
{
for(int col = 0; col < colSize; ++col)
{
//set the index and the value. indexing starts at 1.
this->_svmProblem->x[row][col].index = col + 1;
double tempVal = (double)data.at<float>(row,col);
this->_svmProblem->x[row][col].value = tempVal;
}
this->_svmProblem->x[row][colSize].index = -1;
this->_svmProblem->x[row][colSize].value = 0;
//add the label to the y array, and feature vector to X matrix
double tempVal = (double)labels.at<float>(row);
this->_svmProblem->y[row] = tempVal;
}
}/*createProblem()*/
Here is how I set up the parameters, where svmParams is my own struct for C/Gamma and such:
this->_svmParameter = new svm_parameter;
std::memset(this->_svmParameter,0,sizeof(svm_parameter));
this->_svmParameter->svm_type = svmParams.svmType;
this->_svmParameter->kernel_type = svmParams.kernalType;
this->_svmParameter->C = svmParams.C;
this->_svmParameter->gamma = svmParams.gamma;
this->_svmParameter->nr_weight = 0;
this->_svmParameter->eps = 0.001;
this->_svmParameter->degree = 1;
this->_svmParameter->shrinking = 0;
this->_svmParameter->probability = 1;
this->_svmParameter->cache_size = 100;
I use the provided param/problem checking function and no errors are returned.
I then train as such:
this->_svmModel = svm_train(this->_svmProblem, this->_svmParameter);
And then predict like so:
float pred = (float)svm_predict(this->_svmModel, x[i]);
If anyone could point out where I'm going wrong here I would greatly appreciate it. Thank you!
EDIT:
Using this code I printed the contents of the problem
for(int i = 0; i < rowSize; ++i)
{
std::cout << "[";
for(int j = 0; j < colSize + 1; ++j)
{
std::cout << " (" << this->_svmProblem->x[i][j].index << ", " << this->_svmProblem->x[i][j].value << ")";
}
std::cout << "]" << " <" << this->_svmProblem->y[i] << ">" << std::endl;
}
Here is the output:
[ (1, -1) (2, 0) (-1, 0)] <1>
[ (1, -0.92394) (2, 0) (-1, 0)] <1>
[ (1, -0.7532) (2, 0) (-1, 0)] <1>
[ (1, -0.75977) (2, 0) (-1, 0)] <1>
[ (1, -0.75337) (2, 0) (-1, 0)] <1>
[ (1, -0.76299) (2, 0) (-1, 0)] <1>
[ (1, -0.76527) (2, 0) (-1, 0)] <1>
[ (1, -0.74631) (2, 0) (-1, 0)] <1>
[ (1, -0.85153) (2, 0) (-1, 0)] <1>
[ (1, -0.72436) (2, 0) (-1, 0)] <1>
[ (1, -0.76485) (2, 0) (-1, 0)] <1>
[ (1, -0.72936) (2, 0) (-1, 0)] <1>
[ (1, -0.94004) (2, 0) (-1, 0)] <1>
[ (1, -0.92756) (2, 0) (-1, 0)] <1>
[ (1, -0.9688) (2, 0) (-1, 0)] <1>
[ (1, 0.05193) (2, 0) (-1, 0)] <3>
[ (1, -0.048488) (2, 0) (-1, 0)] <3>
[ (1, 0.070436) (2, 0) (-1, 0)] <3>
[ (1, 0.15191) (2, 0) (-1, 0)] <3>
[ (1, -0.07331) (2, 0) (-1, 0)] <3>
[ (1, 0.019786) (2, 0) (-1, 0)] <3>
[ (1, -0.072793) (2, 0) (-1, 0)] <3>
[ (1, 0.16157) (2, 0) (-1, 0)] <3>
[ (1, -0.057188) (2, 0) (-1, 0)] <3>
[ (1, -0.11187) (2, 0) (-1, 0)] <3>
[ (1, 0.15886) (2, 0) (-1, 0)] <3>
[ (1, -0.0701) (2, 0) (-1, 0)] <3>
[ (1, -0.17816) (2, 0) (-1, 0)] <3>
[ (1, 0.12305) (2, 0) (-1, 0)] <3>
[ (1, 0.058615) (2, 0) (-1, 0)] <3>
[ (1, 0.80203) (2, 0) (-1, 0)] <1>
[ (1, 0.734) (2, 0) (-1, 0)] <1>
[ (1, 0.9072) (2, 0) (-1, 0)] <1>
[ (1, 0.88061) (2, 0) (-1, 0)] <1>
[ (1, 0.83903) (2, 0) (-1, 0)] <1>
[ (1, 0.86604) (2, 0) (-1, 0)] <1>
[ (1, 1) (2, 0) (-1, 0)] <1>
[ (1, 0.77988) (2, 0) (-1, 0)] <1>
[ (1, 0.8578) (2, 0) (-1, 0)] <1>
[ (1, 0.79559) (2, 0) (-1, 0)] <1>
[ (1, 0.99545) (2, 0) (-1, 0)] <1>
[ (1, 0.78376) (2, 0) (-1, 0)] <1>
[ (1, 0.72177) (2, 0) (-1, 0)] <1>
[ (1, 0.72619) (2, 0) (-1, 0)] <1>
[ (1, 0.80149) (2, 0) (-1, 0)] <1>
[ (1, 0.092327) (2, -1) (-1, 0)] <2>
[ (1, 0.019054) (2, -1) (-1, 0)] <2>
[ (1, 0.15287) (2, -1) (-1, 0)] <2>
[ (1, -0.1471) (2, -1) (-1, 0)] <2>
[ (1, -0.068182) (2, -1) (-1, 0)] <2>
[ (1, -0.094567) (2, -1) (-1, 0)] <2>
[ (1, -0.17071) (2, -1) (-1, 0)] <2>
[ (1, -0.16646) (2, -1) (-1, 0)] <2>
[ (1, -0.030421) (2, -1) (-1, 0)] <2>
[ (1, 0.094346) (2, -1) (-1, 0)] <2>
[ (1, -0.14408) (2, -1) (-1, 0)] <2>
[ (1, 0.090025) (2, -1) (-1, 0)] <2>
[ (1, 0.043706) (2, -1) (-1, 0)] <2>
[ (1, 0.15065) (2, -1) (-1, 0)] <2>
[ (1, -0.11751) (2, -1) (-1, 0)] <2>
[ (1, -0.02324) (2, 1) (-1, 0)] <2>
[ (1, 0.0080356) (2, 1) (-1, 0)] <2>
[ (1, -0.17752) (2, 1) (-1, 0)] <2>
[ (1, 0.011135) (2, 1) (-1, 0)] <2>
[ (1, -0.029063) (2, 1) (-1, 0)] <2>
[ (1, 0.15398) (2, 1) (-1, 0)] <2>
[ (1, 0.097746) (2, 1) (-1, 0)] <2>
[ (1, 0.01018) (2, 1) (-1, 0)] <2>
[ (1, 0.015592) (2, 1) (-1, 0)] <2>
[ (1, -0.062793) (2, 1) (-1, 0)] <2>
[ (1, 0.014444) (2, 1) (-1, 0)] <2>
[ (1, -0.1205) (2, 1) (-1, 0)] <2>
[ (1, -0.18011) (2, 1) (-1, 0)] <2>
[ (1, 0.010521) (2, 1) (-1, 0)] <2>
[ (1, 0.036914) (2, 1) (-1, 0)] <2>
Here, the data is printed in the format [ (index, value)...] label.
The artificial dataset I created just has 3 classes, all which are easily separable with a non-linear decision boundary. Each row is a feature vector (observation), with 2 features (x coord, y coord). Libsvm asks to terminate each vector with a -1 label, so I do.
EDIT2:
This edit pertains to my C and Gamma values used for training, as well as data scaling. I normally data between 0 and 1 (as suggested here: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf). I will scale this fake dataset as well and try again, although I used this same exact dataset with the MATLAB implementation of libsvm and it could separate this unscaled data with 100% accuracy.
For C and Gamma, I also use the values recommended in the guide. I create two vectors and use a double nested loop to try all combinations:
std::vector<double> CList, GList;
double baseNum = 2.0;
for(double j = -5; j <= 15; j += 2) //-5 and 15
CList.push_back(pow(baseNum,j));
for(double j = -15; j <= 3; j += 2) //-15 and 3
GList.push_back(pow(baseNum,j));
And the loop looks like:
for(auto CIt = CList.begin(); CIt != CList.end(); ++CIt) //for all C's
{
double C = *CIt;
for(auto GIt = GList.begin(); GIt != GList.end(); ++GIt) //for all gamma's
{
double gamma = *GIt;
svmParams.svmType = C_SVC;
svmParams.kernalType = RBF;
svmParams.C = C;
svmParams.gamma = gamma;
......training code etc..........
EDIT3:
Since I keep referencing MATLAB, I will show the accuracy differences. Here is a heat map of the accuracy libsvm yields:
And here is the accuracy map MATLAB yields using the same parameters and same C/Gamma grid:
Here is the code used to generate the C/Gamma lists, and how I train:
CList = 2.^(-15:2:15);%(-5:2:15);
GList = 2.^(-15:2:15);%(-15:2:3);
cmd = ['-q -s 0 -t 2 -c ', num2str(C), ' -g ', num2str(gamma)];
model = ovrtrain(yTrain,xTrain,cmd);
EDIT4
As a sanity check, I reformatted my fake scaled dataset to conform to the dataset used by libsvm's Unix/Linux terminal API. I trained and predicted using a C/Gamma found in in the MATLAB accuracy map. The prediction accuracy was 100%. Thus I am absolutely doing something wrong in the C++ implementation.
EDIT5
I loaded the model trained from the Linux terminal into my C++ wrapper class. I then tried predicting the same exact dataset used for training. The accuracy in C++ was still awful! However, I'm very close to narrowing the source of the problem. If MATLAB/Linux both agree in terms of 100% accuracy, and the model it produces has already been proven to yield 100% accuracy on the same dataset that was trained on, and now my C++ wrapper class shows poor performance with the verified model... there are three possible situations:
The method I use to transform cv::Mats into the svm_node* it requires for prediction has a problem in it.
The method I use to predict labels has a problem in it.
BOTH 2 and 3!
The code to really inspect now is how I create the svm_node. Here it is again:
svm_node** LibSVM::createNode(INPUT const cv::Mat& data)
{
const int rowSize = data.rows;
const int colSize = data.cols;
//dynamically allocate the X matrix...
svm_node** x = new svm_node*[rowSize];
if(x == NULL)
throw MLInterfaceException("Could not allocate SVM Node Array.");
for(int row = 0; row < rowSize; ++row)
{
x[row] = new svm_node[colSize + 1]; //+1 here for the index-terminating -1
if(x[row] == NULL)
throw MLInterfaceException("Could not allocate SVM Node.");
}
for(int row = 0; row < rowSize; ++row)
{
for(int col = 0; col < colSize; ++col)
{
double tempVal = data.at<double>(row,col);
x[row][col].value = tempVal;
}
x[row][colSize].index = -1;
x[row][colSize].value = 0;
}
return x;
} /*createNode()*/
And prediction:
cv::Mat LibSVM::predict(INPUT const cv::Mat& data)
{
if(this->_svmModel == NULL)
throw MLInterfaceException("Cannot predict; no model has been trained or loaded.");
cv::Mat predMat;
//create the libsvm representation of data
svm_node** x = this->createNode(data);
//perform prediction for each feature vector
for(int i = 0; i < data.rows; ++i)
{
double pred = svm_predict(this->_svmModel, x[i]);
predMat.push_back<double>(pred);
}
//delete all rows and columns of x
for(int i = 0; i < data.rows; ++i)
delete[] x[i];
delete[] x;
return predMat;
}
EDIT6:
For those of you tuning in at home, I trained a model (using optimal C/Gamma found in MATLAB) in C++, saved it to file, and then tried predicting on the training data via Linux terminal. It scored 100%. Something is wrong with my prediction. o_0
EDIT7:
I found the issue finally. I had tremendous bug-tracking help in finding it. I printed the contents of the svm_node** 2D array used for prediction. It was a subset of the createProblem() method. There was a piece of it that I failed to copy + paste over to the new function. It was the index of a given feature; it was never written. There should have been 1 more line:
x[row][col].index = col + 1; //indexing starts at 1
And the prediction works fine now.
It would be useful to see your gamma value, since your data is not normalized that would make a huge difference.
The gamma in libsvm is inversely to the hypersphere radius, so if those spheres are too small with respect to the input range, everything will be activated always and then the model would output always the same value.
So, the two recommendations would be 1) Scale your input values to the range [-1,1]. 2) Play with the gamma values.
Related
How to filter a list of tuples with (Int, Int) values by a predicate applied to their respective first and second values?
Suppose i have a list of tuples representing positions on a grid by their x and y values. The tuples are defined as type Pos, a pair of Integer values. The Board is further defined as a list of Pos. type Pos = (Int, Int) type Board = [Pos] exampleBoard :: Board exampleBoard = [(1, 1), (1, 2), (2, 2), (2, 3), (1, 3), (5, 1), (5, 2), (4, 2), (4, 1), (5, 3), (9, 10), (9, 11), (9, 12), (9, 13), (10, 10), (10, 11), (10, 12), (10, 13), (10, 20), (11, 20), (12, 20), (13, 20)] The Board is a x*x grid, where you can always consider the known variables height and width having the same value. height = width = size If the x or y value is not in a certain range (0<x<size || 0<y<size) I want the tuple removed. Is there a simple way to filter the list as I describe? From searching for similar questions I have tried using library functions such as "break" and "span" to no success as of yet.
To test if a tuple (Int,Int) is in a given range, you can use the inRange function: import Data.Ix (inRange) inRange ((1,1),(10,10)) (5,5) -- True inRange ((1,1),(10,10)) (11,6) -- False filter (inRange ((1,1),(10,10))) [(5,5),(11,6)] -- [(5,5)]
How to calculate distances from coordinates stored in lists
So far I managed to calculate the distances between an Point P(x,y) and a multitude of points stored in a list l = [(x1,y1), (x2,y2), (x3,y3), ...) Here is the code : import math import pprint l = [(1,2), (2,3), (4,5)] p = (3,3) dists = [math.sqrt((p[0]-l0)**2 + (p[1]-l1)**2) for l0, l1 in l] pprint.pprint(dists) Output : [2.23606797749979, 1.0, 2.23606797749979] Now I want to calculate the distances from multitude points in a new list to the points in the list l. I haven't found a solution yet, so does anyone have an idea how this could be done?
Here is a possible solution: from math import sqrt def distance(p1, p2): return sqrt((p1[0]-p2[0])**2 + (p1[1]-p2[1])**2) lst1 = [(1,2), (2,3), (4,5)] lst2 = [(6,7), (8,9), (10,11)] for p1 in lst1: for p2 in lst2: d = distance(p1, p2) print(f'Distance between {p1} and {p2}: {d}') Output: Distance between (1, 2) and (6, 7): 7.0710678118654755 Distance between (1, 2) and (8, 9): 9.899494936611665 Distance between (1, 2) and (10, 11): 12.727922061357855 Distance between (2, 3) and (6, 7): 5.656854249492381 Distance between (2, 3) and (8, 9): 8.48528137423857 Distance between (2, 3) and (10, 11): 11.313708498984761 Distance between (4, 5) and (6, 7): 2.8284271247461903 Distance between (4, 5) and (8, 9): 5.656854249492381 Distance between (4, 5) and (10, 11): 8.48528137423857
Python - Compare Tuples in a List
So in a program I am creating I have a list that contains tuples, and each tuple contains 3 numbers. For example... my_list = [(1, 2, 4), (2, 4, 1), (1, 5, 2), (1, 4, 1),...] Now I want to delete any tuple whose last two numbers are less than any other tuple's last two numbers are. The first number has to be the same to delete the tuple. * So with the list of tuples above this would happen... my_list = [(1, 2, 4), (2, 4, 1), (1, 5, 2), (1, 4, 1),...] # some code... result = [(1, 2, 4), (2, 4, 1), (1, 5, 2)] The first tuple is not deleted because (2 and 4) are not less than (4 and 1 -> 2 < 4 but 4 > 1), (1 and 5 -> 2 > 1), or (4 and 1 -> 2 < 4 but 4 > 1) The second tuple is not deleted because its first number (2) is different than every other tuples first number. The third tuple is not deleted for the same reason the first tuple is not deleted. The fourth tuple is deleted because (4 and 1) is less than (5 and 2 -> 4 < 5 and 1 < 2) I really need help because I am stuck in my program and I have no idea what to do. I'm not asking for a solution, but just some guidance as to how to even begin solving this. Thank you so much!
I think this might actually work. I just figured it out. Is this the best solution? results = [(1, 2, 4), (2, 4, 1), (1, 5, 2), (1, 4, 1)] for position in results: for check in results: if position[0] == check[0] and position[1] < check[1] and position[2] < check[2]: results.remove(position)
Simple list comprehension to do this: [i for i in l if not any([i[0]==j[0] and i[1]<j[1] and i[2]<j[2] for j in my_list])] Your loop would work too, but be sure not to modify the list as you are iterating over it. my_list = [(1, 2, 4), (2, 4, 1), (1, 5, 2), (1, 4, 1)] results = [] for position in my_list: for check in my_list: if not (position[0] == check[0] and position[1] < check[1] and position[2] < check[2]): results.append(position) results >[(1, 2, 4), (2, 4, 1), (1, 5, 2)]
Sort by sum and chop a list of tuples in Python
For example I need the list = [(0, 0), (0, 1), (0, 2), (0, 4), (1, 0), (1, 1), (1, 2), (1, 4)] to come out sorted by the sum of each tuple eg [(0, 0),(1, 0), (0, 1), (1, 1),(0, 2), (1, 2), (0, 4), (1, 4)]. The order of (1,1) vs (0,2) does not matter, however the tuples will be of varying but equal length. And then chop the list down to only tuples which add up to 4 or less. eg [(0, 0),(1, 0), (0, 1), (1, 1),(0, 2), (1, 2), (0, 4)] The order of sorting and then chopping is not necessary but the outcome should be the same if done by chopping and then sorting.
Something like this should do the job: list.sort(key=sum) while sum(list[-1])>4: list.pop() Little late but for everybody reading this later.
Some problems when solving moore neighborhood
here is my code, I don't know how to solve it... def count_neighbours(grid, row, col): neighbor_rule = ((-1, -1), (-1, 0), (-1, 1), (0, -1), (0, 1), (1, -1), (1, 0), (1, 1)) chip = 0 for each_loc in neighbor_rule: n_row = row + each_loc[0] n_col = col + each_loc[1] if 0 <= n_row < len(gird) and 0 <= n_col < len(gird[0]): if gird[n_row][n_col] == 1: chip += 1 else: continue return chip