I have two multi-class data sets with 5 labels, one for training, and the other for cross validation. These data sets are stored as .csv files, so they act as a control in this experiment.
I have a C++ wrapper for libsvm, and the MATLAB functions for libsvm.
For both C++ and MATLAB:
Using a C-type SVM with an RBF kernel, I iterate over 2 lists of C and Gamma values. For each parameter combination, I train on the training data set and then predict the cross validation data set. I store the accuracy of the prediction in a 2D map which correlates to the C and Gamma value which yielded the accuracy.
I've recreated different training and cross validation data sets many, many times. Each time, the C++ and MATLAB accuracies are different; sometimes by a lot! Mostly MATLAB produces higher accuracies, but sometimes the C++ implementation is better.
What could be accounting for these differences? The C/Gamma values I'm trying are the same, as are the remaining SVM parameters (default).
There should be no significant differences as both C and Matlab codes use the same svm.c file. So what can be the reason?
implementation error in your code(s), this is unfortunately the most probable one
used wrapper has some bug and/or use other version of libsvm then your matlab code (libsvm is written in pure C and comes with python, Matlab and java wrappers, so your C++ wrapper is "not official") or your wrapper assumes some additional default values, which are not default in C/Matlab/Python/Java implementations
you perform cross validation in somewhat randomized form (shuffling the data and then folding, which is completely correct and reasonable, but will lead to different results in two different runs)
There is some rounding/conversion performed during loading data from .csv in one (or both) of your codes which leads to inconsistencies (really not likely to happen, yet still possible)
I trained an SVC using scikit-Learn (sklearn.svm.SVC) within a python Jupiter Notebook. I wanted to use the trained classifier in MATLAB v. 2022a and C++. I nedeed to verify that all three versions' predictions matched for each implementation of the kernel, decision, and prediction functions. I found some useful guidance from bcorso's implementation of the original libsvm C++ code.
Exporting structure that represents the structure's model is explained in bcorso's post ab required to call his prediction function implementation:
predict(params, sv, nv, a, b, cs, X)
for it to match sklearn's version for trained classifier instance, clf:
clf.predict(X)
Once I established this match, I created a MATLAB versions of bcorso's kernel,
function [k] = kernel_svm(params, sv, X)
k = zeros(1,length(sv));
if strcmp(params.kernel,'linear')
for i = 1:length(sv)
k(i) = dot(sv(i,:),X);
end
elseif strcmp(params.kernel,'rbf')
for i = 1:length(sv)
k(i) =exp(-params.gamma*dot(sv(i,:)-X,sv(i,:)-X));
end
else
uiwait(msgbox('kernel not defined','Error','modal'));
end
k = k';
end
decision,
function [d] = decision_svm(params, sv, nv, a, b, X)
%% calculate the kernels
kvalue = kernel_svm(params, sv, X);
%% define the start and end index for support vectors for each class
nr_class = length(nv);
start = zeros(1,nr_class);
start(1) = 1;
%% First Class Loop
for i = 1:(nr_class-1)
start(i+1) = start(i)+ nv(i)-1;
end
%% Other Classes Nested Loops
for i = 1:nr_class
for j = i+1:nr_class
sum = 0;
si = start(i); %first class start
sj = start(j); %first class end
ci = nv(i)+1; %next class start
cj = ci+ nv(j)-1; %next class end
for k = si:sj
sum =sum + a(k) * kvalue(k);
end
sum1=sum;
sum = 0;
for k = ci:cj
sum = sum + a(k) * kvalue(k);
end
sum2=sum;
end
end
%% Add class sums and the intercept
sumd = sum1 + sum2;
d = -(sumd +b);
end
and predict functions.
function [class, classIndex] = predict_svm(params, sv, nv, a, b, cs, X)
dec_value = decision_svm(params, sv, nv, a, b, X);
if dec_value <= 0
class = cs(1);
classIndex = 1;
else
class = cs(2);
classIndex = 0;
end
end
Translation of the python comprehension syntax to a MATLAB/C++ equivalent of the summations required nested for loops in the decision function.
It is also required to account for for MATLAB indexing (base 1) vs.Python/C++ indexing (base 0).
The trained classifer model is conveyed by params, sv, nv, a, b, cs, which can be gathered within a structure after hanving exported the sv and a matrices as .csv files from teh python notebook. I simply created a wrapper MATLAB function svcInfo that builds the structure:
svcStruct = svcInfo();
params = svcStruct.params;
sv= svcStruct.sv;
nv = svcStruct.nv;
a = svcStruct.a;
b = svcStruct.b;
cs = svcStruct.cs;
Or one can save the structure contents within as MATLAB workspace within a .mat file.
The new case for prediction is provided as a vector X,
%Classifier input feature vector
X=[x1 x2...xn];
A simplified C++ implementation that follows bcorso's python version is fairly similar to this MATLAB implementation in that it uses the nested "for" loop within the decision function but it uses zero based indexing.
Once tested, I may expand this post with the C++ version on the MATLAB code shared above.
Related
I'd like to implement an extended Kalman filter in C++ using the eigen library because I'm interested in robotics, this seems like a good exercise to get better at C++ and it seems like a fun project. I was hoping I can post my code to get some feedback on writing classes and what my next steps should be from here. So I got the equations from a class online
what I have so far is below, I've hardcoded a state vector of size 2x1 and an array of measurements as a test but would like to change it so I can declare a state vector of any size, and I'll move the array of measurements to a main.cpp file. I just did this in the beginning so I can simply declare and object of this class and quickly test out the functions, and everything seems to be working so far. What I was thinking of doing next is to make another class that takes measurements from some source and converts it into eigen matrices to pass onto this kalman filter class. The main questions I have are:
Should I have the measurement update and state prediction as two different functions? Does it really matter? I did that in the first place because I thought it was easier to read.
Should I set the size of things like the state vector in the class constructor or is it better to have something like an initializer function for that?
I read that it's better practice to have class members that are matrices actually be pointers to the matrix, because it makes the class lighter. What does this mean? Is that important if I want to run this on a PC vs something like a raspberry pi?
In the measurementUpdate function, should y, S, K be class members? It'll make the class larger, but then I wouldn't be constructing and destroying the Eigen objects when the program is looping? Is that good practice?
Should there be a class member that takes the measurement inputs or is it better to just pass a value to the measurement update function? Does it matter?
Is it even worth it to try and implement a class for this or is it better to just have a single function that implements the filter?
removed this one because it wasn't a question.
I was thinking of implementing some getter functions so I can check the state variable and covariance matrix, is it better to just make those members public and not have the getter functions?
Apologies if this is posted in the wrong place and if these are super basic questions I'm pretty new to this stuff. Thanks for all the help, all advice is appreciated.
header:
#include "eigen3/Eigen/Dense"
#include <iostream>
#include <vector>
class EKF {
public:
EKF();
void filter(Eigen::MatrixXd Z);
private:
void measurementUpdate(Eigen::MatrixXd Z);
void statePrediction();
Eigen::MatrixXd P_; //Initial uncertainty
Eigen::MatrixXd F_; //Linearized state approximation function
Eigen::MatrixXd H_; //Jacobian of linearrized measurement function
Eigen::MatrixXd R_; //Measurement uncertainty
Eigen::MatrixXd I_; //Identity matrix
Eigen::MatrixXd u_; //Mean of state function
Eigen::MatrixXd x_; //Matrix of initial state variables
};
source:
EKF::EKF() {
double meas[5] = {1.0, 2.1, 1.6, 3.1, 2.4};
x_.resize(2, 1);
P_.resize(2, 2);
u_.resize(2, 1);
F_.resize(2, 2);
H_.resize(1, 2);
R_.resize(1, 1);
I_.resize(2, 2);
Eigen::MatrixXd Z(1, 1);
for(int i = 0; i < 5; i++){
Z << meas[i];
measurementUpdate(Z);
//statePrediction();
}
}
void EKF::measurementUpdate(Eigen::MatrixXd Z){
//Calculate measurement residual
Eigen::MatrixXd y = Z - (H_ * x_);
Eigen::MatrixXd S = H_ * P_ * H_.transpose() + R_;
Eigen::MatrixXd K = P_ * H_.transpose() * S.inverse();
//Calculate posterior state vector and covariance matrix
x_ = x_ + (K * y);
P_ = (I_ - (K * H_)) * P_;
}
void EKF::statePrediction(){
//Predict next state vector
x_ = (F_ * x_) + u_;
P_ = F_ * P_ * F_.transpose();
}
void EKF::filter(Eigen::MatrixXd Z){
measurementUpdate(Z);
statePrediction();
}
One thing to consider, which will affect the answers to ypur questions, is how 'generic' a filter you want to make.
There is no restriction in a Kalman filter that the sampling rate of the measurements be constant, nor that you get all the measurements every time. The only restriction is that the measurements appear in increasing time order. If you want to support this, then your measurement function will be passed arrays of variable sizes, and the H and R matrices will also be of variable size, and moreover the F and Q matrices (though of constant shape) will need to know the time update -- in particular you will need a function to compute Q.
As an example of what I mean, consider the example of some sort of survey boat that has a dgps sensor that gives a position every second, a gyro compass that gives the ship's heading twice a second, and a rgps system that gives the range and bearing to a towed buoy every two seconds. In this case we could get measurements like this:
at 0.0 gyro and dgps
at 0.0 gyro
at 1.0 gyro and dgps
at 1.5 gyro
at 2.0 gyro, dgps and rgps
and so on. So we get different numbers of observations at different times.
On a different topic I've always found it useful to have a way of seeing how well the filter is doing. Somewhat surprisingly the state covariance matrix is not a way of seeing this. In the linear (as opposed to extended) filter the state covariance could be computed for all times before you see any data! This is not true for the extended case as the state covariance depends on the states through the measurement Jacobian, but that is a very weak dependence on the observations. I think the most useful quality measures are those based on the measurements. Easy ones to compute are the 'innovations' -- the difference between the measured values and the values calculated using the predicted state -- and the residuals -- the difference between the measured values and the values calculated using the updated state. Each of these, over time, should have mean 0. If you want to get fancier there are the normalised residuals. If ita is the innovations vector, the normalised residuals are
T = inv(S)
u = T*ita
nr[i] = u[i]/sqrt( T[i][i])
The nice thing about the normalised residuals is that each (over time) should have mean 0 but also sd 1 -- if the filter is correctly tuned.
I am using some legacy C code that passing around lots of raw pointers. To interface with the code, I have to pass a function of the form:
const int N = ...;
T * func(T * x) {
// TODO Put N elements in x
return x + N;
}
where this function should write the result into x, and then return x.
Internally, in this function, I am using Eigen extensively to perform some calculations. Then I write the result back to the raw pointer using the Map class. A simple example which mimics what I am doing is this:
const int N = 5;
T * func(T * x) {
// Do a lot of operations that result in some matrices like
Eigen::Matrix<T, N, 1 > A = ...
Eigen::Matrix<T, N, 1 > B = ...
Eigen::Map<Eigen::Matrix<T, N, 1 >> constraint(x);
constraint = A - B;
return x + N;
}
Obviously, there is much more complicated stuff going on internally, but that is the gist of it... Do some calculations with Eigen, then use the Map class to write the result back to the raw pointer.
Now the problem is that when I profile this code with Callgrind, and then view the results with KCachegrind, the lines
constraint = A - B;
are almost always the bottleneck. This is sort of understandable, because such lines could/are potentially doing three things:
Constructing the Map object
Performing the calculation
Writing the result to the pointer
So it is understandable that this line might have the longest runtime. But I am a little bit worried that perhaps I am somehow doing an extra copy in that line before the data gets written to the raw pointer.
So is there a better way of writing the result to the raw pointer? Or is that the idiom I should be using?
In the back of my mind, I am wondering if using the placement new syntax would buy me anything here.
Note: This code is mission critical and should run in realtime, so I really need to squeeze every ounce of speed out of it. For instance, getting this call from a runtime of 0.12 seconds to 0.1 seconds would be huge for us. But code legibility is also a huge concern since we are constantly tweaking the model used in the internal calculations.
These two lines of code:
Eigen::Map<Eigen::Matrix<T, N, 1 >> constraint(x);
constraint = A - B;
are essentially compiled by Eigen as:
for(int i=0; i<N; ++i)
x[i] = A[i] - B[i];
The reality is a bit more complicated because of explicit unrolling, and explicit vectorization (both depends on T), but that's essentially it. So the construction of the Map object is essentially a no-op (it is optimized away by any compiler) and no, there is no extra copy going on here.
Actually, if your profiler is able to tell you that the bottleneck lies on this simple expression, then that very likely means that this piece of code has not been inlined, meaning that you did not enabled compiler optimizations flags (like -O3 with gcc/clang).
My goal is to be able to model signal-dependent Gaussian noise in Halide. I have a model built in OpenCV which I am now porting to Halide. The challenge is that Halide's random number generator is not normally distributed, so I need to use an external function to produce the noise values.
The implementation attempt uses the C++ random number generator to produced normally distributed noise, a Halide Func to produce the signal-depended standard deviations of the noise at each pixel, and then the noise is added to the pixels in renoise. Below I show the layout of the functions.
// Note: This is an implementation of the noise model found in the paper below:
// "Noise measurement for raw-data of digital imaging sensors by
// automatic segmentation of non-uniform targets"
float get_normal_dist_rand( float mean, float std_dev ) {
std::default_random_engine generator;
std::normal_distribution<float> distribution(mean,std_dev);
float out = distribution(generator);
return out;
}
Func make_get_std_dev( Func *in_func ) {
Var x, y, c;
float q = 0.0060;
float p = 0.0500;
// std_dev = q * sqrt(unnoised_pixel - p)
Func get_std_dev("get_std_dev");
get_std_dev(x,y,c) = q * sqrt( (*in_func)(x,y,c) - p );
return get_std_dev;
}
Func make_renoise( Func *in_func, Func *std_dev ) {
Var x, y, c;
// Noise parameters
// noised_pixel = unnoised_pixel +
// gaussian_rand( std_dev = q * sqrt(unnoised_pixel - p) )
// q and p values do not vary between channels
Func renoise("renoise");
renoise(x,y,c) = (*in_func)(x,y,c) +
get_normal_dist_rand(0,(*std_dev)(x,y,c));
return renoise;
}
This makes sense to me, but unfortunately I receive the following error when I try to compile:
../common/pipe_stages.cpp: In function 'Halide::Func make_renoise(Halide::Func*, Halide::Func*)':
../common/pipe_stages.cpp:223:64: error: cannot convert 'std::enable_if<true, Halide::FuncRef>::type {aka Halide::FuncRef}' to 'float' for argument '2' to 'float get_normal_dist_rand(float, float)'
get_normal_dist_rand(0,(*std_dev)(x,y,c));
^
So it seems that the output of a Func cannot be provided to a C++ function. I guess this makes sense as a limitation of Halide, but I don't really see an alternative to implement the signal dependent normally distributed noise. Is there another way to use external C++ functions in Halide? I have seen folks talking about using "extern" but unfortunately documentation on that functionality seems to be quite light, and I am unable to find what I need.
You'll need to use one of our extern mechanisms to bind to C++ code. HalideExtern_* is the easier of the two and will let you make a call to get random numbers one at a time. Alas test/correctness/c_function.cpp is the immediate example for this, which will help, but could be clearer.
I expect you'll want to request a buffer of random numbers at a time for efficiency reasons. This can be done via the define_extern mechanism. The C++ function has to participate in bounds inference so it is a little more involved. The test for this is correctness/extern_producer.cpp.
I'd expect either transforming our random numbers to be appropriately distributed or implementing the random number generation algorithm in Halide is the right way to go for really fast production code, but that is likely more work than you want to do to get this working initially.
You could also use Halide's RNG along with a binomial approximation to the Gaussian:
Expr gaussian_random(Expr sigma) {
return (random_float() + random_float() + random_float() - 1.5f) * 2 * sigma;
}
Add more instances of randomFloat to get closer and closer to a true normal distribution.
I read here that it is possible (and I interpreted straightforward) to call Stan routines from a C++ program.
I have some complex log-likelihood functions which I have coded up in C++ and really have no idea how I could code them using the Stan language. Is it possible to call the Monte Carlo routines in Stan using the log-likelihood function I have already coded in C++? If so are there any examples of this?
It seems like quite a natural thing to do but I cannot find any examples or pointers as to how to do this.
Upon further review (you may want to unaccept my previous answer), you could try this: Write a .stan program with a user-defined function in the functions block that has the correct signature (and parses) but basically does nothing. Like this
functions {
real foo_log(real[] y, vector beta, matrix X, real sigma) {
return not_a_number(); // replace this after parsing to C++
}
}
data {
int<lower=1> N;
int<lower=1> K;
matrix[N,K] X;
real y[N];
}
parameters {
vector[K] beta;
real<lower=0> sigma;
}
model {
y ~ foo(beta, X, sigma);
// priors here
}
Then, use CmdStan to compile that model, which will generate a .hpp file as an intermediate step. Edit that .hpp file inside the body of foo_log to call your templated C++ function and also #include the header file(s) where your stuff is defined. Then recompile and execute the binary.
That might actually work for you, but if whatever you are doing is somewhat widely useful, we would love for you to contribute the C++ stuff.
I think your question is a bit different from the one you linked to. He had a complete Stan program and wanted to drive it from C++, whereas you are asking if you could circumvent writing a Stan program by calling an external C++ function to evaluate the log-likelihood. But that would not get you very far because you still have to pass in the data in a form that Stan can handle, declare to Stan what are the unknown parameters (plus their support), etc. So, I don't think you can (or should) evade learning the Stan language.
But it is fairly easy to expose a C++ function to the Stan language, which essentially just involves adding your my_loglikelihood.hpp file in the right place under ${STAN_HOME}/lib/stan_math_${VERSION}/stan/math/, adding an include statement to the math.hpp file in that subdirectory, and editing ${STAN_HOME}/src/stan/lang/function_signatures.h. At that point, your .stan program could look as simple as
data {
// declare data like y, X, etc.
}
parameters {
// declare parameters like theta
}
model {
// call y ~ my_logliklihood_log(theta, X)
}
But I think the real answer to your question is that if you have already written a C++ function to evaluate the log-likelihood, then rewriting it in the Stan language shouldn't take more than a few minutes. The Stan language is very C-like so that it is easier to parse the .stan file into a C++ source file. Here is a Stan function I wrote for the log-likelihood of a conditionally Gaussian outcome in a regression context:
functions {
/**
* Increments the log-posterior with the logarithm of a multivariate normal
* likelihood with a scalar standard deviation for all errors
* Equivalent to y ~ normal(intercept + X * beta, sigma) but faster
* #param beta vector of coefficients (excluding intercept)
* #param b precomputed vector of OLS coefficients (excluding intercept)
* #param middle matrix (excluding ones) typically precomputed as crossprod(X)
* #param intercept scalar (assuming X is centered)
* #param ybar precomputed sample mean of the outcome
* #param SSR positive precomputed value of the sum of squared OLS residuals
* #param sigma positive value for the standard deviation of the errors
* #param N integer equal to the number of observations
*/
void ll_mvn_ols_lp(vector beta, vector b, matrix middle,
real intercept, real ybar,
real SSR, real sigma, int N) {
increment_log_prob( -0.5 * (quad_form_sym(middle, beta - b) +
N * square(intercept - ybar) + SSR) /
square(sigma) - # 0.91... is log(sqrt(2 * pi()))
N * (log(sigma) + 0.91893853320467267) );
}
}
which is basically just me dumping what could otherwise be C-syntax into the body of a function in the Stan language that is then callable in the model block of a .stan program.
So, in short, I think it would probably be easiest for you to rewrite your C++ function as a Stan function. However, it is possible that your log-likelihood involves something exotic for which there is currently no corresponding Stan syntax. In that case, you could fall back to exposing that C++ function to the Stan language and ideally making pull requests to the math and stan repositories on GitHub under stan-dev so that other people could use it (although then you would also have to write unit-tests, documentation, etc.).
I'm a new to Eigen and I'm working with sparse LU problem.
I found that if I create a vector b(n), Eigen could compute the x(n) for the Ax=b equation.
Questions:
How to display the L & U, which is the factorization result of the original matrix A?
How to insert non-zeros in Eigen? Right now I just test with some small sparse matrix so I insert non-zeros one by one, but if I have a large-scale matrix, how can I input the matrix in my program?
I realize that this question was asked a long time ago. Apparently, referring to Eigen documentation:
an expression of the matrix L, internally stored as supernodes The only operation available with this expression is the triangular solve
So there is no way to actually convert this to an actual sparse matrix to display it. Eigen::FullPivLU performs dense decomposition and is of no use to us here. Using it on a large sparse matrix, we would quickly run out of memory while trying to convert it to dense, and the time required to compute the factorization would increase several orders of magnitude.
An alternative solution is using the CSparse library from the Suite Sparse as:
extern "C" { // we are in C++ now, since you are using Eigen
#include <csparse/cs.h>
}
const cs *p_matrix = ...; // perhaps possible to use Eigen::internal::viewAsCholmod()
css *p_symbolic_decomposition;
csn *p_factor;
p_symbolic_decomposition = cs_sqr(2, p_matrix, 0); // 1 = ordering A + AT, 2 = ATA
p_factor = cs_lu(p_matrix, m_p_symbolic_decomposition, 1.0); // tol = 1.0 for ATA ordering, or use A + AT with a small tol if the matrix has amostly symmetric nonzero pattern and large enough entries on its diagonal
// calculate ordering, symbolic decomposition and numerical decomposition
cs *L = p_factor->L, *U = p_factor->U;
// there they are (perhaps can use Eigen::internal::viewAsEigen())
cs_sfree(p_symbolic_decomposition); cs_nfree(p_factor);
// clean up (deletes the L and U matrices)
Note that although this does not use expliit vectorization as some Eigen functions do, it is still fairly fast. CSparse is also very compact, it is just a single header and about thirty .c files with no external dependencies. Easy to incorporate in any C++ project. There is no need to actually include all of Suite Sparse.
If you'll use Eigen::FullPivLU::matrixLU() to the original matrix, you'll receive LU decomposition matrix. To display L and U separately, you can use method triangularView<mode>. In Eigen wiki you can find good example of it. Inserting nonzeros into matrices depends on numbers, which you wan't to put. Eigen has convenient syntax, so you can easily insert values in loop:
for(int i=0;i<size;i++)
{
for(int j=size;j>someNumber;j--)
{
matrix(i,j)=yourClass.getNextNumber();
}
}