subset Armadillo field - c++

If I understand correctly, a field in Armadillo is like a List for arbitrary objects. For instance a set of matrices of different sizes, or matrices and vectors. In the documentation I have seen the type cube which can be used with slices so you can subset using them. However, it seems there is no specific method to subset the fields.
A simplified version of my code is:
arma::mat A = eye(2,2);
arma::mat B = eye(3,3)*3;
arma::mat C = eye(4,4)*4;
arma::field<arma::mat> F(3,1);
F(0,0) = A;
F(1,0) = B;
F(2,1) = C;
// to get matrices B and C
F.slices(1,2);
but get error
Error: field::slices(): indicies out of bounds or incorrectly used

Firstly, there is a small error in the code you presented:
F(2,1) = C;
I assume it should be:
F(2,0) = C;
Secondly, the function slices() is only valid for 3D fields. Your field F, however, is only a 2D field because you only specify rows and columns in the constructor. To access matrices B and C, you can instead use:
arma::field<arma::mat> G=F.subfield(1,0,2,0);
or:
arma::field<arma::mat> G=F.rows(1,2);
More info on the subfield views at this page.

Related

Is there a way to create std::vector<arma::mat> from arma::mat matrices without creating a copy of the matrices?

I am new to C++. For a statistical method, I compute large matrices, e.g. A and B . They are n x n so for large sample sizes n, they become very large. If they are double and n = 70k , I think it might be on the order of 30GB?
Because the number of matrices needed can vary, I implemented the algorithm to use a vector of matrices and iterate over it for some operations. E.g.
arma::mat A;
arma::mat B;
std::vector<arma::mat> matrices;
matrices = {A, B};
Is there a way to create this std::vector without copying the matrices?
I tried to check whether the memory is the same by doing this:
logger->info("Memory address for A: {}.", (void *)&A);
logger->info("Memory address for matrices.at(0): {}.", (void *)&matrices.at(0));
And it showed different addresses so I assume it is creating a copy but I am not sure.
I tried to use
std::vector<arma::mat> matrices;
matrices.push_back(A);
The memory addresses differed still. With
std::vector<arma::mat> matrices;
matrices.push_back(std::move(A));
the algorithm no longer worked because the matrices were empty.
You need to reverse the logic: Let the std::vector allocate the memory and create the matrices. Then you work directly with the elements in the vector. For example:
std::vector<arma::mat> matrices;
matrices.resize(2);
arma::mat & A = matrices[0];
arma::mat & B = matrices[1];
// Initializing the values of A and B, and do stuff with them.
Note: It is important to note here that the references to A and B will become invalid if you add more elements afterwards to the vector, i.e. if the vector grows. But since these matrices are so large, you want to avoid this anyway at all costs.
If the container needs to grow, you might want to take a look at e.g. std::list.
First, apply for memory in advance.
Second, use advanced constructors:
mat(ptr_aux_mem, n_rows, n_cols, copy_aux_mem = false, strict = false)

Multiplying 1xn Eigen::Array with 2xn Eigen::Array, with each column in the 1xn array behaving like a scalar

I have two Eigen::Array which have the same number of columns. One of them, a, has one row, and the other, b, has two rows.
What I want to do, is to multiply every column of b with the entry in the respective column in a, so that it behaves like this:
ArrayXXd result;
result.resizeLike(b);
for (int i=0; i<a.cols(); ++i)
result.col(i) = a.col(i)[0] * b.col(i);
However, it's part of a rather long expression with several of such multiplications, and I don't want to have to evaluate intermediate results in temporaries. Therefore, I'd rather get an Eigen expression of the above, like
auto expr = a * b;
This, of course, triggers an assertion, because a.rows() != b.rows().
What I tried, which works, is:
auto expr = a.replicate(2,1) * b;
However, the resulting code is very slow, so I hope there's a better option.
Possibly related.
Eigen provides the possibility to use broadcasting for such cases. However, the one-dimensional array should first be converted into a Vector:
broadcasting operations can only be applied with an object of type Vector
This will work in your case:
RowVectorXd av = a;
ArrayXXd expr = b.rowwise() * av.array();
Edit
To avoid a copy of the data into a new vector one can use Map:
ArrayXXd expr = b.rowwise() * RowVectorXd::Map(&a(0), a.cols()).array();
I have posted the same solution to your previous question but here is my answer again:
Define your arrays with fixed number of rows but dynamic number of columns whereas ArrayXXd type yields an array with both dynamic number of rows and columns.
Use fixed-size version of operations. This should typically give faster code.

Eigen and C++11 type inference fails for Cholesky of matrix product

I am trying to take the cholesky decomposition of the product of a matrix with its transpose, using Eigen and C++11 "auto" type. The problem comes when I try to do
auto c = a * b
auto cTc = c.tranpose() * c;
auto chol = cTc.llt();
I am using XCode 6.1, Eigen 3.2.2. The type error I get is here.
This minimal example shows the problem on my machine. Change the type of c from auto to MatrixXd to see it work.
#include <iostream>
#include <Eigen/Eigen>
using namespace std;
using namespace Eigen;
int main(int argc, const char * argv[]) {
MatrixXd a = MatrixXd::Random(100, 3);
MatrixXd b = MatrixXd::Random(3, 100);
auto c = a * b;
auto cTc = c.transpose() * c;
auto chol = cTc.llt();
return 0;
}
Is there a way to make this work while still using auto?
As a side question, is there a performance reason to not assert the matrix is a MatrixXd at each stage? Using auto would allow Eigen to keep the answer as whatever weird template expression it fancies. I'm not sure if typing it as MatrixXd would cause problems or not.
The problem is that the first multiplication returns a Eigen::GeneralProduct instead of a MatrixXd and auto is picking up the return type. You can implicitly create a MatrixXd from a Eigen::GeneralProduct so when you explicitly declare the type it works correctly. See http://eigen.tuxfamily.org/dox/classEigen_1_1GeneralProduct.html
EDIT: I'm not an expert on the Eigen product or performance characteristics of doing the casting. I just surmised the answer from the error message and confirmed from the online documentation. Profiling is always your best bet for checking the performance of different parts of your code.
Let me summarize what's is going on and why it's wrong. First of all, let's instantiate the auto keywords with the types they are taking:
typedef GeneralProduct<MatrixXd,MatrixXd> Prod;
Prod c = a * b;
GeneralProduct<Transpose<Prod>,Prod> cTc = c.transpose() * c;
Note that Eigen is an expression template library. Here, GeneralProduct<> is an abstract type representing the product. No computation are performed. Therefore, if you copy cTc to a MatrixXdas:
MatrixXd d = cTc;
which is equivalent to:
MatrixXd d = c.transpose() * c;
then the product a*b will be carried out twice! So in any case it is much preferable to evaluate a*b within an explicit temporary, and same for c^T*c:
MatrixXd c = a * b;
MatrixXd cTc = c.transpose() * c;
The last line:
auto chol = cTc.llt();
is also rather wrong. If cTc is an abstract product type, then it tries to instantiate a Cholesky factorization working on a an abstract product type which is not possible. Now, if cTc is a MatrixXd, then your code should work but this still not the preferred way as the method llt() is rather to implement one-liner expression like:
VectorXd b = ...;
VectorXd x = cTc.llt().solve(b);
If you want a named Cholesky object, then rather use its constructor:
LLT<MatrixXd> chol(cTc);
or even:
LLT chol(c.transpose() * c);
which is equivalent unless you have to use c.transpose() * c in other computations.
Finally, depending of the sizes of a and b, it might be preferable to compute cTc as:
MatrixXd cTc = b.transpose() * (a.transpose() * a) * b;
In the future (i.e., Eigen 3.3), Eigen will be able to see:
auto c = a * b;
MatrixXd cTc = c.transpose() * c;
as a product of four matrices m0.transpose() * m1.transpose() * m2 * m3 and put the parenthesis at the right place. However, it cannot know that m0==m3 and m1==m2, and therefore if the preferred way is to evaluate a*b in a temporary, then you will still have to do it yourself.
I'm not an expert at Eigen, but libraries like this often return proxy objects from operations and then use implicit conversion or constructors to force the actual work. (Expression Templates are an extreme example of this.) This avoids unnecessary copying of the full matrix of data in many situations. Unfortunately, auto is quite happy to just create an object of the proxy type, which normally users of the library would never explicitly declare. Since you need to ultimately have the numbers calculated, there is not a performance hit per se from casting to a MatrixXd. (Scott Meyers, in "Effective Modern C++", gives the related example of using auto with vector<bool>, where operator[](size_t i) returns a proxy.)
DO NOT use auto with Eigen expressions. I bumped into even more "dramatic" issues with this before, see
eigen auto type deduction in general product
and was advised by one of the Eigen creators (Gael) not to use auto with Eigen expressions.
The cast from an expression to a specific type like MatrixXd should be extremely fast, unless you want lazy evaluation (since when doing the cast the result is evaluated).

libsvm : C++ vs. MATLAB : What's With The Different Accuracies?

I have two multi-class data sets with 5 labels, one for training, and the other for cross validation. These data sets are stored as .csv files, so they act as a control in this experiment.
I have a C++ wrapper for libsvm, and the MATLAB functions for libsvm.
For both C++ and MATLAB:
Using a C-type SVM with an RBF kernel, I iterate over 2 lists of C and Gamma values. For each parameter combination, I train on the training data set and then predict the cross validation data set. I store the accuracy of the prediction in a 2D map which correlates to the C and Gamma value which yielded the accuracy.
I've recreated different training and cross validation data sets many, many times. Each time, the C++ and MATLAB accuracies are different; sometimes by a lot! Mostly MATLAB produces higher accuracies, but sometimes the C++ implementation is better.
What could be accounting for these differences? The C/Gamma values I'm trying are the same, as are the remaining SVM parameters (default).
There should be no significant differences as both C and Matlab codes use the same svm.c file. So what can be the reason?
implementation error in your code(s), this is unfortunately the most probable one
used wrapper has some bug and/or use other version of libsvm then your matlab code (libsvm is written in pure C and comes with python, Matlab and java wrappers, so your C++ wrapper is "not official") or your wrapper assumes some additional default values, which are not default in C/Matlab/Python/Java implementations
you perform cross validation in somewhat randomized form (shuffling the data and then folding, which is completely correct and reasonable, but will lead to different results in two different runs)
There is some rounding/conversion performed during loading data from .csv in one (or both) of your codes which leads to inconsistencies (really not likely to happen, yet still possible)
I trained an SVC using scikit-Learn (sklearn.svm.SVC) within a python Jupiter Notebook. I wanted to use the trained classifier in MATLAB v. 2022a and C++. I nedeed to verify that all three versions' predictions matched for each implementation of the kernel, decision, and prediction functions. I found some useful guidance from bcorso's implementation of the original libsvm C++ code.
Exporting structure that represents the structure's model is explained in bcorso's post ab required to call his prediction function implementation:
predict(params, sv, nv, a, b, cs, X)
for it to match sklearn's version for trained classifier instance, clf:
clf.predict(X)
Once I established this match, I created a MATLAB versions of bcorso's kernel,
function [k] = kernel_svm(params, sv, X)
k = zeros(1,length(sv));
if strcmp(params.kernel,'linear')
for i = 1:length(sv)
k(i) = dot(sv(i,:),X);
end
elseif strcmp(params.kernel,'rbf')
for i = 1:length(sv)
k(i) =exp(-params.gamma*dot(sv(i,:)-X,sv(i,:)-X));
end
else
uiwait(msgbox('kernel not defined','Error','modal'));
end
k = k';
end
decision,
function [d] = decision_svm(params, sv, nv, a, b, X)
%% calculate the kernels
kvalue = kernel_svm(params, sv, X);
%% define the start and end index for support vectors for each class
nr_class = length(nv);
start = zeros(1,nr_class);
start(1) = 1;
%% First Class Loop
for i = 1:(nr_class-1)
start(i+1) = start(i)+ nv(i)-1;
end
%% Other Classes Nested Loops
for i = 1:nr_class
for j = i+1:nr_class
sum = 0;
si = start(i); %first class start
sj = start(j); %first class end
ci = nv(i)+1; %next class start
cj = ci+ nv(j)-1; %next class end
for k = si:sj
sum =sum + a(k) * kvalue(k);
end
sum1=sum;
sum = 0;
for k = ci:cj
sum = sum + a(k) * kvalue(k);
end
sum2=sum;
end
end
%% Add class sums and the intercept
sumd = sum1 + sum2;
d = -(sumd +b);
end
and predict functions.
function [class, classIndex] = predict_svm(params, sv, nv, a, b, cs, X)
dec_value = decision_svm(params, sv, nv, a, b, X);
if dec_value <= 0
class = cs(1);
classIndex = 1;
else
class = cs(2);
classIndex = 0;
end
end
Translation of the python comprehension syntax to a MATLAB/C++ equivalent of the summations required nested for loops in the decision function.
It is also required to account for for MATLAB indexing (base 1) vs.Python/C++ indexing (base 0).
The trained classifer model is conveyed by params, sv, nv, a, b, cs, which can be gathered within a structure after hanving exported the sv and a matrices as .csv files from teh python notebook. I simply created a wrapper MATLAB function svcInfo that builds the structure:
svcStruct = svcInfo();
params = svcStruct.params;
sv= svcStruct.sv;
nv = svcStruct.nv;
a = svcStruct.a;
b = svcStruct.b;
cs = svcStruct.cs;
Or one can save the structure contents within as MATLAB workspace within a .mat file.
The new case for prediction is provided as a vector X,
%Classifier input feature vector
X=[x1 x2...xn];
A simplified C++ implementation that follows bcorso's python version is fairly similar to this MATLAB implementation in that it uses the nested "for" loop within the decision function but it uses zero based indexing.
Once tested, I may expand this post with the C++ version on the MATLAB code shared above.

Can I solve a system of linear equations, in the form Ax = b with A being sparse, using Eigen?

I need to convert a MATLAB code into C++, and I'm stuck with this instruction:
a = K\F
, where K is a sparse matrix of size n x n, and F is a column vector of size n.
I know it's easy to solve that using the Eigen library - I have tried the fullPivLu() method, and I've been able to built a working snippet, using a Matrix and a Vector.
However, my K is a SparseMatrix<double> (while F is a VectorXd). My declarations:
SparseMatrix<double> K(nec, nec);
VectorXd F(nec);
and it seems that SparseMatrix doesn't have the fullPivLu() method, nor the lu() one.
I've tried, in fact, these two different approaches, taken from the documentation:
//1.
MatrixXd x = K.fullPivLu().solve(F);
//2.
VectorXf x;
K.lu().solve(F, &x);
They don't work, because fullPivLu() and lu() are not members of 'Eigen::SparseMatrix<_Scalar>'
So, I am asking: is there a way to solve a system of linear equations (the MATLAB's mldivide, or '\'), using Eigen for C++, with K being a sparse matrix?
Thank you for any help.
Would Eigen::SparseLU work for you?