Cython replacing list of tuples by C equivalent - c++

I am trying to speed up an already very optimized function in Cython using list of two sized tuples of doubles as inputs and outputs. To do that I need to have it all in pure C first.
In python somehow cythonized the tuple syntax ressembles this:
def f(inputList1, inputList2):
cdef double p, i10,i11,i20,i21,a0,a1
a0,a1=inputList1[-1]
for i10,i11 in inputList1:
outputList=[]
#some more computations involving a0 and a1
for i20,i21 in inputList2:
p=f(i10,i11,a0,a1,i21,i20) #f returns a double
if p==0:
outputList.append((i10,i21))
else if p>0:
outputList.append(g(i10,i21,i20,i21)) #g returns two outputs that are parsed as a two sized tuples automatically
if len(outputList)<1:
return []
#some more computations
a0, a1=i10, i11
return outputList
The details are not relevant what is important is the speed and the syntax: it uses tuple unpacking in the for loops to break the tuples apart and can append whole tuples using append(), it can also fetch the last element of a list and it can return an empty list. It can also convert the two outputs of g to a tuple.
I am trying to change all the python to pure C to either increase speed or at least not damage the speed in my mind it should be doable (maybe I am wrong ?). So as g has to become pure C g has to return one object (I guess multiple outputs are out of the question ?)
My first idea was to use std::vectors and std::pairs lists of lists become vector[pair[double,double]] I modified g to return a pair[double,double] instead of two doubles
cdef vector[pair[double,double]] f(vector[pair[double,double]] inputList1, vector[pair[double,double]] inputList2):
cdef double p, i10,i11,i20,i21,a0,a1
#I have to add the pairs as I cannot use the for a,b in syntax
cdef pair[double,double] i1,i2,e
cdef vector[pair[double,double]] outputList, emptyList
a0,a1=inputList.back()
for i1 in inputList1:
i10=i1.first
i11=i1.second
outputList=emptyList
#some more computations involving a0 and a1
for i2 in inputList2:
i20=i2.first
i21=i2.second
p=f(i10,i11,a0,a1,i21,i20) #f returns a double
if p==0.:
outputList.push_back(i1) #I am now using push_back and not append
else if p>0.:
outputList.push_back(g(i10,i21,i20,i21)) #g now returns a pair
if outputList.size()<1:
return outputList
#some more computations
a0, a1=i10, i11
return outputList
Everything is pure C but it is 3 times slower !!!
I also tried std::list and list[list[double]] I am losing in speed by a factor of 3 also ! I use i1.back() and i1.front() instead of first and second I guess that makes me loose speed too. What is the reason for that ? Is there a better C object to use ? Is is the syntax I am using ? Doing explicitely i20=i2.first and so on is that what makes it so slow ?
Especially the syntax of g now seems really silly maybe the bottleneck comes from there:
cdef pair[double,double] g(double a, double b, double c, double d):
#looks ugly that I have to define res
cdef pair[double,double] res
cdef double res_int
res_1=computations1(a,b,c,d)
res_2=computations2(a,b,c,d)
#looks ugly
res.first=res_1
res.second=res_2
return res
instead of simply returning res_1, res_2 as before:
EDIT: I redid everything to benchmark the different solutions and:
-it turns out the list[list[double]] is not an option for me because later in my code I need to access specific elements of the list through indexing
-vector[np.ndarray[DTYPE, ndim=1]] does not work I guess you cannot form a vector with Python objects
-vector[pair[double,double]] is actually indeed faster than Python list versions !
For 100000 iterations the Python version takes in total:
1.10413002968s for the Python version
0.781275987625s for the (vector[pair[double,double]]) C version.
It still looks very ugly and I still want to hear if this is the right approach

Related

A vector of polynomials each defined as a function

I'm trying to get a vector of polynomials, but within the vector have each polynomial defined by a function in Pari.
For example, I want to be able to output a vector of this form:
[f(x) = x-1 , f(x) = x^2 - 1, f(x) = x^3 - 1, f(x) = x^4 - 1, f(x) = x^5 - 1]
A simple vector construction of vector( 5, n, f(x) = x^n-1) doesn't work, outputting [(x)->my(i=1);x^i-1, (x)->my(i=2);x^i-1, (x)->my(i=3);x^i-1, (x)->my(i=4);x^i-1, (x)->my(i=5);x^i-1].
Is there a way of doing this quite neatly?
Update:
I have a function which takes a polynomial in two variables (say x and y), replaces one of those variables (say y) with exp(I*t), and then integrates this between t=0 and t=1, giving a single variable polynomial in x: int(T)=intnum(t=0,1,T(x,exp(I*t)))
Because of the way this is defined, I have to explicitly define a polynomial T(x,y)=..., and then calculate int(T). Simply putting in a polynomial, say int(x*y)-1, returns:
*** at top-level: int(x*y-1)
*** ^----------
*** in function int: intnum(t=0,1,T(x,exp(I*t)))
*** ^--------------
*** not a function in function call
*** Break loop: type 'break' to go back to GP prompt
I want to be able to do this for many polynomials, without having to manually type T(x,y)=... for every single one. My plan is to try and do this using the apply feature (so, putting all the polynomials in a vector - for a simple example, vector(5, n, x^n*y-1)). However, because of the way I've defined int, I would need to have each entry in the vector defined as T(x,y)=..., which is where my original question spawned from.
Defining T(x,y)=vector(5, n, x^n*y-1) doesn't seem to help with what I want to calculate. And because of how int is defined, I can't think of any other way to go about trying to tackle this.
Any ideas?
The PARI inbuilt intnum function takes as its third argument an expression rather than a function. This expression can make use of the variable t. (Several inbuilt functions behave like this - they are not real functions).
Your int function can be defined as follows:
int(p)=intnum(t=0, 1, subst(p, y, exp(I*t)))
It takes as an argument a polynomial p and then it substitutes for y when required to do so.
You can then use int(x*y) which returns (0.84147098480789650665250232163029899962 + 0.45969769413186028259906339255702339627*I)*x'.
Similarly you can use apply with a vector of polynomials. For example:
apply(int, vector(5, n, x^n*y-1))
Coming back to your original proposal - it's not technically wrong and will work. I just wouldn't recommend it over the subst method, but perhaps if you are were wanting to perform numerical integration over a class of functions that were not representable as polynomials. Let's suppose int is defined as:
int(T)=intnum(t=0,1,T(x,exp(I*t)))
You can invoke it using the syntax int((x,y) -> x*y). The arrow is the PARI syntax for creating an anonymous function. (This is the difference between an expression and a function - you cannot create your own functions that work like PARI inbuilt functions)
You may even use it with a vector of functions:
apply(int, vector(5, n, (x,y)->x^n*y-1))
I am using the syntax (x,y)->x^n*y-1 here which is preferable to the f(x,y)=x^n*y-1 you had in your question, but they are essentially the same. (the latter form also defines f as a side effect which is not wanted so it is better to use anonymous functions.

Efficient (non-standard) join of two Eigen::VectorXd

I have two Eigen::VectorXd objects, A and B, with the same dimension n.
I want to create a new vector C such that:
If B[i] is NaN, C[i] = A[i]
Otherwise: C[i] = B[i]
As the application is latency-sensitive, I'd like to avoid making copies of A and B.
Right now I'm using a simple for-loop but I'd like advice on how to implement this in a smart(er) way with Eigen.
Try using select:
C = (B.array() == B.array()).select(B, A);
The B==B will be true in the values that are non NaN ad false otherwise.
For the true values, select returns the first matrix, for false the second.
As noted below by chtz, a more compact way of writing this would be:
C = B.array().isNaN().select(A, B);
In terms of performance, this is not vectorized (at least last time I checked), but does not introduce copies of A and B. It's probably the same as what you wrote (as far as I can tell without seeing code).

libsvm : C++ vs. MATLAB : What's With The Different Accuracies?

I have two multi-class data sets with 5 labels, one for training, and the other for cross validation. These data sets are stored as .csv files, so they act as a control in this experiment.
I have a C++ wrapper for libsvm, and the MATLAB functions for libsvm.
For both C++ and MATLAB:
Using a C-type SVM with an RBF kernel, I iterate over 2 lists of C and Gamma values. For each parameter combination, I train on the training data set and then predict the cross validation data set. I store the accuracy of the prediction in a 2D map which correlates to the C and Gamma value which yielded the accuracy.
I've recreated different training and cross validation data sets many, many times. Each time, the C++ and MATLAB accuracies are different; sometimes by a lot! Mostly MATLAB produces higher accuracies, but sometimes the C++ implementation is better.
What could be accounting for these differences? The C/Gamma values I'm trying are the same, as are the remaining SVM parameters (default).
There should be no significant differences as both C and Matlab codes use the same svm.c file. So what can be the reason?
implementation error in your code(s), this is unfortunately the most probable one
used wrapper has some bug and/or use other version of libsvm then your matlab code (libsvm is written in pure C and comes with python, Matlab and java wrappers, so your C++ wrapper is "not official") or your wrapper assumes some additional default values, which are not default in C/Matlab/Python/Java implementations
you perform cross validation in somewhat randomized form (shuffling the data and then folding, which is completely correct and reasonable, but will lead to different results in two different runs)
There is some rounding/conversion performed during loading data from .csv in one (or both) of your codes which leads to inconsistencies (really not likely to happen, yet still possible)
I trained an SVC using scikit-Learn (sklearn.svm.SVC) within a python Jupiter Notebook. I wanted to use the trained classifier in MATLAB v. 2022a and C++. I nedeed to verify that all three versions' predictions matched for each implementation of the kernel, decision, and prediction functions. I found some useful guidance from bcorso's implementation of the original libsvm C++ code.
Exporting structure that represents the structure's model is explained in bcorso's post ab required to call his prediction function implementation:
predict(params, sv, nv, a, b, cs, X)
for it to match sklearn's version for trained classifier instance, clf:
clf.predict(X)
Once I established this match, I created a MATLAB versions of bcorso's kernel,
function [k] = kernel_svm(params, sv, X)
k = zeros(1,length(sv));
if strcmp(params.kernel,'linear')
for i = 1:length(sv)
k(i) = dot(sv(i,:),X);
end
elseif strcmp(params.kernel,'rbf')
for i = 1:length(sv)
k(i) =exp(-params.gamma*dot(sv(i,:)-X,sv(i,:)-X));
end
else
uiwait(msgbox('kernel not defined','Error','modal'));
end
k = k';
end
decision,
function [d] = decision_svm(params, sv, nv, a, b, X)
%% calculate the kernels
kvalue = kernel_svm(params, sv, X);
%% define the start and end index for support vectors for each class
nr_class = length(nv);
start = zeros(1,nr_class);
start(1) = 1;
%% First Class Loop
for i = 1:(nr_class-1)
start(i+1) = start(i)+ nv(i)-1;
end
%% Other Classes Nested Loops
for i = 1:nr_class
for j = i+1:nr_class
sum = 0;
si = start(i); %first class start
sj = start(j); %first class end
ci = nv(i)+1; %next class start
cj = ci+ nv(j)-1; %next class end
for k = si:sj
sum =sum + a(k) * kvalue(k);
end
sum1=sum;
sum = 0;
for k = ci:cj
sum = sum + a(k) * kvalue(k);
end
sum2=sum;
end
end
%% Add class sums and the intercept
sumd = sum1 + sum2;
d = -(sumd +b);
end
and predict functions.
function [class, classIndex] = predict_svm(params, sv, nv, a, b, cs, X)
dec_value = decision_svm(params, sv, nv, a, b, X);
if dec_value <= 0
class = cs(1);
classIndex = 1;
else
class = cs(2);
classIndex = 0;
end
end
Translation of the python comprehension syntax to a MATLAB/C++ equivalent of the summations required nested for loops in the decision function.
It is also required to account for for MATLAB indexing (base 1) vs.Python/C++ indexing (base 0).
The trained classifer model is conveyed by params, sv, nv, a, b, cs, which can be gathered within a structure after hanving exported the sv and a matrices as .csv files from teh python notebook. I simply created a wrapper MATLAB function svcInfo that builds the structure:
svcStruct = svcInfo();
params = svcStruct.params;
sv= svcStruct.sv;
nv = svcStruct.nv;
a = svcStruct.a;
b = svcStruct.b;
cs = svcStruct.cs;
Or one can save the structure contents within as MATLAB workspace within a .mat file.
The new case for prediction is provided as a vector X,
%Classifier input feature vector
X=[x1 x2...xn];
A simplified C++ implementation that follows bcorso's python version is fairly similar to this MATLAB implementation in that it uses the nested "for" loop within the decision function but it uses zero based indexing.
Once tested, I may expand this post with the C++ version on the MATLAB code shared above.

Dividing each element in a container between a given number C++

I was multiplying each container against another number so I did the following:
local_it begin = magnitudesBegin;
std::advance(begin , 2);
local_it end = magnitudesBegin;
std::advance(end, 14);
std::transform(begin, end, firstHalf.begin(),
std::bind1st(std::multiplies<double>(),100));
It worked wonders, problem is when doing the same to divide between another container. Here is a working example of my problem:
const std::size_t stabilitySize = 13;
boost::array<double,stabilitySize> secondHalf;
double fundamental = 707;
boost::array<double, stabilitySize> indexes = {{3,4,5,6,7,8,9,10,11,12,13,14,15}};
std::transform(indexes.begin(), indexes.end(), secondHalf.begin(),
std::bind1st(std::divides<double>(),fundamental));
It does divide the container. But instead of dividing each element in the array against 707 it divides 707 between each element in the array.
std::bind1st(std::divides<double>(),fundamental)
The code above takes a functor std::divides<double> that takes two arguments and fixes the value of the first argument to be fundamental. That is it fixes the numerator of the operation and you get the expected result. If you want to bind fundamental to be the denominator, use std::bind2nd.
you can try the following , divide has a completely different operation than multiply, it just divides a constant number by all your elements
std::bind1st(std::multiplies<double>(),1.0/707.0));
If the number 707.0 is something like a fundamental constant, and a division can be seen as a "conversion", let's call it "x to y" (I don't know what your numbers are representing, so replace this by meaningful words). It would be nice to wrap this "x to y" conversion in a free-standing function for re-usability. Then, use this function on std::transform.
double x_to_y(double x) {
return x / 707.0;
}
...
std::transform(..., x_to_y);
If you had C++11 available, or want to use another lambda-library, another option is to write this in-line where being used. You might find this syntax more readable like parameter binding using bind2nd:
std::transform(..., _1 / 707.0); // when using boost::lambda

R List of numeric vectors -> C++ 2d array with Rcpp

I mainly use R, but eventually would like to use Rcpp to interface with some C++ functions that take in and return 2d numeric arrays. So to start out playing around with C++ and Rcpp, I thought I'd just make a little function that converts my R list of variable-length numeric vectors to the C++ equivalent and back again.
require(inline)
require(Rcpp)
test1 = cxxfunction(signature(x='List'), body =
'
using namespace std;
List xlist(x);
int xlen = xlist.size();
vector< vector<int> > xx;
for(int i=0; i<xlen; i++) {
vector<int> test = as<vector<int> > (xlist[i]);
xx.push_back(test);
}
return(wrap(xx));
'
, plugin='Rcpp')
This works like I expect:
> test1(list(1:2, 4:6))
[[1]]
[1] 1 2
[[2]]
[1] 4 5 6
Admittedly I am only part way through the very thorough documentation, but is there a nicer (i.e. more Rcpp-like) way to do the R -> C++ conversion than with the for loop? I am thinking possibly not, since the documentation mentions that (at least with the built-in methods) as "offers less flexibility and currently handles conversion of R objects into primitive types", but I wanted to check because I'm very much a novice in this area.
I will give you bonus points for a reproducible example, and of course for using Rcpp :) And then I will take those away for not asking on the rcpp-devel list...
As for converting STL types: you don't have to, but when you decide to do it, the as<>() idiom is correct. The only 'better way' I can think of is to do name lookup as you would in R itself:
require(inline)
require(Rcpp)
set.seed(42)
xl <- list(U=runif(4), N=rnorm(4), T2df=rt(4,2))
fun <- cxxfunction(signature(x="list"), plugin="Rcpp", body = '
Rcpp::List xl(x);
std::vector<double> u = Rcpp::as<std::vector<double> >(xl["U"]);
std::vector<double> n = Rcpp::as<std::vector<double> >(xl["N"]);
std::vector<double> t2 = Rcpp::as<std::vector<double> >(xl["T2df"]);
// do something clever here
return(R_NilValue);
')
Hope that helps. Otherwise, the list is always open...
PS As for the two-dim array, that is trickier as there is no native C++ two-dim array. If you actually want to do linear algebra, look at RcppArmadillo and RcppEigen.