I want to use Rcpp to make certain parts of my code more efficient. I have a main R function in which my objects are defined, in this R functions I have several rcpp functions that use the r data objects as inputs. This is one example of an rcpp function that is called in the R-function:
void calculateClusters ( List order,
NumericVector maxorder,
List rank,
double lambda,
int nbrClass,
NumericVector nbrExamples) {
int current, i;
for ( current = 0; current < nbrClass; current ++ ) {
maxorder [current] = 0;
for ( i = 0; i < nbrExamples [ current ]; i++ ) {
order[ current ][i] = ( int ) ( rank[ current ][i] / lambda ) - 1;
}
if ( order[ current ][i] > maxorder[ current ] ) {
maxorder[ current ] = order[ current ][i];
}
}
}
This function calculates the maximum number of clusters for each class. In native c++ coding I would define my List as an int** and my NumericVector as int*. However in Rcpp this gives an error. I know the fault lies in the subsetting of these Lists (I handled them the same way as int**).
My question is how can I transform these int** succesfully into List, without loosing flexibility. For example the List order and distance have the structure order[[1]][1:500], order[[2]][1:500], so this would be exactly the same as int** in c++ where it would be order[1][1:500], order[2][1:500]. If there are 3 classes the order and the distance List change to order order[[1]][1:500], order[[2]][1:500], order[[3]][1:500]. How can I do this in Rcpp?
Briefly:
Example from the Rcpp Gallery
Example from the Rcpp Examples package on CRAN
Related
in my math model, I have two objective functions which based on its importance I want to allocate weight to them and add them together as one objective function.
here are my two Objectives added together with weight:
IloExpr objExpression(env);
for (cc = 0; cc < NumberOfCourses; cc++)
for (ww = 0; ww < AvailableWeeks; ww++) {
objExpression += Weight * Diff[cc][ww]; // objective a
}
for (cc = 0; cc < NumberOfCourses; cc++) {
objExpression += (1 - Weight) * (M[cc] * Students[cc]); // objective b
}
IloObjective theObjective(env, objExpression, IloObjective::Minimize);
mod.add(theObjective);
objExpression.end();
i have set the parameters and variables as follow:
const int NumberOfCourses = 15;
const int AvailableWeeks = 8;
const float Weight = 0.5;
double Students[NumberOfCourses];
IloNumVarArray2 Diff(env, NumberOfCourses);
for (cc = 0; cc < NumberOfCourses; cc++)
Diff[cc] = IloNumVarArray(env, AvailableWeeks, 0.0, 8);
IloNumVarArray M(env, NumberOfCourses);
when I run the code it freezes and sets the breakpoint at the second objective line.
also, what should I do if I wanted to have two separate objectives and get output for both individually?
Check the reference documentation of the IloNumVarArray(env, IloInt) constructor. It says
This constructor creates an extensible array of n numeric variables in
env. Initially, the n elements are empty handles.
In other words, all elements in the newly created array will be NULL (note that this is different from the IloBoolVarArray constructor which creates an array in which all elements are new variables). Thus M is an array of empty handles in your case.
In order to fix you code you either have to initialize the elements of the array one by one or use a constructor that initializes the variables, like for example
IloNumVarArray M(env, NumberOfCourses, 0, IloInfinity);
Additionally, I suggest that you compile your code in debug mode without NDEBUG being defined. If you compile without NDEBUG then many of the functions in Concert will raise an exception if you use empty handles.
If you want to have separate objectives then your problem becomes a multi-objective problem. You may want to read the respective chapter in the user manual and also refer to the ilodiet.cpp example that ships with CPLEX.
I am trying to make a package with Rcpp. I have all of my C++ functions in a single .cpp file as follows:
double meanvec(NumericVector x) {
int n = x.size();
double tot = 0;
for (int i = 0; i < n; i++) {
tot += x[i];
}
tot /= n;
return tot;
}
double inprod(NumericVector u, NumericVector v) {
int m = u.size();
double val = 0;
for (int i = 0; i < m; i++) {
val += u[i] * v[i];
}
return val;
}
NumericVector lincoef(NumericVector x, NumericVector y) {
int n = x.size();
double xm = meanvec(x);
double ym = meanvec(y);
NumericVector xa(n);
for (int i = 0; i < n; i++) {
xa[i] = x[i] - xm;
}
NumericVector ya(n);
for (int i = 0; i < n; i++) {
ya[i] = y[i] - ym;
}
double b1 = inprod(xa, ya) / inprod(xa, xa);
double b0 = ym - (b1 * xm);
NumericVector beta = NumericVector::create(b0, b1);
return beta;
}
Basically, the last function takes two vectors as input and outputs a single vector. I would like to call this function into a separate .R file where I am trying to write another function. Something like this:
#' Title
#'
#' #param x Numeric vector.
#' #param y Numeric vector.
#'
#' #return
#' #export
linfit338 = function(x, y){
beta = .Call(`_pkg338_lincoef`, x, y)
fmod = function(x){
beta[1] + beta[2]*x
}
flist = list(beta, fmod)
return(flist)
}
Here the output is a list, where the first element is a vector from the C++ function being called and the second element is the created function. When I try to install and restart, I get this error message:
RcppExports.o:RcppExports.cpp:(.rdata+0x790): undefined reference to `_pkg338_lincoef'
My guess is that is has something to do with exporting the function. When I add // [[Rcpp::export]] above the lincoef function in the C++ file, I don't get any error message, and my final R function works. However, my whole goal is that I do not want the lincoef function exported at all.
Any way to fix this? I would also be open to suggestions as to how I can improve organizing these files, as this is my first experience building a package with Rcpp.
I think you're probably mixing up the concept of exporting C++ code to be used in R (via // [[Rcpp::export]]), which is entirely different to exporting R functions from your package, i.e. making those functions available to end-users of your package.
To make your Rcpp functions callable from within R at all, you need to // [[Rcpp::export]] them. If you don't do this, none of your C++ code will be available from within your R package.
It sounds like what you would like to do is to use the Rcpp-exported functions within your package but to hide them from end-users. This is a common use case for Rcpp, as it allows you to have an R function that acts as an end-user interface to your C++ code, while leaving you free to alter the C++ implementation in future developments without the risk of breaking existing users' code.
Any function you have created within your package, be it an R function or an Rcpp-exported function, has to actively be exported from your package to make it available to end-users. This is a different concept from // [[Rcpp::export]], which is needed to access C++ functions from within your package's R code.
Any R functions will only be exported from your R package if you specify them in the NAMESPACE file in your project's root directory. Thus to export myfunction() you need to have a line that says export(myfunction) in the NAMESPACE file. You are using roxygen2, which will generate this line automatically as long as you write #export in the roxygen skeleton. An alternative to using roxygen's exporting system is to specify an exportPattern in the NAMESPACE file that uses regex to export only functions whose names match a certain pattern.
My usual workflow is to prefix any Rcpp-exported functions with a period by writing my C++ functions like this:
// [[Rcpp::export(.MyCppFunction)]]
int BoringFunction() { return 0; }
I can now call the C++ function from R like this:
MyRFunction <- function()
{
result <- .MyCppFunction()
return(result)
}
The first line in my NAMESPACE file looks like this:
exportPattern("^[[:alpha:]]+")
Which means that any R function in my package starting with a letter will be exported. Since all the functions I Rcpp::export start with a period, I can use them internally within the R package but they won't be exported to end-users.
In other words, end-users of the package can call MyRFunction() but would get an error if they tried to call .MyCppFunction
How would one get a view of a PyArrayObject* similar to the following python code?
# n-column array x
# d is the length of each column
print(x.shape) # => (d, n)
by_column = [x[::,i] for i in range(x.shape[1])]
assert len(by_column) == n
print(by_column[n-1].shape) # => (d,)
So far my code is this:
// my_array is a PyArrayObject*
std::vector<PyArrayObject*> columns = {};
npy_intp* dims = my_array->dimensions;
npy_intp* strides = my_array->strides;
std::vector<int> shape = {};
for (int i = 0; &dims[i] != strides; i++){
shape.push_back(dims[i]);
}
switch (shape.size()) {
case 1: {
// handle 1D array by simply iterating
}
case 2: {
int columns = shape[1];
// What now?
}
}
I'm having trouble finding any reference to do this in C/C++ in both the documentation and the source code, could you give an example of how one would do this?
The C/C++ API for numpy seems really convoluted when compared to something like std::vector, and the documentation isn't very beginner-friendly either, so any references to easier guides would be appreciated too.
You should access the internal structure of PyArrayObject via the PyArray_XXX functions like PyArray_NDIM. To get the contents of a sequence, you use PyObject_GetItem with a tuple key, where in your use case the tuple will have a PySliceObject as the first element.
So I put some code into c++ from R in order to make my model run faster. The c++ code returns a list of 2 items: one vector called "trace" and one matrix called "weights". Once the c++ code has run I would like to reassign "weights" and "trace" in R to the values that were computed from the c++ code. Unfortunately, when I tried to do this I got the error: "Error: cannot change value of locked binding for 'weights'". So I searched for an unbinding function and found unlockBinding. I stuck that in my R code, but I am still getting the same error as before! Am I putting the unlockBinding function in the wrong place? Am I using it correctly? The items "weights" and "trace" do exist in the global environment so why are they not unlocking?
I assigned the list that the c++ code returns to the variable "result", then I used the unlockBinding function, then I reassigned "weights" and "trace" to be what was computed in the c++ code. Here is the code:
batch <- function(n.training){
for(i in 1:n.training){
g <- input.correlation()
for(o in 1:nrow(g)){
result <- traceUpdate(g[o,], trace, weights, trace.param, learning.rate)
unlockBinding("weights", .GlobalEnv)
unlockBinding("trace", .GlobalEnv)
weights <<- result$weights
trace <<- result$trace
}
}
}
Here is the part of my C++ code that returns a list of 2 items, one being a matrix "weights" and the other being a vector "trace":
List traceUpdate(NumericVector input, NumericVector trace, NumericMatrix weights, double traceParam, double learningRate){
NumericVector output = forwardPass(input, trace.size(), weights);
for(int i = 0; i<trace.size(); i++){
trace[i] = (1 - traceParam) * trace[i] + traceParam * output[i];
for(int j=0; j<input.size(); j++){
double w = weights(j,i);
if(w >= 0){
weights(j,i) = w + learningRate * trace[i] * input[j] - learningRate * trace[i] * w;
}
}
}
//return weights
return List::create(Rcpp::Named("weights") = weights,
Rcpp::Named("trace") = trace);
}
If I simply reassign the weights like this:
weights <- matrix(0, nrow = 100, ncol = 20)
they do all change to zeroes and I do not get an error.
Also, when looking for solutions online I came across a way to unlock environments in R, but I'm pretty sure that's not what's wrong because the environment is not locked.
I'm new to this site and relatively new to programming so I apologize if my question is dumb or formatted incorrectly etc.
Thank you.
Just received a solution from my professor:
"Instead of treating weights and trace as global variables that are modified from inside the batch function, we can pass them into the function and return them out:"
batch <- function(weights, trace, n.training){
for(i in 1:n.training){
g <- input.correlation()
for(o in 1:nrow(g)){
result <- traceUpdate(g[o,], trace, weights, trace.param, learning.rate)
weights <- result$weights
trace <- result$trace
}
}
return(list(weights=weights, trace=trace))
}
result <- batch(weights, trace, 50)
weights <- result$weights
trace <- result$trace
i am trying to solve a LP-model in CPLEX using C++ and Concert Technology.
I want to implement constraints (the subtour elimination constraints, to be more specific) that needs to query the value of two of my variables in the current solution:
The variable array xvar is indicating the edges, yvar is representing the nodes.
I implement these constraints by solving n (= number of nodes) Min-Cut-Problems on a modified graph, which is constructed by adding an artificial source and an artifical sink and connect these to every node of the original graph.
From what i've read so far, do i need a lazy constraint or a callback or none of this?
This is where i create the model and get it solved, access the values of the variables in the solution etc:
// Step 2: Construct the necessary CPLEX objects and the LP model
IloCplex solver(env);
std::cout<< "Original Graph g: " <<std::endl;
std::cout<< net.g() <<std::endl;
MCFModel model(env, net);
// Step 3: Load the model into cplex and solve
solver.extract(model);
solver.solve();
// Step 4: Extract the solution from the solver
if(solver.getStatus() != IloAlgorithm::Optimal) throw "Could not solve to optimality!";
IloNumArray xsol ( env, net.g().nEdges() );
IloNumArray ysol ( env, net.g().nNodes() );
IloNumArray rsol ( env, net.g().nGroups() );
IloNumArray wisol ( env, net.g().nGroups() );
IloNum ksol;
NumMatrix wsol ( env, net.g().nGroups());
for(IloInt i = 0; i < net.g().nGroups(); i++){
wsol[i] = IloNumArray( env, net.g().nGroups() );
}
solver.getValues(xsol, model.xvar());
solver.getValues(ysol, model.yvar());
solver.getValues(rsol, model.rvar());
solver.getValues(wisol, model.wivar());
ksol=solver.getValue(model.kvar());
for (IloInt i = 0; i < net.g().nGroups(); i++){
wsol[i] = wisol;
}
// Step 5: Print the solution.
The constraint, i need the current values of the variables xvar and yvar for, is created here:
//build subset constraint y(S) -x(E(S))>= y_i
void MCFModel::buildSubsetCons(){
IloExpr lhs(m_env);
IloCplex cplex(m_env);
IloNumArray xtemp ( m_env, m_net.g().nEdges() );
IloNumArray ytemp ( m_env, m_net.g().nNodes() );
std::vector<Edge> mg_twin;
std::vector<int> mg_weights;
int mg_s;
int mg_t;
SGraph mgraph;
std::vector<int> f;
int nOrigEdges = m_net.g().nEdges();
int nOrigNodes = m_net.g().nNodes();
cplex.getValues(xtemp, m_xvar);
cplex.getValues(ytemp, m_yvar);
mgraph = m_net.g().mod_graph();
mg_s = mgraph.nNodes()-1;
mg_t = mgraph.nNodes();
std::cout<<"modified graph:"<<std::endl;
std::cout<<mgraph<<std::endl;
// fill the weight of original edges with 1/2*x_e
foreach_edge(e, m_net.g()){
mg_weights.push_back((xtemp[e->idx()])/2);
}
// fill the weight of the edges from artificial source with zero
for(int i=0; i<m_net.g().nNodes(); i++){
mg_weights.push_back(0);
}
// fill the weight of the edges to artificial sink with f(i)
// first step: calculate f(i):
//f.resize(m_net.g().nNodes());
foreach_node(i, m_net.g()){
foreach_adj_edge(e, i, m_net.g()){
f[i] = f[i] + xtemp[e->idx()];
}
f[i] = (-1)*f[i]/2;
f[i] = f[i] + ytemp[i];
}
// second step: fill the weights vector with it
for(int i=0; i<m_net.g().nNodes(); i++){
mg_weights.push_back(f[i]);
}
// calculate the big M = abs(sum_(i in N) f(i))
int M;
foreach_node(i, m_net.g()){
M = M + abs(f[i]);
}
// Build the twin vector of the not artificial edges for mgraph
mg_twin.resize(2*nOrigEdges + 2*nOrigNodes);
for(int i=0; i < nOrigEdges ; ++i){
mg_twin[i] = mgraph.edges()[nOrigEdges + i];
mg_twin[nOrigEdges + i] = mgraph.edges()[i];
}
//Start the PreflowPush for every node in the original graph
foreach_node(v, m_net.g()){
// "contract" the edge between s and v
// this equals to higher the weights of the edge (s,v) to a big value M
// weight of the edge from v to s lies in mg_weights[edges of original graph + index of node v]
mg_weights[m_net.g().nEdges() + v] = M;
//Start PreflowPush for v
PreflowPush<int> pp(mgraph, mg_twin, mg_weights, mg_s, mg_t);
std::cout << "Flowvalue modified graph: " << pp.minCut() << std::endl;
}
}
The Object pp is to solve the Min-Cut-Problem on the modified graph mgraph with artificial source and sink. The original graph is in m_net.g().
When i compile and run it, i get the following error:
terminate called after throwing an instance of 'IloCplex::Exception'
Aborted
It seems to me, that it is not possible to access the values of xvar and yvar like this?
I do appreciate any help since i am quite lost how to do this.
Thank you very much!!
Two things...
I. I strongly suggest you to use a try-catch to better understand CPLEX Exceptions. You could perhaps understand the nature of the exception like this. As a matter of fact, I suggest you a try-catch-catch setting, sort of:
try {
//... YOUR CODE ...//
}
catch(IloException& e) {
cerr << "CPLEX found the following exception: " << e << endl;
e.end();
}
catch(...) {
cerr << "The following unknown exception was found: " << endl;
}
II. The only way to interact with CPLEX during the optimization process is via a Callback, and, in the case of Subtour Elimination Constraints (SECs) you will need to separate both integer and fractional SECs.
II.1 INTEGER: The first one is the easiest one, an O(n) routine would help you identify all the connected components of a node solution, then you could add the subsequent cuts to prevent this particular SEC from appearing in other nodes. You could either enforce your cuts locally, i.e. only on the current sub-tree, using the addLocal() function, or globally, i.e. on the entire Branch-and-Cut tree, using the add() function. In any case, ALWAYS remember to add .end() to terminate the cut container. Otherwise you WILL have serious memory leak issues, trust me with this, lol. This callback needs to be a done via a Lazy Constraint Callback (ILOLAZYCONSTRAINTCALLBACK)
II.2 FRACTIONAL: The second one is by far more complex. The easiest way to make it work is to use Professor Lysgaard's CVRPSEP library. It is nowadays most efficient way of computing capacity cuts, Multistar, generalized multistar, framed capacity, strengthened comb and hypotour cuts. Additionally, is rather easy to link with any existing code. The linkage also needs to be embedded on the solution process, hence, a callback is also required. In this case, it would be a User Cut Callback (ILOUSERCUTCALLBACK).
One is glad to be of service
Y