Dynamic allocation of matrix. How?? (c++ Stroustrup) - c++

I'm using Stroustrup's matrix.h implementation as I have a lot of matrix heavy computation to do. It will make life easier if I can just get the matrix populated!
I'm receiving a complex object with a matrix that is not known until received. Once it enters the method, I can get the row and column count, but I have to use a double i,j loop to pull the values since they are in a cpp17::any structure and I have to convert them using asNumber().
I declare it as follows as part of an object definition:
Matrix<double,2> inputLayer;
In the code that instantiates the object, I have the following code:
int numRows = sourceSeries->rowCount();
int numColumns = sourceSeries->columnCount();
int i,j = 0;
for(i=0; i<numRows; i++){
for(j=0;j<numColumns;j++) {
// make sure you skip the header row in sourceSeries
inputLayer[i][j] = asNumber(sourceSeries->data(i+1,j,ItemDataRole::Display));
}
}
There is nothing like push_back() for the matrix template. The examples I can find in his books and on the web either pre-populate the matrix in the definition or create it from existing lists, which I won't have at this particular time.
Do I need to define a "new" double, receive the asNumber(), then set inputlayer[][] = the "new" double?
I'm hoping not to have to manage the memory like I can do with vectors that release when I go out of scope, which is why I was avoiding "new."
I'm using the boost frameworks as well and I'm wondering if I should try ublas version instead, or just get this one working.

Thanks for the pointers to Eigen, that was so simple! Here's all I had to do:
In the header file:
#include "Eigen/Dense"
using namespace Eigen;
In the object definition of the header file:
Matrix<double, Dynamic, Dynamic> inputLayer;
In the code where I need to read in the matrix:
int numRows = sourceSeries->rowCount();
int numColumns = sourceSeries->columnCount();
int i,j = 0;
MatrixXd inputLayer(numRows,numColumns);
for(i=0; i<numRows; i++){
for(j=0;j<numColumns;j++) {
// make sure you skip the header row in sourceSeries
inputLayer(i,j) = asNumber(sourceSeries->data(i+1,j,ItemDataRole::Display));
}
}
Sorry I had to waste so much time trying to get the other code to work, but at least I got real familiar with my debugger and the codebase again. Thanks everyone for the comments!

Related

Sort an Array using C++ in R

I need to arrange a dataframe of prices, row by row in ascedent order.
But doing it on R for Loop is quite bad and slow.
A friend of mine tipped me to use Rcpp.
But I'm having quite a hard time to develop a looping in C++ that works.
#include <Rcpp.h>
// [[Rcpp::export]]
using namespace std;
List min(NumericVector x)
{
for (unsigned int i = 0; i < x.size(); i++) {
vector<int>& vec = x[i];
NumericVector Value sort(vec.begin(), vec.end());
}
Return Value;
}
I'm not used to C++ and i would like to know why it keeps saying that mys sort is wrong.
Arrange my dataframe by row.
Welcome (again) to StackOverflow and Rcpp! Two big worlds with much to discover...
sort() is available as a member function:
> Rcpp::cppFunction("NumericVector srt(NumericVector x) { return(x.sort()); }")
> srt(c(2,3,4,1.5,3.2))
[1] 1.5 2.0 3.0 3.2 4.0
>
Note that an advanced question is hidden inside this simple because the sort() member function sorts in place so the above mutates its input. That can be convenient ("hey, no new heap object to return") or confusing depending on your vantage point. We cover it in most Rcpp tutorials but you may have other more pressing issue. Keep on it!

How to pre-allocate memory for a growing Eigen::MatrixXd

I have a growing database in form of an Eigen::MatrixXd. My matrix starts empty and gets rows added one by one until it reaches a maximum predefined (known at compile time) number of rows.
At the moment I grow it like that (from the Eigen docs and many posts here and elsewhere):
MatrixXd new_database(database.rows()+1, database.cols());
new_database << database, new_row;
database = new_database;
But this seems way more inefficient than it needs to be, since it makes a lot of useless memory reallocation and data copying every time a new row is added... Seems like I should be able to pre-allocate a bunch of memory of size MAX_ROWS*N_COLS and let the matrix grow in it, however I can't find an equivalent of std::vector's capacity with Eigen.
Note: I may need to use the matrix at anytime before it is actually full. So I do need a distinction between what would be its size and what would be its capacity.
How can I do this?
EDIT 1: I see there is a MaxSizeAtCompileTime but I find the doc rather unclear with no examples. Anyone knows if this is the way to go, how to use this parameter and how it would interact with the resize and conservativeResize?
EDIT 2: C++: Eigen conservativeResize too expensive? provides another interesting approach while raising question regarding non contiguous data... Anyone has some good insight on that matter?
First thing I want to mention before I forget, is that you may want to consider using a row major matrix for storage.
The simplest (and probably best) solution to your question would be to use block operations to access the top rows.
#include <Eigen/Core>
#include <iostream>
using namespace Eigen;
int main(void)
{
const int rows = 5;
const int cols = 6;
MatrixXd database(rows, cols);
database.setConstant(-1.0);
std::cout << database << "\n\n";
for (int i = 0; i < rows; i++)
{
database.row(i) = VectorXd::Constant(cols, i);
// Use block operations instead of the full matrix
std::cout << database.topRows(i+1) << "\n\n";
}
std::cout << database << "\n\n";
return 0;
}
Instead of just printing the matrix, you could do whatever operations you require.

c++ mysqlpp::storequeryresult and std::vector

i´d like to know if there is a quicker way to copy my data from a mysqlpp::storequeryresult to a std::vector.
My example is as follows:
I store my Query Result with query.store() in StoreQueryResult and my result is e.g. a table with one column with doubles in it. Now I want to copy those doubles into a std::vector. The way I´m doing it right now is to access every single double with the [][] operator and copy it to my vector in a for-loop.
This works but it is very time consuming since i´m copying like 277000 double in a loop. Is there a way to just copy the column to my vector? The thing is my other functions use std::vectors in their parameterlists. Alternatively i could change my functions to call a StoreQueryResult i guess, but i´d prefere a std::vector.
Here is my simplified code:
void foo()
{
vector<double> vec;
mysqlpp::StoreQueryResult sqr;
Query query;
query << "SELECT * FROM tablename";
sqr = query.store();
vec.reserve(sqr.num_rows());
vec.resize(sqr.size());
for(int i=0; i != vec.size(); i++)
{
vec[i] = sqr[i]["my_column"];
}
}
I want something like:
vec = sqr["my_column"] // when my_column is a field with doubles
Thx in advance.
Martin
Ultimately, if you need to copy then you need to copy, and whether you write the loop yourself or get a library function to do it isn't particularly relevant.
What you can do is pre-reserve enough space in the destination vector to avoid repeated re-allocations and copies:
vec.reserve(sqr.num_rows());
It is possible that you wish to create a vector, but then only some values will actually be accessed and used.
In which case we may delay the conversion from mysqlpp::String to another datatype:
std::vector<mysqlpp::String> data(res.num_rows());
for(size_t i=0, n=res.num_rows(); i<n; ++i)
{
data[i] = std::move(res[i]["value"]);
}
Several things are happening here:
We are creating the vector that stores mysqlpp::String. It is an interesting datatype that can be converted to many others. In your case you were using operator double () const.
We get the size once, store it, and then use that value. It's the micro-optimisation, together with using ++i rather than i++; they don't add up to many cycles, but should be used, to keep the code in the spirit of optimisation.
We move the data, rather than copying it. See std::move if you've not encountered it before.
If then you have something like:
double sum = 0.0;
for(size_t i=0, n=data.num_rows(); i<n; i+=2)
{
sum+=double(data[i]);
}
You will only run the conversion routine on ½ of your values.
Of course, if you plan to use the resultant vector several times, you will actually start running the same conversions again and again. So this "optimisation" will actually hurt performance.

c++: passing Eigen-defined matrices to functions, and using them - best practice

I have a function which requires me to pass a fairly large matrix (which I created using Eigen) - and ranges from dimensions 200x200 -> 1000x1000. The function is more complex than this, but the bare bones of it are:
#include <Eigen/Dense>
int main()
{
MatrixXi mIndices = MatrixXi::Zero(1000,1000);
MatrixXi* pMatrix = &mIndices;
MatrixXi mTest;
for(int i = 0; i < 10000; i++)
{
mTest = pMatrix[0];
// Then do stuff to the copy
}
}
Is the reason that it takes much longer to run with a larger size of matrix because it takes longer to find the available space in RAM for the array when I set it equal to mTest? When I switch to a sparse array, this seems to be quite a lot quicker.
If I need to pass around large matrices, and I want to minimise the incremental effect of matrix size on runtime, then what is best practice here? At the moment, the same program is running slower in c++ than it is in Matlab, and obviously I would like to speed it up!
Best,
Ben
In the code you show, you are copying a 1,000,000 element 10,000 times. The assignment in the loop creates a copy.
Generally if you're passing an Eigen matrix to another function, it can be beneficial to accept the argument by reference.
It's not really clear from your code what you're trying to achieve however.

Most efficient option for build 3D structures using Eigen matrices

I need a 3D matrix/array structure on my code, and right now I'm relying on Eigen for both my matrices and vectors.
Right now I am creating a 3D structure using new:
MatrixXd* cube= new MatrixXd[60];
for (int i; i<60; i++) cube[i]=MatrixXd(60,60);
and for acessing the values:
double val;
MatrixXd pos;
for (int i; i<60; i++){
pos=cube[i];
for (int j; j<60; j++){
for (int k; k<60; k++){
val=pos(j,k);
//...
}
}
}
However, right now it is very slow in this part of the code, which makes me beleive that this might not be the most efficient way. Are there any alternatives?
While it was not available, when the question was asked, Eigen has been providing a Tensor module for a while now. It is still in an "unsupported" stage (meaning the API may change), but basic functionality should be mostly stable. The documentation is scattered here and here.
A solution I used is to form a fat matrix containing all the matrices you need stacked.
MatrixXd A(60*60,60);
and then access them with block operations
A0 = A.block<60,60>(0*60,0);
...
A5 = A.block<60,60>(5*60,0);
An alternative is to create a very large chunk of memory ones, and maps Eigen matrices from it:
double* data = new double(60*60 * 60*60*60);
Map<MatrixXd> Mijk(data+60*(60*(60*k)+j)+i), 60, 60);
At this stage you can use Mijk like a MatrixXd object. However, since this not a MatrixXd type, if you want to pass it to a function, your function must either:
be of the form foo(Map<MatrixXd> mat)
be a template function: template<typename Der> void foo(const MatrixBase<Der>& mat)
take a Ref<MatrixXd> object which can handle both Map<> and Matrix<> objects without being a template function and without copies. (doc)