How to add elements to a triplet matrix using CHOLMOD? - c++

Can anyone please give me a simple example of how to add elements to a triplet matrix using CHOLMOD.
I have tried something like this:
cholmod_triplet *A;
int k;
void add_A_entry(int r, int c, double x)
{
((int*)A->i)[k] = r;
((int*)A->j)[k] = c;
((double*)A->x)[k] = x;
k++;
}
int main()
{
k = 0;
cholmod_common com;
cholmod_start(&com);
A = cholmod_allocate_triplet(202, 202, 202*202, -1, CHOLMOD_REAL, &com);
add_A_entry(2, 2, 1.);
add_A_entry(4, 1, 2.);
add_A_entry(2, 10, -1.);
cholmod_print_triplet(A, "A", &com);
cholmod_finish(&com);
return 0;
}
However, this doesn't add any elements to the matrix. I simply get the output:
CHOLMOD triplet: A: 202-by-202, nz 0, lower. OK
Of course, I have tried to find the solution both by searching and in the CHOLMOD documentation, but I found no help.

cholmod_allocate_triplet() sets A->nzmax, which in your case is 202*202. That just defines the space available to add triplets. The actual number of triplets in the matrix is A->nnz, which gets set to zero by cholmod_allocate_triplet().
The A->nnz should be used instead of your variable k.
Tim Davis (CHOLMOD author)

Related

Assigning in high-dimensional Xtensor arrays

I am using the Xtensor library for C++.
I have a xt::zeros({n, n, 3}) array and I would like to assign the its i, j, element an xt::xarray{ , , } so that it would store a 3D dimensional vector at each (i, j). However the documentation does not mention assigning values - I am in general unable to figure out from the documentation how arrays with multiple coodinates works.
What I have been trying is this
xt::xarray<double> force(Body body1, Body body2){
// Function to calulate the vector force on body2 from
// body 1
xt::xarray<double> pos1 = body1.get_position();
xt::xarray<double> pos2 = body2.get_position();
// If the positions are equal return the zero-vector
if(xt::all(xt::equal(pos1, pos2))) {
return xt::zeros<double>({1, 3});
}
xt::xarray<double> r12 = pos2 - pos1;
double dist = xt::linalg::norm(r12);
return -6.67259e-11 * body1.get_mass() * body2.get_mass()/pow(dist, 3) * r12;
}
xt::xarray <double> force_matrix(){
// Initialize the matrix that will hold the force vectors
xt::xarray <double> forces = xt::zeros({self_n, self_n, 3});
// Enter the values into the force matrix
for (int i = 0; i < self_n; ++i) {
for (int j = 0; j < self_n; ++j)
forces({i, j}) = force(self_bodies[i], self_bodies[j]);
}
}
Where I'm trying to assign the output of the force function as the ij'th coordinate in the forces array, but that does not seem to work.
In xtensor, assigning and indexing into multidimensional arrays is quite simple. There are two main ways:
Either index with round brackets:
xarray<double> a = xt::zeros({3, 3, 5});
a(0, 1, 3) = 10;
a(1, 1, 0) = -100; ...
or by using the xindex type (which is a std::vector at the moment), and the square brackets:
xindex idx = {0, 1, 3};
a[idx] = 10;
idx[0] = 1;
a[idx] = -100; ...
Hope that helps.
You can also use view to achieve that.
In the inner loop, you could do:
xt::view(forces, i, j, xt::all()) = a_xarray_with_proper_size;

Assigning the function output to a variable

I have a function which returns the address of a 4x2 matrix whose name is 'a'.
This function computes the elements of 'a' matrix inside and returns the address of the matrix. When I use that function, I want to assign its output to a matrix called 'a1' but when I do so, 'a1' becomes a zero matrix. However, when I assign the output to the same 'a' matrix, everything works fine. Can anyone help me? The code is written on Arduino IDE.
double a[4][2], a1[4][2];
double T0E[4][4]={
{0.1632, -0.3420, 0.9254, 297.9772},
{0.0594, 0.9397, 0.3368, 108.4548},
{-0.9848, 0, 0.1736, -280.5472},
{0, 0, 0, 1}
};
const int axis_limits[4][2]=
{
{ -160, 160 },
{ -135, 60 },
{ -135, 135 },
{ -90, 90 }
};
const unsigned int basex = 50, basez = 100, link1 = 200, link2 = 200, link3=30, endeff=link3+50;
double *inversekinematic(double target[4][4])
{
// angle 1
a[0][0] = -asin(target[0][1]);
a[0][1] = a[0][0];
if (a[0][0]<axis_limits[0][0] || a[0][0]>axis_limits[0][1] || isnan(a[0][0]))
{
bool error=true;
}
// angle 2
double A = sqrt(pow(target[0][3]-cos(a[0][0])*endeff*target[2][2], 2) + pow(target[1][3]-sin(a[0][0])*endeff*target[2][2], 2));
double N = (A - basex) / link1;
double M = -(target[2][3]-endeff*target[2][0] - basez) / link2;
double theta = acos(N / sqrt(pow(N, 2) + pow(M, 2)));
a[1][0] = theta + acos(sqrt(pow(N, 2) + pow(M, 2)) / 2);
a[1][1] = theta - acos(sqrt(pow(N, 2) + pow(M, 2)) / 2);
// angle 3
for (int i = 0; i <= 1; i++)
{
a[2][i] = {asin(-(target[2][3]-endeff*target[2][0]-basez)/link2-sin(a[1][i]))-a[1][i]};
}
// angle 4
for(int i = 0; i <=1; i++)
{
a[3][i] = {-asin(target[2][0])-a[1][i]-a[2][i]};
}
return &a[4][2];
}
void setup(){
Serial.begin(9600);
}
void loop() {
a1[4][2]={*inversekinematic(T0E)};
}
When you type return &a[4][2]; you are returning the address of the 3rd element of the 5th row. This is out of bounds, since C++ uses zero-based indexing and the array was declared as double a[4][2];. I think what you want to do is just return a; to return the address of the entire matrix.
Also, you're doing lots of strange things like declaring the parameter double target[4][4] with a size and using initializer lists to assign single elements, which look unusual to me.
I'll try to be a little more detailed. In C/C++, arrays are nothing more than pointers. So, when you assign one array to another array you are making them literally point to the same data in memory. What you will have to do is copy the elements with loops, or perhaps use memcpy(dest, src, size). For example, if you want to copy the contents of double a[4][2] to double b[4][2], you would use something like memcpy(b, a, sizeof(double) * 8);. If you use a = b; then a and b are pointing to same locations in memory.
Two points:
1. your code says the function inversekinematic() returns a pointer to a double, not an array.
2. you return a pointer to a double, but it's always the same address.
Maybe typedefs will help simplify the code?
typedef double Mat42[4][2];
Mat42 a, a1;
Mat42 *inversekinematic(double target[4][4])
{
// ...
return &a;
}
But, for the code you've shown, I don't see why you need to return the address of a fixed global value. Perhaps your real code might return the address of 'a' or 'a1', but if it doesn't ...

Orthogonalization in QR Factorization outputting slightly innaccurate orthogonalized matrix

I am writing code for QR Factorization and for some reason my orthogonal method does not work as intended. Basically, my proj() method is outputting random projections. Here is the code:
apmatrix<double> proj(apmatrix<double> v, apmatrix<double> u)
//Projection of u onto v
{
//proj(v,u) = [(u dot v)/(v dot v)]*v
double a = mult(transpose(u,u),v)[0][0], b = mult(transpose(v,v),v)[0][0], c = (a/b);
apmatrix<double>k;
k.resize(v.numrows(),v.numcols());
for(int i = 0; i<v.numrows(); i++)
{
for(int j = 0; j<v.numcols(); j++)
{
k[i][j]=v[i][j]*c;
}
}
return k;
}
I tested the method by itself with manual matrix inputs, and it seems to work fine. Here is my orthogonal method:
apmatrix<double> orthogonal(apmatrix<double> A) //Orthogonal
{
/*
n = (number of columns of A)-1
x = columns of A
v0 = x0
v1 = x1 - proj(v0,x1)
vn = xn - proj(v0,xn) - proj(v1,xn) - ... - proj(v(n-1),xn)
V = {v1, v2, ..., vn} or [v0 v1 ... vn]
*/
apmatrix<double> V, x, v;
int n = A.numcols();
V.resize(A.numrows(),n);
x.resize(A.numrows(), 1);
v.resize(A.numrows(),1);
for(int i = 0; i<A.numrows(); i++)
{
x[i][0]=A[i][1];
v[i][0]=A[i][0];
V[i][0]=A[i][0];
}
for (int c = 1; c<n; c++) //Iterates through each col of A as if each was its own matrix
{
apmatrix<double>vn,vc; //vn = Orthogonalized v (avoiding matrix overwriting of v); vc = previously orthogonalized v
vn=x;
vc.resize(v.numrows(), 1);
for(int i=0; i<c; i++) //Vn = an-(sigma(t=1, n-1, proj(vt, xn))
{
for(int k = 0; k<V.numrows(); k++)
vc[k][0] = V[k][i]; //Sets vc to designated v matrix
apmatrix<double>temp = proj(vc, x);
for(int j = 0; j<A.numrows(); j++)
{
vn[j][0]-=temp[j][0]; //orthogonalize matrix
}
}
for(int k = 0; k<V.numrows(); k++)
{
V[k][c]=vn[k][0]; //Subtracts orthogonalized col to V
v[k][0]=V[k][c]; //v is redundant. more of a placeholder
}
if((c+1)<A.numcols()) //Matrix Out of Bounds Checker
{
for(int k = 0; k<A.numrows(); k++)
{
vn[k][0]=0;
vc[k][0]=0;
x[k][0]=A[k][c+1]; //Moves x onto next v
}
}
}
system("PAUSE");
return V;
}
For testing purposes, I have been using the 2D Array: [[1,1,4],[1,4,2],[1,4,2],[1,1,0]]. Each column is its own 4x1 matrix. The matrices should be outputted as: [1,1,1,1]T, [-1.5,1.5,1.5,-1.5]T, and [2,0,0,-2]T respectively. What's happening now is that the first column comes out correctly (it's the same matrix), but the second and third come out to something that is potentially similar but not equal to their intended values.
Again, each time I call on the orthogonal method, it outputs something different. I think it's due to the numbers inputted in the proj() method, but I am not fully sure.
The apmatrix is from the AP college board, back when they taught cpp. It is similar to vectors or ArrayLists in Java.
Here is a link to apmatrix.cpp and to the documentation or conditions (probably more useful), apmatrix.h.
Here is a link to the full code (I added visual markers to see what the computer is doing).
It's fair to assume that all custom methods work as intended (except maybe Matrix Regressions, but that's irrelevant). And be sure to enter the matrix using the enter method before trying to factorize. The code might be inefficient partly because I self-taught myself cpp not too long ago and I've been trying different ways to fix my code. Thank you for the help!
As said in comments:
#AhmedFasih After doing more tests today, I have found that it is in-fact some >memory issue. I found that for some reason, if a variable or an apmatrix object >is declared within a loop, initialized, then that loop is reiterated, the >memory does not entirely wipe the value stored in that variable or object. This >is noted in two places in my code. For whatever reason, I had to set the >doubles a,b, and c to 0 in the proj method and apmatrixdh to 0 in the >mult method or they would store some value in the next iteration. Thank you so >much for you help!

Tensor Product Algorithm Optimization

double data[12] = {1, z, z^2, z^3, 1, y, y^2, y^3, 1, x, x^2, x^3};
double result[64] = {1, z, z^2, z^3, y, zy, (z^2)y, (z^3)y, y^2, z(y^2), (z^2)(y^2), (z^3)(y^2), y^3, z(y^3), (z^2)(y^3), (z^3)(y^3), x, zx, (z^2)x, (z^3)x, yx, zyx, (z^2)yx, (z^3)yx, (y^2)x, z(y^2)x, (z^2)(y^2)x, (z^3)(y^2)x, (y^3)x, z(y^3)x, (z^2)(y^3)x, (z^3)(y^3)x, x^2, z(x^2), (z^2)(x^2), (z^3)(x^2), y(x^2), zy(x^2), (z^2)y(x^2), (z^3)y(x^2), (y^2)(x^2), z(y^2)(x^2), (z^2)(y^2)(x^2), (z^3)(y^2)(x^2), (y^3)(x^2), z(y^3)(x^2), (z^2)(y^3)(x^2), (z^3)(y^3)(x^2), x^3, z(x^3), (z^2)(x^3), (z^3)(x^3), y(x^3), zy(x^3), (z^2)y(x^3), (z^3)y(x^3), (y^2)(x^3), z(y^2)(x^3), (z^2)(y^2)(x^3), (z^3)(y^2)(x^3), (y^3)(x^3), z(y^3)(x^3), (z^2)(y^3)(x^3), (z^3)(y^3)(x^3)};
What is the fastest (fewest executions) to produce result given data? Assume, that data is variable in size, but always a factor of 4 (e.g., 4, 8, 12, etc.).
No Boost. I am trying to keep my dependencies small. STL Algorithms are ok.
HINT: result array size should always be 4^(multiple size) (e.g., 4, 16, 64, etc.).
BONUS: If you can compute result just given x, y, z
Additional examples:
double data[4] = {1, z, z^2, z^3};
double result[4] = {1, z, z^2, z^3};
double data[8] = {1, z, z^2, z^3, 1, y, y^2, y^3};
double result[16] = { ... };
I chose the accepted answer code after running this benchmark: https://gist.github.com/1232406. Basically, the top two codes were run and the one with the smallest execution time won.
void Tensor(std::vector<double>& result, double x, double y, double z) {
result.resize(64); //almost noop if already right size
double tz = z*z;
double ty = y*y;
double tx = x*x;
std::array<double, 12> data = {0, 0, tz, tz*z, 1, y, ty, ty*y, 1, x, tx, tx*x};
register std::vector<double>::iterator iter = result.begin();
register int yi;
register double xy;
for(register int xi=0; xi<4; ++xi) {
for(yi=0; yi<4; ++yi) {
xy = data[4+yi]*data[8+xi];
*iter = xy; //a smart compiler can do these four in parallell
*(++iter) = z*xy;
*(++iter) = data[2]*xy;
*(++iter) = data[3]*xy;
++iter; //workaround for speed!
}
}
}
There's probably at least one bug in here somewhere, but it should be fast, with no dependancies (outside of std::vector/std::array), just takes x,y,z. I avoided recursion though, so it only works for 3 in/64 out. The concept can be applied to any number of parameters though. You just have to instantiate yourself.
A good compiler will autovectorize this I guess none of my compilers are good:
void tensor(const double *restrict data,
int dimensions,
double *restrict result) {
result[0] = 1.0;
for (int i = 0; i < dimensions; i++) {
for (int j = (1 << (i * 2)) - 1; j > -1; j--) {
double alpha = result[j];
{
double *restrict dst = &result[j * 4];
const double *restrict src = &data[(dimensions - 1 - i) * 4];
for (int k = 0; k < 4; k++) dst[k] = alpha * src[k];
}
}
}
}
you should use dynamic algorithm. that is, you can use previous results. for example, you keep y^2 result and use it when computing (y^2)z instead of computing it again.
#include <vector>
#include <cstddef>
#include <cmath>
void Tensor(std::vector<double>& result, const std::vector<double>& variables, size_t index)
{
double p1 = variables[index];
double p2 = p1*p1;
double p3 = p1*p2;
if (index == variables.size() - 1) {
result.push_back(1);
result.push_back(p1);
result.push_back(p2);
result.push_back(p3);
} else {
Tensor(result, variables, index+1);
ptrdiff_t size = result.size();
for(int j=0; j<size; ++j)
result.push_back(result[j]*p1);
for(int j=0; j<size; ++j)
result.push_back(result[j]*p2);
for(int j=0; j<size; ++j)
result.push_back(result[j]*p3);
}
}
std::vector<double> Tensor(const std::vector<double>& params) {
std::vector<double> result;
double rsize = (1<<(2*params.size());
result.reserve(rsize);
Tensor(result, params);
return result;
}
int main() {
std::vector<double> params;
params.push_back(3.1415926535);
params.push_back(2.7182818284);
params.push_back(42);
params.push_back(65536);
std::vector<double> result = Tensor(params);
}
I verified that this one compiles and runs (http://ideone.com/IU1eQ). It runs fast, with no dependancies (outside of std::vector). It also takes any number of parameters. Since calling the recursive form is awkward, I made a wrapper. It makes one function call for each parameter, and one call to dynamic memory (in the wrapper).
You should look for Pascal's pyramid to get fast solution. Useful link 1, useful link 2, useful link 3 and useful link 4.
One more thing: as I see it would be a base of a finite element solver. Usually to write own BLAS solver is not a good idea. Do not reinvent the wheel! I think you should use a BLAS solver like intel MKL or Cuda base BLAS.

Creating a sparse matrix in CHOLMOD or SuiteSparseQR

In SparseSuiteQR, all of the examples I can find use stdin or a file read to create a sparse matrix. Could someone provide a simple example of how to create one directly in C++?
Even better, in the CHOLMOD documentation, there is mention of a sparse2 function available in matlab, which behaves the same as the sparse. Can this be used in C++?
The data structures used by SuiteSparseQR (e.g. cholmod_sparse) are defined in the CHOLMOD library. You can find more information about it on the CHOLMOD documentation, which is much larger than the one from SuiteSparseQR.
I am assuming that you try to solve a linear system, see the CSparse package from Tim Davies, or boost matrix libraries which also have numeric bindings which interface umfpack and some lapack functions AFAIK...
CHOLMOD is a pretty awesome project - thanks Tim Davis :)
There is surprisingly a lot of code on GitHub that makes use of CHOLMOD, but you have to be logged into GitHub and know what you're looking for!
So, after crawling through CHOLMOD documentation and source code and then searching through GitHub for source code that uses CHOLMOD you would find out what to do.
But for most developers who want/need a quick example, here it is below.
*Note that your mileage might vary depending on how you compiled SuiteSparse.
(You might need to use the cholmod_ variant (without the l), i.e. not cholmod_l_; and use int for indexing, not long int).
// example.cpp
#include "SuiteSparseQR.hpp"
#include "SuiteSparse_config.h"
int main (int argc, char **argv)
{
cholmod_common Common, *cc;
cholmod_sparse *A;
cholmod_dense *X, *B;
// start CHOLMOD
cc = &Common;
cholmod_l_start (cc);
/* A =
[
1.1, 0.0, -0.5, 0.7
0.0, -2.0, 0.0, 0.0
0.0, 0.0, 0.9, 0.0
0.0, 0.0, 0.0, 0.6
]
*/
int m = 4; // num rows in A
int n = 4; // num cols in A
int nnz = 6; // num non-zero elements in A
int unsymmetric = 0; // A is non-symmetric: see cholmod.h > search for `stype` for more details
// In coordinate form (COO) a.k.a. triplet form (zero-based indexing)
int i[nnz] = {0, 1, 0, 2, 0, 3}; // row indices
int j[nnz] = {0, 1, 2, 2, 3, 3}; // col indices
double x[nnz] = {1.1, -2.0, -0.5, 0.9, 0.7, 0.6}; // values
// Set up the cholmod matrix in COO/triplet form
cholmod_triplet *T = cholmod_l_allocate_triplet(m, n, nnz, unsymmetric, CHOLMOD_REAL, cc);
T->nnz = nnz;
for (int ind = 0; ind < nnz; ind++)
{
((long int *) T->i)[ind] = i[ind]; // Notes:
((long int *) T->j)[ind] = j[ind]; // (1) casting necessary because these are void* (see cholmod.h)
((double *) T->x)[ind] = x[ind]; // (2) direct assignment will cause memory corruption
} // (3) long int for index pointers corresponds to usage of cholmod_l_* functions
// convert COO/triplet to CSC (compressed sparse column) format
A = (cholmod_sparse *) cholmod_l_triplet_to_sparse(T, nnz, cc);
// note: if you already know CSC format you can skip the triplet allocation and instead use cholmod_allocate_sparse
// and assign the member variables: see cholmod.h > cholmod_sparse_struct definition
// B = ones (size (A,1),1)
B = cholmod_l_ones (A->nrow, 1, A->xtype, cc);
// X = A\B
X = SuiteSparseQR <double> (A, B, cc);
// Print contents of X
printf("X = [\n");
for (int ind = 0; ind < n; ind++)
{
printf("%f\n", ((double *) X->x)[ind]);
}
printf("]\n");
fflush(stdout);
// free everything and finish CHOLMOD
cholmod_l_free_triplet (&T, cc);
cholmod_l_free_sparse (&A, cc);
cholmod_l_free_dense (&X, cc);
cholmod_l_free_dense (&B, cc);
cholmod_l_finish (cc);
return 0;
}
Supposing you have compiled SuiteSparse successfully and you have saved example.cpp in the base directory, then the following should work (on Linux):
gcc example.cpp -I./include -L./lib -lcholmod -lspqr -lsuitesparseconfig -o example
#Add SuiteSpare libraries to your `ld` search path if necessary
LD_LIBRARY_PATH=$(pwd)/lib
export LD_LIBRARY_PATH
./example
Output:
X = [
0.353535
-0.500000
1.111111
1.666667
]