Gaussian Elimination Inverse Matrix - c++

What does you block of code shown below do? I'm confused
if (temp != j)
for (k = 0; k < 2 * dimension; k++) {
temporary = augmentedmatrix[j][k];
augmentedmatrix[j][k] = augmentedmatrix[temp][k];
augmentedmatrix[temp][k] = temporary;
}

Edit:
the original question was how to compute inverse matrix with Gaussian Elimination. OP has stucked on the actual elimination part of the algorithm.
now if element 1,1 is not zero then you are going to zero element 2,1 using matrix elementary operations:
F_s,t - interchange rows s and t
F_s,t_(a) - add row t*a to s
F_s_(a) - multiply row s by a
You can also start by zeroing 1,2 element.
These operations correspond to elementary matrices ( and to linear transformation matrices). Because each invertible matrix can be expressed as product of some elementary matrices
A = P1,...,Pk,Ql,...,Q1
which are invertible, we can retrieve inverse of A, A_inverse by applying corresponding operations to original matrix A, and this is the same as multiplication A by P,Q:
A_inverse = Q1_inv,...,Ql_inv,Pk_inv,...,P1_inv
For each row in a matrix, if the row does not consist of only zeros,
then the left-most non-zero entry is called the leading coefficient
(or pivot) of that row. So if two leading coefficients are in the same
column, then a row operation of type 3 (see above) could be used to
make one of those coefficients zero. Then by using the row swapping
operation, one can always order the rows so that for every non-zero
row, the leading coefficient is to the right of the leading
coefficient of the row above. If this is the case, then matrix is said
to be in row echelon form. So the lower left part of the matrix
contains only zeros, and all of the zero rows are below the non-zero
rows. The word "echelon" is used here because one can roughly think of
the rows being ranked by their size, with the largest being at the top
and the smallest being at the bottom.
now basically you should store augmented matrix
float augmentedmatrix[maximum][2*maximum] ;
so you can perform operations on matrix A and on identity matrix simultaneously.
Fill identity matrix:
/* augmenting with identity matrix of similar dimensions */
for(i=0;i<dimension; i++)
for(j=dimension; j<2*dimension; j++)
if(i==j%dimension)
augmentedmatrix[i][j]=1;
else
augmentedmatrix[i][j]=0;
and perform Gauss-Jordan elimination on the extended matrix by:
/* finding maximum jth column element in last (dimension-j) rows */
/* swapping row which has maximum jth column element */
/* performing row operations to form required identity matrix
out of the input matrix */
What you are missing is:
/* using gauss-jordan elimination */
for (j = 0; j < dimension; j++) {
temp = j;
/* finding maximum jth column element in last (dimension-j) rows */
for (i = j + 1; i < dimension; i++)
if (augmentedmatrix[i][j] > augmentedmatrix[temp][j])
temp = i;
if (fabs(augmentedmatrix[temp][j]) < minvalue) {
printf("\n Elements are too small to deal with !!!");
return -1;
}
/* swapping row which has maximum jth column element */
if (temp != j)
for (k = 0; k < 2 * dimension; k++) {
temporary = augmentedmatrix[j][k];
augmentedmatrix[j][k] = augmentedmatrix[temp][k];
augmentedmatrix[temp][k] = temporary;
}
/* performing row operations to form required identity matrix out
of the input matrix */
for (i = 0; i < dimension; i++)
if (i != j) {
r = augmentedmatrix[i][j];
for (k = 0; k < 2 * dimension; k++)
augmentedmatrix[i][k] -= augmentedmatrix[j][k] *
r / augmentedmatrix[j][j];
} else {
r = augmentedmatrix[i][j];
for (k = 0; k < 2 * dimension; k++)
augmentedmatrix[i][k] /= r;
}
}

Related

Neural Network Accuracy Plateauing

I am making a neural network for the purpose of identifying letters. Currently, during training, the network seems to plateau at around 12% accuracy. As input, the network takes a 10x10 image (formatted as a 100x1 column vector) and outputs a 26x1 column vector where each element corresponds to a different letter. Right now I don't have a great data set (only 50 samples) but I iterate over it a few hundred times, and each iteration the accuracy doesn't really get any better than 6 / 50 correct. What I consider a correct identification is the element that corresponds to the correct letter being the greatest number in the vector. I was hoping to get a decently good accuracy before moving on and expanding the data set.
ML::Matrix ML::NeuralNetwork::calculate(const Matrix & input)
{
//all inputs and layers are column vectors
//weights and biases are std::vector of ML::Matrix
Matrix resultant = input;
results.add(resultant); //circular linked list to store the intermediate results
for (int i = 0; i < weights.size(); ++i) {
resultant = (weights[i] * resultant) + biases[i];
resultant.function(sigmoid); //apply sigmoid to every element in the matrix
results.add(resultant);
}
return resultant;
}
void ML::NeuralNetwork::learn(const Matrix & calc, const Matrix & real)
{
//backpropagation
ML::Matrix cost = 2 * (calc - real); //derivative of cost function: (calc - real)^2
for (int i = weights.size() - 1; i >= 0; --i) {
ML::Matrix dCdB = cost.hadamardProduct(ML::sigDerivative(weights[i] * results[i] + biases[i]));
ML::Matrix dCdW = dCdB * results[i].transpose();
cost = weights[i].transpose() * dCdB;
weights[i] -= learningRate * dCdW;
biases[i] -= learningRate * dCdB;
}
}
ML::Matrix ML::Matrix::operator*(const Matrix & other) const throw(ML::MathUndefinedException)
{
//naive matrix-multiplication and matrix-vector product
if (columns != other.rows) throw MathUndefinedException();
Matrix output(rows, other.columns);
if (other.columns == 1) {
for (int i = 0; i < rows; ++i) {
for (int j = 0; j < columns; ++j)
output.set(i, output.get(i) + get(i, j) * other.get(j));
}
}
else {
for (int i = 0; i < rows; ++i) {
for (int j = 0; j < columns; ++j) {
for (int k = 0; k < other.rows; ++k) {
output.set(i, j, output.get(i, j) + get(i, k) * other.get(k, j));
}
}
}
}
return output;
}
My network does work better with simpler examples. In a test with 3 inputs and 1 output it plateaus at about 70% and in another test with only 1 input and 1 output it would get around 99% accuracy so I am not certain if there is a problem with the code. While the code is abstracted for n layers of any size, I have been testing with around 1 - 2 hidden layers (total of 3 - 4 layers). I have tested various training rates, even non constant and differential training rates. I have tested each individual matrix manipulation function on its own (hadamardProduct, transposing, matrix addition etc.) so I am almost certain the problem isn't in one of those functions (thus, I didn't show their code with the exception of matrix multiplication)
All help will be appreciated

Calculating the determinant of a matrix

I am trying to calculate the determinant of a square matrix using row operations.
I ran into this code but I do not really understand how it works.
What do subi and subj do? Does it use row operations?
What is the logic behind this code?
int c, subi, i, j, subj;
double submat[10][10],d=0;
if (n == 2) {
return((mat[0][0] * mat[1][1]) - (mat[1][0] * mat[0][1]));
}
else {
for (c = 0; c < n; c++) {
subi = 0;
for (int i = 1; i < n; i++) {
subj = 0;
for (j = 0; j < n; j++) {
if (j == c)
continue;
submat[subi][subj] = mat[i][j];
subj++;
}
subi++;
}
d = d + (pow(-1, c)*mat[0][c] * determinant(n - 1, submat));
}
}
return d;
The function, which looks like:
double determinant(int n, double mat[10][10]);
recursively goes through rows and calls itself on the submatrices for that row and the first column return a value for all by matrices. The recursion ends for 2 by 2 matrices.
This is a recursive function using Laplace expansion to calculate the determinant whose base case is a 2 by 2 matrix.
However, it does not seem to be a good program to me for:
what if the input is a 1 by 1 matrix
submat is limited by size of 10 by 10
submat is a waste of memory
When matrix is large, it is better to use LU decomposition.

Rotate a matrix n times

I was solving problems on HackerRank when I got stuck at this one.
Problem Statement
You are given a 2D matrix, a, of dimension MxN and a positive integer R. You have to rotate the matrix R times and print the resultant matrix. Rotation should be in anti-clockwise direction.
Rotation of a 4x5 matrix is represented by the following figure. Note that in one rotation, you have to shift elements by one step only (refer sample tests for more clarity).
It is guaranteed that the minimum of M and N will be even.
Input
First line contains three space separated integers, M, N and R, where M is the number of rows, N is number of columns in matrix, and R is the number of times the matrix has to be rotated.
Then M lines follow, where each line contains N space separated positive integers. These M lines represent the matrix.
Output
Print the rotated matrix.
Constraints
2 <= M, N <= 300
1 <= R <= 10^9
min(M, N) % 2 == 0
1 <= aij <= 108, where i ∈ [1..M] & j ∈ [1..N]'
What I tried to do was store the circles in a 1D array. Something like this.
while(true)
{
k = 0;
for(int j = left; j <= right; ++j) {temp[k] = a[top][j]; ++k;}
top++;
if(top > down || left > right) break;
for(int i = top; i <= down; ++i) {temp[k] = a[i][right]; ++k;}
right--;
if(top > down || left > right) break;
for(int j = right; j >= left; --j) {temp[k] = a[down][j] ; ++k;}
down--;
if(top > down || left > right) break;
for(int i = down; i >= top; --i) {temp[k] = a[i][left]; ++k;}
left++;
if(top > down || left > right) break;
}
Then I could easily rotate the 1D matrix by calculating its length modulo R. But then how do I put it back in matrix form? Using a loop again would possibly cause a timeout.
Please don't provide code, but only give suggestions. I want to do it myself.
Solution Created :
#include <iostream>
using namespace std;
int main() {
int m,n,r;
cin>>m>>n>>r;
int a[300][300];
for(int i = 0 ; i < m ; ++i){
for(int j = 0; j < n ; ++j)
cin>>a[i][j];
}
int left = 0;
int right = n-1;
int top = 0;
int down = m-1;
int tleft = 0;
int tright = n-1;
int ttop = 0;
int tdown = m-1;
int b[300][300];
int k,size;
int temp[1200];
while(true){
k=0;
for(int i = left; i <= right ; ++i)
{
temp[k] = a[top][i];
// cout<<temp[k]<<" ";
++k;
}
++top;
if(top > down || left > right)
break;
for(int i = top; i <= down ; ++i)
{
temp[k]=a[i][right];
// cout<<temp[k]<<" ";
++k;
}
--right;
if(top > down || left > right)
break;
for(int i = right; i >= left ; --i)
{
temp[k] = a[down][i];
// cout<<temp[k]<<" ";
++k;
}
--down;
if(top > down || left > right)
break;
for(int i = down; i >= top ; --i)
{
temp[k] = a[i][left];
// cout<<temp[k]<<" ";
++k;
}
++left;
if(top > down || left > right)
break;
//________________________________\\
size = k;
k=0;
// cout<<size<<endl;
for(int i = tleft; i <= tright ; ++i)
{
b[ttop][i] = temp[(k + (r%size))%size];
// cout<<(k + (r%size))%size<<" ";
// int index = (k + (r%size))%size;
// cout<<index;
++k;
}
++ttop;
for(int i = ttop; i <= tdown ; ++i)
{
b[i][tright]=temp[(k + (r%size))%size];
++k;
}
--tright;
for(int i = tright; i >= tleft ; --i)
{
b[tdown][i] = temp[(k + (r%size))%size];
++k;
}
--tdown;
for(int i = tdown; i >= ttop ; --i)
{
b[i][tleft] = temp[(k + (r%size))%size];
++k;
}
++tleft;
}
size=k;
k=0;
if(top != ttop){
for(int i = tleft; i <= tright ; ++i)
{
b[ttop][i] = temp[(k + (r%size))%size];
++k;
}
++ttop;
}
if(right!=tright){
for(int i = ttop; i <= tdown ; ++i)
{
b[i][tright]=temp[(k + (r%size))%size];
++k;
}
--tright;
}
if(down!=tdown){
for(int i = tright; i >= tleft ; --i)
{
b[tdown][i] = temp[(k + (r%size))%size];
++k;
}
--tdown;
}
if(left!=tleft){
for(int i = tdown; i >= ttop ; --i)
{
b[i][tleft] = temp[(k + (r%size))%size];
++k;
}
++tleft;
}
for(int i = 0 ; i < m ;++i){
for(int j = 0 ; j < n ;++j)
cout<<b[i][j]<<" ";
cout<<endl;
}
return 0;
}
You need to break down this problem (remind me of an interview question from gg and fb) :
Solve first rotating a sequence one a single position
Then solve rotating a sequence N times
Model each "circle" or ring as an array. You may or may not actually need to store in a separate data
Iterate over each ring and apply the rotating algorithm
Lets consider the case of an array of length L which needs to be rotated R time. Observe that if R is a multiple of L, the array will be unchanged.
Observe too that rotating x times to the right is the same as rotating L - x to the left (and vice versa).
Thus you can first design an algorithm able to rotate once either left or right one exactly one position
Reduce the problem of rotating R times to the left to rotating R modulo L to the left
If you want to go further reduce the problem of rotating R modulo L to the left to rotating left R modulo L or rotating right L - R modulo L. Which means if you have 100 elements and you have to do 99 rotations left, you better do 1 rotation right and be done with it.
So the complexity will be O ( Number of circles x Circle Length x Single Rotation Cost)
With an array in-place it means O( min(N,m) * (N * M)^2 )
If you use a doubly linked list as temporary storage, a single rotation sequence is done by removing the front and putting it at the tail (or vice versa to rotate right). So what you can do is copy all data first to a linked list. Run the single rotation algorithm R modulo L times, copy back the linked list on the ring position, and move on the next right till all rings are processed.
Copy ring data to list is O(L), L <= N*M
Single Rotation Cost is O(1)
All rotations R modulo L is O(L)
Repeat on all min(N,m) rings
With a spare double linked list it means complexity of O( min(N,m) * (N * M))
I would start with a simplifying assumption: M is less than or equal to N. Thus, you are guaranteed to have an even number of rows. (What if M > N? Then transpose the matrix, carry out the algorithm, and transpose the matrix again.)
Because you have an even number of rows, you can easily find the corners of each cycle within the matrix. The outermost cycle has these corners:
a1,1 → aM,1 → aM,N → a1,N
To find the next cycle, move each corner inward, which means incrementing or decrementing the index at each corner as appropriate.
Knowing the sequence of corners allows you to iterate over each cycle and store the values in a one-dimensional vector. In each such vector a, start from index R % a.size() and increment the index a.size() - 1 times to iterate over the rotated elements of the cycle. Copy each element a[i % a.size()] back to the cycle.
Note that we don't actually rotate the vector. We accomplish the rotation by starting from an offset index when we copy elements back to the matrix. Thus, the overall running time of the algorithm is O(MN), which is optimal because it costs O(MN) just to read the input matrix.
I would treat this as a problem that divides the matrix into submatrices. You could probably write a function that shifts the matrices (and submatrices) outer rows and columns by one each time you call it. Take care to handle the four corners of the matrix appropriately.
Check this out for suggestions how to shift the columns.
Edit (more detailed):
Read each matrix circle in as a vector, use std::rotate on it R % length.vector times, write back. Maximally 150 operations.
Each element moves uniquely according to one of four formulas, adding five movements of known sizes (I'll leave the size calculation out since you wanted to figure it out):
formula (one of these four):
left + down + right + up + left
down + right + up + left + down
right + up + left + down + right
up + left + down + right + up
Since the smallest side of the matrix is even, we know there is not an element remaining in place. After R rotations, the element has circled around floor (R / formula) times but still needs to undergo extra = R % formula shifts. Once you know extra, simply calculate the appropriate placement for the element.

In a matrix put 0 in the row and column of a cell which contains 0 without using extra space

Given a matrix, if a cell contains 0, then we have make entire row and column corresponding to the cell as 0. For example, if
1 2 3
M = 0 4 5
4 2 0
then the output should be
0 2 0
0 0 0
0 0 0
The method I thought is as follows
Make auxiliary arrays row[] and col[]. If a cell(i,j) contains 0 then, mark row[i] and col[j] as 0.(Initially row[] and col[] contains all 1s).
Again traverse the whole matrix, if for cell(i,j), either of row[i] or col[j] is 0, then put cell(i,j) as 0.
This takes O(m*n) time and O(m+n) space.
How to optimize it further specially in terms of space.Any suggestions for improving time complexity are also welcomed.
Aha, this is an old question.
Use one boolean variate(isZeroInFirstRow) saving if first row has zero element(s) or not and one boolean variate(isZeroInFirstCol) saving if first column has zero element(s) or not.
Then, traverse the whole matrix. If cell(i,j)==0, then set cell(0,j) and cell(i,0) to 0.
Traverse the first row of the matrix. If cell(0,j)==0, then set all elements in column(j) to 0.
Traverse the first column of the matrix. If cell(i,0)==0, then set all elements in row(i) to 0.
If isZeroInFirstRow==true, set all elements in row(0) to 0.
If isZeroInFirstCol==true, set all elements in column(0) to 0.
You can solve this in O(1) space. One solution is to iterate on the matrix, for each 0 you see, you fill the corresponding row/col with some character, 'X' for example.
When you finish, you should have something like that:
X 2 X
M= 0 X X
X X 0
Then you iterate again on the matrix and replace each 'X' with 0 to get:
0 2 0
M= 0 0 0
0 0 0
If you are concerned with storage you may think of using some sparse matrix storage formats to store the resulting matrix, and then free the original dense input.
An example of what I am proposing may be the following (implementing COO format) which should take O(M*N) time:
#include<vector>
#include<iostream>
#include<algorithm>
#include<cstddef>
using namespace std;
int main()
{
constexpr size_t M = 3;
constexpr size_t N = 3;
int matrix[M][N] = {
{1, 2, 3},
{0, 4, 5},
{4, 2, 0}
};
vector<size_t> markedRows;
vector<size_t> markedColumns;
// Search for zeroes
for (size_t ii = 0; ii < M; ++ii) {
for(size_t jj = 0; jj < N; ++jj) {
if (matrix[ii][jj] == 0) {
markedRows.push_back (ii);
markedColumns.push_back(jj);
}
}
}
// Sort columns (rows are ordered by construction)
sort(markedColumns.begin(),markedColumns.end());
// Eliminate duplicates
markedRows.erase (unique(markedRows.begin() ,markedRows.end()) ,markedRows.end() );
markedColumns.erase(unique(markedColumns.begin(),markedColumns.end()),markedColumns.end());
// Construct COO matrix format
vector<size_t> irow;
vector<size_t> icol;
vector<int> val;
for (size_t ii = 0; ii < M; ++ii) {
for(size_t jj = 0; jj < N; ++jj) {
if ( ( find(markedRows.begin() ,markedRows.end() ,ii) == markedRows.end() ) &&
( find(markedColumns.begin(),markedColumns.end(),jj) == markedColumns.end() )
) {
irow.push_back(ii);
icol.push_back(jj);
val.push_back (matrix[ii][jj]);
}
}
}
// FROM HERE YOU NO LONGER NEED MATRIX, AND YOU CAN FREE THE STORAGE
// Print non zero entries
for( size_t ii = 0; ii < irow.size(); ++ii) {
cout << "A["<<irow[ii]<<","<<icol[ii]<<"] = "<<val[ii]<<endl;
}
return 0;
}
You can use your algorithm without allocating and auxiliary row or column by searching the matirx for a row that contains no zeros and a column that contains no zero elements.
If either of these searches fails, then the resulting matrix will all zeros, so your work is done by simply setting all elements to zero.
Otherwise, use the row and colum you found as the bookkeeping row and column you mentioned, setting the corresponding element to zero as you find zeros in the remainder of the matrix. Once that pass is done you walk the bookkeeping row, setting the matix columns to zeros for any zero found in the bookkeeping row, similarly for the aux column.
Here is an algorithm can do it in O(M*N) time and O(1) space : -
Find the max element in the matrix .
Mat[i][j] = max - Mat[i][j] for all (i,j)
Notice that Mat[i][j] will only have positive values.
Use negetive values as sentinels and Mat[i][j] = max as zeros.
Retrieve original values as Mat[i][j] = max - Mat[i][j]
Simple and easy answer:
<2 nested loop> to search through all columns and rows you find any cell = 0 through all column and set it to zeros through all row and set it to zeros. let me know if it not clear to record video for it.
Int main()
{
//example matrix dimension rows(r=6) * columns (c=3)
int r = 6;
int c = 3;
int matrix[r][c];
for(int i=0; i<r; ++i){
for(int j=0 ; j < c ; ++j){
if(matrix[i][j] == 0){
for(int ii=0; ii<r; ++ii){
Matrix[ii][j] = 0 ;
}
for(int jj=0; jj<c; ++jj){
Matrix[i][jj] = 0 ;
}
}
}
}
}

Sparse Matrix multiplication like (maxmin) in C++ using Octave libraries

I'm implementing a maxmin function, it works like matrix multiplication but instead of summing products it gets max of min between two numbers pointwise. An example of naive implementation is
double mx = 0;
double mn = 0;
for (i = 0; i < rowsC;i++)
{
for(j = 0; j < colsC;j++)
{
mx = 0;
for(k = 0; k < colsA; k++)
{
if (a(i, k) < b(k, j))
mn = a(i,k);
else
mn = b(k,j);
if (mn > mx)
mx = mn;
}
c(i, j) = mx;
}
}
I'm coding it as an Octave oct-file so i have to use oct.h data structure. The problem is that i want to implement a sparse version, but usually you need a reference to the next non zero element in a row or in a column like in this example (see 4.3 algorithm):
http://www.eecs.harvard.edu/~ellard/Q-97/HTML/root/node20.html
There doing row_p->next gave the next nonzero element of the row (the same for the column). Is there a way to do the same with the octave SparseMatrix class? Or is there another way of implementing the sparse matrix multiplication i can adopt for my maxmin function?
I don't know if anyoe would ever be interested, but i managed to find a solution.
The code of the solution is part of fl-core1.0 a fuzzy logic core package for Octave and it is released under LGPL license.
(The code relies on some octave functions)
// Calculate the S-Norm/T-Norm composition of sparse matrices (single thread)
void sparse_compose(octave_value_list args)
{
// Create constant versions of the input matrices to prevent them to be filled by zeros on reading.
// a is the const reference to the transpose of a because octave sparse matrices are column compressed
// (to cycle on the rows, we cycle on the columns of the transpose).
SparseMatrix atmp = args(0).sparse_matrix_value();
const SparseMatrix a = atmp.transpose();
const SparseMatrix b = args(1).sparse_matrix_value();
// Declare variables for the T-Norm and S-Norm values
float snorm_val;
float tnorm_val;
// Initialize the result sparse matrix
sparseC = SparseMatrix((int)colsB, (int)rowsA, (int)(colsB*rowsA));
// Initialize the number of nonzero elements in the sparse matrix c
int nel = 0;
sparseC.xcidx(0) = 0;
// Calculate the composition for each element
for (int i = 0; i < rowsC; i++)
{
for(int j = 0; j < colsC; j++)
{
// Get the index of the first element of the i-th column of a transpose (i-th row of a)
// and the index of the first element of the j-th column of b
int ka = a.cidx(i);
int kb = b.cidx(j);
snorm_val = 0;
// Check if the values of the matrix are really not 0 (it happens if the column of a or b hasn't any value)
// because otherwise the cidx(i) or cidx(j) returns the first nonzero element of the previous column
if(a(a.ridx(ka),i)!=0 && b(b.ridx(kb),j)!=0)
{
// Cicle on the i-th column of a transpose (i-th row of a) and j-th column of b
// From a.cidx(i) to a.cidx(i+1)-1 there are all the nonzero elements of the column i of a transpose (i-th row of a)
// From b.cidx(j) to b.cidx(j+1)-1 there are all the nonzero elements of the column j of b
while ((ka <= (a.cidx(i+1)-1)) && (kb <= (b.cidx(j+1)-1)))
{
// If a.ridx(ka) == b.ridx(kb) is true, then there's a nonzero value on the same row
// so there's a k for that a'(k, i) (equals to a(i, k)) and b(k, j) are both nonzero
if (a.ridx(ka) == b.ridx(kb))
{
tnorm_val = calc_tnorm(a.data(ka), b.data(kb));
snorm_val = calc_snorm(snorm_val, tnorm_val);
ka++;
kb++;
}
// If a.ridx(ka) == b.ridx(kb) ka should become the index of the next nonzero element on the i column of a
// transpose (i row of a)
else if (a.ridx(ka) < b.ridx(kb))
ka++;
// If a.ridx(ka) > b.ridx(kb) kb should become the index of the next nonzero element on the j column of b
else
kb++;
}
}
if (snorm_val != 0)
{
// Equivalent to sparseC(i, j) = snorm_val;
sparseC.xridx(nel) = j;
sparseC.xdata(nel++) = snorm_val;
}
}
sparseC.xcidx(i+1) = nel;
}
// Compress the result sparse matrix because it is initialized with a number of nonzero element probably greater than the real one
sparseC.maybe_compress();
// Transpose the result
sparseC = sparseC.transpose();
}