I want to split an Eigen dynamic-size array by columns evenly over OpenMP threads.
thread 0 | thread 1 | thread 2
[[0, 1, 2], [[0], | [[1], | [[2],
[3, 4, 5], becomes: [3], | [4], | [5],
[6, 7, 8]] [6]] | [7]] | [8]]
I can use the block method to do that, but I am not sure if Eigen would recognize the subarray for each thread occupies contiguous memory.
When I read the documentation of block type, has an InnerPanel template parameter with the following description:
InnerPanel is true, if the block maps to a set of rows of a row major
matrix or to set of columns of a column major matrix (optional). The
parameter allows to determine at compile time whether aligned access
is possible on the block expression.
Does Eigen know that vectorization over the subarray for each OpenMP thread is possible because each subarray actually occupies contiguous memory?
If not, how to make Eigen know this?
Program:
#include <Eigen/Eigen>
#include <iostream>
int main() {
// The dimensions of the matrix is not necessary 8 x 8.
// The dimension is only known at run time.
Eigen::MatrixXi x(8,8);
x.fill(0);
int n_parts = 3;
#pragma omp parallel for
for (int i = 0; i < n_parts; ++i) {
int st = i * x.cols() / n_parts;
int en = (i + 1) * x.cols() / n_parts;
x.block(0, st, x.rows(), en - st).fill(i);
}
std::cout << x << "\n";
}
Result (g++ test.cpp -I<path to eigen includes> -fopenmp -lgomp):
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
To make sure that a block expression is indeed occupying contiguous memory, use middleCols (or leftCols or rightCols) instead:
#include <Eigen/Core>
template<typename XprType, int BlockRows, int BlockCols, bool InnerPanel>
void inspectBlock(const Eigen::Block<XprType, BlockRows, BlockCols, InnerPanel>& block)
{
std::cout << __PRETTY_FUNCTION__ << '\n';
}
int main() {
Eigen::MatrixXi x(8,8);
inspectBlock(x.block(0, 1, x.rows(), 2));
inspectBlock(x.middleCols(1, 2));
}
Result:
void inspectBlock(const Eigen::Block<ArgType, BlockRows, BlockCols, InnerPanel>&) [with XprType = Eigen::Matrix<int, -1, -1>; int BlockRows = -1; int BlockCols = -1; bool InnerPanel = false]
void inspectBlock(const Eigen::Block<ArgType, BlockRows, BlockCols, InnerPanel>&) [with XprType = Eigen::Matrix<int, -1, -1>; int BlockRows = -1; int BlockCols = -1; bool InnerPanel = true]
Note: -1 is the value of Eigen::Dynamic, i.e., not fixed at compile time.
And of course, if your matrix was row major, you could split int topRows, middleRows or bottomRows, instead.
Related
I am wondering how I would initialize a matrix of 0s given that I want to use a <vector<vector> type and if it is possible to change the elements one by one in the boolean matrix the same as you would with an integer matrix (ex: matrix[row][col] = 1)
EDIT:
for example to make an NxN matrix I am trying:
int n = 5;
std::vector<std::vector<bool>> (n, std::vector<bool>(n, false))
this gives me the following error
error: no match for call to ‘(std::vector<std::vector<bool> >) (int&, std::vector<bool>)’
for reference, I get the same error if I do something like this:
int n = 5;
std::vector<bool> row(n, false);
std::vector<std::vector<bool>> (n, row)
Sure you can. A vector of vector of boolean values may not necessarily be the most efficient means for this(a), but it's certainly doable:
#include <iostream>
#include <vector>
using tMatrix = std::vector<std::vector<bool>>;
void dumpMatrix(const std::string &desc, const tMatrix matrix) {
std::cout << desc << ":\n";
for (const auto &row: matrix) {
for (const auto &item: row) {
std::cout << ' ' << item;
}
std::cout << '\n';
}
}
int main() {
tMatrix matrix = { {1, 0, 0}, {1, 1, 1}, {0, 1, 0}, {0, 0, 0}};
//tMatrix matrix(2, std::vector<bool>(3, false));
dumpMatrix("before", matrix);
matrix[0][2] = 1;
dumpMatrix("after", matrix);
}
The output of that program shows that both aspects work, the initialisation nad the ability to change a single item:
before:
1 0 0 <- note this final bit (row 0, column 2) ...
1 1 1
0 1 0
0 0 0
after:
1 0 1 <- ... has changed here
1 1 1
0 1 0
0 0 0
As an aside, the reason why your definition of the matrix isn't working is because of the presence of the word row. There's no place for names in that type definition, you just need the types:
tMatrix matrix(5, std::vector<bool>(5, false));
I've added a similar line to my code above, commented out. If you replace the current declaration of matrix with that, you'll see:
before:
0 0 0
0 0 0
after:
0 0 1
0 0 0
(a) Unless you need to resize the matrix, you may be better off with a std::array or std::bitset.
Your mistake was trying to name the inner vector passed to the outer vector's constructor:
std::vector<std::vector<bool>> matrix(n, std::vector<bool> row(n, false))
// You can't name the temporary ^^^
should just be:
std::vector<std::vector<bool>> matrix(n, std::vector<bool>(n, false))
I found a bug in my code and can't figuring out the error. I tried debugging by showing the output of each variable step by step but I can't find my error. Here is what I have and what I want to do:
I have a matrix A:
0000
0101
1010
1111
And I have a matrix B:
10000
21000
30100
41100
20010
21010
40110
41110
30001
41001
30101
41101
40011
41011
40111
41111
The matrix B has 16 rows and 5 coloumns. The matrix A has 4 rows and 4 coloumns. Now I declare a matrix C that has 4 rows and 16 coloumns.
What I want to do is to calculate the inner product of each row from B with a corresponding row from A. With corresponding I mean that the first coloumn of B shoud define the row from A that I want to multiply. So the B matrix has in fact also four-dimensional vectors and the first element corresponds to the row of A. One could say this first coloumn of B is an index for choosing the row of A. Because C++ start counting by zero I substract one for my index. Here is my code:
std::vector< std::vector<int> > C(4, std::vector<int>(16));
std::vector<int> index(4);
std::vector<int> vectorA(4);
std::vector<int> vectorB(4);
for( int y = 0; y < 16; y++)
{
for(int i=0; i<4; ++i){
vectorA[i] = A[ B[y][0]-1 ][i];
}
for( int x = 1; x < 4; x++)
{
vectorB[x -1] = B[y][x];
}
C[ B[y][0] -1][index[ B[y][0] -1] ] = inner_product(vectorA.begin(), vectorA.end(), vectorB.begin(), 0);
index[B[y][0]-1] += 1;
}
This results in my matrix C:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
2 2 3 1 2 1 2 2 3 0 0 0 0 0 0 0
The first two rows are correct but row three and four are false.
The correct solution has to be (maybe except of ordering in row 3 and 4):
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0
4 3 3 2 3 2 3 2 2 0 0 0 0 0 0 0
Where is my problem? Please help, it drives me crazy :( I tried showing each variable by step but can't find why is it false.
Thanks and greetings.
I have to agree with the other comments: Your code is kind of confusing. You should really simplify the access of vectors by index.
First simple thing you should do is to change the first column of B to be zero-based. All stuff in C++ is zero-based. Adopt it. Do not start adjusting it in your code by substracting one. (This does not gain much simplicity, but is is symptomatic for your code.)
Another source of confusion is that you use the first column of B as an index into A. This might be an implication from the problem you'd like to solve, but it makes things unclear: first column of B has a totally different meaning, always code in a way that objects are seperated by their meaning.
For me the most confusing thing is, that I really do not get what you're up to. With inner product you mean dot product, right? You have 2 sets of vectors you want to calculate the dot product of. This should result in a set of scalars, a 1D vector not a 2D matrix. You do some special stuff with this index vector, which makes the result being a 2D matrix. But you haven't explained the purpose/system behind it. Why do you need a vector for index, not just a scalar??
Vector index is the most ugly/complex part of your code. Without having a clue what you are up to, I would still guess that you find out what is going wrong when you start printing out the full vector index on every iteration and check if it is changing the way you expect.
I don't know what's the rationale behind OP choices, so I can't properly comment the design of code provided, but for what I can understand there are some mistakes with the example input too.
Given A and B matrices as presented, the inner product of the lower rows of A with the corresponding in B is always 0:
B[1] { 2, 1, 0, 0, 0 },
row "2" or A[1] is { 0, 1, 0, 1 } <- B[4] { 2, 0, 0, 1, 0 },
B[5] { 2, 1, 0, 1, 0 },
The same for the succesive row. Only if swapped, the expected output can be obtained and so I did in my code.
vectorA and vectorB and the corresponding copy loops aren't really neccessary and probably are the cause of the wrong output:
for( int x = 1; x < 4; x++)
{ // ^^^^^ this should be <= to reach the last element
vectorB[x -1] = B[y][x];
}
My code, with the updated input and the direct use of A and B is:
#include <iostream>
#include <vector>
#include <numeric>
using vec_t = std::vector<int>; // I assume a C++11 compliant compiler
using mat_t = std::vector<vec_t>;
using std::cout;
int main() {
mat_t A{
{ 0, 0, 0, 0 },
{ 1, 0, 1, 0 }, // <-- those lines are swapped
{ 0, 1, 0, 1 }, // <--
{ 1, 1, 1, 1 }
};
mat_t B{
{ 1, 0, 0, 0, 0 },
{ 2, 1, 0, 0, 0 },
{ 3, 0, 1, 0, 0 },
{ 4, 1, 1, 0, 0 },
{ 2, 0, 0, 1, 0 },
{ 2, 1, 0, 1, 0 },
{ 4, 0, 1, 1, 0 },
{ 4, 1, 1, 1, 0 },
{ 3, 0, 0, 0, 1 },
{ 4, 1, 0, 0, 1 },
{ 3, 0, 1, 0, 1 },
{ 4, 1, 1, 0, 1 },
{ 4, 0, 0, 1, 1 },
{ 4, 1, 0, 1, 1 },
{ 4, 0, 1, 1, 1 },
{ 4, 1, 1, 1, 1 }
};
mat_t C(4, vec_t(16));
vec_t pos(4);
for ( int i = 0; i < 16; ++i )
{
int row = B[i][0] - 1;
int col = pos[row];
int prod = std::inner_product( A[row].begin(), A[row].end(),
++(B[i].begin()), 0 );
// ^^^ skip the first element
C[row][col] = prod;
if ( prod )
++pos[row];
}
for ( auto & r : C )
{
for ( int x : r ) {
cout << ' ' << x;
}
cout << '\n';
}
return 0;
}
The output is:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0
2 2 3 2 3 2 3 3 4 0 0 0 0 0 0 0
I don't know if the ordering of the last row is as expected, but it mimics the logic of OP's code.
How to use eigen library to compute lower triangular of input matrix without changing columns order?
for example for matrix:
A=[1 2 3;4 5 6 ;7 8 9]
I want the result to be:
1 0 0
4 0 0
7 0 0
Your text and your example don't match. I'll go through the three possible ways I understood your question. First, we'll set up the matrix:
Matrix3d mat;
mat << 1, 2, 3, 4, 5, 6, 7, 8, 9;
If you wanted the actual lower triangular matrix, you would use:
std::cout << Matrix3d(mat.triangularView<Lower>()) << "\n\n";
or similar. The result is:
1 0 0
4 5 0
7 8 9
Note the 5,8,9 which are missing from your example. If you just wanted the left-most column, you would use:
std::cout << mat.col(0) << "\n\n";
which gives
1
4
7
If (as the second part of your example shows) you want mat * [1, 0, 0] then you could either do the matrix multiplication (not recommended) or just construct the result:
Matrix3d z = Matrix3d::Zero();
z.col(0) = mat.col(0);
std::cout << z << "\n\n";
which gives the same result as your example:
1 0 0
4 0 0
7 0 0
I am using Microsoft Visual Studio 2008 on a Windows 7 x64. I am trying to solve the following linear system Ax=b by using csparse, where A is positive definite.
| 1 0 0 1 |
A = | 0 3 1 0 |
| 0 1 2 1 |
| 1 0 1 2 |
| 1 |
b = | 1 |
| 1 |
| 1 |
I have used the following codes
int Ncols = 4, Nrows = 4, nnz = 10;
int cols[] = {0, 3, 1, 2, 1, 2, 3, 0, 2, 3};
int rows[] = {0, 0, 1, 1, 2, 2, 2, 3, 3, 3};
double vals[] = {1, 1, 3, 1, 1, 2, 1, 1, 1, 2};
cs *Operator = cs_spalloc(Ncols,Nrows,nnz,1,1);
int j;
for(j = 0; j < nnz; j++)
{
Operator->i[j] = rows[j];
Operator->p[j] = cols[j];
Operator->x[j] = vals[j];
Operator->nz++;
}
for(j = 0; j < nnz; j++)
cout << Operator->i[j] << " " << Operator->p[j] << " " << Operator->x[j] << endl;
Operator = cs_compress(Operator);
for(j = 0; j < nnz; j++)
cout << Operator->i[j] << " " << Operator->p[j] << " " << Operator->x[j] << endl;
// Right hand side
double b[] = {1, 1, 1, 1};
// Solving Ax = b
int status = cs_cholsol(0, Operator, &b[0]); // status = 0 means error.
In order to make sure that I have created the sparse variable correctly, I tried to print out the rows and columns index as well as their values to the console before and after cs_compress. The following is the result of this print-out.
Before:
0 0 1
0 3 1
1 1 3
1 2 1
2 1 1
2 2 2
2 3 1
3 0 1
3 2 1
3 3 2
After:
0 0 1
3 2 1
1 4 3
2 7 1
1 10 1
2 -6076574517017313795 2
3 -6076574518398440533 1
0 -76843842582893653 1
2 0 1
3 0 2
Because of the trash values that can be observed above after calling cs_compress, the solution of Ax=b does not match with the one that I have calculated with MATLAB. MATLAB results in the following solution.
| 2.0000 |
x = | 0.0000 |
| 1.0000 |
|-1.0000 |
Interestingly, I don't have this problem for the following codes which solve Ax=b, where A is a 3×3 identity matrix.
int Ncols = 3, Nrows = 3, nnz = Nrows;
cs *Operator = cs_spalloc(Ncols,Nrows,nnz,1,1);
int j;
for(j = 0; j < nnz; j++) {
Operator->i[j] = j;
Operator->p[j] = j;
Operator->x[j] = 1.0;
Operator->nz++;
}
Operator = cs_compress(Operator);
double b[] = {1, 2, 3};
int status = cs_cholsol(0, Operator, &b[0]); // status = 1 means no error.
Could someone please help me fix the problem that I have with cs_compress?
Having never worked with csparse before, I skimmed the source code.
When you call cs_spalloc() to create Operator, you are creating a triplet (indicated by setting the last parameter to 1). But, after the call to cs_copmress(), the result is no longer a triplet (you can detect this by checking the result and see that Operator->n is now -1 after compression). So, it is an error to traverse the matrix as if it were.
You can use the cs_print() API to print your sparse matrix.
As an aside, your code leaks memory, since the compressed matrix is a new allocation, and the original uncompressed matrix was not freed by cs_compress().
array_2D = new ushort * [nx];
// Allocate each member of the "main" array
//
for (ii = 0; ii < nx; ii++)
array_2D[ii] = new ushort[ny];
// Allocate "main" array
array_3D = new ushort ** [numexp];
// Allocate each member of the "main" array
for(kk=0;kk<numexp;kk++)
array_3D[kk]= new ushort * [nx];
for(kk=0;kk<numexp;kk++)
for(ii=0;ii<nx;ii++)
array_3D[kk][ii]= new ushort[ny];
the values of numexp,nx and ny is obtained by user..
Is this the correct form for dynamic allocation for a 3d array....We know that the code is working for the 2D array...If this is not correct can anyone suggest a better method?
I think the simplest way to allocate and deal with a multidimensional array is to use one big 1d array (or better yet a std::vector) and provide an interface to index into correctly.
This is easiest to think about first in 2 dimensions. Consider a 2D array with "x" and "y" axis
x=0 1 2
y=0 a b c
1 d e f
2 g h i
We can represent this using a 1-d array, rearranged as follows:
y= 0 0 0 1 1 1 2 2 2
x= 0 1 2 0 1 2 0 1 2
array: a b c d e f g h i
So our 2d array is simply
unsigned int maxX = 0;
unsigned int maxY = 0;
std::cout << "Enter x and y dimensions":
std::cin << maxX << maxY
int array = new int[maxX*maxY];
// write to the location where x = 1, y = 2
int x = 1;
int y = 2;
array[y*maxX/*jump to correct row*/+x/*shift into correct column*/] = 0;
The most important thing is to wrap up the accessing into a neat interface so you only have to figure this out once
(In a similar way we can work with 3-d arrays
z = 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2
y = 0 0 0 1 1 1 2 2 2 0 0 0 1 1 1 0 0 0 1 1 1 2 2 2
x = 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
array: a b c d e f g h i j k l m n o p q r s t u v w x
Once you figure out how to index into the array correctly and put this code in a common place, you don't have to deal with the nastiness of pointers to arrays of pointers to arrays of pointers. You'll only have to do one delete [] at the end.
Looks fine too me, so long an array of arr[numexp][nx][ny] is what you wanted.
A little tip: you can put the allocation of the third dimension into the loop of the second dimension, aka you allocate each 3rd dimension while the parent subarray gets allocated:
ushort*** array_3D = new ushort**[nx];
for(int i=0; i<nx; ++i){
array_3D[i] = new ushort*[ny];
for(int j=0; j<ny; ++j)
array_3D[i][j] = new ushort[nz];
}
And of course, the general hint: Do that with std::vectors to not have to deal with that nasty (de)allocation stuff. :)
#include <vector>
int main(){
using namespace std;
typedef unsigned short ushort;
typedef vector<ushort> usvec;
vector<vector<usvec> > my3DVector(numexp, vector<usvec>(nx, vector<ushort>(ny)));
// size of -- dimension 1 ^^^^^^ -- dimension 2 ^^ --- dimension 3 ^^
}