I have a piece of code as follows, and the number of for loops is determined by n which is known at compile time. Each for loop iterates over the values 0 and 1. Currently, my code looks something like this
for(int in=0;in<2;in++){
for(int in_1=0;in_1<2;in_1++){
for(int in_2=0;in_2<2;in_2++){
// ... n times
for(int i2=0;i2<2;i2++){
for(int i1=0;i1<2;i1++){
d[in][in_1][in_2]...[i2][i1] =updown(in)+updown(in_1)+...+updown(i1);
}
}
// ...
}
}
}
Now my question is whether one can write it in a more compact form.
The n bits in_k can be interpreted as the representation of one integer less than 2^n.
This allows easily to work with a 1-D array (vector) d[.].
In practice, an interger j corresponds to
j = in[0] + 2*in[1] + ... + 2^n-1*in[n-1]
Moreover, a direct implementation is O(NlogN). (N = 2^n)
A recursive solution is possible, for example using
f(val, n) = updown(val%2) + f(val/2, n-1) and f(val, 0) = 0.
This would correspond to a O(N) complexity, at the condition to introduce memoization, not implemented here.
Result:
0 : 0
1 : 1
2 : 1
3 : 2
4 : 1
5 : 2
6 : 2
7 : 3
8 : 1
9 : 2
10 : 2
11 : 3
12 : 2
13 : 3
14 : 3
15 : 4
#include <iostream>
#include <vector>
int up_down (int b) {
if (b) return 1;
return 0;
}
int f(int val, int n) {
if (n < 0) return 0;
return up_down (val%2) + f(val/2, n-1);
}
int main() {
const int n = 4;
int size = 1;
for (int i = 0; i < n; ++i) size *= 2;
std::vector<int> d(size, 0);
for (int i = 0; i < size; ++i) {
d[i] = f(i, n);
}
for (int i = 0; i < size; ++i) {
std::cout << i << " : " << d[i] << '\n';
}
return 0;
}
As mentioned above, the recursive approach allows a O(N) complexity, at the condition to implement memoization.
Another possibility is to use a simple iterative approach, in order to get this O(N) complexity.
(here N represents to total number of data)
#include <iostream>
#include <vector>
int up_down (int b) {
if (b) return 1;
return 0;
}
int main() {
const int n = 4;
int size = 1;
for (int i = 0; i < n; ++i) size *= 2;
std::vector<int> d(size, 0);
int size_block = 1;
for (int i = 0; i < n; ++i) {
for (int j = size_block-1; j >= 0; --j) {
d[2*j+1] = d[j] + up_down(1);
d[2*j] = d[j] + up_down(0);
}
size_block *= 2;
}
for (int i = 0; i < size; ++i) {
std::cout << i << " : " << d[i] << '\n';
}
return 0;
}
You can refactor your code slightly like this:
for(int in=0;in<2;in++) {
auto& dn = d[in];
auto updown_n = updown(in);
for(int in_1=0;in_1<2;in_1++) {
// dn_1 == d[in][in_1]
auto& dn_1 = dn[in_1];
// updown_n_1 == updown(in)+updown(in_1)
auto updown_n_1 = updown_n + updown(in_1);
for(int in_2=0;in_2<2;in_2++) {
// dn_2 == d[in][in_1][in_2]
auto& dn_2 = dn_1[in_2];
// updown_n_2 == updown(in)+updown(in_1)+updown(in_2)
auto updown_n_2 = updown_n_1 + updown(in_2);
.
.
.
for(int i2=0;i2<2;i1++) {
// d2 == d[in][in_1][in_2]...[i2]
auto& d2 = d3[i2];
// updown_2 = updown(in)+updown(in_1)+updown(in_2)+...+updown(i2)
auto updown_2 = updown_3 + updown(i2);
for(int i1=0;i1<2;i1++) {
// d1 == d[in][in_1][in_2]...[i2][i1]
auto& d1 = d2[i1];
// updown_1 = updown(in)+updown(in_1)+updown(in_2)+...+updown(i2)+updown(i1)
auto updown_1 = updown_2 + updown(i1);
// d[in][in_1][in_2]...[i2][i1] = updown(in)+updown(in_1)+...+updown(i1);
d1 = updown_1;
}
}
}
}
}
And make this into a recursive function now:
template<std::size_t N, typename T>
void loop(T& d) {
for (int i = 0; i < 2; ++i) {
loop<N-1>(d[i], updown(i));
}
}
template<std::size_t N, typename T, typename U>
typename std::enable_if<N != 0>::type loop(T& d, U updown_result) {
for (int i = 0; i < 2; ++i) {
loop<N-1>(d[i], updown_result + updown(i));
}
}
template<std::size_t N, typename T, typename U>
typename std::enable_if<N == 0>::type loop(T& d, U updown_result) {
d = updown_result;
}
If your type is int d[2][2][2]...[2][2]; or int*****... d;, you can also stop when the type isn't an array or pointer instead of manually specifying N (or change for whatever the type of d[0][0][0]...[0][0] is)
Here's a version that does that with a recursive lambda:
auto loop = [](auto& self, auto& d, auto updown_result) -> void {
using d_t = typename std::remove_cv<typename std::remove_reference<decltype(d)>::type>::type;
if constexpr (!std::is_array<d_t>::value && !std::is_pointer<d_t>::value) {
// Last level of nesting
d = updown_result;
} else {
for (int i = 0; i < 2; ++i) {
self(self, d[i], updown_result + updown(i));
}
}
};
for (int i = 0; i < 2; ++i) {
loop(loop, d[i], updown(i));
}
I am assuming that it is a multi-dimensional matrix. You may have to solve it mathematically first and then write the respective equations in the program.
I wrote a sparse matrix class, based on Block compressed storage, I wrote almost all the method, but I have not idea to how to write the method findValue(i,j) that give 2 indexes of the original matrix ! the storage consists in four vectors :
`ba_': stored the non zero block (rectangular block in which almost one element is different from zero) of the matrix in top-down left-right order
an_ is the vector of index that points to the first element of the block in the vector ba
aj_ stored the index of the block columns in the blocked matrix.
ai_ stored the first block of each row in the blocked matrix.
the picture clarify anything :
here the following class in which I use two methods to achieve the result, findBlockIndex and findValue(i,j,Brows,Bcols) but I need to get the value of the original i,j index using findValue(i,j) in which i,j are the index in the sparse complete matrix
# include <iosfwd>
# include <vector>
# include <string>
# include <initializer_list>
# include "MatrixException.H"
# include <sstream>
# include <fstream>
# include <algorithm>
# include <iomanip>
// forward declarations
template <typename T, std::size_t R, std::size_t C>
class BCRSmatrix ;
template <typename T, std::size_t R, std::size_t C>
std::ostream& operator<<(std::ostream& os , const BCRSmatrix<T,R,C>& m );
template <typename T, std::size_t Br, std::size_t Bc >
std::vector<T> operator*(const BCRSmatrix<T,Br,Bc>& m, const std::vector<T>& x );
template <typename data_type, std::size_t BR , std::size_t BC>
class BCRSmatrix {
template <typename T, std::size_t R, std::size_t C>
friend std::ostream& operator<<(std::ostream& os , const BCRSmatrix<T,R,C>& m );
template <typename T, std::size_t Br,std::size_t Bc>
friend std::vector<T> operator*(const BCRSmatrix<T,Br,Bc>& m, const std::vector<T>& x );
public:
constexpr BCRSmatrix(std::initializer_list<std::vector<data_type>> dense );
constexpr BCRSmatrix(const std::string& );
virtual ~BCRSmatrix() = default ;
auto constexpr print_block(const std::vector<std::vector<data_type>>& dense,
std::size_t i, std::size_t j) const noexcept ;
auto constexpr validate_block(const std::vector<std::vector<data_type>>& dense,
std::size_t i, std::size_t j) const noexcept ;
auto constexpr insert_block(const std::vector<std::vector<data_type>>& dense,
std::size_t i, std::size_t j) noexcept ;
auto constexpr printBCRS() const noexcept ;
auto constexpr printBlockMatrix() const noexcept ;
auto constexpr size1() const noexcept { return denseRows ;}
auto constexpr size2() const noexcept { return denseCols ;}
auto constexpr printBlock(std::size_t i) const noexcept ;
auto constexpr print() const noexcept ;
private:
std::size_t bn ;
std::size_t bBR ;
std::size_t nnz ;
std::size_t denseRows ;
std::size_t denseCols ;
std::vector<data_type> ba_ ;
std::vector<std::size_t> an_ ;
std::vector<std::size_t> ai_ ;
std::vector<std::size_t> aj_ ;
std::size_t index =0 ;
auto constexpr findBlockIndex(const std::size_t r, const std::size_t c) const noexcept ;
auto constexpr recomposeMatrix() const noexcept ;
auto constexpr findValue(
const std::size_t i, const std::size_t j,
const std::size_t rBlock, const std::size_t cBlock
) const noexcept ;
};
//--------------------------- IMPLEMENTATION
template <typename T, std::size_t BR, std::size_t BC>
constexpr BCRSmatrix<T,BR,BC>::BCRSmatrix(std::initializer_list<std::vector<T>> dense_ )
{
this->denseRows = dense_.size();
auto it = *(dense_.begin());
this->denseCols = it.size();
if( (denseRows*denseCols) % BR != 0 )
{
throw InvalidSizeException("Error block size is not multiple of dense matrix size");
}
std::vector<std::vector<T>> dense(dense_);
bBR = BR*BC ;
bn = denseRows*denseCols/(BR*BC) ;
ai_.resize(denseRows/BR +1);
ai_[0] = 1;
for(std::size_t i = 0; i < dense.size() / BR ; i++)
{
auto rowCount =0;
for(std::size_t j = 0; j < dense[i].size() / BC ; j++)
{
if(validate_block(dense,i,j))
{
aj_.push_back(j+1);
insert_block(dense, i, j);
rowCount ++ ;
}
}
ai_[i+1] = ai_[i] + rowCount ;
}
printBCRS();
}
template <typename T, std::size_t BR, std::size_t BC>
constexpr BCRSmatrix<T,BR,BC>::BCRSmatrix(const std::string& fname)
{
std::ifstream f(fname , std::ios::in);
if(!f)
{
throw OpeningFileException("error opening file in constructor !");
}
else
{
std::vector<std::vector<T>> dense;
std::string line, tmp;
T elem = 0 ;
std::vector<T> row;
std::size_t i=0, j=0 ;
while(getline(f, line))
{
row.clear();
std::istringstream ss(line);
if(i==0)
{
while(ss >> elem)
{
row.push_back(elem);
j++;
}
}
else
{
while(ss >> elem)
row.push_back(elem);
}
dense.push_back(row);
i++;
}
this->denseRows = i;
this->denseCols = j;
bBR = BR*BR ;
bn = denseRows*denseCols/(BR*BC) ;
ai_.resize(denseRows/BR +1);
ai_[0] = 1;
for(std::size_t i = 0; i < dense.size() / BR ; i++)
{
auto rowCount =0;
for(std::size_t j = 0; j < dense[i].size() / BC ; j++)
{
if(validate_block(dense,i,j))
{
aj_.push_back(j+1);
insert_block(dense, i, j);
rowCount ++ ;
}
}
ai_[i+1] = ai_[i] + rowCount ;
}
}
printBCRS();
}
template <typename T,std::size_t BR, std::size_t BC>
inline auto constexpr BCRSmatrix<T,BR,BC>::printBlockMatrix() const noexcept
{
for(auto i=0 ; i < denseRows / BR ; i++)
{
for(auto j=1 ; j <= denseCols / BC ; j++)
{
std::cout << findBlockIndex(i,j) << ' ' ;
}
std::cout << std::endl;
}
}
template <typename T,std::size_t BR,std::size_t BC>
inline auto constexpr BCRSmatrix<T,BR,BC>::printBlock(std::size_t i) const noexcept
{
auto w = i-1 ;
auto k = 0;
for(std::size_t i = 0 ; i < BR ; ++i)
{
for(std::size_t j=0 ; j < BC ; ++j )
{
std::cout << std::setw(8) << ba_.at(an_.at(w)-1+k) << ' ';
k++;
}
}
}
template <typename T,std::size_t BR, std::size_t BC>
inline auto constexpr BCRSmatrix<T,BR,BC>::print_block(const std::vector<std::vector<T>>& dense,
std::size_t i, std::size_t j) const noexcept
{
for(std::size_t m = i * BR ; m < BR * (i + 1); ++m)
{
for(std::size_t n = j * BC ; n < BC * (j + 1); ++n)
std::cout << dense[m][n] << ' ';
std::cout << '\n';
}
}
template <typename T,std::size_t BR, std::size_t BC>
inline auto constexpr BCRSmatrix<T,BR,BC>::validate_block(const std::vector<std::vector<T>>& dense,
std::size_t i, std::size_t j) const noexcept
{
bool nonzero = false ;
for(std::size_t m = i * BR ; m < BR * (i + 1); ++m)
{
for(std::size_t n = j * BC ; n < BC * (j + 1); ++n)
{
if(dense[m][n] != 0) nonzero = true;
}
}
return nonzero ;
}
template <typename T,std::size_t BR, std::size_t BC>
inline auto constexpr BCRSmatrix<T,BR,BC>::insert_block(const std::vector<std::vector<T>>& dense,
std::size_t i, std::size_t j) noexcept
{
bool firstElem = true ;
for(std::size_t m = i * BR ; m < BR * (i + 1); ++m)
{
for(std::size_t n = j * BC ; n < BC * (j + 1); ++n)
{
if(firstElem)
{
an_.push_back(index+1);
firstElem = false ;
}
ba_.push_back(dense[m][n]);
index ++ ;
}
}
}
template <typename T, std::size_t BR,std::size_t BC>
auto constexpr BCRSmatrix<T,BR,BC>::findBlockIndex(const std::size_t r, const std::size_t c) const noexcept
{
for(auto j= ai_.at(r) ; j < ai_.at(r+1) ; j++ )
{
if( aj_.at(j-1) == c )
{
return j ;
}
}
}
template <typename T, std::size_t BR, std::size_t BC>
auto constexpr BCRSmatrix<T,BR,BC>::printBCRS() const noexcept
{
std::cout << "ba_ : " ;
for(auto &x : ba_ )
std::cout << x << ' ' ;
std::cout << std::endl;
std::cout << "an_ : " ;
for(auto &x : an_ )
std::cout << x << ' ' ;
std::cout << std::endl;
std::cout << "aj_ : " ;
for(auto &x : aj_ )
std::cout << x << ' ' ;
std::cout << std::endl;
std::cout << "ai_ : " ;
for(auto &x : ai_ )
std::cout << x << ' ' ;
std::cout << std::endl;
}
template <typename T, std::size_t BR, std::size_t BC>
auto constexpr BCRSmatrix<T,BR,BC>::print() const noexcept
{
//for each BCRS row
for(auto i=0 ; i < denseRows / BR ; i++){
//for each Block sub row.
for(auto rBlock = 0; rBlock < BR; rBlock++){
//for each BCSR col.
for(auto j = 1; j <= denseCols / BC; j++){
//for each Block sub col.
for(auto cBlock = 0; cBlock < BC; cBlock++){
std::cout<< findValue(i, j, rBlock, cBlock) <<'\t';
}
}
std::cout << std::endl;
}
}
}
template <typename T, std::size_t BR,std::size_t BC>
auto constexpr BCRSmatrix<T,BR,BC>::recomposeMatrix() const noexcept
{
std::vector<std::vector<T>> sparseMat(denseRows, std::vector<T>(denseCols, 0));
auto BA_i = 0, AJ_i = 0;
//for each BCSR row
for(auto r = 0; r < denseRows/BR; r++){
//for each Block in row
for(auto nBlock = 0; nBlock < ai_.at(r+1)-ai_.at(r); nBlock++){
//for each subMatrix (Block)
for(auto rBlock = 0; rBlock < BR; rBlock++){
for(auto cBlock = 0; cBlock < BC; cBlock++){
//insert value
sparseMat.at(rBlock + r*BR).at(cBlock + (aj_.at(AJ_i)-1)*BC) = ba_.at(BA_i);
++BA_i;
}
}
++AJ_i;
}
}
return sparseMat;
}
template <typename T, std::size_t BR,std::size_t BC>
auto constexpr BCRSmatrix<T,BR,BC>::findValue(
const std::size_t i, const std::size_t j,
const std::size_t rBlock, const std::size_t cBlock
) const noexcept
{
auto index = findBlockIndex(i,j);
if(index != 0)
return ba_.at(an_.at(index-1)-1 + cBlock + rBlock*BC);
else
return T(0);
}
template <typename T, std::size_t BR,std::size_t BC>
std::ostream& operator<<(std::ostream& os , const BCRSmatrix<T,BR,BC>& m )
{
for(auto i=0 ; i < m.denseRows / BR ; i++)
{
//for each Block sub row.
for(auto rBlock = 0; rBlock < BR; rBlock++)
{
//for each BCSR col.
for(auto j = 1; j <= m.denseCols / BC; j++)
{
//for each Block sub col.
for(auto cBlock = 0; cBlock < BC; cBlock++)
{
os << m.findValue(i, j, rBlock, cBlock) <<'\t';
}
}
os << std::endl;
}
}
return os;
}
template <typename T, std::size_t BR, std::size_t BC>
std::vector<T> operator*(const BCRSmatrix<T,BR,BC>& m, const std::vector<T>& x )
{
std::vector<T> y(x.size());
if(m.size1() != x.size())
{
std::string to = "x" ;
std::string mess = "Error occured in operator* attempt to perfor productor between op1: "
+ std::to_string(m.size1()) + to + std::to_string(m.size2()) +
" and op2: " + std::to_string(x.size());
throw InvalidSizeException(mess.c_str());
}
else
{
auto brows = m.denseRows/BR ;
auto bnze = m.an_.size() ;
auto z=0;
for(auto b=0 ; b < brows ; b++)
{
for(auto j= m.ai_.at(b) ; j <= m.ai_.at(b+1)-1; j++ )
{
for(auto k=0 ; k < BR ; k++ )
{
for(auto t=0 ; t < BC ; t++)
{
y.at(BC*b+k) += m.ba_.at(z) * x.at(BC*(m.aj_.at(j-1)-1)+t) ;
z++ ;
}
}
}
}
}
return y;
}
and this is the main
# include "BCSmatrix.H"
using namespace std;
int main(){
BCRSmatrix<int,2,2> bbcsr1 = {{11,12,13,14,0,0},{0,22,23,0,0,0},{0,0,33,34,35,36},{0,0,0,44,45,0},
{0,0,0,0,0,56},{0,0,0,0,0,66}};
BCRSmatrix<int,2,2> bbcsr2 = {{11,12,0,0,0,0,0,0} ,{0,22,0,0,0,0,0,0} ,{31,32,33,0,0,0,0,0},
{41,42,43,44,0,0,0,0}, {0,0,0,0,55,56,0,0},{0,0,0,0,0,66,67,0},{0,0,0,0,0,0,77,78},{0,0,0,0,0,0,87,88}};
BCRSmatrix<int,2,4> bbcsr3 = {{11,12,0,0,0,0,0,0} ,{0,22,0,0,0,0,0,0} ,{31,32,33,0,0,0,0,0},
{41,42,43,44,0,0,0,0}, {0,0,0,0,55,56,0,0},{0,0,0,0,0,66,67,0},{0,0,0,0,0,0,77,78},{0,0,0,0,0,0,87,88}};
bbcsr3.printBlockMatrix();
bbcsr3.print();
BCRSmatrix<int,2,2> bbcsr4("input17.dat");
bbcsr4.printBlockMatrix();
BCRSmatrix<int,2,4> bbcsr5("input18.dat");
bbcsr5.printBlockMatrix();
cout << bbcsr5 ;
BCRSmatrix<int,4,4> bbcsr6("input18.dat");
bbcsr6.printBlockMatrix();
bbcsr6.print();
cout << bbcsr4 ; //.print();
BCRSmatrix<int,2,4> bbcsr7("input20.dat");
cout << bbcsr7;
bbcsr7.printBlockMatrix();
std::vector<int> v1 = {3,4,0,1,6,8,1,19};
std::vector<int> v01 = {3,4,0,1,6,8,1,19,15,2};
std::vector<int> v2 = bbcsr4 *v1 ;
for(auto& x : v2)
cout << x << ' ' ;
cout << endl;
BCRSmatrix<double,2,2> bbcsr8("input21.dat");
bbcsr8.print() ;
bbcsr8.printBlockMatrix();
return 0;
}
how to write the method findValue(i,j) that give 2 indexes of the original matrix
It is similar to the previous findValue method:
template <typename T, std::size_t BR,std::size_t BC>
auto constexpr BCRSmatrix<T,BR,BC>::myNewfindValue(const std::size_t i, const std::size_t j) const noexcept{
auto index = findBlockIndex(i/BR, j/BC);
if(index != 0)
return ba_.at(an_.at(index-1)-1 + j%BC + (i%BR)*BC);
else
return T(0);
}
To recall this function: you have to do a little change to your findBlockIndex: just change if( aj_.at(j-1) == c ) whit if( aj_.at(j-1) == c+1 ), than you have to modify your for statements in the others functions for(auto j = 1; j <= .. whit for(auto j = 0; j < ...
Let me know if there are problems or this is not the answer you were looking for.
I hope to be of help to you,
best regards Marco.
rename the original findValue as findVal then define a new findValue that take exactly 2 element defined as follow (I know is orrible):
template <typename T, std::size_t BS>
T constexpr SqBCSmatrix<T,BS>::findValue(const std::size_t r, const std::size_t c) const noexcept
{
//for each BCRS row
for(auto i=0 ,k=0; i < denseRows / BS ; i++){
//for each Block sub row.
for(auto rBlock = 0; rBlock < BS; k++ ,rBlock++){
//for each BCSR col.
for(auto j = 1 , l=0; j <= denseCols / BS; j++){
//for each Block sub col.
for(auto cBlock = 0; cBlock < BS; l++ , cBlock++){
if(k == r && c == l )
return findVal(i,j,rBlock, cBlock);
}
}
}
}
return 0;
}
I'm implementing a matrix class in modified compressed sparse column format , I have not idea to how to perform the product , this matrix store all the non zero element in 2 vector (value and index) in particular this format of storing consist of 2 container construct in this way:
aa_ the vector of value, stored in is first matrix-dim element the value of the diagonal, and then all the non zero value off-diagonal
ja_ stored in it's first matrix-dim element the number of non zero off-diagonal element in this way :ja[0]=matrix.dimension +1 then ja_[i] -ja_[i+1] = nnz element of column i+1 , and in the ja[ja_[i]] = index_of_row of the non zero element.
If you would read something about this format you can look here Modified compressed format
I've implemented a class but I would figure out how to perform the matrix product, I hope somebody can help me about
# include <iosfwd>
# include <initializer_list>
# include <iomanip>
# include <cassert>
# include <cmath>
# include <vector>
template <typename data_type> class MCSCmatrix ;
template <typename T>
std::vector<T> operator*(const MCSCmatrix<T>& A ,const std::vector<T>& x)noexcept ;
template <typename data_type>
class MCSCmatrix {
public:
template <typename T>
friend std::vector<T> operator*(const MCSCmatrix<T>& A ,const std::vector<T>& x) noexcept ;
auto constexpr printMCSC() const noexcept ;
private:
std::vector<data_type> aa_ ; // non zero value value
std::vector<std::size_t> ja_ ;
std::size_t dim ;
};
template <typename T>
inline constexpr MCSCmatrix<T>::MCSCmatrix( std::initializer_list<std::vector<T>> row)
{
this->dim = row.size();
auto il = *(row.begin());
if(this-> dim != il.size())
{
throw InvalidSizeException("Matrix Must be square in Modified CSC format ");
}
std::vector<std::vector<T>> temp(row);
aa_.resize(dim+1);
ja_.resize(dim+1);
//std::size_t elemCount = 0;
ja_[0] = dim+2 ;
auto elemCount = 0;
for(auto c = 0 ; c < temp[0].size() ; c++ )
{
elemCount =0 ;
for(auto r = 0 ; r < temp.size() ; r++)
{
if(c==r)
{
aa_[c] = temp[r][c] ;
}
else if(c != r && temp[r][c] !=0)
{
aa_.push_back(temp[r][c]);
ja_.push_back(r+1);
elemCount++ ;
}
}
ja_[c+1] = ja_[c] + elemCount ;
}
printMCSC();
}
template <typename T>
inline auto constexpr MCSCmatrix<T>::printMCSC() const noexcept
{
std::cout << "aa: " ;
for(auto& x : aa_ )
std::cout << x << ' ' ;
std::cout << std::endl;
std::cout << "ja: " ;
for(auto& x : ja_ )
std::cout << x << ' ' ;
std::cout << std::endl;
}
template <typename T>
std::vector<T> operator*(const MCSCmatrix<T>& A ,const std::vector<T>& x) noexcept
{
assert(A.dim == x.size());
std::vector<T> b(x.size());
for(auto i=0 ; i < A.dim ; i++ )
b.at(i) = A.aa_.at(i) * x.at(i) ; // diagonal value
for(auto i=0; i< A.dim ; i++)
{
for(auto k=A.ja_.at(i)-1 ; k < A.ja_.at(i+1)-1 ; k++ )
{
b.at(A.ja_.at(k)-1) += A.aa_.at(k)* x.at(i);
}
}
return b;
}
and here the main function:
# include "ModCSCmatrix.H"
using namespace std;
int main(){
MCSCmatrix<int> m1 = {{11,12,13,14,0,0},{0,22,23,0,0,0},{0,0,33,34,35,36},{0,0,0,44,45,0},
{0,0,0,0,0,56},{0,0,0,0,0,66}};
m1.printMCSC();
MCSCmatrix<double> m100 = {{1.01, 0 , 2.34,0}, {0, 4.07, 0,0},{3.12,0,6.08,0},{1.06,0,2.2,9.9} };
std::vector<double> v1={0,1.3,4.2,0.8};
std::vector<double> v2 = m100*v1 ;
for(auto& x : v2)
cout << x << ' ' ;
cout << endl;
return 0;
}
I'm implementing a modified compressed sparse row matrix [reference],
but I have a problem with Matrix * vector multiplication, I wrote the function but I don't reach to find the bug !
the class used 2 container (std::vector) for store
Diagonal element (aa_[0] to aa_[dim])
the non zero value off-diagonal (aa_[dim+2] to aa_[size_of_non_zero])
pointer of the first element in the row (ja_[0] to ja_[dim] )
in the previous pointer this rules is used : ja_[0]=dim+1 ; ja_[i+1]-ja[i]= number of element in i-th row
column index stored in ja_[ja_[row]] for ja_[row] described above is range is ja[0] to ja[dim+1] ,so the colum index are in ja_[dim+2] to ja_[size_of_non_zero elment]
here the minimal code :
# include <initializer_list>
# include <vector>
# include <iosfwd>
# include <string>
# include <cstdlib>
# include <cassert>
# include <iomanip>
# include <cmath> for(auto i=0; i< A.dim ; i++)
{
//for(auto k=A.ja_.at(i) ; k <= A.ja_.at(i+1)-1 ; k++ )
auto k=A.ja_.at(i)-1;
do
{
b.at(i) += A.aa_.at(k)* x.at(A.ja_.at(k)-1);
k++ ; for(auto i=0; i< A.dim ; i++)
{
//for(auto k=A.ja_.at(i) ; k <= A.ja_.at(i+1)-1 ; k++ )
auto k=A.ja_.at(i)-1;
do
{
b.at(i) += A.aa_.at(k)* x.at(A.ja_.at(k)-1);
k++ ;
}while (k < A.ja_.at(i+1)-1 ); // ;
}
return b;
}while (k < A.ja_.at(i+1)-1 ); // ;
}
return b;
# include <set>
# include <fstream>
template <typename data_type>
class MCSRmatrix {
public:
using itype = std::size_t ;
template <typename T>
friend std::vector<T> operator*(const MCSRmatrix<T>& A, const std::vector<T>& x ) noexcept ;
public:
constexpr MCSRmatrix( std::initializer_list<std::initializer_list<data_type>> rows);
private:
std::vector<data_type> aa_ ; // vector of value
std::vector<itype> ja_ ; // pointer vector
int dim ;
};
//constructor
template <typename T>
constexpr MCSRmatrix<T>::MCSRmatrix( std::initializer_list<std::initializer_list<T>> rows)
{
this->dim = rows.size();
auto _rows = *(rows.begin());
aa_.resize(dim+1);
ja_.resize(dim+1);
if(dim != _rows.size()) for(auto i=0; i< A.dim ; i++)
{
//for(auto k=A.ja_.at(i) ; k <= A.ja_.at(i+1)-1 ; k++ )
auto k=A.ja_.at(i)-1;
do
{
b.at(i) += A.aa_.at(k)* x.at(A.ja_.at(k)-1);
k++ ;
}while (k < A.ja_.at(i+1)-1 ); // ;
}
return b;
{
throw std::runtime_error("error matrix must be square");
}
itype w = 0 ;
ja_.at(w) = dim+2 ;
for(auto ii = rows.begin(), i=1; ii != rows.end() ; ++ii, i++)
{
for(auto ij = ii->begin(), j=1, elemCount = 0 ; ij != ii->end() ; ++ij, j++ )
{
if(i==j)
aa_[i-1] = *ij ;
else if( i != j && *ij != 0 )
{
ja_.push_back(j);
aa_.push_back(*ij);
elemCount++ ;
}
ja_[i] = ja_[i-1] + elemCount;
}
}
for(auto& x : aa_ )
std::cout << x << ' ' ;
std::cout << std::endl;
for(auto& x : ja_ )
std::cout << x << ' ' ;
std::cout << std::endl;
}
template <typename T>
std::vector<T> operator*(const MCSRmatrix<T>& A, const std::vector<T>& x ) noexcept
{
std::vector<T> b(A.dim);
for(auto i=0; i < A.dim ; i++ )
b.at(i) = A.aa_.at(i)* x.at(i) ;
for(auto i=0; i< A.dim ; i++)
{
for(auto k=A.ja_.at(i) ; k < A.ja_.at(i+1)-1 ; k++ )
{
b.at(i) += A.aa_.at(k)* x.at(A.ja_.at(k));
}
}
return b;
}
and finally the main
# include "ModCSRmatrix.H"
using namespace std;
int main(){
std::vector<double> v1={0,1.3,4.2,0.8};
MCSRmatrix<double> m1 = {{1.01, 0 , 2.34,0}, {0, 4.07, 0,0},{3.12,0,6.08,0},{1.06,0,2.2,9.9} };
std::vector<double> v2 = m1*v1 ;
for(auto& x : v2)
cout << x << ' ' ;
cout << endl;
}
but the result is different from the result obtain in octave !
I've correct the code and now compile ! it give me the result :
0 5.291 25.536 9.68
but the correct result obtained using octave is :
9.8280 5.2910 25.5360 17.1600
the strange thing is that the same code written in Fortran works!
MODULE MSR
IMPLICIT NONE
CONTAINS
subroutine amuxms (n, x, y, a,ja)
real*8 x(*), y(*), a(*)
integer n, ja(*)
integer i, k
do 10 i=1, n
y(i) = a(i)*x(i)
10 continue
do 100 i = 1,n
do 99 k=ja(i), ja(i+1)-1
y(i) = y(i) + a(k) *x(ja(k))
99 continue
100 continue
return
end
END MODULE
PROGRAM MSRtest
USE MSR
IMPLICIT NONE
INTEGER :: i
REAL(KIND(0.D0)), DIMENSION(4) :: y, x= (/0.,1.3,4.2,0.8/)
REAL(KIND(0.D0)), DIMENSION(9) :: AA = (/ 1.01, 4.07, 6.08, 9.9, 0., 2.34, 3.12, 1.06, 2.2/)
INTEGER , DIMENSION(9) :: JA = (/6, 7, 7, 8, 10, 3, 1, 1, 3/)
WRITE(6,FMT='(4F8.3)') (x(I), I=1,4)
CALL amuxms(4,x,y,aa,ja)
WRITE(6,FMT='(4F8.3)') (y(I), I=1,4)
END PROGRAM
in the above code the value of aa and ja is given by the c++ constructor putting this member
template <typename T>
inline auto constexpr MCSRmatrix<T>::printMCSR() const noexcept
{
for(auto& x : aa_ )
std::cout << x << ' ' ;
std::cout << std::endl;
for(auto& x : ja_ )
std::cout << x << ' ' ;
std::cout << std::endl;
}
and call it at the end of constructor! now I have added the lines of the member at the end of constructor so if you try the constructor you get exactly the same vector written in the fortran code
thanks I followed your advice #Paul H. and rewrite the operator + as follow:
(I didn't change the ja_ indexing because in my class I have a lot of already more or less un-bugged method )
template <typename T>
std::vector<T> operator*(const MCSRmatrix<T>& A, const std::vector<T>& x ) noexcept
{
std::vector<T> b(A.dim);
for(auto i=0; i < A.dim ; i++ )
b.at(i) = A.aa_.at(i)* x.at(i) ;
for(auto i=0; i< A.dim ; i++)
{
//for(auto k=A.ja_.at(i) ; k <= A.ja_.at(i+1)-1 ; k++ )
auto k=A.ja_.at(i)-1;
do
{
b.at(i) += A.aa_.at(k)* x.at(A.ja_.at(k)-1);
k++ ;
}while (k < A.ja_.at(i+1)-1 ); // ;
}
return b;
}
as You can see I have subtracts 1 from all ja_ using as indices :
x.at(A.ja_.at(k)-1) instead of x.at(A.ja_.at(k))
different start of index K k=A.ja_.at(i)-1
and different end of cicle (I've used a do while instead of for)
The debugger is your friend! For future reference, here is a link to a very good blog post on debugging small programs: How to debug small programs.
There are a couple of off by one mistakes in your code. If you create the 4 x 4 matrix used as an example in the reference you linked to, you will see that the ja_ values you calculate are all off by one. The reason your Fortran version works is because arrays in Fortran are by default indexed starting from 1, not 0. So in class MCSRmatrix change
ja_.at(w) = dim+2;
to
ja_.at(w) = dim+1;
and
ja_.push_back(j);
to
ja_.push_back(j-1);
Then in your operator* method change
for(auto k=A.ja_.at(i) ; k < A.ja_.at(i+1)-1 ; k++ )
to
for(auto k = A.ja_.at(i); k < A.ja_.at(i+1); k++)