Signature for matrix-vector product function - c++

I am relatively new to C++ and still confused how to pass and return arrays as arguments. I would like to write a simple matrix-vector-product c = A * b function, with a signature like
times(A, b, c, m, n)
where A is a two-dimensional array, b is the input array, c is the result array, and m and n are the dimensions of A. I want to specify array dimensions through m and n, not through A.
The body of the (parallel) function is
int i, j;
double sum;
#pragma omp parallel for default(none) private(i, j, sum) shared(m, n, A, b, c)
for (i = 0; i < m; ++i) {
sum = 0.0;
for (j = 0; j < n; j++) {
sum += A[i][j] * b[j];
}
c[i] = sum;
}
What is the correct signature for a function like this?
Now suppose I want to create the result array c in the function and return it. How can I do this?

So instead of "you should rather" answer (which I will leave up, because you really should rather!), here is "what you asked for" answer.
I would use std::vector to hold your array data (because they have O(1) move capabilities) rather than a std::array (which saves you an indirection, but costs more to move around). std::vector is the C++ "improvement" of a malloc'd (and realloc'd) buffer, while std::array is the C++ "improvement" of a char foo[27]; style buffer.
std::vector<double> times(std::vector<double> const& A, std::vector<double> const& b, size_t m, size_t n)
{
std::vector<double> c;
Assert(A.size() = m*n);
c.resize(n);
// .. your code goes in here.
// Instead of A[x][y], do A[x*n+y] or A[y*m+x] depending on if you want column or
// row-major order in memory.
return std::move(c); // O(1) copy of the std::vector out of this function
}
You'll note I changed the signature slightly, so that it returns the std::vector instead of taking it as a parameter. I did this because I can, and it looks prettier!
If you really must pass c in to the function, pass it in as a std::vector<double>& -- a reference to a std::vector.

This is the answer you should use... So a good way to solve this one involves creating a struct or class to wrap your array (well, buffer of data -- I'd use a std::vector). And instead of a signature like times(A, b, c, m, n), go with this kind of syntax:
Matrix<4,4> M;
ColumnMatrix<4> V;
ColumnMatrix<4> C = M*V;
where the width/height of M are in the <4,4> numbers.
A quick sketch of the Matrix class might be (somewhat incomplete -- no const access, for example)
template<size_t rows, size_t columns>
class Matrix
{
private:
std::vector<double> values;
public:
struct ColumnSlice
{
Matrix<rows,columns>* matrix;
size_t row_number;
double& operator[](size_t column) const
{
size_t index = row_number * columns + column;
Assert(matrix && index < matrix->values.size());
return matrix->values[index];
}
ColumnSlice( Matrix<rows,columns>* matrix_, size_t row_number_ ):
matrix(matrix_), row_number(row_number_)
{}
};
ColumnSlice operator[](size_t row)
{
Assert(row < rows); // note: zero based indexes
return ColumnSlice(this, row);
}
Matrix() {values.resize(rows*columns);}
template<size_t other_columns>
Matrix<rows, other_columns> operator*( Matrix<columns, other_columns> const& other ) const
{
Matrix<rows, other_columns> retval;
// TODO: matrix multiplication code goes here
return std::move(retval);
}
};
template<size_t rows>
using ColumnMatrix = Matrix< rows, 1 >;
template<size_t columns>
using RowMatrix = Matrix< 1, columns >;
The above uses C++0x features your compiler might not have, and can be done without these features.
The point of all of this? You can have math that both looks like math and does the right thing in C++, while being really darn efficient, and that is the "proper" C++ way to do it.
You can also program in a C-like way using some features of C++ (like std::vector to handle array memory management) if you are more used to it. But that is a different answer to this question. :)
(Note: code above has not been compiled, nor is it a complete Matrix implementation. There are template based Matrix implementations in the wild you can find, however.)

Normal vector-matrix multiplication is as follows:
friend Vector operator*(const Vector &v, const Matrix &m);
But if you want to pass the dimensions separately, it's as follows:
friend Vector mul(const Vector &v, const Matrix &m, int size_x, int size_y);
Since the Vector and Matrix would be 1d and 2d arrays, they would look like this:
struct Vector { float *array; };
struct Matrix { float *matrix; };

Related

How can we speedup matrix multiplication where matrices are initialized using vectors of vectors (2D vector) in C++

I have written a function for matrix multiplication where matrices are defined using vectors of vectors containing double values.
vector<vector<double> > mat_mul(vector<vector<double> >A, vector<vector<double> >B){
vector<vector<double> >result(A.size(),vector<double>(B[0].size(),0));
if(A[0].size()==B.size()){
const int N=A[0].size();
for(int i=0;i<A.size();i++){
for(int j=0;j<B[0].size();j++){
for(int k=0;k<A[0].size();k++)
result[i][j]+=A[i][k]*B[k][j];
}
}
}
return result;
}
The code seems to work fine but is very slow for matrices large as 400X400.
How can we speed this up? I have seen other answers but they discuss matrix multiplication but not about any speed up for vector of vectors.
Any help is highly appreciated.
struct matrix:
vector<double>
{
using base = vector<double>
using size_type=std::pair<std::size_t,std::size_t>;
void resize(size_type const& dims, double const v=0){
rows = dims.second;
base::resize(dims.first * rows);
};
size_type size() const { return { cols(), rows }; };
double& operator[](size_type const& idx) {
base& vec=*this;
return vec[idx.first + idx.second * cols()];
};
double operator[](size_type const& idx) const {
base const& vec=*this;
return vec[idx.first + idx.second * cols()];
};
private:
std::size_t cols() const { return base::size() / rows; };
std::size_t rows = 0;
};
///...
auto const size = std::tuple_cat(A.size(), std::make_tuple(B.size().second));
matrix result;
result.resize({get<0>(size), get<2>(size)});
for(auto i = 0; i < get<0>(size); ++i)
for(auto j = 0; j < get<1>(size); ++j)
for(auto k = 0; k < get<2>(size); ++k)
result[{i,k}] += A[{i,j}] * B[{j,k}];
I just skipped lots of details, such as none-dedault constructors which is needed if you want a pretty initialization syntax. Moreover as a matrix, this type will need lots of arithmetics operators.
Another approach would be type_erased 2D raw array, but that would require defining assignment operator, as well as copy and move constructors. So this std::vector based solution seems to be the simplest implementation.
Also, if the dimensions are fixed at compile-time, a template alias can do:
template<typename T, std::size_t cols, std::size_t rows>
using array_2d = std::array<std::array<double, rows>, cols>;
array_2d<double, icount, kcount> result{};
You are using the naive algorithm for matrix multiplication. It’s extremely cache unfriendly, and you are hit by the full latency.
First, split up the operations so your inner loop repeatedly accessed data that fit into your cache.
Second, calculate four sums simultaneously to avoid penalties for latency.
Third, use fma instructions (fused multiply-add) which calculate a product and a sum in the same time as a product.
Fourth, use vector registers.
Five, use multiple threads.
Or just use a package like linpack optimising things for you.

Matrix multiplication using multiple threads

So I am trying to compute (M by N matrix) times (N by 1 vector) operations with threads into a resulting vector. The question in my book says that I should think about how many threads to use, and I assume since the result matrix should be M by 1 then I should use M threads, one for each set of operations.
M is height, and N is width.
To create the threads I use
thread* myThreads = new thread[height];
Then I call the MatrixMultThreads function i times. At the end I join all the threads.
for (int i = 0; i < height; i++)
{
myThreads[i] = thread(MatrixMultThreads, my2DArray, vector, height, width);
}
for (int i = 0; i < height; i++)
{
myThreads[i].join();
}
What I am having trouble figuring out is how should I sum up all the resulting values in the correct order. How would I tell each specific thread what to do.
I was thinking, maybe I should create a global variable step_i and set it to 0, then each time the function is called I can iterate that variable. then since I can pass the width of the array, I go through each step_i and add arr[i][j] * vector[j]
What I am having trouble figuring out is how should I sum up all the
resulting values in the correct order.
They can be summed out-of-order, which is why this is a good problem to solve with multi-threading. If ordering matters to a specific problem, you can't improve it with multithreading (to be clear, if any sub-problem can be solved out-of-order then that sub-problem is a potential candidate for multithreading).
One solution to your problem is to set up a solution vector at the call site, then pass the corresponding element by reference (also the MatrixMultiply function needs to know which problem it's solving):
void MatrixMultiply(const Array2d& matrix,
const vector<int>& vec, int row, int& solution);
// ...
vector<int> result(height);
for (int i = 0; i < height; i++)
{
threads[i] = thread(MatrixMultiply, array2d, array1d, i, result[i]);
}
Your 2D array should really provide info on its height and width without having to pass these values explicitly.
BONUS INFO:
We could make this solution much more OOP in a way that you'll want to reuse for future problems (and some experienced programmers seem to miss this trick for using arrays):
MatrixMultiply function is really similar to a dot-product function:
template <typename V1, typename V2>
auto DotProduct(const V1& vec1, const V2& vec2)
{
auto result = vec1[0] * vec2[0];
for (size_t i = 1; i < vec1.size(); ++i)
result += vec1[i] * vec2[i];
return result;
}
template <typename V1, typename V2, typename T>
auto DotProduct(const V1& vec1, const V2& vec2, T& result)
{
result = DotProduct(vec1, vec2);
}
(The above allows the vectors to be any objects that uses size() and [] methods as expected.)
We can write a wrapper class around std::vector that can be used by our array class to handle all the indexing for us; like this:
template <typename T, typename A>
class SubVector
{
const typename std::vector<T,A>::iterator m_it;
const size_t m_size, m_interval_size;
public:
SubVector (std::vector<T,A>& v, size_t start, size_t sub_size, size_t i_size = 1)
: m_it(v.begin() + start), m_size(sub_size), m_interval_size(i_size)
{}
auto size () const
{
return m_size;
}
const T& operator [] (size_t i) const
{
return it[i*m_interval_size];
}
T& operator [] (size_t i)
{
return it[i*m_interval_size];
}
};
Then you could use this in some kind of Vectorise method in your array; like this:
template <typename T, typename A = std::allocator<T>>
class Array2D
{
std::vector<T,A> m_data;
size_t m_width, m_height;
public:
// your normal methods
auto VectoriseRow(int r) const
{
return SubVector(m_data, r*m_width, m_width);
}
auto VectoriseColumn(int c) const
{
return SubVector(m_data, c, m_height, m_width);
}
}
(Note: We could add the Vectorise feature to std::array or boost::multi_array by just writing a wrapper around them, which makes our array class more generic and saves us from having to do all the work. boost actually has this sort of feature inbuilt with array_view.)
Now our call site can be like so:
vector<int> result(height);
for (int i = 0; i < height; i++)
{
threads[i] = thread(DotProduct, array2d.VectoriseRow(i), array1d, result[i]);
}
This might seem like a more verbose way of solving the original problem (because it is), but if you use multi-dimensional arrays in your coding you'll find you no longer have to write multi-array-specific functions, or handle ugly indices for sub-problems (even in 1D problems, like Mean of Means). When dealing with those sorts of problems, you'll invariably want to reuse something like the above code.
You can store the results of the rows dot the Nx1 vector in a Mx1 vector and then do the sum.
By the way, you would be much better using OpenMP for such a problem, it would automatize most of your threads managements according to the number of cores on your machine, since here you might spawn a lot of threads:
https://www.openmp.org/
http://www.bowdoin.edu/~ltoma/teaching/cs3225-GIS/fall17/Lectures/openmp.html

How to initialize double matrix with NaN element?

in my code I have a matrix of double like this:
double * * matrix=new double * [10];
for(int i=0;i<10;i++)
matrix[i]=new double[10];
I want to have NaN value in every cell of this matrix when I initialize it, is it possible to do automatically or the only solution is:
for(int i=0;i<10;i++)
for(int j=0;j<10;j++)
matrix[i][j]=nan("");
Is it possible to infer that when the matrix will costruct, it doesn't use the default constructor of double that insert, for every matrix[i][j], 0.0 value but insert nan("")?
double doesn't have a default constructor, i.e. double values are uninitialized by default.
To avoid explicitly implementing the loops, you can use std::vector :
#include <vector>
...
std::vector<std::vector<double>> matrix(10, std::vector<double>(10, nan("")));
or:
#include <vector>
using namespace std;
...
vector<vector<double>> matrix(10, vector<double>(10, nan("")));
First, strongly avoid using raw pointers in C++ yourself - it's almost always a bad idea. If there's no container class that fits, use std::unique_ptr. So your code becomes:
auto matrix = std::make_unique<double* []>(10);
for(int i=0;i<10;i++) {
matrix.get()[i]= std::make_unique<double []>(10);
}
This code is still not what you want. It's usually not a good idea to create your NxN matrix using N calls to new, or n constructions of a vector. Make a single allocation of NxN doubles, and then either wrap it in a class MyMatrix which supports a 2-parameter square-brace operator, i.e.
template <typename T>
class MyMatrix {
// etc. etc
double const T& operator[](size_type i, size_type j) const { return data_[i*n + j]; }
double T& operator[](size_type i, size_type j) { return data_[i*n + j]; }
}
or (not-recommended) have the pointers point into the single-allocation region:
size_t n = 10;
auto matrix_data = std::make_unique<double []>(n * n);
auto matrix = std::make_unique<double* []>(n);
for(int i=0;i<10;i++) {
matrix.get()[i] = matrix_data.get() + i * n;
}
in each of these cases you can later use std::fill to set all matrix values to NaN, outside of any loop.
The last example above can also be transformed into using vectors (which is probably a better idea than just the raw pointers if you're not using your own class):
size_t n = 10;
auto matrix_data = std::vector<double>(n * n);
auto matrix = std::vector<double*>(n);
for(auto& row : matrix) {
auto row_index = std::dist(row, matrix.begin());
row = &matrix_data[row_index * n];
}
Again, I don't recommend this - it's still a C-like way to enable a my_matrix[i][j] syntax, while using a wrapper class gets you my_matrix[i,j] without needing extra storage, with initialization to NaN or another value (in the constructor), and without following two pointers each time you access it.
If you want to use statically sized arrays you would be better off using std::array. For easier use of multi-dimenstional std::array you can use a template alias
template <class T, size_t ROW, size_t COL>
using Matrix = std::array<std::array<T, COL>, ROW>;
You can set the values in the matrix with std::array::fill, e.g.
Matrix<double, 3, 4> m = {};
m.fill(42.0);
You can also create a compile-time constant matrix object initialized with a default value to skip the initialization at runtime with a simple constexprfunction.
template<typename T, size_t R, size_t C>
constexpr auto makeArray(T&& x) {
Matrix<T,R,C> m = {};
for(size_t i=0; i != R; ++i) {
for(size_t j=0; j != C; ++j) {
m[i][j] = std::forward<T>(x);
}
}
return m;
}
auto constexpr m = makeArray<double, 3,4>(23.42);
I am going to repeat the advice given to prefer C++ constructs over C constructs. They are more type-safe and IMHO almost always more convenient to use, e.g. passing std::array objects as parameters is not different from any other objects. If you are coming from a C background and have no further C++ experience, I would recommend to read some tutorial text that does not first introduce C, e.g. The Tour of C++,

Constructor for a class containing a vec of vecs

I'm trying to learn how to create a class which properly initializes a vectors of vectors to implement a matrix. The code below doesn't work, after running the constructor the vector has size 0. The program prints 0 and attempts to access elements of it result in errors.
My first question is what is wrong with the code and how to fix it.
My second question is if there is a better approach to creating a class to implement a matrix dynamically using vectors or similar objects from the STL.
Code:
class Matrix{
std::vector<std::vector<int> > Mat;
public:
Matrix(int n, int m);
void set(int a, int b, int Value);
int get(int a, int b);
void size();
};
Matrix::Matrix(int n, int m){
std::vector<std::vector<int> > Mat(n, vector<int>(m));
}
void Matrix::size(){
std::cout << std::endl << Mat.size() << std::endl;
}
int Matrix::get(int a, int b){
return Mat[a][b];
}
void Matrix::set(int a, int b, int Value){
Mat[a][b]=Value;
}
int main(int argc, char** argv) {
Matrix M(10,10);
M.size();
return 0;
}
1) Current code
The problem is that Mat is already constructed when you enter the body of your constructor. What you do is just redefine a local Mat which hides the member having the same name and which vanishes as soon as you exit the constructor.
Try this instead:
Matrix::Matrix(int n, int m) : Mat(n, vector<int>(m)) {
}
2)Are there better approaches ?
it all depend on what you intend to do, whar atre your constraints, and what are the trade-offs:
If the size of you matrixes is not always defined at compile time, this kind of implementation is fairly good, and the code very readable.
If you have many rather small vectors, one alternative could be to flatten the internal representation of the matrix to a unidimensional vector. You'd spare some vectors, but have to calculate the flatened index for the getter and the setter. If your Matrix class would provide a lot of matrix operations, would make the code less readable (i.e. Mat[n][m] vs. Mat[n*width+m])
If the size of your matrix is determined at compile time (e.g. you use only 2D or 3D matrixes), it could make sense to use std::array instead of std::vector: the compiler could then make use of the known sizes to generate faster code.
This code:
Matrix::Matrix(int n, int m){
std::vector<std::vector<int> > Mat(n, vector<int>(m));
}
will default construct the member-variable Mat, and then, separately, try to construct a local variable Mat, unrelated to the member. To initialize the member variable, you'll want to use a member initializer list:
Matrix::Matrix(int n, int m)
: Mat(n, std::vector<int>(m))
{ }
As a side-note, size() should return the size, not print it, and if your getter returned an int& instead of an int, you wouldn't need a setter with the code duplication.
in the constructor you should initialize using the following instead
Matrix::Matrix(int n, int m) {
Mat = std::vector<std::vector<int> > (n, std::vector<int>(m));
}
or use the way mentioned in the other answers

What is the equivalent matrix-like C-array of a nested std::vector (for C and C++ interop)?

What is the equivalent matrix-like C-array of a nested std::vector (for C and C++ interop)?
For example, if one wanted to treat std::vector<std::vector<int>> as some kind of int arr[n][m], where n is the dimension of the outer vector and m of the inner vector, then what structure would one use in C?
This is motivated by wanting to have a similar correspondence between matrices in C and C++ as for vectors in:
https://stackoverflow.com/a/1733150/4959635
Based on additional information in the comments, let me suggest you do something like this instead:
class TwoDimVector {
public:
TwoDimVector(int num_cols, int num_rows)
: m_num_cols(num_cols)
, m_num_rows(num_rows)
, m_data(m_num_cols * m_num_rows, 0)
{ }
int & ix(int row, int col) {
return data[num_cols * row + col];
}
const int m_num_rows;
const int m_num_cols;
private:
std::vector<int> m_data;
}
When you do nested vectors, there's a lot of extra work happening. Also, with nested vectors, the data is not contiguous, making it hard to work with any C-apis. Notice with this data structure, the size is fixed at construction time and accessible. This is designed to be row contiguous, so for C interoperability you can access extra raw pointers like so:
TwoDimVector tdv(4,3);
int * raw = &tdv.ix(0,0);
int * raw_second_row = &tdv.ix(1,0);
Just note: if you pass this into a function, be sure to pass by reference:
void do_work(TwoDimVector & tdv) {
...
}
If you don't pass by reference, it will copy everything, which is a bunch of (typically unnecessary) work.
Maybe, this code
void translate(const vector< vector >& vec){
int m = vec.size(), n = 0;
for (vector<int>& deep : vec) // search maximum size if nested vectors
{
if (deep.size() > n)
n = deep.size();
}
int arr[m][n];
m = n = 0;
for (vector<int>& deep : vec){
for (int& x : deep)
{
arr[m][n] = x;
++n;
}
++m;
}
// So, I really don't know how you can return this array :(
}
You see, it code is BAD, you mustn't do it !!!
If you writing on C++ you should using std::vector - it is easier.
C-like arrays is heritage from C, you shouldn't using they