How to modify a method in runtime in c++? - c++

I got here trying to transpose a matrix in O(1). So right now I have something like this:
#include <vector>
template <class T>
class Matrix { //intended for math matrix
public:
Matrix(int Rows, int Cols){matrix = vector<vector<T> >(Rows, vector<T>(Cols,0)); transpose = false; cols = Cols; rows = Rows;}
Matrix(const Matrix<T> &M){matrix = M.matrix; transpose = M.transpose; cols = M.cols; rows = M.rows;}
~Matrix(){}
void t(){
transpose = !transpose;
swap(cols,rows);
}
T& operator()(int row, int col){
if(transpose)
return matrix[col][row];
else
return matrix[row][col];
}
private:
vector<vector<T> > matrix;
bool transpose;
int cols;
int rows;
};
In that code I have what I want: t() is O(1) and operator() is also O(1). But operator() is used a lot and I want to take away the if.
So, to verify if I can improve performance I want to have something like this:
#include <vector>
template <class T>
class Matrix { //intended for math matrix
public:
Matrix(int Rows, int Cols){matrix = vector<vector<T> >(Rows, vector<T>(Cols,0)); transpose = false; cols = Cols; rows = Rows;}
Matrix(const Matrix<T> &M){matrix = M.matrix; transpose = M.transpose; cols = M.cols; rows = M.rows;}
~Matrix(){}
T& originalShape(int row, int col){return matrix[row][col];}
T& transposedMatrix(int row, int col){return matrix[col][row];}
void t(){
transpose = !transpose;
swap(cols,rows);
if(transpose)
&operator() = &transposedMatrix;
else
&operator() = &originalShape;
}
T& operator()(int row, int col){return matrix[row][col];}
private:
vector<vector<T> > matrix;
bool transpose;
int cols;
int rows;
};
Of course, that doesn't work. And I didn't find anything useful for this case.
More info about performance impact of t() and operator():
I read that some libraries both use t() in O(1) and O(rows*cols), according to what is going to be used the matrix for. But doing t() in O(1) seems to be a good first step. If then I call a method I know it would access row by row, then I could do the copy transpose at that moment.
And for taking the if: the idea is to put all the weight of the operation in t() and to have the fastest operator(), because t() would be call once in a while and operator() a lot. I also want to know how to do this because it might become helpful in another situation.
The question
Maybe i lack enough English to make this clear: the objective of the question is to find a good way to change the behavior of operator(). Not to improve Matrix() (advice is much appreciated though), nor to discard any way of changing the behavior of operator() just because it might not be better than the if. Ultimately, I will analyze, code and post the one that has the best performance between what i have and what i get as answers. But to know what has better performance i need those codes/algorithms/patterns/whatever, and I think this may help other people in different but similar situations.

If you store your matrix as a single vector, you can write the indexing function like this:
T& operator()(int row, int col){
return matrix[col*colstep + row*rowstep];
}
Initially rowstep is 1 and colstep is rows. The transpose operator swaps these two values and the sizes.
You do have one extra multiplication, you'll have to measure if that is better or worse than the extra if. Note that branch prediction will guess right most of the time if accessing many/all matrix elements in a loop.
You will still have the problem of accessing data in a non-optimal order, it's worth while to write algorithms taking into account the storage order.

Related

How can we speedup matrix multiplication where matrices are initialized using vectors of vectors (2D vector) in C++

I have written a function for matrix multiplication where matrices are defined using vectors of vectors containing double values.
vector<vector<double> > mat_mul(vector<vector<double> >A, vector<vector<double> >B){
vector<vector<double> >result(A.size(),vector<double>(B[0].size(),0));
if(A[0].size()==B.size()){
const int N=A[0].size();
for(int i=0;i<A.size();i++){
for(int j=0;j<B[0].size();j++){
for(int k=0;k<A[0].size();k++)
result[i][j]+=A[i][k]*B[k][j];
}
}
}
return result;
}
The code seems to work fine but is very slow for matrices large as 400X400.
How can we speed this up? I have seen other answers but they discuss matrix multiplication but not about any speed up for vector of vectors.
Any help is highly appreciated.
struct matrix:
vector<double>
{
using base = vector<double>
using size_type=std::pair<std::size_t,std::size_t>;
void resize(size_type const& dims, double const v=0){
rows = dims.second;
base::resize(dims.first * rows);
};
size_type size() const { return { cols(), rows }; };
double& operator[](size_type const& idx) {
base& vec=*this;
return vec[idx.first + idx.second * cols()];
};
double operator[](size_type const& idx) const {
base const& vec=*this;
return vec[idx.first + idx.second * cols()];
};
private:
std::size_t cols() const { return base::size() / rows; };
std::size_t rows = 0;
};
///...
auto const size = std::tuple_cat(A.size(), std::make_tuple(B.size().second));
matrix result;
result.resize({get<0>(size), get<2>(size)});
for(auto i = 0; i < get<0>(size); ++i)
for(auto j = 0; j < get<1>(size); ++j)
for(auto k = 0; k < get<2>(size); ++k)
result[{i,k}] += A[{i,j}] * B[{j,k}];
I just skipped lots of details, such as none-dedault constructors which is needed if you want a pretty initialization syntax. Moreover as a matrix, this type will need lots of arithmetics operators.
Another approach would be type_erased 2D raw array, but that would require defining assignment operator, as well as copy and move constructors. So this std::vector based solution seems to be the simplest implementation.
Also, if the dimensions are fixed at compile-time, a template alias can do:
template<typename T, std::size_t cols, std::size_t rows>
using array_2d = std::array<std::array<double, rows>, cols>;
array_2d<double, icount, kcount> result{};
You are using the naive algorithm for matrix multiplication. It’s extremely cache unfriendly, and you are hit by the full latency.
First, split up the operations so your inner loop repeatedly accessed data that fit into your cache.
Second, calculate four sums simultaneously to avoid penalties for latency.
Third, use fma instructions (fused multiply-add) which calculate a product and a sum in the same time as a product.
Fourth, use vector registers.
Five, use multiple threads.
Or just use a package like linpack optimising things for you.

Matrix multiplication using multiple threads

So I am trying to compute (M by N matrix) times (N by 1 vector) operations with threads into a resulting vector. The question in my book says that I should think about how many threads to use, and I assume since the result matrix should be M by 1 then I should use M threads, one for each set of operations.
M is height, and N is width.
To create the threads I use
thread* myThreads = new thread[height];
Then I call the MatrixMultThreads function i times. At the end I join all the threads.
for (int i = 0; i < height; i++)
{
myThreads[i] = thread(MatrixMultThreads, my2DArray, vector, height, width);
}
for (int i = 0; i < height; i++)
{
myThreads[i].join();
}
What I am having trouble figuring out is how should I sum up all the resulting values in the correct order. How would I tell each specific thread what to do.
I was thinking, maybe I should create a global variable step_i and set it to 0, then each time the function is called I can iterate that variable. then since I can pass the width of the array, I go through each step_i and add arr[i][j] * vector[j]
What I am having trouble figuring out is how should I sum up all the
resulting values in the correct order.
They can be summed out-of-order, which is why this is a good problem to solve with multi-threading. If ordering matters to a specific problem, you can't improve it with multithreading (to be clear, if any sub-problem can be solved out-of-order then that sub-problem is a potential candidate for multithreading).
One solution to your problem is to set up a solution vector at the call site, then pass the corresponding element by reference (also the MatrixMultiply function needs to know which problem it's solving):
void MatrixMultiply(const Array2d& matrix,
const vector<int>& vec, int row, int& solution);
// ...
vector<int> result(height);
for (int i = 0; i < height; i++)
{
threads[i] = thread(MatrixMultiply, array2d, array1d, i, result[i]);
}
Your 2D array should really provide info on its height and width without having to pass these values explicitly.
BONUS INFO:
We could make this solution much more OOP in a way that you'll want to reuse for future problems (and some experienced programmers seem to miss this trick for using arrays):
MatrixMultiply function is really similar to a dot-product function:
template <typename V1, typename V2>
auto DotProduct(const V1& vec1, const V2& vec2)
{
auto result = vec1[0] * vec2[0];
for (size_t i = 1; i < vec1.size(); ++i)
result += vec1[i] * vec2[i];
return result;
}
template <typename V1, typename V2, typename T>
auto DotProduct(const V1& vec1, const V2& vec2, T& result)
{
result = DotProduct(vec1, vec2);
}
(The above allows the vectors to be any objects that uses size() and [] methods as expected.)
We can write a wrapper class around std::vector that can be used by our array class to handle all the indexing for us; like this:
template <typename T, typename A>
class SubVector
{
const typename std::vector<T,A>::iterator m_it;
const size_t m_size, m_interval_size;
public:
SubVector (std::vector<T,A>& v, size_t start, size_t sub_size, size_t i_size = 1)
: m_it(v.begin() + start), m_size(sub_size), m_interval_size(i_size)
{}
auto size () const
{
return m_size;
}
const T& operator [] (size_t i) const
{
return it[i*m_interval_size];
}
T& operator [] (size_t i)
{
return it[i*m_interval_size];
}
};
Then you could use this in some kind of Vectorise method in your array; like this:
template <typename T, typename A = std::allocator<T>>
class Array2D
{
std::vector<T,A> m_data;
size_t m_width, m_height;
public:
// your normal methods
auto VectoriseRow(int r) const
{
return SubVector(m_data, r*m_width, m_width);
}
auto VectoriseColumn(int c) const
{
return SubVector(m_data, c, m_height, m_width);
}
}
(Note: We could add the Vectorise feature to std::array or boost::multi_array by just writing a wrapper around them, which makes our array class more generic and saves us from having to do all the work. boost actually has this sort of feature inbuilt with array_view.)
Now our call site can be like so:
vector<int> result(height);
for (int i = 0; i < height; i++)
{
threads[i] = thread(DotProduct, array2d.VectoriseRow(i), array1d, result[i]);
}
This might seem like a more verbose way of solving the original problem (because it is), but if you use multi-dimensional arrays in your coding you'll find you no longer have to write multi-array-specific functions, or handle ugly indices for sub-problems (even in 1D problems, like Mean of Means). When dealing with those sorts of problems, you'll invariably want to reuse something like the above code.
You can store the results of the rows dot the Nx1 vector in a Mx1 vector and then do the sum.
By the way, you would be much better using OpenMP for such a problem, it would automatize most of your threads managements according to the number of cores on your machine, since here you might spawn a lot of threads:
https://www.openmp.org/
http://www.bowdoin.edu/~ltoma/teaching/cs3225-GIS/fall17/Lectures/openmp.html

Passing a 2D array of unknown size to a function C++

Basically, my problem is: I have the user to define the size (N,M) of a 2D array and then I declare:
int matrix[N][M];
then I need to pass this uninitialized matrix to a function that reads some data from a .csv file and puts it into the matrix, so I tried:
void readwrite(int &matrix[N][], const int N, const int M){....};
int main(){
....
cin>>N;
cin>>M;
int matrix[N][M];
readwrite(matrix,N,M);
};
However, when i compile it, it gives me the following error: "N was not declared in this scope".
Any ideas of how to make this work?
Thank y'all!
What The OP is trying is so annoyingly difficult to get right and the benefits of pulling it off are so minuscule compared to the costs that... Well, I'll quote from the Classics.
The only winning move is not to play.
-Joshua, WarGames
You cannot safely pass a dynamically allocated 2D array in C++ into a function because you always have to know at least one dimension at compile time.
I could point over at Passing a 2D array to a C++ function because that looks like a good duplicate. I won't because it's referring to statically allocated arrays.
You can play silly casting games to force the array into the function, and then cast it back on the inside. I'm not going to explain how to do this because it is epic-class stupid and should be a firing offense.
You can pass a pointer to a pointer, int **, but the construction and destruction logic is a grotesque set of new and loops. Further, the end result scatters the allocated memory around the RAM, crippling the processors attempts at prediction and caching. On a modern processor if you can't predict and cache, you are throwing away the greater part of your CPU's performance.
What you want to do is stay one dimensional. A 1D array is easy to pass. The indexing arithmetic is dead simple and easy to predict. It's all one memory block so cache hits are more likely than not.
Making a 1D array is simple: Don't. Use std::vector instead.
std::vector<int> arr(rows*columns);
If you have to because the assignment spec says "No Vectors!" Well you're stuck.
int * arr = new int[rows*columns];
Note I'm using rows and columns not M and N. When faced with M and N which is which? Who knows, who cares, and why do this to yourself in the first place? Give your variables good, descriptive names and enjoy the time savings of being able to read your code when you are debugging it later.
The guts of usage are identical with array and vector:
int test = arr[row * columns + column];
Will recover the element in 2D space at [row][column]. I shouldn't have to explain what any of those variables mean. Death to M and N.
Defining a function is:
void function (std::vector<int> & arr, size_t rows, size_t columns)
or (yuck)
void function (int * arr, size_t rows, size_t columns)
Note that rows and columns are of type size_t. size_t is unsigned (a negative array size is not something you want, so why allow it?) and it is guaranteed to be big enough to hold the largest possible array index you can use. In other words it is a much better fit than int. But why pass rows and columns everywhere? The smart thing to do at this point is make a wrapper around an the array and its control variables and then bolt on a few functions to make the thing easier to use.
template<class TYPE>
class Matrix
{
private:
size_t rows, columns;
std::vector<TYPE> matrix;
public:
// no default constructor. Matrix is BORN ready.
Matrix(size_t numrows, size_t numcols):
rows(numrows), columns(numcols), matrix(rows * columns)
{
}
// vector handles the Rule of Three for you. Don't need copy and move constructors
// a destructor or assignment and move operators
// element accessor function
TYPE & operator()(size_t row, size_t column)
{
// check bounds here
return matrix[row * columns + column];
}
// constant element accessor function
TYPE operator()(size_t row, size_t column) const
{
// check bounds here
return matrix[row * columns + column];
}
// stupid little getter functions in case you need to know how big the matrix is
size_t getRows() const
{
return rows;
}
size_t getColumns() const
{
return columns;
}
// and a handy-dandy stream output function
friend std::ostream & operator<<(std::ostream & out, const Matrix & in)
{
for (int i = 0; i < in.getRows(); i++)
{
for (int j = 0; j < in.getColumns(); j++)
{
out << in(i,j) << ' ';
}
out << '\n';
}
return out;
}
};
Rough bash-out of what the array version would have to look like just to show the benefits of allowing vector to do its job. Not tested. May contain howlers. The point is a lot more code and a lot more room for error.
template<class TYPE>
class ArrayMatrix
{
private:
size_t rows, columns;
TYPE * matrix;
public:
ArrayMatrix(size_t numrows, size_t numcols):
rows(numrows), columns(numcols), matrix(new TYPE[rows * columns])
{
}
// Array version needs the copy and move constructors to deal with that damn pointer
ArrayMatrix(const ArrayMatrix & source):
rows(source.rows), columns(source.columns), matrix(new TYPE[rows * columns])
{
for (size_t i = 0; i < rows * columns; i++)
{
matrix[i] = source.matrix[i];
}
}
ArrayMatrix(ArrayMatrix && source):
rows(source.rows), columns(source.columns), matrix(source.matrix)
{
source.rows = 0;
source.columns = 0;
source.matrix = nullptr;
}
// and it also needs a destructor
~ArrayMatrix()
{
delete[] matrix;
}
TYPE & operator()(size_t row, size_t column)
{
// check bounds here
return matrix[row * columns + column];
}
TYPE operator()(size_t row, size_t column) const
{
// check bounds here
return matrix[row * columns + column];
}
// and also needs assignment and move operator
ArrayMatrix<TYPE> & operator=(const ArrayMatrix &source)
{
ArrayMatrix temp(source);
swap(*this, temp); // copy and swap idiom. Read link below.
// not following it exactly because operator=(ArrayMatrix source)
// collides with operator=(ArrayMatrix && source) of move operator
return *this;
}
ArrayMatrix<TYPE> & operator=(ArrayMatrix && source)
{
delete[] matrix;
rows = source.rows;
columns = source.columns;
matrix = source.matrix;
source.rows = 0;
source.columns = 0;
source.matrix = nullptr;
return *this;
}
size_t getRows() const
{
return rows;
}
size_t getColumns() const
{
return columns;
}
friend std::ostream & operator<<(std::ostream & out, const ArrayMatrix & in)
{
for (int i = 0; i < in.getRows(); i++)
{
for (int j = 0; j < in.getColumns(); j++)
{
out << in(i,j) << ' ';
}
out << std::endl;
}
return out;
}
//helper for swap.
friend void swap(ArrayMatrix& first, ArrayMatrix& second)
{
std::swap(first.rows, second.rows);
std::swap(first.columns, second.columns);
std::swap(first.matrix, second.matrix);
}
};
Creating one of these is
Matrix<int> arr(rows, columns);
Now passing the array around is
void func(Matrix & arr);
Using the array is
int test = arr(row, column);
All of the indexing math is hidden from sight.
Other references:
What is the copy-and-swap idiom?
What is The Rule of Three?
int &matrix[N][] - N has to be a compile-time constant, not just const and not at all a parameter. And reference to an array is declared like: int (&matrix)[size]
Try passing int **matrix, and you'll also need to change the way you create this array. Variable lenght arrays are not supported in C++, you'll need to allocate it's memory dynamically. Or rather, stick with std::vector, std::array if you knew the sizes at compile-time.

c++ overloading operator [] efficiency

I just wanted to know how to overload the operator [] to acces a matrix within a class, and I found how to that here.
But, I have a question about this: Which way of changing a matrix will be more efficient?
1: Overloading the operator []: (code extracted from the previous link)
class CMatrix {
public:
int rows, cols;
int **arr;
public:
int* operator[]( int const y )
{
return &arr[0][y];
}
....
Edit: I relied too much on the other example: Should it work this way?
int* operator[]( int const x )
{
return &arr[x];
}
2:Using a "normal" method:
class CMatrix {
public:
int rows, cols;
int **arr;
public:
void changematrix( int i, int j, int n)
{
arr[i][j]=n;
}
...
Edit: Fixed const on changematrix
Write correct, readable code and let the compiler worry about efficiency.
If you have performance problems, and if when you run a profiler to measure the performance (which you MUST do before trying to optimize) this code shows up as a performance issue, then you can:
1) Examine the assembly language interpretation of the code generated by the compiler with full optimization enabled, or
2) Try it both ways, measure, and pick the faster one.
The results will be highly dependent on the compiler you are using and the flags you specify for that compiler.
Neither method will work as you have declared
int **arr;
gives no row length information for:
return &arr[0][y];
arr[i][j]=n;
I would go for a third option, using operator() instead of operator[]:
int& operator()(size_t i, size_t j) {
return arr[i][j];
}
int operator()(size_t i, size_t j) const {
return arr[i][j];
}
The user will then do:
CMatrix m = ...
m(1,2) = 5;
std::cout << m(1,2);
Other than that, I would really consider whether the way of laying out the data internally the the most efficient. Is this a jagged array? Or do all rows have the same width? If this represents a rectangle shaped array (i.e. all rows have the same number of columns) you might be better off storing all elements in a single 1D array, and using some basic arithmetic to locate the correct element.

Signature for matrix-vector product function

I am relatively new to C++ and still confused how to pass and return arrays as arguments. I would like to write a simple matrix-vector-product c = A * b function, with a signature like
times(A, b, c, m, n)
where A is a two-dimensional array, b is the input array, c is the result array, and m and n are the dimensions of A. I want to specify array dimensions through m and n, not through A.
The body of the (parallel) function is
int i, j;
double sum;
#pragma omp parallel for default(none) private(i, j, sum) shared(m, n, A, b, c)
for (i = 0; i < m; ++i) {
sum = 0.0;
for (j = 0; j < n; j++) {
sum += A[i][j] * b[j];
}
c[i] = sum;
}
What is the correct signature for a function like this?
Now suppose I want to create the result array c in the function and return it. How can I do this?
So instead of "you should rather" answer (which I will leave up, because you really should rather!), here is "what you asked for" answer.
I would use std::vector to hold your array data (because they have O(1) move capabilities) rather than a std::array (which saves you an indirection, but costs more to move around). std::vector is the C++ "improvement" of a malloc'd (and realloc'd) buffer, while std::array is the C++ "improvement" of a char foo[27]; style buffer.
std::vector<double> times(std::vector<double> const& A, std::vector<double> const& b, size_t m, size_t n)
{
std::vector<double> c;
Assert(A.size() = m*n);
c.resize(n);
// .. your code goes in here.
// Instead of A[x][y], do A[x*n+y] or A[y*m+x] depending on if you want column or
// row-major order in memory.
return std::move(c); // O(1) copy of the std::vector out of this function
}
You'll note I changed the signature slightly, so that it returns the std::vector instead of taking it as a parameter. I did this because I can, and it looks prettier!
If you really must pass c in to the function, pass it in as a std::vector<double>& -- a reference to a std::vector.
This is the answer you should use... So a good way to solve this one involves creating a struct or class to wrap your array (well, buffer of data -- I'd use a std::vector). And instead of a signature like times(A, b, c, m, n), go with this kind of syntax:
Matrix<4,4> M;
ColumnMatrix<4> V;
ColumnMatrix<4> C = M*V;
where the width/height of M are in the <4,4> numbers.
A quick sketch of the Matrix class might be (somewhat incomplete -- no const access, for example)
template<size_t rows, size_t columns>
class Matrix
{
private:
std::vector<double> values;
public:
struct ColumnSlice
{
Matrix<rows,columns>* matrix;
size_t row_number;
double& operator[](size_t column) const
{
size_t index = row_number * columns + column;
Assert(matrix && index < matrix->values.size());
return matrix->values[index];
}
ColumnSlice( Matrix<rows,columns>* matrix_, size_t row_number_ ):
matrix(matrix_), row_number(row_number_)
{}
};
ColumnSlice operator[](size_t row)
{
Assert(row < rows); // note: zero based indexes
return ColumnSlice(this, row);
}
Matrix() {values.resize(rows*columns);}
template<size_t other_columns>
Matrix<rows, other_columns> operator*( Matrix<columns, other_columns> const& other ) const
{
Matrix<rows, other_columns> retval;
// TODO: matrix multiplication code goes here
return std::move(retval);
}
};
template<size_t rows>
using ColumnMatrix = Matrix< rows, 1 >;
template<size_t columns>
using RowMatrix = Matrix< 1, columns >;
The above uses C++0x features your compiler might not have, and can be done without these features.
The point of all of this? You can have math that both looks like math and does the right thing in C++, while being really darn efficient, and that is the "proper" C++ way to do it.
You can also program in a C-like way using some features of C++ (like std::vector to handle array memory management) if you are more used to it. But that is a different answer to this question. :)
(Note: code above has not been compiled, nor is it a complete Matrix implementation. There are template based Matrix implementations in the wild you can find, however.)
Normal vector-matrix multiplication is as follows:
friend Vector operator*(const Vector &v, const Matrix &m);
But if you want to pass the dimensions separately, it's as follows:
friend Vector mul(const Vector &v, const Matrix &m, int size_x, int size_y);
Since the Vector and Matrix would be 1d and 2d arrays, they would look like this:
struct Vector { float *array; };
struct Matrix { float *matrix; };