C++ Matrix Class with Operator Overloading - c++

I was implementing a small dense matrix class and instead of plan get/set operators I wanted to use operator overloading to make the API more usable and coherent.
What I want to achieve is pretty simple:
template<typename T>
class Matrix
{
public:
/* ... Leaving out CTOR and Memory Management for Simplicity */
T operator() (unsigned x, unsigned y){/* ... */ }
};
Matrix<int> m(10,10);
int value = m(5,3); // get the value at index 5,3
m(5,3) = 99; // set the value at index 5,3
While getting the value is straight forward by overloading operator(), I can't get my head around defining the setter. From what I understood the operator precedence would call operator() before the assignment, however it is not possible to overload operator() to return a correct lvalue.
What is the best approach to solve this problem?

I dispute that "it's not possible" to do the correct thing:
struct Matrix
{
int & operator()(size_t i, size_t j) { return data[i * Cols + j]; }
const int & operator()(size_t i, size_t j) const { return data[i * Cols + j]; }
/* ... */
private:
const size_t Rows, Cols;
int data[Rows * Cols]; // not real code!
};
Now you can say, m(2,3) = m(3,2) = -1; etc.

The answer to your question is what Kerrek already stated: you can provide an overload by changing the signature of the operator, which in this case can be achieved by modifying the const-ness of the function.
But I would recommend that you at least consider providing a separate setter for the values. The reason is that once you return references to your internal data structures you loose control of what external code does with your data. It might be ok in this case, but consider that if you decided to add range validation to the implementation (i.e. verify that no value in the matrix is above X or below Y), or wished to optimize some calculation on the matrix (say the sum of all of the elements in the matrix is an often checked value, and you want to optimize away the calculation by pre-caching the value and updating it on each field change) it is much easier to control with a method that receives the value to set.

Related

Initialization not Working

I am using trying to solve the Shortest Path problem in C++. For that I have created the following Graph() constructor.
Graph::Graph(int NumberOfVertices){
this->NumberOfVertices=NumberOfVertices;
cout<<"Graph()"<<endl;
//declaring the weight matrix
WeightMatrix=new int*[NumberOfVertices];
for(int a=0;a<NumberOfVertices;a++)
WeightMatrix[a]=new int[NumberOfVertices];
//initialising the weight matrix
WeightMatrix[NumberOfVertices][NumberOfVertices]={0};
ShortestPathArray=new int[NumberOfVertices];
}
I have two questions.
Why is a simple declaration like
WeightMatrix=new int[NumberOfVertices][NumberOfVertices] not allowed? I tried doing so but there were errors. I found the solution online, but am not able to understand it.
The initialization step is not working. The code doesn't proceed further than this statement WeightMatrix[NumberOfVertices][NumberOfVertices]={0};
When I comment out this step everything works fine.
Question #1:
The type of WeightMatrix is int**, so you cannot initialize it with new int[...].
As you seem to have already fixed in your code, the right way is to initialize it with new int*[...].
Question #2:
Initializing an array to a list of values is allowed only at declaration. For example:
int WeightMatrix[M][N] =
{
{1,2,3,...},
{4,5,6,...},
...
};
So you can fix the compilation error by changing this:
WeightMatrix[NumberOfVertices][NumberOfVertices]={0};
To this:
for (int i=0; i<NumberOfVertices; i++)
for (int j=0; j<NumberOfVertices; j++)
WeightMatrix[i][j] = 0;
Why is a simple declaration like WeightMatrix=new int[NumberOfVertices][NumberOfVertices] not allowed? I tried doing so but there were errors. I found the solution online, but am not able to understand it.
It should help to compare this to creation of an array on the stack, for which you can do:
int my_array[X][Y];
When you later say my_array[x][y], the compiler's record of the value Y is used to find the int value at address &my_array + x * Y + y. But, when you use new and specify a dimension at run-time, the compiler isn't obliged to store the dimension(s) involved - that would adversely affect run-time memory usage and performance. Without such dimensions though, the compiler can't support a [x][y] notation, so it's misleading to let you use new as if it were creating a multi-dimensional array. In practice, implementations sometimes store the single allowed array dimension in some extra memory they ask for when you use new[] so they can iterate over the right number of elements to call destructors, but they might want to avoid that for types such as int that require no destruction.
The expectation is that you'll work out the total number of elements you need:
int* weightMatrix = new int[NumberOfVertices * NumberOfVertices];
Then what's conceptually weightMatrix[x][y] can be stored at weightMatrix[x * NumberOfVertices + y] (or if you prefer weightMatrix[x + NumberOfVertices * y]).
I recommend writing a simple class that has an operator to provide a convenient notation ala matrix(x, y):
template <typename T>
class Matrix
{
Matrix(size_t X, size_t Y = X) : X_(X), Y_(Y), p_(new T[X * Y]) { }
~Matrix() { delete[] p_; }
T& operator()(size_t x, size_t y) { return p_[x * Y + y]; }
const T& operator()(size_t x, size_t y) const { return p_[x * Y + y]; }
size_t X_, Y_;
T* p_;
};
Then you can write simpler, cleaner and more robust client code:
Matrix matrix(20, 10);
matrix(4, 2) = 13;
You can also easily put checks in operator() to catch out-of-bound indexing during development and testing.

Implementing the A(:,k)=b; Matlab-like syntax in a C++ matrix library

I have developed an expression templates-based C++ matrix class of my own. I have overloaded the () operator so that I can read or write element matrices as, for example,
cout << A(i,j) << endl;
and
A(i,j)=b;
respectively.
I have also implemented a Range class to enable Matlab-like reads as
cout << A(Range(3,5),Range(0,10)) << endl;
The template Matrix class is exemplified as
template <typename OutType>
class Matrix
{
private:
int Rows_; //number of Rows
int Columns_; //number of Columns
OutType *data_; //row-major order allocation
public:
// --- Access operators
inline OutType & operator()(const int i, const int j) { return data_[IDX2R(i,j,GetColumns())]; }
inline OutType operator()(const int i, const int j) const { return data_[IDX2R(i,j,GetColumns())]; }
// --- SubExpressions - Range Range
inline Expr<SubMatrixExpr<const OutType*,OutType>,OutType> operator()(Range range1, Range range2)
{ typedef SubMatrixExpr<const OutType*,OutType> SExpr;
return Expr<SExpr,OutType>(SExpr(data_,Rows_,Columns_,range1.numelements_,range2.numelements_,range1.start_,range1.step_,range2.start_,range2.step_),
range1.numelements_,range2.numelements_);
}
}
I would now like to enable Matlab-like assignments as
A(Range(3,5),Range(0,10))=B;
where B is an appropriate matrix.
I think that, to achieve the Matlab-like syntax above, two possibilities would be
overloading the () operator, so that it returns an array of pointers, and then overloading the = operator so that the latter could act between an array of pointers and a Matrix;
exploit the already performed overload of the () operator indicated above and overloading the = operator so that the latter could act between an expression and a Matrix.
Maybe the first option is not very convenient, especilly for very large matrices.
Am I correct? Are there other more efficient/effective possibilities using perhaps more sophisticated C++ features (e.g., move semantics)?
Thank you very much for your help.
I think your best bet is to have a non-const version of operator()(Range, Range) return a proxy object that has an overloaded assignment operator that knows how to assign to a range (back into the original matrix for example).

How to index and assign elements in a tensor using identical call signatures?

OK, I've been googling around for too long, I'm just not sure what to call this technique, so I figured it's better to just ask here on SO. Please point me in the right direction if this has an obvious name and/or solution I've overlooked.
For the laymen: a tensor is the logical extension of the matrix, in the same way a matrix is the logical extension of the vector. A vector is a rank-1 tensor (in programming terms, a 1D array of numbers), a matrix is a rank-2 tensor (a 2D array of numbers), and a rank-N tensor is then simply an N-D array of numbers.
Now, suppose I have something like this Tensor class:
template<typename T = double> // possibly also with size parameters
class Tensor
{
private:
T *M; // Tensor data (C-array)
// alternatively, std::vector<T> *M
// or std::array<T> *M
// etc., or possibly their constant-sized versions
// using Tensor<>'s template parameters
public:
... // insert trivial fluffy stuff here
// read elements
const T & operator() (size_t a, size_t b) const {
... // error checks etc.
return M[a + rows*b];
}
// write elements
T & operator() (size_t a, size_t b) {
... // error checks etc.
return M[a + rows*b];
}
...
};
With these definitions of operator()(...), indexing/assign individual elements then has the same call signature:
Tensor<> B(5,5);
double a = B(3,4); // operator() (size_t,size_t) used to both GET elements
B(3,4) = 5.5; // and SET elements
It is fairly trivial to extend this up to arbitrary tensor rank. But what I'd like to be able to implement is a more high-level way of indexing/assigning elements:
Tensor<> B(5,5);
Tensor<> C = B( Slice(0,4,2), 2 ); // operator() (Slice(),size_t) used to GET elements
B( Slice(0,4,2), 2 ) = C; // and SET elements
// (C is another tensor of the correct dimensions)
I am aware that std::valarray (and many others for that matter) does a very similar thing already, but it's not my objective to just accomplish the behavior; my objective here is to learn how to elegantly, efficiently and safely add the following functionality to my Tensor<> class:
// Indexing/assigning with Tensor<bool>
B( B>0 ) += 1.0;
// Indexing/assigning arbitrary amount of dimensions, each dimension indexed
// with either Tensor<bool>, size_t, Tensor<size_t>, or Slice()
B( Slice(0,2,FINAL), 3, Slice(0,3,FINAL), 4 ) = C;
// double indexing/assignment operation
B(3, Slice(0,4,FINAL))(mask) = C; // [mask] == Tensor<bool>
.. etc.
Note that it's my intention to use operator[] for non-checked versions of operator(). Alternatively, I'll stick more to the std::vector<> approach of using .at() methods for checked versions of operator[]. Anyway, this is a design choice and besides the issue right now.
I've conjured up the following incomplete "solution". This method is only really manageable for vectors/matrices (rank-1 or rank-2 tensors), and has many undesirable side-effects:
// define a simple slice class
Slice ()
{
private:
size_t
start, stride, end;
public:
Slice(size_t s, size_t e) : start(s), stride(1), end(e) {}
Slice(size_t s, size_t S, size_t e) : start(s), stride(S), end(e) {}
...
};
template<typename T = double>
class Tensor
{
... // same as before
public:
// define two operators() for use with slices:
// version for retrieving data
const Tensor<T> & operator() (Slice r, size_t c) const {
// use slicing logic to construct return tensor
...
return M;
{
// version for assigning data
Sass operator() (Slice r, size_t c) {
// returns Sass object, defined below
return Sass(*this, r,c);
}
protected:
class Sass
{
friend class Tensor<T>;
private:
Tensor<T>& M;
const Slice &R;
const size_t c;
public:
Sass(Tensor<T> &M, const Slice &R, const size_t c)
: M(M)
, R(R)
, c(c)
{}
operator Tensor<T>() const { return M; }
Tensor<T> & operator= (const Tensor<T> &M2) {
// use R/c to copy contents of M2 into M using the same
// Slice-logic as in "Tensor<T>::operator()(...) const" above
...
return M;
}
};
But this just feels wrong...
For each of the indexing/assignment methods outlined above, I'd have to define a separate Tensor<T>::Sass::Sass(...) constructor, a new Tensor<T>::Sass::operator=(...), and a new Tensor<T>::operator()(...) for each and every such operation. Moreover, the Tensor<T>::Sass::operators=(...) would need to contain much of the same stuff that's already in the corresponding Tensor<T>::operator()(...), and making everything suitable for a Tensor<> of arbitrary rank makes this approach quite ugly, way too verbose and more importantly, completely unmanageable.
So, I'm under the impression there is a much more effective approach to all this.
Any suggestions?
First of all I'd like to point out some design issues:
T & operator() (size_t a, size_t b) const;
suggests you can't alter the matrix through this method, because it's const. But you are giving back a nonconst reference to a matrix element, so in fact you can alter it. This only compiles because of the raw pointer you are using. I suggest to use std::vector instead, which does the memory management for you and will give you an error because vector's const version of operator[] gives a const reference like it should.
Regarding your actual question, I am not sure what the parameters of the Slice constructor should do, nor what a Sass object is meant to be (I am no native speaker, and "Sass" gives me only one translation in the dictionary, meaning sth. like "impudence", "impertinence").
However, I suppose with a slice you want to create an object that gives access to a subset of a matrix, defined by the slice's parameters.
I would advice against using operator() for every way to access the matrix. op() with two indices to access a given element seems natural. Using a similar operator to get a whole matrix to me seems less intuitive.
Here's an idea: make a Slice class that holds a reference to a Matrix and the necessary parameters that define which part of the Matrix is represented by the Slice. That way a Slice would be something like a proxy to the Matrix subset it defines, similar to a pair of iterators which can be seen as a proxy to a subrange of the container they are pointing to. Give your Matrix a pair of slice() methods (const and nonconst) that give back a Slice/ConstSlice, referencing the Matrix you call the method on. That way, you can even put checks into the method to see if the Slice's parameters make sense for the Matrix it refers to. If it makes sense and is necessary, you can also add a conversion operator, to convert a Slice into a Matrix of its own.
Overloading operator() again and again and using the parameters as a mask, as linear indices and other stuff is more confusing than helping imo. operator() is slick if it does something natural which everybody expects from it. It only obfuscates the code if it is used everywhere. Use named methods instead.
Not an answer, just a note to follow up my comment:
Tensor<bool> T(false);
// T (whatever its rank) contains all false
auto lazy = T(Slice(0,4,2));
// if I use lazy here, it will be all false
T = true;
// now T contains all true
// if I use lazy here, it will be all true
This may be what you want, or it might be unexpected.
In general, this can work cleanly with immutable tensors, but allowing mutation gives the same class of problem as COW strings.
If you allow for your Tensor to implicitly be a double you can return only Tensors from your operator() overload.
operator double() {
return M.size() == 1 ? M[0] : std::numeric_limits<double>::quiet_NaN();
};
That should allow for
double a = B(3,4);
Tensor<> a = B(Slice(1,2,3),4);
To get the operator() to work with multiple overloads with Slice and integer is another issue. I'd probably just use Slice and create another implicit conversion so integers can be Slice's, then maybe using the variable argument elipses.
const Tensor<T> & operator() (int numOfDimensions, ...)
Although the variable argument route is kind of a kludge best to just have 8 specializations for 1-8 parameters of Slice.

Own matrix class multiply operator

I wrote an IntegerMatrix class to add my own methods to work with matrices. Now I've written a function like this:
IntegerMatrix** IntegerMatrix::multiplyMatrix(IntegerMatrix** table2)
(It's a double pointer because I'm holding a huge array of pointers to 4x4 2D arrays.) so I simply could do this:
matrix1.multplyMatrix(matrix2)
One little problem is the * isn't defined for my own class. So I thought to overload this operator that I could do something like this:
sum += this->table[i][k] * table2[k][j];
But how can I get the right i and k in the overloaded operator, which is defined like this:
IntegerMatrix IntegerMatrix::operator*(const IntegerMatrix & k);
The only problem I can't figure out right now is how to get the right values ?
EDIT:
I've rewrote this and now I have:
IntegerMatrix IntegerMatrix::operator*(const IntegerMatrix & table2)
{
int i, j, k;
int sum;
IntegerMatrix * result = new IntegerMatrix(SIZE);
for (i = 0; i < SIZE; i++) {
for (j = 0; j < SIZE; j++) {
sum = 0;
for (k = 0; k < SIZE; k++) {
sum += this->table[i][k] * table2[k][j];
}
result[i][j] = sum;
}
}
return *result;
}
That gives me just an error on the [] :
Binary '[' : 'IntegerMatrix' does not define this operator or a conversiont o a type acceptable to the predefined operator.
I don't understand your question, but here's a brief demo of how matrix multiplication normall works:
class IntegerMatrix {
int table[3][3];
public:
IntegerMatrix& operator*=(const IntegerMatrix& rhs) {
//multiply table by rhs.table, store in data.
return *this;
}
};
IntegerMatrix operator*(IntegerMatrix lhs, const IntegerMatrix& rhs)
{return lhs*=rhs;} //lhs is a copy, so we can alter and return it
FOR YOUR EDIT
You have the code
IntegerMatrix * result = new IntegerMatrix(SIZE); //pointer to an IntegerMatrix
...
result[i][j] = sum; //assign sum to the jth index of matrix # i
when in actuality, I presume you wanted
result->table[i][j] = sum; //sum to the ixj index of the result matrix.
Also, your function is leaky, because you have a new, but no delete. This is easy to fix in your case, since you don't need the new. (Are you from a Java or C# background?)
IntegerMatrix result(SIZE);
...
result[i][j] = sum;
...
return result;
Unrelated to all of the above, you might actually want to provide a [] operator for your Integer Matrix.
class row {
int* data;
int size;
public:
row(int* d, int s) :data(d), size(s) {}
int& operator[](int offset) {
assert(offset<size);
return data[offset];
}
};
row operator[](int column) {
assert(column<SIZE);
return row(table[column], SIZE);
}
And this would allow you to write:
IntegerMatrix result;
result[i][j] = sum;
You may be carrying over some artifacts, in sort of a Cargo-Cult programming sense. :-/
For instance: I'm guessing that the double indirections (**) on your prototype for multiplyMatrix are there because you saw multidimensional arrays of integers around somewhere...stuff like:
void printMatrix(int ** myMatrix, int rows, int columns);
The double-indirection is just a pointer-to-a-pointer. It's a way of achieving the specific implementation point of passing low-level C-style 2D arrays as parameters. But it's not something you have to tack on any time you're working with an abstract class that happens to represent a Matrix. So once you've encapsulated the matrix size and data itself inside the IntegerMatrix class, you don't want something like this:
void printMatrix(IntegerMatrix ** myMatrix);
More likely you'd want to pass in a simple reference to the class which is encapsulating the data, like this:
void printMatrix(IntegerMatrix const & myMatrix);
You should actually return a new matrix from your multiplication function, at least if you're using it to implement an operator overload...because semantically it does not make sense for people to write things like a * b; and have that modify a. (It can, but you shouldn't.) So you are left with either the choice of returning a matrix value instance:
IntegerMatrix IntegerMatrix::multiplyMatrix(IntegerMatrix const & rhs);
...or returning a pointer to a new object:
IntegerMatrix * IntegerMatrix::multiplyMatrix(IntegerMatrix const & rhs);
Returning by pointer has historically been chosen by many libraries because returning by value from a local variable in the function would involve making a copy at the time of return. Returning a pointer is fast (it "copies" only one 32-bit/64-bit number) while copying an instance of an object and large blocks of data inside it is slow. So a lot of libraries would just use Matrix pointers everywhere...with the problem that it becomes hard to know whose responsibility it is to ultimately delete the object. Smart pointers are one way of ensuring this:
unique_ptr<IntegerMatrix> IntegerMatrix::multiplyMatrix(IntegerMatrix const & rhs);
But C++11 has some sneaky ability to be just as fast without the mess. If you return something by value from a function and the compiler is sure that value isn't going to be used again (since it's going out of scope), then it can be "moved" about as fast as a pointer could. This requires that you support move construction by RValue reference, and there's all kinds of trickiness in that.
There's really a lot of nuance. If you're doing this as an educational exercise, I'd suggest taking it slowly and going through a tutorial that walks you through every step instead of jumping straight into the fire. And if you're using low-level C arrays and dynamic allocations inside your matrix, change them to a std::vector of std::vector.
For one IntegerMatrix object you're using this->table[i][k] to refer to the array where you're holding the matrix data, while for the table2 object reference and the result pointer, you're using table2[k][j] and result[i][j].
I think that what you want to do is something like:
IntegerMatrix IntegerMatrix::operator*(const IntegerMatrix & table2)
{
int i, j, k;
int sum;
IntegerMatrix * result = new IntegerMatrix(SIZE);
for (i = 0; i < SIZE; i++) {
for (j = 0; j < SIZE; j++) {
sum = 0;
for (k = 0; k < SIZE; k++) {
sum += this->table[i][k] * table2.table[k][j];
}
result->table[i][j] = sum;
}
}
return *result;
}

Is the book wrong?

template <typename T>
class Table {
public:
Table();
Table(int m, int n);
Table(int m, int n, const T& value);
Table(const Table<T>& rhs);
~Table();
Table<T>& operator=(const Table& rhs);
T& operator()(int i, int j);
int numRows()const;
int numCols()const;
void resize(int m, int n);
void resize(int m, int n, const T& value);
private:
// Make private because this method should only be used
// internally by the class.
void destroy();
private:
int mNumRows;
int mNumCols;
T** mDataMatrix;
};
template <typename T>
void Table<T>::destroy() {
// Does the matrix exist?
if (mDataMatrix) {
for (int i = 0; i < _m; ++i) {
// Does the ith row exist?
if (mDataMatrix[i]) {
// Yes, delete it.
delete[]mDataMatrix[i];
mDataMatrix[i] = 0;
}
}
// Delete the row-array.
delete[] mDataMatrix;
mDataMatrix = 0;
}
mNumRows = 0;
mNumCols = 0;
}
This is a code sample I got from a book. It demonstrates how to destroy or free a 2x2 matrix where mDataMatrix is the pointer to array of pointers.
What I don't understand is this part:
for(int i = 0; i < _m; ++i) {
// Does the ith row exist?
if (mDataMatrix[i]) {
//.….
}
}
I don't know why the book uses _m for max number of row-ptr. It wasn't even a variable define in class; the variable for max row is mNumRows. Maybe it is some compiler pre-defined variable? Another thing I am quite confuse is why is it ++i? pre-operator, why not i++? Will it make different if I change it into i++?
Another thing I am quite confuse is why is it ++i? pre-operator, why not i++? Will it make different if I change it into i++?
Because ++i is more natural and easier to understand: increment i and then yield the variable i as a result. i++ on the other hand means copy the current value of i somewhere (let's call it temp), increment i, and then yield the value temp as a result.
Also, for user-defined types, i++ is potentially slower than ++i.
Note that ++i as a loop increment does not imply the increment happens before entering the loop body or something. (This seems to be a common misconception among beginners.) If you're not using ++i or i++ as part of a larger expression, the semantics are exactly the same, because prefix and postfix increment only differ in their result (incremented variable vs. old value), not in their side effect (incrementing the variable).
Without seeing the entire class code, it is hard to tell for your first question, but if it hasn't been defined as part of the class, my guess would be that it is a typo.
as for your second question, ++i vs. i++, the prefix increment operator (++i) returns the object you are incrementing, whereas the postfix increment operator returns a copy of the object, in the objects original state. i.e.-
int i=1;
std::cout << i++ << std::endl; // output: 1
std::cout << i << std::endl // output: 2
std::cout << ++i << std::endl // output: 3
as for will the code change with the postfix- no, it works the same in loops, and makes basically no difference in loops for integer types. For user defined types, however, it may be more efficient to use the prefix increment, and is the style many c++ programmers use by default.
If the _mvariable isn't defined anywhere this is an error. From that context it looks like it should contain the number of rows that are allocated with new somewhere (probably in the constructor, or there might be methods like addRow). If that number is always mNumRows, than this would be appropriate for the loop in the destructor.
If you use ++i or i++ in that for loop doesn't make any difference. Both variants increment the integer, and the return value of the expression (that would be different) isn't used anywhere.
I can't speak to the first part of the question, but I can explain the pre- versus post- increment dilemma.
Prefix versions increment and decrement are slightly more efficient and are generally preferred. In the end, though, the extra overhead caused by using i++ over ++i is negligible unless the loop is being executed many, many times.
As others have said, the prefix operator is preferred for performance reasons when dealing with user-defined types. The reason it has no impact on the for loop is because the test involving the value of the variable (i.e. i < _m) is performed before the operation that modifies the variable is performed.
The real mess with this book is the way it illustrates a 2x2 matrix. The problem is here that for 4 elements you have 3 blocks of memory allocated, and not only does it slows down the program but it is certainly much more tricky to handle.
The usual technic is much simpler:
T* mData = new T[2*2];
And then you access it like so:
T& operator()(size_t r, size_t c) { return mData[r * mNbRows + c]; }
This is a bit more work (you have to multiply by the number of rows if you are row major), but then the destroy is incredibly easy:
template <class T>
void Table<T>::destroy()
{
delete[] mData;
mData = 0;
mNbRows = 0;
mNbColumns = 0;
}
Also note that here there is no need for a if: it's fine to call delete on a null pointer, it just doesn't do anything.
Finally, I have no idea why your book is using int for coordinates, do negative coordinates have any meaning in the context of this Table class ? If not, you're better off using an unsigned integral type (like size_t) and throwing the book away.