Static pointer to member array for safe operator[] access - c++

I was looking at the source code of boost::gil and I came across this comment and corresponding code in the 2D point class.
const T& operator[](std::size_t i) const { return this->*mem_array[i]; }
T& operator[](std::size_t i) { return this->*mem_array[i]; }
...
private:
// this static array of pointers to member variables makes operator[]
// safe and doesn't seem to exhibit any performance penalty
static T point2<T>::* const mem_array[num_dimensions];
http://www.boost.org/doc/libs/develop/boost/gil/utilities.hpp
Questions:
What does this do exactly?
How does this make operator[] safe?

The definition of the array is relevant – it is
template <typename T>
T point2<T>::* const point2<T>::mem_array[point2<T>::num_dimensions]
= { &point2<T>::x, &point2<T>::y };
The indirection through a pointer-to-member makes it possible to access the x coordinate of a point p as either p.x or p[0], and similarly for p.y and p[1].
This is otherwise sometimes accomplished through (probably undefined) pointer trickery or a (possibly less efficient) branch on the index.
It is of course not absolutely safe since there is no bounds-checking, but it's safe in the sense of being standards-compliant and well-defined.

Related

c++: Is it valid to subtract an iterator from an element pointer to get a valid index?

I came across the following code:
for (int i = 0; i < subspan.size(); i++) {
...
int size = size_table[&(subspan[i]) - fullspan.begin()];
...
}
subspanand fullspan are both of type std::span (actually absl::Span from Google's Abseil library, but they seem to be pretty much the same as std::span) and are views into the same data array (with fullspan spanning the entire array).
Is this valid and well defined code? It seems to depend on the iterator being converted to the corresponding pointer value when the - operator is applied together with a lhs pointer.
Is it valid to subtract an iterator from an element pointer to get a valid index?
It could be, depending on how the iterator is defined. For example, it works if the iterator is a pointer of the same type, and points to an element of the same array.
However, no generic iterator concept specifies such operation, and so such operation isn't guaranteed to work with any standard iterator. Hence, it's not a portable assumption that it would work in generic code.
Is this valid and well defined code?
The iterator type in question is defined to be the pointer type, so that condition is satisfied. Abseil is neither thoroughly documented nor specified, so it's hard to say whether that's an intentional feature, or incidental implementation detail. If it's latter, then the code may break in future versions of Abseil.
Reading the implementation of absl::Span, we have:
template <typename T>
class Span {
...
public:
using element_type = T;
using pointer = T*;
using const_pointer = const T*;
using reference = T&;
...
using iterator = pointer;
...
constexpr iterator begin() const noexcept { return data(); }
constexpr reference operator[](size_type i) const noexcept { return *(data() + i); }
...
}
So your expression boils down to plain pointer arithmetic.
Note that there is no check on whether both spans refer to the same base span, but you asserted that was not the case.

Trying to understand the access operator of a Matrix Multiplication in C++

I am trying to understand the access operator for a Matrix Multiplication.
template<typename T>
class Matrix44
{
public:
Matrix44() {}
// The next two lines are totally confusing for me
const T* operator [] (uint8_t i) const { return m[i]; }
T* operator [] (uint8_t i) { return m[i]; }
// initialize the coefficients of the matrix with the coefficients of the identity matrix
T m[4][4] = {{1,0,0,0},{0,1,0,0},{0,0,1,0},{0,0,0,1}}; // Why can you do this m[4][4]
};
typedef Matrix44<float> Matrix44f;
So what I understood is that they defined an own access operator for accessing the matrix indices:
Matrx44f mat;
mat[0][3] = 1.f;
But how does that relate to their defintion
...
const T* operator [] (uint8_t i) const { return m[i]; }
T* operator [] (uint8_t i) { return m[i]; }
Thank you very much for helping a C++ noob <3
Source: https://www.scratchapixel.com/lessons/mathematics-physics-for-computer-graphics/geometry/matrices
As quite a few people have already pointed out it is quite a poor matrix implementation: It let's you make assumptions about the internal implementation and has quite a few flaws. But I would lie if I said I would not have already seen implementations like this in research codes. ;) Instead of bashing the implementation I would like to briefly point out how it works as nonetheless it shows a few particularities of C++.
Overview
// Class for a generic data type T (e.g. T = float)
template<typename T>
class Matrix44 {
public:
// Constructor
Matrix44() {
return;
}
// Access operator for constant objects
const T* operator [] (uint8_t i) const {
return m[i];
}
// Access operator for non-constant objects
T* operator [] (uint8_t i) {
return m[i];
}
// Declaration of a stack-allocated array as class member
T m[4][4]
// Initialisation of this class member
= {{1,0,0,0},{0,1,0,0},{0,0,1,0},{0,0,0,1}};
};
// Alias Matrix44f for a matrix of floats (T = float)
typedef Matrix44<float> Matrix44f;
Stack allocated arrays
In C++ unlike in other programming languages one can allocate arrays on the stack as well as on the heap. A stack allocated one-dimensional array can (of integers) be declared as
int v[3];
while
int m[2][3];
would be the declaration of a multidimensional (in this case two-dimensional) row-major array which can be initialised with
int m[2][3] = {{1,2},{3,4},{5,6}};
(similar to how you could initialise a vector of vectors with an initialiser list) and whose elements can then be accessed with a similar syntax m[i][j] (also similar to a vector of vectors std::vector<std::vector<T>>, Watch out: No out-of-bound checks!).
As you can see your class is only a wrapper for a such a stack-allocated two-dimensional array of a generic type T,
T m[4][4];
a so called template class. Instantiating the class as Matrix44<float> makes it a matrix of T = float while for a matrix Matrix44<int> the underlying data type would be int. With typedef Matrix44<float> Matrix44f at the bottom a new alias for this data type was create. So one can declare a variable Matrix44f mat.
Stack-allocated means that its size is quite limited compared to a heap-allocated array which isn't very prohibiting for a 4x4 array though (but why would you not extend this implementation at least to NxN?). Furthermore the way it is written it does not even have a proper constructor (Matrix44() {}) but is instead default-initialised with an identity matrix
{{1,0,0,0},{0,1,0,0},{0,0,1,0},{0,0,0,1}}
(why would somebody do that?).
Operator []
C++ gives you the possibility to overload operators. So one can define (overload) basic operations such as addition operator +, multiplication operator *, comparisons etc. Similarly one can also overload [] (but only with a single argument and therefore numerical libraries such as Eigen opt to overload the () operator instead), () and ,. But there is no such thing as a [][] overload! In order to allow a matrix access from outside the class with [][] similar to the stack-allocated array it was chosen to write an operator [] that returns a pointer to the first element of the second dimension of the underlying two-dimensional array at the corresponding index i of the first dimension m[i][0]: return m[i] is equivalent to return &m[i][0]. One can apply then the operator [] of the pointer p[j] which is equivalent to *(p+j) (see here for more details) to move this pointer by j-entries to access the element m[i][j]. (This is sort of similar to using a vector-of-vectors instead of a stack-allocated-array, returning a std::vector<T>& from the operator [] and then using the [] operator of the vector to access the precise element.)
Member functions can be declared constant (see the const after the argument list) which means they can also be called for constant objects while if not declared constant they can only be called for non-constant objects. For constant objects the constant version is called while for non-constant objects the non-constant implementation is called (const overloading). In order to allow assignments the non-constant version
T* operator [] (uint8_t i) {
return m[i];
}
returns a non-constant pointer T* which can be modified while the constant implementation returns a pointer to a constant variable const T*.
const T* operator [] (uint8_t i) const {
return m[i];
}
This is the reason why there are two of these implementations. For a constant object the latter one will be called which does not allow assignment (but you can read the element 1, 2 with m[1][2]) while for a non-constant one the first one which allows assignment (e.g. m[1][2] = 1 will set the element 1, 2 to 1).
First, remember that the only value of this class is to give value semantics to a C-array and that is not yet achieved with the shown code. (Sort of what std::array<T, N> does.)
Besides that, this is a bad implementation that relies in pointer decay, the reference to the 4-element array (once you take the first index), decays into an pointer in your implementation.
At best the decay looses useful type information.
So, as it is, m[i] returns a reference to a C-array and it is later decayed into a pointer by the return type of operator[] of the class.
A better implementation could simply be decltype(auto) operator[](int i){return m[i];}.
If you want a more explicit description of what is going on you can do
using reference = T(&)[4]; // reference to a 4-element subarray
reference const operator[](int i) const { return m[i]; }
reference operator[](int i) { return m[i]; }
https://godbolt.org/z/EjbWY691x
At this point, it might be better to rely in an existing good wrapper and simply say
template<class T> using Matrix44 = std::array<std::array<T, 4>, 4>;
If the initialization to identity is important (not a good idea IMO)...
template<typename T>
class Matrix44
{
public:
Matrix44() {}
// The next two lines are totally confusing for me
decltype(auto) operator [] (uint8_t i) const { return m[i]; }
decltype(auto) operator [] (uint8_t i) { return m[i]; }
// initialize the coefficients of the matrix with the coefficients of the identity matrix
std::array<std::array<T, 4>, 4> m = {{1,0,0,0},{0,1,0,0},{0,0,1,0},{0,0,0,1}};
};
typedef Matrix44<float> Matrix44f;

What is the design rationale behind the resize method of std::vector?

Many methods within the template class vector take a const reference to value_type objects, for instance:
void push_back (const value_type& val);
while resize takes its value_type parameter by value:
void resize (size_type n, value_type val = value_type());
As a non-expert C++ programmer I can only think of disadvantages with this choice (for instance if size_of(value_type) is big enough stack overflow may occur). What I would like to ask to people with more insight on the language is thus:
What is the design rationale behind this choice?
void resize( size_type count, T value = T() );
This function has been removed from C++11.
C++11 has two overloads of resize():
void resize( size_type count );
void resize( size_type count, const value_type& value);
which is pretty much straightforward to understand. The first one uses default constructed objects of type value_type to fill vector when resizing, the second takes a value from which it makes copies when resizing.
This seems to be a design defect, it has been fixed now.
Quoting from STL defects 679
The C++98 standard specifies that one member function alone of the containers passes its parameter (T) by value instead of by const reference:
void resize(size_type sz, T c = T());
This fact has been discussed / debated repeatedly over the years, the first time being even before C++98 was ratified. The rationale for passing this parameter by value has been:
So that self referencing statements are guaranteed to work, for example:
v.resize(v.size() + 1, v[0]);
However this rationale is not convincing as the signature for push_back is:
void push_back(const T& x);
And push_back has similar semantics to resize (append). And push_back must also work in the self referencing case:
v.push_back(v[0]); // must work
The problem with passing T by value is that it can be significantly more expensive than passing by reference. The converse is also true, however when it is true it is usually far less dramatic (e.g. for scalar types).
Even with move semantics available, passing this parameter by value can be expensive. Consider for example vector>:
std::vector<int> x(1000);
std::vector<std::vector<int>> v;
...
v.resize(v.size()+1, x);
In the pass-by-value case, x is copied once to the parameter of resize. And then internally, since the code can not know at compile time by how much resize is growing the vector, x is usually copied (not moved) a second time from resize's parameter into its proper place within the vector.
With pass-by-const-reference, the x in the above example need be copied only once. In this case, x has an expensive copy constructor and so any copies that can be saved represents a significant savings.
If we can be efficient for push_back, we should be efficient for resize as well. The resize taking a reference parameter has been coded and shipped in the CodeWarrior library with no reports of problems which I am aware of.
Proposed resolution:
Change 23.3.3 [deque], p2:
class deque {
...
void resize(size_type sz, const T& c);
Change 23.3.3.3 [deque.capacity], p3:
void resize(size_type sz, const T& c);
Change 23.3.5 [list], p2:
class list {
...
void resize(size_type sz, const T& c);
Change 23.3.5.3 [list.capacity], p3:
void resize(size_type sz, const T& c);
Change 23.3.6 [vector], p2:
class vector {
...
void resize(size_type sz, const T& c);
Change 23.3.6.3 [vector.capacity], p11:
void resize(size_type sz, const T& c);

Class storing pointers to vector in C++

I'm implementing a class Aviary, which can store pointers to Bird-objects. Now, I have the following:
class Aviary {
public:
const Bird &operator[](const size_t index) const {
return birds[index];
}
Bird &operator[](const size_t index) {
return birds[index];
}
private:
std::vector<Bird*> birds;
The Bird-objects are stored as pointers in order to avoid object-slicing. However, there is a problem with the operator[]-implementation (Reference to type 'const Bird' could not bind to an lvalue of 'const value_type' (aka 'Bird *const')).
How do I implement the operator[] properly?
Since you store pointers, you should dereference the pointer for return reference.
const Bird &operator[](const size_t index) const {
return *birds[index];
}
Bird &operator[](const size_t index) {
return *birds[index];
}
Side note: use smart pointers, instead of raw pointers.
Two side notes:
The const in a parameter passed by value (const size_t index) is useless and your compiler will ignore it. You can try declaring it with const and removing the const in the implementation: the compiler will correctly consider that your implementation matches the declaration.
The canonical way to implement the non-const version of operator[] is as follows:
As follows
Bird &operator[](size_t index) {
return const_cast<Bird&>(const_cast<const Aviary*>(this)->operator[](index));
}
I know all those const_cast look ugly, but they are both safe and this is the right way to ensure that both versions of operator[] do the same (you just need to maintain the const version from now on), while also making sure that your are not doing any non-const operation in the const version.
Apart from that, the problem with your code is that you are returning pointers, not (references to) the values pointed by them, as Luchian and ForEveR have already pointed out.
You need to dereference:
return *(birds[index]);
birds[index] is a Bird*, so you can't directly return it as a Bird&.

Overloading subscript operator []

Why does it require to be a member function of a class for its operation and is good to return a reference to private member?
class X
{
public:
int& operator[] (const size_t);
const int &operator[] (const size_t) const;
private:
static std::vector<int> data;
};
int v[] = {0, 1, 2, 3, 4, 5};
std::vector<int> X::data(v, v+6);
int& X::operator[] (const size_t index)
{
return data[index];
}
const int& X::operator[] (const size_t index) const
{
return data[index];
}
As to why is it required to have [] as a member, you can read this question (by yours sincerely). Seems it's just the way it is with no really really convincing explanation.
As to why return reference?
Because you want to provide a way not only to read, but also (for non-const objects) to modify the data. If the return weren't a reference (or some proxyr)
v[i] = 4;
wouldn't work.
HTH
It needs to be a member function according to 13.5.5:
operator[] shall be a non-static
member function with exactly one
parameter. It implements the
subscripting syntax
A reference to a private member is completely OK and pretty common. You hide the details from the user of your class, but still provide the functionality you need (ability to modify individual elements)
Your data variable likely shoudn't be static though, unless you really want to share it among all instances of your class
For the first question, it is just the way they decided it had to be, i.e. you can't do:
T operator[]( const X &, size_t );
as an external function.
And yes, you are fine returning a reference to a private member, non-const if you allow your users to write there, non-const otherwise.
In your example though data is static, which does not make sense if that is the source for what it returns.
What would the syntax be for calling a non-member operator[]? Any syntax for that would be awkward. operator[] takes ones parameter within the [ and ] and that is usually an index or some kind of data necessary to find an object.
Also, yes, it is a good idea to return a reference, even if it's a private member. That is exactly what STL vectors do and just about any other class I can think of that I've ever used that provides operator[]. It would be advised that it the usage is maintained.