Let's say I have something like the following method in my container class:
Datatype& operator[](const unsigned int Index) // I know this should use size_t instead.
{
return *(BasePointer + Index); // Where BasePointer is the start of the array.
}
I'd like to implement some sort of bounds-checking for the MyInstance[Index] = Value usage so the container resizes automatically if the user tries to change a value outside its range. However, I want something else to happen if the user tries to access a value outside the container's range, e.g. MyVariable = MyInstance[Index]. How can I detect how operator[] is being used?
Sketch:
return a proxy object instead of the actual data entry. The proxy object then defines operator = to handle the assignment case, and an implicit conversion operator for the reading-out case.
template <typename T>
class AccessorProxy {
friend class Container<T>;
public:
AccessorProxy(Container<T>& data, unsigned index)
: data(data), index(index) { }
void operator =(T const& new_value) {
// Expand array.
}
operator const T&() const {
// Do bounds check.
return *(data.inner_array + index);
}
private:
AccessorProxy(const AccessorProxy& rhs)
: data(rhs.data), index(rhs.index) {}
AccessorProxy& operator=(const AccessorProxy&);
Container<T>& data;
unsigned index;
};
template <typename T>
class ConstAccessorProxy {
friend class Container<T>;
public:
ConstAccessorProxy(const Container<T>& data, unsigned index)
: data(data), index(index) { }
operator const T&() const {
// Do bounds check.
return *(data.inner_array + index);
}
private:
ConstAccessorProxy(const ConstAccessorProxy& rhs)
: data(rhs.data), index(rhs.index) {}
ConstAccessorProxy& operator=(const ConstAccessorProxy&);
const Container<T>& data;
unsigned index;
};
AccessorProxy<Datatype> operator[](const unsigned int Index)
{
return AccessorProxy<Datatype>(*this, Index);
}
ConstAccessorProxy<Datatype> operator[] const (const unsigned int Index)
{
return ConstAccessorProxy<Datatype>(*this, Index);
}
The accessor classes will likely need to be be friends of the container class.
Finding ways to avoid the code duplication is left as an exercise to the reader. :)
Use a dummy class type to represent expressions like MyInstance[Index] and delay figuring out what to do until that expression is used.
class MyContainer {
private:
class IndexExpr {
public:
// Get data from container:
operator const Datatype&() const;
// Expand container if necessary, then store data:
Datatype& operator=(const Datatype& value);
// Treat MyInstance[i] = MyInstance[j]; as expected:
Datatype& operator=(const IndexExpr& rhs)
{ return *this = static_cast<const Datatype&>(rhs); }
private:
IndexExpr(MyContainer& cont, unsigned int ind);
MyContainer& container_;
unsigned int index_;
friend class MyContainer;
};
public:
IndexExpr operator[](unsigned int Index)
{ return IndexExpr(*this, Index); }
// No IndexExpr needed when container is const:
const Datatype& operator[](unsigned int Index) const;
// ...
};
This is not a perfect answer to "how to detect", but, if the user is accessing the operator[] via a const instance, then throw an exception if the index is out of bounds.
i.e.
Datatype const& operator[]() const { .. // don't modify here, throw exception
However, if the user is accessing the instance via a non const instance, then by all means expand if the index is out of bounds (and within your acceptable ranges)
Datatype& operator[]() { .. // modify here
Basically, you are using the const attribute of the instance to determine what your semantics would be (as done in std::map - i.e. trying to call operator[] on a const instance of a map results in a compiler error - i.e. there is no const qualified operator[] for map, because the function is guaranteed to create a mapping if the key does not exist already.)
Related
Edit: MyClass has been renamed to ReverseStringAccess for disambiguation.
I have a class which encapsulates a vector<string>. The class has an overloaded operator[] which can be used to read and modify the contents of the vector. This is how it looks (minimally):
class ReverseStringAccess {
public:
ReverseStringAccess() {}
ReverseStringAccess(vector<string> _arr) arr(_arr) {}
string& operator[](int index) {
return arr[index];
}
private:
vector<string> arr;
};
Now I need to be able to modify the contents of each string in the vector without directly accessing the vector (i.e. some sort of operator[][] which only works with vectors that are members of this class). The problem is that using ReverseStringAccess[][] will result in the default behavior of operator[] on strings. For example, this statement:
ReverseStringAccess[i][j]
would give the jth character of the ith string in the vector. But I want it to (for example) instead get the (length - j - 1)th character of the ith string in the vector, where length is the length of the ith string.
Is this possible? If yes, how?
If you want my_object[i][k] to not invoke std::string::operator[] then don't return a std::string& from MyClass::operator[], but a proxy that implements the desired []:
class MyClass {
public:
MyClass() {}
MyClass(vector<string> _arr) arr(_arr) {}
struct MyProxy {
std::string& str;
char& operator[](size_t index) { /.../ }
const char& operator[](size_t index) const { /.../ }
};
MyProxy operator[](int index) {
return {arr[index]};
}
private:
vector<string> arr;
};
The question is how to achieve reverse indexing on members of vector<string> via [][].
This can be achieved via a wrapper class for std::string:
template<typename T>
class reverse_access {
T& ref;
public:
// create wrapper given reference to T
reverse_access(T& ref_)
: ref(ref_)
{}
// reverse access operator []
auto operator[](size_t i) -> decltype(ref[0])
{ return ref[ ref.size() - 1 - i]; }
auto operator[](size_t i) const -> decltype(ref[0])
{ return ref[ ref.size() - 1 - i]; }
// allow implicit conversion to std::string reference if needed
operator T&() { return ref; }
operator const T&() const { return ref; }
};
// inside MyClass:
reverse_access<string> operator[](int index) {
return reverse_access<string>(arr[index]);
}
reverse_access<const string> operator[](int index) const {
return reverse_access<const string>(arr[index]);
}
And use return reverse_access<std::string>(arr[index]); in operator[] in MyClass.
The added conversion operator allows implicit conversion to a reference to the original type.
Another option is to use a different operator that can take 2 arguments: myclass(vec_index,str_index):
string::reference operator()(size_t vec_index, size_t str_index)
{
return arr[vec_index][ arr[vec_index].size() - str_index ];
}
string::const_reference operator()(size_t vec_index, size_t str_index) const
{
return arr[vec_index][ arr[vec_index].size() - str_index ];
}
A third solution is to reverse all strings on construction of ReverseStringAccess:
ReverseStringAccess(vector<string> _arr) arr(_arr)
{
for (auto& str : arr)
std::reverse(str.begin(), str.end());
}
Suppose I have a class that implements operator[], e.g.:
class Array {
public:
Array(size_t s) : data(new int[s]) {}
~Array() { delete[] data; }
int& operator[](size_t index) {
return data[index];
}
private:
int* data;
};
Is there a way to create a random access iterator from the class without having to manually create the iterator class and all its methods? I could manually define the class as follows:
class ArrayIterator : public std::iterator<std::random_access_iterator_tag, int> {
public:
ArrayIterator(Array& a) : arr(a), index(0) {}
reference operator[](difference_type i) {
return arr[index + i];
}
ArrayIterator& operator+=(difference_type i) {
index += i;
return *this;
}
bool operator==(const ArrayIterator& rhs) {
return &arr == &rhs.arr && index == rhs.index;
}
// More methods here...
private:
Array& arr;
difference_type index;
};
But doing so is time consuming since there are so many methods to implement, and each iterator for a class with operator[] would have the exact same logic. It seems it would be possible for the compiler to do this automatically, so is there a way to avoid implementing the entire iterator?
Is there a way to create a random access iterator from the class without having to manually create the iterator class and all its methods?
The simplest way to create a random-access iterator is to just use a raw pointer, which satisfies all of the requirements of the RandomAccessIterator concept (the STL even provides a default template specialization of std::iterator_traits for raw pointers), eg:
class Array {
public:
Array(size_t s) : data(new int[s]), dataSize(s) {}
~Array() { delete[] data; }
int& operator[](size_t index) {
return data[index];
}
size_t size() const { return dataSize; }
int* begin() { return data; }
int* end() { return data+dataSize; }
const int* cbegin() const { return data; }
const int* cend() const { return data+dataSize; }
private:
int* data;
size_t dataSize;
};
Implementing the random access operator by using operator[] may work, but it can be very inefficient and that's why compilers dont do that automatically. Just imagine adding operator[] to a class like std::list, where "going to element i" may take up to i steps. Incrementing an iterator based on operator[] would then have complexity O(n), where n is the size of the list. However, users of random access iterators, expect a certain efficiency, typically O(1).
How to avoid dangling reference in array subscription operator in some vector implementation below? If realloc changes the pointer then references previously obtained from operator[] are no longer valid. I cannot use new/delete for this. I have to use malloc/realloc/free.
template <class T>
class Vector
{
public:
typedef size_t size_type_;
...
T& operator[] (const size_type_);
void push_back (const T&);
...
private:
const size_type_ page_size_;
size_type_ size_;
size_type_ capacity_;
T* buffer_;
};
template<class T>
inline
T&
some_namespace::Vector<T>::operator[] (const size_type_ index)
{
ASSERT(index < size_);
return buffer_[index];
}
template<class T>
inline
void
some_namespace::Vector<T>::push_back(const T& val)
{
if (size_ >= capacity_)
{
capacity_ += page_size_;
T* temp = (T*)(realloc(buffer_, capacity_*sizeof(T)));
if (temp != NULL)
{
buffer_ = temp;
}
else
{
free(buffer_);
throw some_namespace::UnexpectedEvent();
}
}
buffer_[size_++] = val;
}
By the way, the source of dangling reference in the code was this:
v_.push_back(v_[id]);
where v_ is an instance of Vector. To guard against this the new push_back is:
template<class T>
inline
void
some_namespace::Vector<T>::push_back(const T& val)
{
if (size_ >= capacity_)
{
const T val_temp = val; // copy val since it may come from the buffer_
capacity_ += page_size_;
T* temp = (T*)(realloc(buffer_, capacity_*sizeof(T)));
if (temp != NULL)
{
buffer_ = temp;
}
else
{
free(buffer_);
throw some_namespace::UnexpectedEvent();
}
buffer_[size_++] = val_temp;
}
else
{
buffer_[size_++] = val;
}
}
There are basically three things you can do:
Accept it as is and document it. This is what the STL does and it's actually quite reasonable.
Don't return a reference, return a copy. Of course, this means you can't assign to the vector's element anymore (v[42] = "Hello"). In that case you want a single operator[] which is marked const.
Return a proxy object which stores a reference to the vector itself and the index. You add an assignment to the proxy from const T& to allow writing to the vector's element and you need to provide some way to access/read the value, most likely an implicit operator const T&. Since the proxy object asks the vector on each access for the current position of the element, this works even if the vector changed the location of those elements between calls. Just be careful with multi-threading.
Here's a sketch for the proxy, untested and incomplete:
template<typename T> class Vector
{
private:
// ...
// internal access to the elements
T& ref( const size_type_ i );
const T& ref( const size_type_ i ) const;
class Proxy
{
private:
// store a reference to a vector and the index
Vector& v_;
size_type_ i_;
// only the vector (our friend) is allowed to create a Proxy
Proxy( Vector& v, size_type_ i ) : v_(v), i_(i) {}
friend class Vector;
public:
// the user that receives a Proxy can write/assign values to it...
Proxy& operator=( const T& v ) { v_.ref(i_) = v; return *this; }
// ...or the Proxy can convert itself in a value for reading
operator const T&() const { return v_.ref(i_); }
};
// the Proxy needs access to the above internal element accessors
friend class Proxy;
public:
// and now we return a Proxy for v[i]
Proxy operator[]( const size_type_ i )
{
return Proxy( *this, i );
}
};
Note that the above is incomplete and the whole technique has some drawbacks. The most significant problem is that the Proxy "leaks" in the API and therefore some use cases are not met. You need to adapt the technique to your environment and see if it fits.
A question related to a custom Vector class in C++.
template <typename T>
class Vector
{ ...
private:
T * mData; int mSize;
public:
proxy_element operator[](const size_type index) { return proxy_element(*this, index); }
const T& operator[](const size_type index) const { return mData[index]; }
};
template <typename T>
class proxy_element
{ ...
proxy_element(Vector<T>& m_parent, const size_type index);
proxy_elem& operator=(const T& rhs); // modifies data so invalidate on other memories
bool operator==(const proxy_elem& rhs) // only read, just copy data back.
...
}
The reason for using proxy_element class is to distinguish and optimize read and writes operations, considering that the vector data can reside in GPU device memories as well. So any read operation require only to copy latest data back (if any) but a readwrite/write operation require invalidating data in device memories.
This design work well when the element type is primitive. However for more complex element types, there is one issue:
struct person{ int age; double salary; };
int main()
{
Vector<person> v1(10);
v[1].age = 10; // gives error as operator[] returns proxy_element for which "." operator has no meaning
}
AFAIK, the "." operator cannot be overload in C++. One obvious solution is to not use proxy_elem and just return regular reference (T &), assuming that each access is a write access, but that will be inefficient for obvious reasons.
Is there any other work around which gives me "." operator working while retaining ability to distinguish between read and write operations?
One option is to make such data types immutable (private member variables, initialised by a constructor, and the only setter is the class's assignment operator). This way, the only means to change anything is to assign to an entire instance of the class, which can be channeled through a proxy_element.
Marcelo Cantos's answer is, of course, the proper way to do things. However, there is the complicated and crazy workaround of specialization. (Not recommended.)
//if it's a class, inherit from it to get public members
template<class T>
class proxy_element : public T {
...
proxy_element(Vector<T>& m_parent, const size_type index);
proxy_elem& operator=(const T& rhs); // modifies data so invalidate on other memories
bool operator==(const proxy_elem& rhs) // only read, just copy data back.
...
};
//pretend to be a pointer
template<>
class proxy_element<T*> {
...
proxy_element(Vector<T>& m_parent, const size_type index);
proxy_elem& operator=(const T& rhs); // modifies data so invalidate on other memories
bool operator==(const proxy_elem& rhs) // only read, just copy data back.
...
};
//otherwise, pretend to be primitive
#define primitive_proxy(T) \
template<> class proxy_element {
...
proxy_element(Vector<T>& m_parent, const size_type index);
proxy_elem& operator=(const T& rhs); // modifies data so invalidate on other memories
bool operator==(const proxy_elem& rhs) // only read, just copy data back.
...
};
primitive_proxy(char)
primitive_proxy(unsigned char)
primitive_proxy(signed char) //this is distinct from char remember
primitive_proxy(short)
primitive_proxy(unsigned short)
primitive_proxy(int)
primitive_proxy(unsigned int)
primitive_proxy(long)
primitive_proxy(unsigned long)
primitive_proxy(long long)
primitive_proxy(unsigned long long)
primitive_proxy(char16_t) //if GCC
primitive_proxy(char32_t) //if GCC
primitive_proxy(wchar_t)
primitive_proxy(float)
primitive_proxy(double)
primitive_proxy(long double)
Let's say I have a container class called MyContainerClass that holds integers.
The [] operator, as you know, can be overloaded so the user can more intuitively access values as if the container were a regular array. For example:
MyContainerClass MyInstance;
// ...
int ValueAtIndex = MyInstance[3]; // Gets the value at the index of 3.
The obvious return type for operator[] would be int, but then the user wouldn't be able to do something like this:
MyContainerClass MyInstance;
MyInstance[3] = 5;
So, what should the return type for operator[] be?
The obvious return type is int& :)
For increased elaboration:
int &operator[](ptrdiff_t i) { return myarray[i]; }
int const& operator[](ptrdiff_t i) const { return myarray[i]; }
// ^ could be "int" too. Doesn't matter for a simple type as "int".
This should be a reference:
int &
class MyContainerClass {
public:
int& operator[](unsigned int index);
int operator[](unsigned int index) const;
// ...
};
Returning a reference lets the user use the result as an lvalue, as in your example MyInstance[3] = 5;. Adding a const overload makes sure they can't do that if MyInstance is a const variable or reference.
But sometimes you want things to look like that but don't really have an int you can take a reference to. Or maybe you want to allow multiple types on the right-hand side of MyInstance[3] = expr;. In this case, you can use a dummy object which overloads assignment:
class MyContainerClass {
private:
class Index {
public:
Index& operator=(int val);
Index& operator=(const string& val);
private:
Index(MyContainerClass& cont, unsigned int ind);
MyContainerClass& m_cont;
unsigned int m_ind;
friend class MyContainerClass;
};
public:
Index operator[](unsigned int ind) { return Index(*this, ind); }
int operator[](unsigned int ind) const;
// ...
};
int&
returning a reference allows you too use the returned value as a left-hand side of the assignment.
same reason why operator<<() returns an ostream&, which allows you to write cout << a << b;