How to overload the operators to call a setter function on an operator[] call? - c++

How can I overload the operators of a class, so that using syntax of
classInstance[index] = value;
performs
classInstance.cfgfile.Write(index,value)
background info; feel free to skip.
The application we develop uses a memory-mapped access to a segment of NVRAM - actually, mapped are just two registers, address and data. You write to the address register, then either write or read the data register. After initialization, the reads and writes are performed by a simple [] overload of the class holding the reference to the segment of memory. You refer to the instance's [] giving a namespaced index of the cell you want to read and write and it does its thing.
int& IndirectMemory::operator[](RTCMemIndex idx)
{
*midx_reg = idx;
return *mdata_reg;
}
(code stripped of irrelevant elements like mutexes and sanity checks).
Everything works fine... as long as the NVRAM works fine. This specific chip is out of production, and the ones 'out in the wild' began dying of old age currently. Their functionality is of low significance to our use, and we could shift their role to the flash memory with nearly no impact (just a little more flash wear) if the chip goes corrupt. Thing is we want to use the flash record using our config format, which uses getters and setters.
int TCfgFile::ReadKey(std::string Key);
void TCfgFile::WriteKey(std::string Key,int data);
And in many places of the code we have calls to NVRAM through IndirectMemory[Some_Register] = Some_Value; writting assorted things that change frequently and we want to persist through reboot. I'd like to retain this syntax and behavior, but be able to write to the file if NVRAM is detected to be corrupted or manually disabled through a config entry.
The net is rife with examples of using operator[] for setting given value just by returning the reference to it. For example:
unsigned long operator [](int i) const {return registers[i];}
unsigned long & operator [](int i) {return registers[i];}
In that case if I call, say, reg[3] = 1; the [] will return a reference to the element#3 and the default operator= will write to the reference just fine.
But since I can't return a reference to a key in the file (.WriteKey() just performs a complete write, returning success or error), and operator= doesn't take an index, I'm afraid this simple option won't help.

You can use a proxy class to solve this. Since value can't be passed into classInstance we need to make an object that operator[] can return that will get the value of value and knows which instance to apply the operation to. Using
struct Proxy
{
classInstance_type& to_apply;
index_type index;
Proxy(classInstance_type& to_apply, index_type index) : to_apply(to_apply), index(index) {}
Proxy& operator=(value_type const & value)
{
to_apply.cfgfile.Write(index,value)
return *this;
}
};
your class's operator[] would look like
Proxy operator[](index_type index)
{
return Proxy{*this, index};
}
and then when you do classInstance[index] = value; you call Proxy's operator= which has a reference to the object to call, the index to use, and the value you also need.

You can also do this without a proxy class. You can make operator[] return a reference to *this and than overload the = operator of said class to perform Write on whatever was given to operator= in the second argument.
#include <iostream>
struct Foo {
void Write(int idx, int value) {
std::cout << "Write(" << idx << ", " << value << ")\n";
}
Foo& operator[](int idx) {
this->index = idx;
return *this;
}
void operator=(int value) {
this->Write(this->index, value);
}
int index;
};
int main() {
Foo f;
f[5] = 10;
}
Prints: Write(5, 10)

Related

C++ Return temporary values and objects that cannot be copied

I know that references can extend the lifetime of a return value in C++. With this phylosophy, I tried the following: I have three clases, 'tensor', 'view' and 'mutable_view'. Operator () on a tensor returns a "const view" object. This view has a private copy constructor so that the view cannot be copied, as it keeps information about the tensor that might not survive beyond the current statement.
#include <iostream>
#include <algorithm>
struct tensor {
int data[10];
class view {
const int *const data;
view();
view(const view &);
public:
view(const int *new_data) : data(new_data) {}
int operator*() const { return *data; }
};
class mutable_view {
int *const data;
mutable_view();
mutable_view(const mutable_view &);
public:
mutable_view(int *new_data) : data(new_data) {}
void operator=(const view &v) {
*data = *v;
}
};
tensor(int n) {
std::fill(data, data+10, n);
}
const view operator()(int ndx) const {
return view(data + ndx);
}
mutable_view at(int ndx) {
return mutable_view(data + ndx);
}
};
int main()
{
tensor a(1);
tensor b(2);
b.at(2) = a(2);
for (int i = 0; i < 10; i++)
std::cout << "a[i] = " << b.data[i] << std::endl;
for (int i = 0; i < 10; i++)
std::cout << "b[i] = " << b.data[i] << std::endl;
exit(0);
}
The problem is that, while this code works in gcc (depends on the version), icc signals a warning and open64 simply does not build it: it demands that the constructors from 'view' be public. Reading icc's message the idea seems to be that the right hand value could be potentially copied by the compiler and thus constructors are needed.
Is this really true? Is there a workaround that preserves the syntax I want to build? By the way they are built, and in order to avoid inefficient implementations based on shared_ptr or other stuff, I need to keep the 'view' objects un-copiable.
Edit 1:
tensor cannot control the lifetime of the views. The views are created by the accessors and their lifetime is limited to the statement where they are used, for the following reasons:
These views are only used for two things: (i) copying data, (ii) extracting portions of the tensor.
The tensors are multidimensional arrays that implement copy-on-write semantics, which means that the views cannot be long-lived objects: if the data changes, they expire.
Edit 2:
Changed the pseudocode description (guys, if you see '...' do you expect it to be compilable?) with one that builds on 'icc' and does not on clang/open64
Go ahead and let the default copy constructors be public. And document that a view or mutable_view is "invalidated" when its tensor is changed or destroyed.
This parallels how the Standard Library deals with iterators, pointers, and references that have a lifetime which depends on another object.
As others already pointed out you missed () here:
const view operator(int ndx) const;
Anyway this declaration means that return value is copied. If you want to avoid copying just return reference for an object:
const view& operator()(int ndx) const;
As I understand 'tensor' is container of 'views' so it manages there lifetime and its safe to return reference. For the same reason tensor::at should return reference to mutable_view:
mutable_view& at(int ndx);
Another question is about default constructor of 'view' - it looks like 'tensor' has to be a friend of 'view' to be able to create its instances
By the way - prefer using 'size_t' as index type instead of just 'int'
My overall feeling of this code - you are trying to implement kind of domain language. Maybe it's better to focus on concrete calculation task?

Simple C++ getter/setters

Lately I'm writing my getter and setters as (note: real classes do more things in getter/setter):
struct A {
const int& value() const { return value_; } // getter
int& value() { return value_; } // getter/setter
private:
int value_;
};
which allows me to do the following:
auto a = A{2}; // non-const object a
// create copies by "default" (value always returns a ref!):
int b = a.value(); // b = 2, is a copy of value :)
auto c = a.value(); // c = 2, is a copy of value :)
// create references explicitly:
auto& d = a.value(); // d is a ref to a.value_ :)
decltype(a.value()) e = a.value(); // e is a ref to a.value_ :)
a.value() = 3; // sets a.value_ = 3 :)
cout << b << " " << c << " " << d << " " << e << endl; // 2 2 3 3
const auto ca = A{1};
const auto& f = ca.value(); // f is a const ref to ca.value_ :)
auto& g = ca.value(); // no compiler error! :(
// g = 4; // compiler error :)
decltype(ca.value()) h = ca.value(); // h is a const ref to ca.value_ :)
//ca.value() = 2; // compiler error! :)
cout << f << " " << g << " " << h << endl; // 1 1 1
This approach doesn't allow me to:
validate the input for the setter (which is a big BUT),
return by value in the const member function (because I want the compiler to catch assignment to const objects: ca.value() = 2). Update: see cluracan answer below.
However, I'm still using this a lot because
most of the time I don't need that,
this allows me to decouple the implementation details of my classes from their interface, which is just what I want.
Example:
struct A {
const int& value(const std::size_t i) const { return values_[i]; }
int& value(const std::size_t i) { return values_[i]; }
private:
std::vector<int> values_;
// Storing the values in a vector/list/etc is an implementation detail.
// - I can validate the index, but not the value :(
// - I can change the type of values, without affecting clients :)
};
Now to the questions:
Are there any other disadvantages of this approach that I'm failing to see?
Why do people prefer:
getter/setters methods with different names?
passing the value as a parameter?
just for validating input or are there any other main reasons?
Generally using accessors/mutators at all is a design smell that your class public interface is incomplete. Typically speaking you want a useful public interface that provides meaningful functionality rather than simply get/set (which is just one or two steps better than we were in C with structs and functions). Every time you want to write a mutator, and many times you want to write an accessor first just take a step back and ask yourself "do I *really* need this?".
Just idiom-wise people may not be prepared to expect such a function so it will increase a maintainer's time to grok your code.
The same-named methods are almost the same as the public member: just use a public member in that case. When the methods do two different things, name them two different things.
The "mutator" returning by non-const reference would allow for a wide variety of aliasing problems where someone stashes off an alias to the member, relying on it to exist later. By using a separate setter function you prevent people from aliasing to your private data.
This approach doesn't allow me to:
return by value in the const member function (because I want the compiler to catch assignment to const objects ca.value() = 2).
I don't get what you mean. If you mean what I think you mean - you're going to be pleasantly surprised :) Just try to have the const member return by value and see if you can do ca.value()=2...
But my main question, if you want some kind of input validation, why not use a dedicated setter and a dedicated getter
struct A {
int value() const { return value_; } // getter
void value(int v) { value_=v; } // setter
private:
int value_;
};
It will even reduce the amount typing! (by one '=') when you set. The only downside to this is that you can't pass the value by reference to a function that modifies it.
Regarding your second example after the edit, with the vector - using your getter/setter makes even more sense than your original example as you want to give access to the values (allow the user to change the values) but NOT to the vector (you don't want the user to be able to change the size of the vector).
So even though in the first example I really would recommend making the member public, in the second one it is clearly not an option, and using this form of getters / setters really is a good option if no input validation is needed.
Also, when I have classes like your second type (with the vector) I like giving access to the begin and end iterators. This allows more flexibility of using the data with standard tools (while still not allowing the user to change the vector size, and allowing easy change in container type)
Another bonus to this is that random access iterators have an operator[] (like pointers) so you can do
vector<int>::iterator A::value_begin() {return values_.begin();}
vector<int>::const_iterator A::value_begin()const{return values_.begin();}
...
a.value_begin()[252]=3;
int b=a.value_begin()[4];
vector<int> c(a.value_begin(),a.value_end())
(although it maybe ugly enough that you'd still want your getters/setters in addition to this)
REGARDING INPUT VALIDATION:
In your example, the assignment happens in the calling code. If you want to validate user input, you need to pass the value to be validated into your struct object. This means you need to use member functions (methods). For example,
struct A {
// getter
int& getValue() const { return value_; }
// setter
void setValue(const int& value) {
// validate value here
value_ = value;
}
private:
int value_;
};
By the way, .NET properties are implemented are methods under the hood.

Operator Overloading: C++

I have a question about Operator Overloading in C++.
For an assignment, I have to write a class which encompasses an array, sort of like the ArrayList in Java.
One of the things I have to do is keep track of the size of the array. Size is the amount of elements included, whereas capacity is the maximum amount which CAN be included before the class has to expand the array.
Client code specifies the size when they call the constructor. However, when new elements are added, I have to figure out a way to change the size.
My teacher said something about being able to overload an operator for different sides of an equality. Is this a real thing, or did I misunderstand her? If this works, it would be the optimal solution to my problem.
My current overloading for the [] operator is:
int & ArrayWrapper::operator [] (int position){
if(position == _size){
if(_size == _capacity){
changeCapacity(_capacity+10);
}
}
return _array[position];
}
This works fine for retrieval, but I'd like to have it so that if someone calls it from the left hand side of a '=' then it checks to see if it needs to expand the size or not.
EDIT: If this isn't a real thing, can anyone think of a different solution to the problem? One solution I thought of is to have the getSize() method just go through the entire array every time it is called, but I'd really rather not use that solution because it seems cheesy.
EDIT: For clarification, I'm not asking whether or not my expansion of an array works. I need to add 1 to size every time a new element is added. For example, if the client creates an array of size 15 and capacity 25, and then tries to add something to Array[15], that SHOULD increase the size to 16. I was wondering if there was a way to do that with overloading.
A simple approach, which doesn't quite do what you want, is to overload on whether the array is const or mutable.
This doesn't distinguish between whether the array is being used on the left-hand side of assignment (as a lvalue) or on the right (as a rvalue); just on whether it's allowed to be modified or not.
// Mutable overload (returns a mutable reference)
int & operator[](size_t position) {
if (position >= _size) {
if (position >= _capatity) {
// increase capacity
}
// increase size
}
return _array[position];
}
// Const overload (returns a value or const reference)
int operator[](size_t position) const {
if (position >= _size) {
throw std::out_of_range("Array position out of range");
}
return _array[position];
}
If you really want to tell whether you're being assigned to or not, then you'll have to return a proxy for the reference. This overloads assignment to write to the array, and provides a conversion operator to get the value of the element:
class proxy {
public:
proxy(ArrayWrapper & array, size_t position) :
_array(array), _position(position) {}
operator int() const {
if (_position >= _array._array._size) {
throw std::out_of_range("Array position out of range");
}
return _array._array[_position];
}
proxy & operator=(int value) {
if (_position >= _size) {
if (_position >= _capatity) {
// increase capacity
}
// increase size
}
_array._array[_position] = value;
return *this;
}
private:
ArrayWrapper & _array;
size_t _position;
};
You probably need to declare this a friend of ArrayWrapper; then just return this from operator[]:
proxy ArrayWrapper::operator[](size_t position) {
return proxy(*this, position);
}
This approach is fine. There's an error in the code, though: what happens if someone calls that operator with a position that's equal to the current size of the array plus 100?
The question is whether you really want different behavior depending on
which side of the = you are. Your basic idea will work fine, but will
expand the array regardless of the side you're on, e.g.:
ArrayWrapper a(10);
std::cout << a[20] << std::end;
will result in expanding the array. Most of the time, in such cases,
the preferred behavior would be for the code above to raise an exception,
but for
ArrayWrapper a(10);
a[20] = 3.14159;
to work. This is possible using proxies: first, you define double
ArrayWrapper::get( int index ) const and void ArrayWrapper::set( int
index, double newValue ); the getter will throw an exception if the
index is out of bounds, but the setter will extend the array. Then,
operator[] returns a proxy, along the lines of:
class ArrayWrapper::Proxy
{
ArrayWrapper* myOwner;
int myIndex;
public:
Proxy( ArrayWrapper& owner, int index )
: myOwner( &owner )
, myIndex( index )
{
}
Proxy const& operator=( double newValue ) const
{
myOwner->set( myIndex, newValue );
}
operator double() const
{
return myOwner->get( myIndex );
}
};
In case you're not familiar with the operator double(), it's an
overloaded conversion operator. The way this works is that if the
operator[] is on the left side of an assignment, it will actually be
the proxy which gets assigned to, and the assignment operator of the
proxy forwards to the set() function. Otherwise, the proxy will
implicitly convert to double, and this conversion forwards to the
get() function.

defining operator [ ] for both reading and writing

In the book of "The C++ Programming Language", the author gave the following example along with several statements:
Defining an operator, such as [], to be used for both reading and writing is difficult where it is not acceptable simply to return a reference and let the user decide what to do with it.
Cref, is to help implement a subscript operator that distinguishes between reading and writing.
Why [] is difficult to be defined when to be used for both reading and writing?
How does the definition of class Cref help to solve this issue?
class String{
struct Srep;
Srep *rep;
public:
class Cref;
// some definitions here
void check (int i) const { if (i<0 || rep->sz<=i) throw Range( );}
char read( int i) const {return rep->s[i];}
void write(int i, char c){ rep=rep->get_own_copy(); rep->s[i]=c;}
Cref operator[] (int i){ check(i); return Cref(*this, i);}
char operator[] (int i) const{check(i); return rep->s{i];}
}
class String::Cref{
friend class String;
String& s;
int i;
Cref(String& ss, int ii): s(ss),i(ii) {}
public:
operator char( ) { return s.read(i);}
void operator=(char c){s.write(i,c);}
};
If you don't define a class Cref that solves this issue, then you have to do what std::map does:
template class <K,V> class map{
V& operator[](K const & key);
}
This returns a reference, which must be backed by a valid memory location, and therefore
std::map<string,string> m;
m["foo"];
assert(m.find("foo") != m.end());
The assertion will succeed (meaning, "foo" is now a valid key in the map) even though you never assigned something to m["foo"].
This counterintuitive behavior can be fixed by the Cref class in your example -- it can perform the appropriate logic to create m["foo"] only when you assign to the reference, and ensure that m.find("foo") == m.end() if you didn't perform some assignment when you tried to read the nonexistant m["foo"].
Likewise, in your String class (which is a reference-counted string -- strings share their string data, and a new copy is created when you change a string whose data is shared with another string), you'd have to make a copy when using operator[] to read characters. The use of the Cref class, allows you to ensure that you only make a copy when using operator[] to write.
String s;
s[0] = 5;
will call String::operator [](int) and then String::Cref::operator =(char).
However,
String s;
char c = s[0];
will call String::operator [](int) and then String::Cref::operator char().
When reading, String::Cref::operator char is called, and when writing String::Cref::operator = is called - this allows you to distinguish between reading and writing.
Why [] is difficult to be defined when to be used for both distinguish between reading and writing?
It's because the non-const operator[] is called whenever you have a non-const object, even if you're using it in a read-only fashion.

optimize output value using a class and public member

Suppose you have a function, and you call it a lot of times, every time the function return a big object. I've optimized the problem using a functor that return void, and store the returning value in a public member:
#include <vector>
const int N = 100;
std::vector<double> fun(const std::vector<double> & v, const int n)
{
std::vector<double> output = v;
output[n] *= output[n];
return output;
}
class F
{
public:
F() : output(N) {};
std::vector<double> output;
void operator()(const std::vector<double> & v, const int n)
{
output = v;
output[n] *= n;
}
};
int main()
{
std::vector<double> start(N,10.);
std::vector<double> end(N);
double a;
// first solution
for (unsigned long int i = 0; i != 10000000; ++i)
a = fun(start, 2)[3];
// second solution
F f;
for (unsigned long int i = 0; i != 10000000; ++i)
{
f(start, 2);
a = f.output[3];
}
}
Yes, I can use inline or optimize in an other way this problem, but here I want to stress on this problem: with the functor I declare and construct the output variable output only one time, using the function I do that every time it is called. The second solution is two time faster than the first with g++ -O1 or g++ -O2. What do you think about it, is it an ugly optimization?
Edit:
to clarify my aim. I have to evaluate the function >10M times, but I need the output only few random times. It's important that the input is not changed, in fact I declared it as a const reference. In this example the input is always the same, but in real world the input change and it is function of the previous output of the function.
More common scenario is to create object with reserved large enough size outside the function and pass large object to the function by pointer or by reference. You could reuse this object on several calls to your function. Thus you could reduce continual memory allocation.
In both cases you are allocating new vector many many times.
What you should do is to pass both input and output objects to your class/function:
void fun(const std::vector<double> & in, const int n, std::vector<double> & out)
{
out[n] *= in[n];
}
this way you separate your logic from the algorithm. You'll have to create a new std::vector once and pass it to the function as many time as you want. Notice that there's unnecessary no copy/allocation made.
p.s. it's been awhile since I did c++. It may not compile right away.
It's not an ugly optimization. It's actually a fairly decent one.
I would, however, hide output and make an operator[] member to access its members. Why? Because you just might be able to perform a lazy evaluation optimization by moving all the math to that function, thus only doing that math when the client requests that value. Until the user asks for it, why do it if you don't need to?
Edit:
Just checked the standard. Behavior of the assignment operator is based on insert(). Notes for that function state that an allocation occurs if new size exceeds current capacity. Of course this does not seem to explicitly disallow an implementation from reallocating even if otherwise...I'm pretty sure you'll find none that do and I'm sure the standard says something about it somewhere else. Thus you've improved speed by removing allocation calls.
You should still hide the internal vector. You'll have more chance to change implementation if you use encapsulation. You could also return a reference (maybe const) to the vector from the function and retain the original syntax.
I played with this a bit, and came up with the code below. I keep thinking there's a better way to do this, but it's escaping me for now.
The key differences:
I'm allergic to public member variables, so I made output private, and put getters around it.
Having the operator return void isn't necessary for the optimization, so I have it return the value as a const reference so we can preserve return value semantics.
I took a stab at generalizing the approach into a templated base class, so you can then define derived classes for a particular return type, and not re-define the plumbing. This assumes the object you want to create takes a one-arg constructor, and the function you want to call takes in one additional argument. I think you'd have to define other templates if this varies.
Enjoy...
#include <vector>
template<typename T, typename ConstructArg, typename FuncArg>
class ReturnT
{
public:
ReturnT(ConstructArg arg): output(arg){}
virtual ~ReturnT() {}
const T& operator()(const T& in, FuncArg arg)
{
output = in;
this->doOp(arg);
return this->getOutput();
}
const T& getOutput() const {return output;}
protected:
T& getOutput() {return output;}
private:
virtual void doOp(FuncArg arg) = 0;
T output;
};
class F : public ReturnT<std::vector<double>, std::size_t, const int>
{
public:
F(std::size_t size) : ReturnT<std::vector<double>, std::size_t, const int>(size) {}
private:
virtual void doOp(const int n)
{
this->getOutput()[n] *= n;
}
};
int main()
{
const int N = 100;
std::vector<double> start(N,10.);
double a;
// second solution
F f(N);
for (unsigned long int i = 0; i != 10000000; ++i)
{
a = f(start, 2)[3];
}
}
It seems quite strange(I mean the need for optimization at all) - I think that a decent compiler should perform return value optimization in such cases. Maybe all you need is to enable it.