Eliminating temporaries in operator overloading - c++

Note: as noted by sellibitze I am not up-to-date on rvalues references, therefore the methods I propose contain mistakes, read his anwser to understand which.
I was reading one of Linus' rant yesterday and there is (somewhere) a rant against operator overloading.
The complaint it seems is that if you have an object of type S then:
S a = b + c + d + e;
may involve a lot of temporaries.
In C++03, we have copy elision to prevent this:
S a = ((b + c) + d) + e;
I would hope that the last ... + e is optimized, but I wonder how many temporaries are created with user defined operator+.
Someone in the thread suggested the use of Expression Templates to deal with the issue.
Now, this thread dates back to 2007, but nowadays when we think elimination of temporaries, we think Move.
So I was thinking about the set of overload operators we should write not to eliminate temporaries, but to limit the cost of their construction (stealing resources).
S&& operator+(S&& lhs, S const& rhs) { return lhs += rhs; }
S&& operator+(S const& lhs, S&& rhs) { return rhs += lhs; } // *
S&& operator+(S&& lhs, S&& rhs) { return lhs += rhs; }
Does this set of operator seems sufficient ? Is this generalizable (in your opinion) ?
*: this implementation supposes commutativity, it doesn't work for the infamous string.

If you're thinking about a custom, move-enabled string class, the proper way to exploit every combination of argument value categories is:
S operator+(S const& lhs, S const& rhs);
S operator+(S && lhs, S const& rhs);
S operator+(S const& lhs, S && rhs);
S operator+(S && lhs, S && rhs);
The functions return a prvalue instead of an xvalue. Returning xvalues is usually a very dangerous thing – std::move and std::forward are the obvious exceptions. If you were to return an rvalue reference you'd break code like:
for (char c : my_string + other_string) {
//...
}
This loop behaves (according to 6.5.4/1 in N3092) as if the code is:
auto&& range = my_string + other_string;
This in turn results in a dangling reference. The temporary object's life-time is not extended because your operator+ doesn't return a prvalue. Returning the objects by value is perfectly fine. It'll create temporary objects but these objects are rvalues, so we can steal their resources to make it very effective.
Secondly, your code should also not compile for the same reason this won't compile:
int&& foo(int&& x) { return x; }
Inside the function's body x is an lvalue and you can't initialize the "return value" (in this case the rvalue reference) with an lvalue expression. So, you'd need an explicit cast.
Thirdly, you're missing an const&+const& overload. In case both of your arguments are lvalues, the compiler won't find a usable operator+ in your case.
If you don't want so many overloads, you could also write:
S operator+(S value, S const& x)
{
value += x;
return value;
}
I intentionally didn't write return value+=x; because this operator probably returns an lvalue reference which would have led to copy construction of the return value. With the two lines I wrote the return value will be move constructed from value.
S x = a + b + c + d;
At least this case is very efficient because there is no unnecessary copying involved even if the compiler isn't able to elide the copies – thanks to a move-enabled string class. Actually, with a class like std::string you can exploit its fast swap member function and make it effective in C++03 as well provided you have a reasonably smart compiler (like GCC):
S operator+(S value, S const& x) // pass-by-value to exploit copy elisions
{
S result;
result.swap(value);
result += x;
return result; // NRVO applicable
}
See David Abraham's article Want Speed? Pass by Value. But these simple operators won't be as effective given:
S x = a + (b + (c + d));
Here the left hand side of the operator is always an lvalue. Since operator+ takes its left hand side by value this leads to many copies. The four overloads from above deal perfectly with this example, too.
It's been a while since I read Linus' old rant. If he was complaining about unnecessary copies with respect to std::string, this complaint is no longer valid in C++0x, but it was hardly valid before. You can efficiently concatenate many strings in C++03:
S result = a;
result += b;
result += c;
result += d;
But in C++0x you can also use operator+ and std::move. This will be very efficient, too.
I actually looked at the Git source code and its string management (strbuf.h). It looks well thought through. Except for the detach/attach feature you get the same thing with a move-enabled std::string with the obvious advantage that the resource it automatically managed by the class itself as opposed to the user who needs to remember to call the right functions at the right times (strbuf_init, strbuf_release).

Related

Proper way to facilitate MOVE operation when overriding operator on C++

I'm not really familiar with how move works in C++, and I need some help to clarify my understanding. I'm want to overload the operator+, and I'm have a couple of questions about it.
ap_n operator+(const ap_n& n, const ap_n& m) {
ap_n tmp {n};
return tmp += m;
}
My first question is how to make temporary objects movable. As shown in my function above, both arguments are not meant to be mutated, therefore I need to create a third object to do the operation upon.
How can I make my return value usable for move operation. Is the return value supposed to be a reference as ap_n&? Should the return object be encapsulated by std::move(tmp)? Or is it alright as is?
How does C++ decide when an object is rvalue, or decide that a move operation is suitable on an object, and how can I tell the program that an object is safe to use for move operation.
ap_n operator+(const ap_n&, const ap_n&); // defined as in first question
ap_n operator+(ap_n&& n, const ap_n& m) { return n += m; }
ap_n operator+(const ap_n& n, ap_n&& m) { return m += n; }
ap_n operator+(ap_n&& n, ap_n&& m) { return n += m; }
My second question is whether it is necessary to create variants of function that accept rvalue arguments. Right now I have 4 functions, as shown, to be able to accept normal objects and rvalue objects.
Is writing all the combinations possible like this necessary? If I remove all but the first function, would the program still be able to perform move operation correctly?
As a debugging tip, something that can help with getting these things right is to print a message in the move constructor
ap_n(ap_n&& o): x_(std::move(o.x_)) { std::cerr << "Move constructed\n"; }
plus similar messages in other constructors and the destructor. Then you get a clear chronology of when and how instances are created and destroyed.
How can I make my return value usable for move operation. Is the return value supposed to be a reference as ap_n&? Should the return object be encapsulated by std::move(tmp)? Or is it alright as is?
Return the result by value. (Don't return a reference to a local variable, since the local variable goes out of scope immediately, making the reference invalid.) You might find this short article useful: Tip of the Week #77: Temporaries, Moves, and Copies. For more depth, check out cppreference on copy elision.
how to make temporary objects movable
By defining move constructor/assignment to your type.
Is the return value supposed to be a reference as ap_n&?
Not for operator+ when you return new object.
operator += on the other hand returns reference to lhs, so returns ap_n&.
How can I make my return value usable for move operation. Should the return object be encapsulated by std::move(tmp)? Or is it alright as is?
From return statement,
there is an automatic move when returning directly a local variable.
so return tmp; is sufficient.
return std::move(tmp); prevents NRVO
return tmp += m; does a copy, as you don't return "directly" tmp.
You should do:
ap_n operator+(const ap_n& n, const ap_n& m) {
ap_n tmp {n};
tmp += m;
return tmp; // NRVO, or automatic move
}
return std::move(tmp += m); would prevent NRVO, and do the move.
How does C++ decide when an object is rvalue,
Roughly,
variables are l-value as there have name.
function returning l-value reference (ap_n&) returns l-value.
function returning r-value reference (ap_n&&), or by value (ap_n) returns r-value.
or decide that a move operation is suitable on an object, and how can I tell the program that an object is safe to use for move operation.
Overload resolution select the best match between valid candidate.
So it requires function taking by value or by r-value reference (or forwarding reference).
My second question is
It seems not the second ;-)
whether it is necessary to create variants of function that accept rvalue arguments. Right now I have 4 functions, as shown, to be able to accept normal objects and rvalue objects.
Is writing all the combinations possible like this necessary?
Single function taking by const reference or by value can be enough in general case,
unless you want that optimization. so mostly for library writer, or critical code.
Notice that your overloads should be rewritten to effectively do move operation (to reuse input temporary parameter):
ap_n operator+(ap_n&& n, const ap_n& m)
{
n += m;
return std::move(n); // C++11; C++14, C++17
return n; // C++20
}
or
ap_n&& operator+(ap_n&& n, const ap_n& m)
{
return std::move(n += m);
}
If I remove all but the first function, would the program still be able to perform move operation correctly?
With
ap_n operator+(const ap_n& n, const ap_n& m) {
ap_n tmp {n}; // copy
tmp += m;
return tmp; // NRVO, or automatic move
}
You have 1 copy, one NRVO/move for any kind of parameters.
With
ap_n&& operator+(ap_n&& n, const ap_n& m) {
return std::move(n += m);
}
you have no moves, but you should be careful about lifetime of references, as
auto ok = ap_n() + ap_n(); // 1 extra move
auto&& dangling = ap_n() + ap_n(); // No move, but no lifetime extension...
With
ap_n operator+(ap_n&& n, const ap_n& m) {
n += m;
return std::move(n); // C++11, C++14, C++17 // move
return n; // C++20 // automatic move
}
you have 1 move, no copies.
auto ok = ap_n() + ap_n(); // 1 extra move possibly elided pre-C++17, 0 extra moves since C++17
auto&& ok2 = ap_n() + ap_n(); // No moves, and lifetime extension...
So with extra overload, you might trade copy to move.
Taking notes from Jarod42's answer, the following is the revised code.
ap_n operator+(const ap_n& n, const ap_n& m) {
ap_n tmp {n};
tmp += n;
return tmp; // Allows copy, NRVO, or move
}
ap_n operator+(ap_n&& n, const ap_n& m) {
n += m;
return std::move(n); // Allows copy, or move
}
ap_n operator+(const ap_n& n, ap_n&& m) {
m += n;
return std::move(m); // Allows copy, or move
}
The amount of functions is also reduced to 3, since one that takes both rvalue reference will automatically use the second function.
Please tell me if I'm still misunderstanding this.

C++ STL - equivalent of operator function object templates with assignment?

Are there assignment operator objects in C++? Like std::plus, but to do +=? (Likewise minus, multiplies, divides, etc.)
EDIT - Motivation:
I thought it would be preferable to avoid the extra copy by using the function objects (std::plus(), etc.) in the following code.
template<typename Op>
static vector<int>& memberwiseAssignOp(vector<int>& lhs, vector<int> rhs, Op op)
{
size_t const len = rhs.size();
if (len > lhs.size())
{
lhs.resize(len);
}
transform(lhs.begin(), lhs.end(), rhs.begin(), lhs.begin(), op);
return lhs;
}
vector<int>& operator+=(vector<int>& lhs, vector<int> rhs)
{
return memberwiseAssignOp(lhs, rhs, plus<int>());
}
vector<int>& operator-=(vector<int>& lhs, vector<int> rhs)
{
return memberwiseAssignOp(lhs, rhs, minus<int>());
}
More generally than just "no, it's not there", there's the simple fact that overloads of assignment operators need to be done as member functions, so it really can't be done. I guess, since what we're dealing with aren't really operators though, it could be done to the extent that a function could be written to receive a non-const reference to an object, and modify the object to which that referred.
I'm not at all sure you'd gain a whole lot from this though. The types for which it made much difference would really be those types for which it was substantially cheaper to modify an existing object than to overwrite an old object with a new value.
At one time (before C++11), that may have been a fair number of types. Since the introduction of rvalue references you can get roughly the same effect (but much more cleanly) by moving from the old object to the new object, and modifying as you see fit along the way.
In theory, there are probably still a few places that wouldn't work out quite as nicely. The obvious example would be an object that (directly) contains a lot of data, so moving still basically works out to copying.
Here is a list of Functional Operations.
As you can see there are no assignment operators.
You could of course just have this functionality yourself like follows:
a = std::plus(a, b);
But this might not be helpful at all.

Named objects vs. temporary objects: Is it better to avoid named objects when possible?

The following is an excerpt I found from a coding style documentation for a library:
Where possible, it can be better to use a temporary rather than
storing a named object, eg:
DoSomething( XName("blah") );
rather than
XName n("blah"); DoSomething( n );
as this makes it easier for the compiler to optimise the call, may
reduce the stack size of the function, etc. Don't forget to consider
the lifetime of the temporary, however.
Assuming the object does not need to be modified and lifetime issues are not a problem, is this guideline true? I was thinking that in this day and age it wouldn't make a difference. However, in some cases you couldn't avoid a named object:
XName n("blah");
// Do other stuff to mutate n
DoSomething( n );
Also, with move semantics, we can write code like this since the temporaries are eliminated:
std::string s1 = ...;
std::string s2 = ...;
std::string s3 = ...;
DoSomething( s1 + s2 + s3 );
rather than (I've heard that the compiler can optimize better with the following in C++03):
std::string s1 = ...;
std::string s2 = ...;
std::string s3 = ...;
s1 += s2;
s1 += s3; // Instead of s1 + s2 + s3
DoSomething(s1);
(Of course, the above may boil down to measure and see for yourself, but I was wondering if the general guideline mentioned above has any truth to it)
The main job of the compiler frontend is to remove names from everything to resolve the underlying semantic structures.
Tending to avoid names does help avoid taking addresses of objects unnecessarily, which can unintuitively stop the compiler from manipulating data. But there are enough ways to get the address of a temporary that it's all but moot. And named objects are special in that they are not eligible for constructor elision in C++, but as you mention, move semantics eliminate most expensive unnecessary copy-construction.
Just focus on writing readable code.
Your first example does eliminate a copy of n, but in C++11 you can use move semantics instead: DoSomething( std::move( n ) ).
In the example s1 + s2 + s3, it's also true that C++11 makes things more efficient, but move semantics and elimination of temporaries are different things. A move constructor just makes construction of a temporary less expensive.
I was also under the misimpression that C++11 would eliminate the temporaries, as long as you used the idiom
// What you should use in C++03
foo operator + ( foo lhs, foo const & rhs ) { return lhs += rhs; }
This is actually untrue; lhs is a named object, not a temporary, and it is not eligible for the return value optimization form of copy elision. In fact, in C++11 this will produce a copy, not a move! You would need to fix this with std::move( lhs += rhs );.
// What you should use in C++11
foo operator + ( foo lhs, foo const & rhs ) { return std::move( lhs += rhs ); }
Your example uses std::string, not foo, and that operator+ is defined (essentially, and since C++03) as
// What the C++03 Standard Library uses
string operator + ( string const & lhs, string const & rhs )
{ return string( lhs ) += rhs; } // Returns rvalue expression, as if moved.
This strategy has similar properties to the above, because a temporary is disqualified for copy elision once it is bound to a reference. There are two potential fixes, which give a choice between speed and safety. Neither fix is compatible with the first idiom, which with move already implements the safe style (and as such is what you should use!).
Safe style.
Here there are no named objects, but the temporary bound to the lhs argument cannot be directly constructed into the result binding to a reference stops copy elision.
// What the C++11 Standard Library uses (in addition to the C++03 library style)
foo operator + ( foo && lhs, foo const & rhs )
{ return std::move( lhs += rhs ); }
Unsafe style.
A second overload accepting an rvalue reference and return the same reference eliminates the intermediate temporary completely (no reliance on elision), allowing a chain of + calls to be converted perfectly into += calls. But unfortunately it also disqualifies the remaining temporary at the start of the call chain from lifetime extension, by binding it to a reference. So the returned reference is valid until the semicolon, but then it's going away and nothing can stop it. So this is mainly useful inside something like a template expression library, with documented restrictions on what results can be bound to a local reference.
// No temporary, but don't bind this result to a local!
foo && operator + ( foo && lhs, foo const & rhs )
{ return std::move( lhs += rhs ); }
Evaluating library documentation as such requires a little bit of evaluation of the library authors' skill. If they say to do things a certain quirky way because it's always more efficient, be skeptical because C++ isn't purposely designed to be quirky, but it is designed to be efficient.
However in the case of expression templates where temporaries include complicated type computations which would be interrupted by assignment to a named variable of concrete type, you should absolutely listen to what the authors say. In such a case, they would be presumably much more knowledgeable.
I think the accepted answer is wrong. Avoiding naming temporary objects is better.
The reason is that if you have
struct T { ... };
T foo(T obj) { return obj; }
// ...
T t;
foo(t);
then t will be copy-constructed, and this cannot be optimized out if the copy-constructor has observable side-effects.
By contrast, if you had said foo(T()), then calling the copy-constructor could be avoided completely, irrespective of potential side-effects.
Therefore, avoiding naming temporary objects is better practice in general.
Here are a few points:
You can never know exactly what compiler will optimize and what it will not. Optimization is a complex thing. Optimizer writers tend to be very careful not to break something. It is possible to hit a bug that optimizer mistakenly decides that something should not be optimized. Coding standards in compilers are extremely high. Nevertheless they are written by humans.
This particular coding style excerpt does not seem me very reasonable. Our days compilers are almost always good. It is hard to imagine that optimizer will confuse something in XName n("blah"); DoSomething(n); - this code is too simple.
I would put similar coding guideline this way:
Write your code in a way that is easy to understand and modify;
Once performance problems are observed, look into generated code and think
how to please the compiler.
It is better to address the problem in this order, not the opposite way.

Has anyone found the need to declare the return parameter of a copy assignment operator const?

The copy assignment operator has the usual signature:
my_class & operator = (my_class const & rhs);
Does the following signature have any practical use?
my_class const & operator = (my_class const & rhs);
You can only define one or the other, but not both.
The principle reason to make the return type of copy-assignment a non-const reference is that it is a requirement for "Assignable" in the standard.
If you make the return type a const reference then your class won't meet the requirements for use in any of the standard library containers.
Don't do that. It prevent a client from writing something like:
(a = b).non_const_method();
instead of the longer form:
a = b;
a.non_const_method();
While you may not like the shorthand style, it's really up to the user of the library to decide how they want to write the code.
An answer that mirrors one from Overloading assignment operator in C++:
Returning a const& will still allow assignment chaining:
a = b = c;
But will disallow some of the more unusual uses:
(a = b) = c;
Note that this makes the assignment operator have semantics similar to what it has in C, where the value returned by the = operator is not an lvalue. In C++, the standard changed it so the = operator returns the type of the left operand, so the result is an lvalue. But as Steve Jessop noted in a comment to another answer, while that makes it so the compiler will accept
(a = b) = c;
even for built-ins, the result is undefined behavior for built-ins since a is modified twice with no intervening sequence point. That problem is avoided for non-builtins with an operator=() because the operator=() function call serves as a sequence point.
I see no problem returning a const& unless you want to specifically allow the lvalue semantics (and design the class to ensure it acts sensibly with those semantics). If you're users want to do something unusual with the result of operator=(), I'd prefer that the class disallow it rather than hope it gets it right by chance instead of design.
Also. note that while you said:
You can only define one or the other, but not both.
that's because the function signature in C++ doesn't take into account the return value type. You could however have multiple operator=() assignement operatiors that take different parameters and return different types appropriate to the parameter types:
my_class& operator=( my_class& rhs);
my_class const& operator=(my_class const& rhs);
I'm not entirely sure what this would buy you though. The object being assigned to (that is presumably the reference being returned) is non-const in both cases, so there's no logical reason to return a const& just because the righ-hand-side of the = is const. But maybe I'm missing something...
Effective C++ explains that this would break compatibility with the built-in types of C++.
You can do this with plain ints:
(x = y) = z;
so he reasons, however silly this looks like, one should be able to do the same with one's own type as well.
This example is there in 2nd Edition, although not anymore in the 3rd. However, this quote from 3rd Ed., Item 10 tells the same still:
[...] assignment returns a reference to its left-hand argument, and that's the convention you should follow when you implement assignment operators for your classes:
class Widget {
public:
...
Widget& operator=(const Widget& rhs) // return type is a reference to
{ // the current class
...
return *this; // return the left-hand object
}
...
};
Why is everyone obsessing over (a = b) = c? Has that ever been written by accident?
There is probably some unforeseen utility of the result of assignment being altered. You don't just make arbitrary rules against made-up examples that look funny. Semantically there is no reason that it should be const, so do not declare it const for lexical side effects.
Here is an example of somewhat reasonable code that breaks for const & assignment:
my_class &ref = a = b;
As in any other usage of const, const is the default, unless you really want to let the user change.
Yes, it should be const. Otherwise clients can do this:
class MyClass
{
public:
MyClass & operator = (MyClass const & rhs);
}
void Foo() {
MyClass a, b, c;
(a = b) = c; //Yikes!
}

What is an overloaded operator in C++?

I realize this is a basic question but I have searched online, been to cplusplus.com, read through my book, and I can't seem to grasp the concept of overloaded operators. A specific example from cplusplus.com is:
// vectors: overloading operators example
#include <iostream>
using namespace std;
class CVector {
public:
int x,y;
CVector () {};
CVector (int,int);
CVector operator + (CVector);
};
CVector::CVector (int a, int b) {
x = a;
y = b;
}
CVector CVector::operator+ (CVector param) {
CVector temp;
temp.x = x + param.x;
temp.y = y + param.y;
return (temp);
}
int main () {
CVector a (3,1);
CVector b (1,2);
CVector c;
c = a + b;
cout << c.x << "," << c.y;
return 0;
}
From http://www.cplusplus.com/doc/tutorial/classes2/ but reading through it I'm still not understanding them at all. I just need a basic example of the point of the overloaded operator (which I assume is the "CVector CVector::operator+ (CVector param)").
There's also this example from wikipedia:
Time operator+(const Time& lhs, const Time& rhs)
{
Time temp = lhs;
temp.seconds += rhs.seconds;
if (temp.seconds >= 60)
{
temp.seconds -= 60;
temp.minutes++;
}
temp.minutes += rhs.minutes;
if (temp.minutes >= 60)
{
temp.minutes -= 60;
temp.hours++;
}
temp.hours += rhs.hours;
return temp;
}
From "http://en.wikipedia.org/wiki/Operator_overloading"
The current assignment I'm working on I need to overload a ++ and a -- operator.
Thanks in advance for the information and sorry about the somewhat vague question, unfortunately I'm just not sure on it at all.
Operator overloading is the technique that C++ provides to let you define how the operators in the language can be applied to non-built in objects.
In you example for the Time class operator overload for the + operator:
Time operator+(const Time& lhs, const Time& rhs);
With that overload, you can now perform addition operations on Time objects in a 'natural' fashion:
Time t1 = some_time_initializer;
Time t2 = some_other_time_initializer;
Time t3 = t1 + t2; // calls operator+( t1, t2)
The overload for an operator is just a function with the special name "operator" followed by the symbol for the operator being overloaded. Most operators can be overloaded - ones that cannot are:
. .* :: and ?:
You can call the function directly by name, but usually don't (the point of operator overloading is to be able to use the operators normally).
The overloaded function that gets called is determined by normal overload resolution on the arguments to the operator - that's how the compiler knows to call the operator+() that uses the Time argument types from the example above.
One additional thing to be aware of when overloading the ++ and -- increment and decrement operators is that there are two versions of each - the prefix and the postfix forms. The postfix version of these operators takes an extra int parameter (which is passed 0 and has no purpose other than to differentiate between the two types of operator). The C++ standard has the following examples:
class X {
public:
X& operator++(); //prefix ++a
X operator++(int); //postfix a++
};
class Y { };
Y& operator++(Y&); //prefix ++b
Y operator++(Y&, int); //postfix b++
You should also be aware that the overloaded operators do not have to perform operations that are similar to the built in operators - being more or less normal functions they can do whatever you want. For example, the standard library's IO stream interface uses the shift operators for output and input to/from streams - which is really nothing like bit shifting. However, if you try to be too fancy with your operator overloads, you'll cause much confusion for people who try to follow your code (maybe even you when you look at your code later).
Use operator overloading with care.
An operator in C++ is just a function with a special name. So instead of saying Add(int,int) you say operator +(int,int).
Now as any other function, you can overload it to say work on other types. In your vector example, if you overload operator + to take CVector arguments (ie. operator +(CVector, CVector)), you can then say:
CVector a,b,res;
res=a+b;
Since ++ and -- are unary (they take only one argument), to overload them you'd do like:
type operator ++(type p)
{
type res;
res.value++;
return res;
}
Where type is any type that has a field called value. You get the idea.
What you found in those references are not bad examples of when you'd want operator overloading (giving meaning to vector addition, for example), but they're horrible code when it comes down to the details.
For example, this is much more realistic, showing delegating to the compound assignment operator and proper marking of a const member function:
class Vector2
{
double m_x, m_y;
public:
Vector2(double x, double y) : m_x(x), m_y(y) {}
// Vector2(const Vector2& other) = default;
// Vector2& operator=(const Vector2& other) = default;
Vector2& operator+=(const Vector2& addend) { m_x += addend.m_x; m_y += addend.m_y; return *this; }
Vector2 operator+(const Vector2& addend) const { Vector2 sum(*this); return sum += addend; }
};
From your comments above, you dont see the point of all this operator overloading?
Operator overloading is simply 'syntactic sugar' hiding a method call, and making code somehwhat clearer in many cases.
Consider a simple Integer class wrapping an int. You would write add and other arithmetic methods, possibly increment and decrement as well, requiring a method call such as my_int.add(5). now renaming the add method to operator+ allows my_int + 5, which is more intuitive and clearer, cleaner code. But all it is really doing is hiding a call to your operator+ (renamed add?) method.
Things do get a bit more complex though, as operator + for numbers is well understood by everyone above 2nd grade. But as in the string example above, operators should usually only be applied where they have an intuitive meaning. The Apples example is a good example of where NOT to overload operators.
But applied to say, a List class, something like myList + anObject, should be intuitively understood as 'add anObject to myList', hence the use of the + operator. And operator '-' as meaning 'Removal from the list'.
As I said above, the point of all this is to make code (hopefully) clearer, as in the List example, which would you rather code? (and which do you find easier to read?) myList.add( anObject ) or myList + onObject? But in the background, a method (your implementation of operator+, or add) is being called either way. You can almost think of the compiler rewritting the code: my_int + 5 would become my_int.operator+(5)
All the examples given, such as Time and Vector classes, all have intuitive definitions for the operators. Vector addition... again, easier to code (and read) v1 = v2 + v3 than v1 = v2.add(v3). This is where all the caution you are likely to read regarding not going overboard with operators in your classes, because for most they just wont make sense. But of course there is nothing stopping you putting an operator & into a class like Apple, just dont expect others to know what it does without seeing the code for it!
'Overloading' the operator simply means your are supplying the compiler with another definition for that operator, applied to instances of your class. Rather like overloading methods, same name... different parameters...
Hope this helps...
The "operator" in this case is the + symbol.
The idea here is that an operator does something. An overloaded operator does something different.
So, in this case, the '+' operator, normally used to add two numbers, is being "overloaded" to allow for adding vectors or time.
EDIT: Adding two integers is built-in to c++; the compiler automatically understands what you mean when you do
int x, y = 2, z = 2;
x = y + z;
Objects, on the other hand, can be anything, so using a '+' between two objects doesn't inherently make any sense. If you have something like
Apple apple1, apple2, apple3;
apple3 = apple1 + apple2;
What does it mean when you add two Apple objects together? Nothing, until you overload the '+' operator and tell the compiler what it is that you mean when you add two Apple objects together.
An overloaded operator is when you use an operator to work with types that C++ doesn't "natively" support for that operator.
For example, you can typically use the binary "+" operator to add numeric values (floats, ints, doubles, etc.). You can also add an integer type to a pointer - for instance:
char foo[] = "A few words";
char *p = &(foo[3]); // Points to "e"
char *q = foo + 3; // Also points to "e"
But that's it! You can't do any more natively with a binary "+" operator.
However, operator overloading lets you do things the designers of C++ didn't build into the language - like use the + operator to concatenate strings - for instance:
std::string a("A short"), b(" string.");
std::string c = a + b; // c is "A short string."
Once you wrap your head around that, the Wikipedia examples will make more sense.
A operator would be "+", "-" or "+=". These perform different methods on existing objects. This in fact comes down to a method call. Other than normal method calls these look much more natural to a human user. Writing "1 + 2" just looks more normal and is shorter than "add(1,2)". If you overload an operator, you change the method it executes.
In your first example, the "+" operator's method is overloaded, so that you can use it for vector-addition.
I would suggest that you copy the first example into an editor and play a little around with it. Once you understand what the code does, my suggestion would be to implement vector subtraction and multiplication.
Before starting out, there are many operators out there! Here is a list of all C++ operators: list.
With this being said, operator overloading in C++ is a way to make a certain operator behave in a particular way for an object.
For example, if you use the increment/decrement operators (++ and --) on an object, the compiler will not understand what needs to be incremented/decremented in the object because it is not a primitive type (int, char, float...). You must define the appropriate behavior for the compiler to understand what you mean. Operator overloading basically tells the compiler what must be accomplished when the increment/decrement operators are used with the object.
Also, you must pay attention to the fact that there is postfix incrementing/decrementing and prefix incrementing/decrementing which becomes very important with the notion of iterators and you should note that the syntax for overloading these two type of operators is different from each other. Here is how you can overload these operators: Overloading the increment and decrement operators
The accepted answer by Michael Burr is quite good in explaining the technique, but from the comments it seems that besides the 'how' you are interested in the 'why'. The main reasons to provide operator overloads for a given type are improving readability and providing a required interface.
If you have a type for which there is a single commonly understood meaning for an operator in the domain of your problem, then providing that as an operator overload makes code more readable:
std::complex<double> a(1,2), b(3,4), c( 5, 6 );
std::complex<double> d = a + b + c; // compare to d = a.add(b).add(c);
std::complex<double> e = (a + d) + (b + c); // e = a.add(d).add( b.add(c) );
If your type has a given property that will naturally be expressed with an operator, you can overload that particular operator for your type. Consider for example, that you want to compare your objects for equality. Providing operator== (and operator!=) can give you a simple readable way of doing so. This has the advantage of fulfilling a common interface that can be used with algorithms that depend on equality:
struct type {
type( int x ) : value(x) {}
int value;
};
bool operator==( type const & lhs, type const & rhs )
{ return lhs.value == rhs.value; }
bool operator!=( type const & lhs, type const & rhs )
{ return !lhs == rhs; }
std::vector<type> getObjects(); // creates and fills a vector
int main() {
std::vector<type> objects = getObjects();
type t( 5 );
std::find( objects.begin(), objects.end(), t );
}
Note that when the find algorithm is implemented, it depends on == being defined. The implementation of find will work with primitive types as well as with any user defined type that has an equality operator defined. There is a common single interface that makes sense. Compare that with the Java version, where comparison of object types must be performed through the .equals member function, while comparing primitive types can be done with ==. By allowing you to overload the operators you can work with user defined types in the same way that you can with primitive types.
The same goes for ordering. If there is a well defined (partial) order in the domain of your class, then providing operator< is a simple way of implementing that order. Code will be readable, and your type will be usable in all situations where a partial order is required, as inside associative containers:
bool operator<( type const & lhs, type const & rhs )
{
return lhs < rhs;
}
std::map<type, int> m; // m will use the natural `operator<` order
A common pitfall when operator overloading was introduced into the language is that of the 'golden hammer' Once you have a golden hammer everything looks like a nail, and operator overloading has been abused.
It is important to note that the reason for overloading in the first place is improving readability. Readability is only improved if when a programmer looks at the code, the intentions of each operation are clear at first glance, without having to read the definitions. When you see that two complex numbers are being added like a + b you know what the code is doing. If the definition of the operator is not natural (you decide to implement it as adding only the real part of it) then code will become harder to read than if you had provided a (member) function. If the meaning of the operation is not well defined for your type the same happens:
MyVector a, b;
MyVector c = a + b;
What is c? Is it a vector where each element i is the sum of of the respective elements from a and b, or is it a vector created by concatenating the elements of a before the elements of b. To understand the code, you would need to go to the definition of the operation, and that means that overloading the operator is less readable than providing a function:
MyVector c = append( a, b );
The set of operators that can be overloaded is not restricted to the arithmetic and relational operators. You can overload operator[] to index into a type, or operator() to create a callable object that can be used as a function (these are called functors) or that will simplify usage of the class:
class vector {
public:
int operator[]( int );
};
vector v;
std::cout << v[0] << std::endl;
class matrix {
public:
int operator()( int row, int column );
// operator[] cannot be overloaded with more than 1 argument
};
matrix m;
std::cout << m( 3,4 ) << std::endl;
There are other uses of operator overloading. In particular operator, can be overloaded in really fancy ways for metaprogramming purposes, but that is probably much more complex than what you really care for now.
Another use of operator overloading, AFAIK unique to C++, is the ability to overload the assignment operator. If you have:
class CVector
{
// ...
private:
size_t capacity;
size_t length;
double* data;
};
void func()
{
CVector a, b;
// ...
a = b;
}
Then a.data and b.data will point to the same location, and if you modify a, you affect b as well. That's probably not what you want. But you can write:
CVector& CVector::operator=(const CVector& rhs)
{
delete[] data;
capacity = length = rhs.length;
data = new double[length];
memcpy(data, rhs.data, length * sizeof(double));
return (*this);
}
and get a deep copy.
Operator overloading allows you to give own meaning to the operator.
For example, consider the following code snippet:
char* str1 = "String1";
char* str2 = "String2";
char str3[20];
str3 = str1 + str2;
You can overload the "+" operator to concatenate two strings. Doesn't this look more programmer-friendly?