Related
What is the copy-and-swap idiom and when should it be used? What problems does it solve? Does it change for C++11?
Related:
What are your favorite C++ Coding Style idioms: Copy-swap
Copy constructor and = operator overload in C++: is a common function possible?
What is copy elision and how it optimizes copy-and-swap idiom
C++: dynamically allocating an array of objects?
Overview
Why do we need the copy-and-swap idiom?
Any class that manages a resource (a wrapper, like a smart pointer) needs to implement The Big Three. While the goals and implementation of the copy-constructor and destructor are straightforward, the copy-assignment operator is arguably the most nuanced and difficult. How should it be done? What pitfalls need to be avoided?
The copy-and-swap idiom is the solution, and elegantly assists the assignment operator in achieving two things: avoiding code duplication, and providing a strong exception guarantee.
How does it work?
Conceptually, it works by using the copy-constructor's functionality to create a local copy of the data, then takes the copied data with a swap function, swapping the old data with the new data. The temporary copy then destructs, taking the old data with it. We are left with a copy of the new data.
In order to use the copy-and-swap idiom, we need three things: a working copy-constructor, a working destructor (both are the basis of any wrapper, so should be complete anyway), and a swap function.
A swap function is a non-throwing function that swaps two objects of a class, member for member. We might be tempted to use std::swap instead of providing our own, but this would be impossible; std::swap uses the copy-constructor and copy-assignment operator within its implementation, and we'd ultimately be trying to define the assignment operator in terms of itself!
(Not only that, but unqualified calls to swap will use our custom swap operator, skipping over the unnecessary construction and destruction of our class that std::swap would entail.)
An in-depth explanation
The goal
Let's consider a concrete case. We want to manage, in an otherwise useless class, a dynamic array. We start with a working constructor, copy-constructor, and destructor:
#include <algorithm> // std::copy
#include <cstddef> // std::size_t
class dumb_array
{
public:
// (default) constructor
dumb_array(std::size_t size = 0)
: mSize(size),
mArray(mSize ? new int[mSize]() : nullptr)
{
}
// copy-constructor
dumb_array(const dumb_array& other)
: mSize(other.mSize),
mArray(mSize ? new int[mSize] : nullptr)
{
// note that this is non-throwing, because of the data
// types being used; more attention to detail with regards
// to exceptions must be given in a more general case, however
std::copy(other.mArray, other.mArray + mSize, mArray);
}
// destructor
~dumb_array()
{
delete [] mArray;
}
private:
std::size_t mSize;
int* mArray;
};
This class almost manages the array successfully, but it needs operator= to work correctly.
A failed solution
Here's how a naive implementation might look:
// the hard part
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get rid of the old data...
delete [] mArray; // (2)
mArray = nullptr; // (2) *(see footnote for rationale)
// ...and put in the new
mSize = other.mSize; // (3)
mArray = mSize ? new int[mSize] : nullptr; // (3)
std::copy(other.mArray, other.mArray + mSize, mArray); // (3)
}
return *this;
}
And we say we're finished; this now manages an array, without leaks. However, it suffers from three problems, marked sequentially in the code as (n).
The first is the self-assignment test.
This check serves two purposes: it's an easy way to prevent us from running needless code on self-assignment, and it protects us from subtle bugs (such as deleting the array only to try and copy it). But in all other cases it merely serves to slow the program down, and act as noise in the code; self-assignment rarely occurs, so most of the time this check is a waste.
It would be better if the operator could work properly without it.
The second is that it only provides a basic exception guarantee. If new int[mSize] fails, *this will have been modified. (Namely, the size is wrong and the data is gone!)
For a strong exception guarantee, it would need to be something akin to:
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get the new data ready before we replace the old
std::size_t newSize = other.mSize;
int* newArray = newSize ? new int[newSize]() : nullptr; // (3)
std::copy(other.mArray, other.mArray + newSize, newArray); // (3)
// replace the old data (all are non-throwing)
delete [] mArray;
mSize = newSize;
mArray = newArray;
}
return *this;
}
The code has expanded! Which leads us to the third problem: code duplication.
Our assignment operator effectively duplicates all the code we've already written elsewhere, and that's a terrible thing.
In our case, the core of it is only two lines (the allocation and the copy), but with more complex resources this code bloat can be quite a hassle. We should strive to never repeat ourselves.
(One might wonder: if this much code is needed to manage one resource correctly, what if my class manages more than one?
While this may seem to be a valid concern, and indeed it requires non-trivial try/catch clauses, this is a non-issue.
That's because a class should manage one resource only!)
A successful solution
As mentioned, the copy-and-swap idiom will fix all these issues. But right now, we have all the requirements except one: a swap function. While The Rule of Three successfully entails the existence of our copy-constructor, assignment operator, and destructor, it should really be called "The Big Three and A Half": any time your class manages a resource it also makes sense to provide a swap function.
We need to add swap functionality to our class, and we do that as follows†:
class dumb_array
{
public:
// ...
friend void swap(dumb_array& first, dumb_array& second) // nothrow
{
// enable ADL (not necessary in our case, but good practice)
using std::swap;
// by swapping the members of two objects,
// the two objects are effectively swapped
swap(first.mSize, second.mSize);
swap(first.mArray, second.mArray);
}
// ...
};
(Here is the explanation why public friend swap.) Now not only can we swap our dumb_array's, but swaps in general can be more efficient; it merely swaps pointers and sizes, rather than allocating and copying entire arrays. Aside from this bonus in functionality and efficiency, we are now ready to implement the copy-and-swap idiom.
Without further ado, our assignment operator is:
dumb_array& operator=(dumb_array other) // (1)
{
swap(*this, other); // (2)
return *this;
}
And that's it! With one fell swoop, all three problems are elegantly tackled at once.
Why does it work?
We first notice an important choice: the parameter argument is taken by-value. While one could just as easily do the following (and indeed, many naive implementations of the idiom do):
dumb_array& operator=(const dumb_array& other)
{
dumb_array temp(other);
swap(*this, temp);
return *this;
}
We lose an important optimization opportunity. Not only that, but this choice is critical in C++11, which is discussed later. (On a general note, a remarkably useful guideline is as follows: if you're going to make a copy of something in a function, let the compiler do it in the parameter list.‡)
Either way, this method of obtaining our resource is the key to eliminating code duplication: we get to use the code from the copy-constructor to make the copy, and never need to repeat any bit of it. Now that the copy is made, we are ready to swap.
Observe that upon entering the function that all the new data is already allocated, copied, and ready to be used. This is what gives us a strong exception guarantee for free: we won't even enter the function if construction of the copy fails, and it's therefore not possible to alter the state of *this. (What we did manually before for a strong exception guarantee, the compiler is doing for us now; how kind.)
At this point we are home-free, because swap is non-throwing. We swap our current data with the copied data, safely altering our state, and the old data gets put into the temporary. The old data is then released when the function returns. (Where upon the parameter's scope ends and its destructor is called.)
Because the idiom repeats no code, we cannot introduce bugs within the operator. Note that this means we are rid of the need for a self-assignment check, allowing a single uniform implementation of operator=. (Additionally, we no longer have a performance penalty on non-self-assignments.)
And that is the copy-and-swap idiom.
What about C++11?
The next version of C++, C++11, makes one very important change to how we manage resources: the Rule of Three is now The Rule of Four (and a half). Why? Because not only do we need to be able to copy-construct our resource, we need to move-construct it as well.
Luckily for us, this is easy:
class dumb_array
{
public:
// ...
// move constructor
dumb_array(dumb_array&& other) noexcept ††
: dumb_array() // initialize via default constructor, C++11 only
{
swap(*this, other);
}
// ...
};
What's going on here? Recall the goal of move-construction: to take the resources from another instance of the class, leaving it in a state guaranteed to be assignable and destructible.
So what we've done is simple: initialize via the default constructor (a C++11 feature), then swap with other; we know a default constructed instance of our class can safely be assigned and destructed, so we know other will be able to do the same, after swapping.
(Note that some compilers do not support constructor delegation; in this case, we have to manually default construct the class. This is an unfortunate but luckily trivial task.)
Why does that work?
That is the only change we need to make to our class, so why does it work? Remember the ever-important decision we made to make the parameter a value and not a reference:
dumb_array& operator=(dumb_array other); // (1)
Now, if other is being initialized with an rvalue, it will be move-constructed. Perfect. In the same way C++03 let us re-use our copy-constructor functionality by taking the argument by-value, C++11 will automatically pick the move-constructor when appropriate as well. (And, of course, as mentioned in previously linked article, the copying/moving of the value may simply be elided altogether.)
And so concludes the copy-and-swap idiom.
Footnotes
*Why do we set mArray to null? Because if any further code in the operator throws, the destructor of dumb_array might be called; and if that happens without setting it to null, we attempt to delete memory that's already been deleted! We avoid this by setting it to null, as deleting null is a no-operation.
†There are other claims that we should specialize std::swap for our type, provide an in-class swap along-side a free-function swap, etc. But this is all unnecessary: any proper use of swap will be through an unqualified call, and our function will be found through ADL. One function will do.
‡The reason is simple: once you have the resource to yourself, you may swap and/or move it (C++11) anywhere it needs to be. And by making the copy in the parameter list, you maximize optimization.
††The move constructor should generally be noexcept, otherwise some code (e.g. std::vector resizing logic) will use the copy constructor even when a move would make sense. Of course, only mark it noexcept if the code inside doesn't throw exceptions.
Assignment, at its heart, is two steps: tearing down the object's old state and building its new state as a copy of some other object's state.
Basically, that's what the destructor and the copy constructor do, so the first idea would be to delegate the work to them. However, since destruction mustn't fail, while construction might, we actually want to do it the other way around: first perform the constructive part and, if that succeeded, then do the destructive part. The copy-and-swap idiom is a way to do just that: It first calls a class' copy constructor to create a temporary object, then swaps its data with the temporary's, and then lets the temporary's destructor destroy the old state.
Since swap() is supposed to never fail, the only part which might fail is the copy-construction. That is performed first, and if it fails, nothing will be changed in the targeted object.
In its refined form, copy-and-swap is implemented by having the copy performed by initializing the (non-reference) parameter of the assignment operator:
T& operator=(T tmp)
{
this->swap(tmp);
return *this;
}
There are some good answers already. I'll focus mainly on what I think they lack - an explanation of the "cons" with the copy-and-swap idiom....
What is the copy-and-swap idiom?
A way of implementing the assignment operator in terms of a swap function:
X& operator=(X rhs)
{
swap(rhs);
return *this;
}
The fundamental idea is that:
the most error-prone part of assigning to an object is ensuring any resources the new state needs are acquired (e.g. memory, descriptors)
that acquisition can be attempted before modifying the current state of the object (i.e. *this) if a copy of the new value is made, which is why rhs is accepted by value (i.e. copied) rather than by reference
swapping the state of the local copy rhs and *this is usually relatively easy to do without potential failure/exceptions, given the local copy doesn't need any particular state afterwards (just needs state fit for the destructor to run, much as for an object being moved from in >= C++11)
When should it be used? (Which problems does it solve [/create]?)
When you want the assigned-to objected unaffected by an assignment that throws an exception, assuming you have or can write a swap with strong exception guarantee, and ideally one that can't fail/throw..†
When you want a clean, easy to understand, robust way to define the assignment operator in terms of (simpler) copy constructor, swap and destructor functions.
Self-assignment done as a copy-and-swap avoids oft-overlooked edge cases.‡
When any performance penalty or momentarily higher resource usage created by having an extra temporary object during the assignment is not important to your application. ⁂
† swap throwing: it's generally possible to reliably swap data members that the objects track by pointer, but non-pointer data members that don't have a throw-free swap, or for which swapping has to be implemented as X tmp = lhs; lhs = rhs; rhs = tmp; and copy-construction or assignment may throw, still have the potential to fail leaving some data members swapped and others not. This potential applies even to C++03 std::string's as James comments on another answer:
#wilhelmtell: In C++03, there is no mention of exceptions potentially thrown by std::string::swap (which is called by std::swap). In C++0x, std::string::swap is noexcept and must not throw exceptions. – James McNellis Dec 22 '10 at 15:24
‡ assignment operator implementation that seems sane when assigning from a distinct object can easily fail for self-assignment. While it might seem unimaginable that client code would even attempt self-assignment, it can happen relatively easily during algo operations on containers, with x = f(x); code where f is (perhaps only for some #ifdef branches) a macro ala #define f(x) x or a function returning a reference to x, or even (likely inefficient but concise) code like x = c1 ? x * 2 : c2 ? x / 2 : x;). For example:
struct X
{
T* p_;
size_t size_;
X& operator=(const X& rhs)
{
delete[] p_; // OUCH!
p_ = new T[size_ = rhs.size_];
std::copy(p_, rhs.p_, rhs.p_ + rhs.size_);
}
...
};
On self-assignment, the above code delete's x.p_;, points p_ at a newly allocated heap region, then attempts to read the uninitialised data therein (Undefined Behaviour), if that doesn't do anything too weird, copy attempts a self-assignment to every just-destructed 'T'!
⁂ The copy-and-swap idiom can introduce inefficiencies or limitations due to the use of an extra temporary (when the operator's parameter is copy-constructed):
struct Client
{
IP_Address ip_address_;
int socket_;
X(const X& rhs)
: ip_address_(rhs.ip_address_), socket_(connect(rhs.ip_address_))
{ }
};
Here, a hand-written Client::operator= might check if *this is already connected to the same server as rhs (perhaps sending a "reset" code if useful), whereas the copy-and-swap approach would invoke the copy-constructor which would likely be written to open a distinct socket connection then close the original one. Not only could that mean a remote network interaction instead of a simple in-process variable copy, it could run afoul of client or server limits on socket resources or connections. (Of course this class has a pretty horrid interface, but that's another matter ;-P).
This answer is more like an addition and a slight modification to the answers above.
In some versions of Visual Studio (and possibly other compilers) there is a bug that is really annoying and doesn't make sense. So if you declare/define your swap function like this:
friend void swap(A& first, A& second) {
std::swap(first.size, second.size);
std::swap(first.arr, second.arr);
}
... the compiler will yell at you when you call the swap function:
This has something to do with a friend function being called and this object being passed as a parameter.
A way around this is to not use friend keyword and redefine the swap function:
void swap(A& other) {
std::swap(size, other.size);
std::swap(arr, other.arr);
}
This time, you can just call swap and pass in other, thus making the compiler happy:
After all, you don't need to use a friend function to swap 2 objects. It makes just as much sense to make swap a member function that has one other object as a parameter.
You already have access to this object, so passing it in as a parameter is technically redundant.
I would like to add a word of warning when you are dealing with C++11-style allocator-aware containers. Swapping and assignment have subtly different semantics.
For concreteness, let us consider a container std::vector<T, A>, where A is some stateful allocator type, and we'll compare the following functions:
void fs(std::vector<T, A> & a, std::vector<T, A> & b)
{
a.swap(b);
b.clear(); // not important what you do with b
}
void fm(std::vector<T, A> & a, std::vector<T, A> & b)
{
a = std::move(b);
}
The purpose of both functions fs and fm is to give a the state that b had initially. However, there is a hidden question: What happens if a.get_allocator() != b.get_allocator()? The answer is: It depends. Let's write AT = std::allocator_traits<A>.
If AT::propagate_on_container_move_assignment is std::true_type, then fm reassigns the allocator of a with the value of b.get_allocator(), otherwise it does not, and a continues to use its original allocator. In that case, the data elements need to be swapped individually, since the storage of a and b is not compatible.
If AT::propagate_on_container_swap is std::true_type, then fs swaps both data and allocators in the expected fashion.
If AT::propagate_on_container_swap is std::false_type, then we need a dynamic check.
If a.get_allocator() == b.get_allocator(), then the two containers use compatible storage, and swapping proceeds in the usual fashion.
However, if a.get_allocator() != b.get_allocator(), the program has undefined behaviour (cf. [container.requirements.general/8].
The upshot is that swapping has become a non-trivial operation in C++11 as soon as your container starts supporting stateful allocators. That's a somewhat "advanced use case", but it's not entirely unlikely, since move optimizations usually only become interesting once your class manages a resource, and memory is one of the most popular resources.
In the assignment operator of a class, you usually need to check if the object being assigned is the invoking object so you don't screw things up:
Class& Class::operator=(const Class& rhs) {
if (this != &rhs) {
// do the assignment
}
return *this;
}
Do you need the same thing for the move assignment operator? Is there ever a situation where this == &rhs would be true?
? Class::operator=(Class&& rhs) {
?
}
First, the Copy and Swap is not always the correct way to implement Copy Assignment. Almost certainly in the case of dumb_array, this is a sub-optimal solution.
The use of Copy and Swap is for dumb_array is a classic example of putting the most expensive operation with the fullest features at the bottom layer. It is perfect for clients who want the fullest feature and are willing to pay the performance penalty. They get exactly what they want.
But it is disastrous for clients who do not need the fullest feature and are instead looking for the highest performance. For them dumb_array is just another piece of software they have to rewrite because it is too slow. Had dumb_array been designed differently, it could have satisfied both clients with no compromises to either client.
The key to satisfying both clients is to build the fastest operations in at the lowest level, and then to add API on top of that for fuller features at more expense. I.e. you need the strong exception guarantee, fine, you pay for it. You don't need it? Here's a faster solution.
Let's get concrete: Here's the fast, basic exception guarantee Copy Assignment operator for dumb_array:
dumb_array& operator=(const dumb_array& other)
{
if (this != &other)
{
if (mSize != other.mSize)
{
delete [] mArray;
mArray = nullptr;
mArray = other.mSize ? new int[other.mSize] : nullptr;
mSize = other.mSize;
}
std::copy(other.mArray, other.mArray + mSize, mArray);
}
return *this;
}
Explanation:
One of the more expensive things you can do on modern hardware is make a trip to the heap. Anything you can do to avoid a trip to the heap is time & effort well spent. Clients of dumb_array may well want to often assign arrays of the same size. And when they do, all you need to do is a memcpy (hidden under std::copy). You don't want to allocate a new array of the same size and then deallocate the old one of the same size!
Now for your clients who actually want strong exception safety:
template <class C>
C&
strong_assign(C& lhs, C rhs)
{
swap(lhs, rhs);
return lhs;
}
Or maybe if you want to take advantage of move assignment in C++11 that should be:
template <class C>
C&
strong_assign(C& lhs, C rhs)
{
lhs = std::move(rhs);
return lhs;
}
If dumb_array's clients value speed, they should call the operator=. If they need strong exception safety, there are generic algorithms they can call that will work on a wide variety of objects and need only be implemented once.
Now back to the original question (which has a type-o at this point in time):
Class&
Class::operator=(Class&& rhs)
{
if (this == &rhs) // is this check needed?
{
// ...
}
return *this;
}
This is actually a controversial question. Some will say yes, absolutely, some will say no.
My personal opinion is no, you don't need this check.
Rationale:
When an object binds to an rvalue reference it is one of two things:
A temporary.
An object the caller wants you to believe is a temporary.
If you have a reference to an object that is an actual temporary, then by definition, you have a unique reference to that object. It can't possibly be referenced by anywhere else in your entire program. I.e. this == &temporary is not possible.
Now if your client has lied to you and promised you that you're getting a temporary when you're not, then it is the client's responsibility to be sure that you don't have to care. If you want to be really careful, I believe that this would be a better implementation:
Class&
Class::operator=(Class&& other)
{
assert(this != &other);
// ...
return *this;
}
I.e. If you are passed a self reference, this is a bug on the part of the client that should be fixed.
For completeness, here is a move assignment operator for dumb_array:
dumb_array& operator=(dumb_array&& other)
{
assert(this != &other);
delete [] mArray;
mSize = other.mSize;
mArray = other.mArray;
other.mSize = 0;
other.mArray = nullptr;
return *this;
}
In the typical use case of move assignment, *this will be a moved-from object and so delete [] mArray; should be a no-op. It is critical that implementations make delete on a nullptr as fast as possible.
Caveat:
Some will argue that swap(x, x) is a good idea, or just a necessary evil. And this, if the swap goes to the default swap, can cause a self-move-assignment.
I disagree that swap(x, x) is ever a good idea. If found in my own code, I will consider it a performance bug and fix it. But in case you want to allow it, realize that swap(x, x) only does self-move-assignemnet on a moved-from value. And in our dumb_array example this will be perfectly harmless if we simply omit the assert, or constrain it to the moved-from case:
dumb_array& operator=(dumb_array&& other)
{
assert(this != &other || mSize == 0);
delete [] mArray;
mSize = other.mSize;
mArray = other.mArray;
other.mSize = 0;
other.mArray = nullptr;
return *this;
}
If you self-assign two moved-from (empty) dumb_array's, you don't do anything incorrect aside from inserting useless instructions into your program. This same observation can be made for the vast majority of objects.
<Update>
I've given this issue some more thought, and changed my position somewhat. I now believe that assignment should be tolerant of self assignment, but that the post conditions on copy assignment and move assignment are different:
For copy assignment:
x = y;
one should have a post-condition that the value of y should not be altered. When &x == &y then this postcondition translates into: self copy assignment should have no impact on the value of x.
For move assignment:
x = std::move(y);
one should have a post-condition that y has a valid but unspecified state. When &x == &y then this postcondition translates into: x has a valid but unspecified state. I.e. self move assignment does not have to be a no-op. But it should not crash. This post-condition is consistent with allowing swap(x, x) to just work:
template <class T>
void
swap(T& x, T& y)
{
// assume &x == &y
T tmp(std::move(x));
// x and y now have a valid but unspecified state
x = std::move(y);
// x and y still have a valid but unspecified state
y = std::move(tmp);
// x and y have the value of tmp, which is the value they had on entry
}
The above works, as long as x = std::move(x) doesn't crash. It can leave x in any valid but unspecified state.
I see three ways to program the move assignment operator for dumb_array to achieve this:
dumb_array& operator=(dumb_array&& other)
{
delete [] mArray;
// set *this to a valid state before continuing
mSize = 0;
mArray = nullptr;
// *this is now in a valid state, continue with move assignment
mSize = other.mSize;
mArray = other.mArray;
other.mSize = 0;
other.mArray = nullptr;
return *this;
}
The above implementation tolerates self assignment, but *this and other end up being a zero-sized array after the self-move assignment, no matter what the original value of *this is. This is fine.
dumb_array& operator=(dumb_array&& other)
{
if (this != &other)
{
delete [] mArray;
mSize = other.mSize;
mArray = other.mArray;
other.mSize = 0;
other.mArray = nullptr;
}
return *this;
}
The above implementation tolerates self assignment the same way the copy assignment operator does, by making it a no-op. This is also fine.
dumb_array& operator=(dumb_array&& other)
{
swap(other);
return *this;
}
The above is ok only if dumb_array does not hold resources that should be destructed "immediately". For example if the only resource is memory, the above is fine. If dumb_array could possibly hold mutex locks or the open state of files, the client could reasonably expect those resources on the lhs of the move assignment to be immediately released and therefore this implementation could be problematic.
The cost of the first is two extra stores. The cost of the second is a test-and-branch. Both work. Both meet all of the requirements of Table 22 MoveAssignable requirements in the C++11 standard. The third also works modulo the non-memory-resource-concern.
All three implementations can have different costs depending on the hardware: How expensive is a branch? Are there lots of registers or very few?
The take-away is that self-move-assignment, unlike self-copy-assignment, does not have to preserve the current value.
</Update>
One final (hopefully) edit inspired by Luc Danton's comment:
If you're writing a high level class that doesn't directly manage memory (but may have bases or members that do), then the best implementation of move assignment is often:
Class& operator=(Class&&) = default;
This will move assign each base and each member in turn, and will not include a this != &other check. This will give you the very highest performance and basic exception safety assuming no invariants need to be maintained among your bases and members. For your clients demanding strong exception safety, point them towards strong_assign.
First, you got the signature of the move-assignment operator wrong. Since moves steal resources from the source object, the source has to be a non-const r-value reference.
Class &Class::operator=( Class &&rhs ) {
//...
return *this;
}
Note that you still return via a (non-const) l-value reference.
For either type of direct assignment, the standard isn't to check for self-assignment, but to make sure a self-assignment doesn't cause a crash-and-burn. Generally, no one explicitly does x = x or y = std::move(y) calls, but aliasing, especially through multiple functions, may lead a = b or c = std::move(d) into being self-assignments. An explicit check for self-assignment, i.e. this == &rhs, that skips the meat of the function when true is one way to ensure self-assignment safety. But it's one of the worst ways, since it optimizes a (hopefully) rare case while it's an anti-optimization for the more common case (due to branching and possibly cache misses).
Now when (at least) one of the operands is a directly temporary object, you can never have a self-assignment scenario. Some people advocate assuming that case and optimize the code for it so much that the code becomes suicidally stupid when the assumption is wrong. I say that dumping the same-object check on users is irresponsible. We don't make that argument for copy-assignment; why reverse the position for move-assignment?
Let's make an example, altered from another respondent:
dumb_array& dumb_array::operator=(const dumb_array& other)
{
if (mSize != other.mSize)
{
delete [] mArray;
mArray = nullptr; // clear this...
mSize = 0u; // ...and this in case the next line throws
mArray = other.mSize ? new int[other.mSize] : nullptr;
mSize = other.mSize;
}
std::copy(other.mArray, other.mArray + mSize, mArray);
return *this;
}
This copy-assignment handles self-assignment gracefully without an explicit check. If the source and destination sizes differ, then deallocation and reallocation precede the copying. Otherwise, just the copying is done. Self-assignment does not get an optimized path, it gets dumped into the same path as when the source and destination sizes start off equal. The copying is technically unnecessary when the two objects are equivalent (including when they're the same object), but that's the price when not doing an equality check (value-wise or address-wise) since said check itself would be a waste most of the time. Note that the object self-assignment here will cause a series of element-level self-assignments; the element type has to be safe for doing this.
Like its source example, this copy-assignment provides the basic exception safety guarantee. If you want the strong guarantee, then use the unified-assignment operator from the original Copy and Swap query, which handles both copy- and move-assignment. But the point of this example is to reduce safety by one rank to gain speed. (BTW, we're assuming that the individual elements' values are independent; that there's no invariant constraint limiting some values compared to others.)
Let's look at a move-assignment for this same type:
class dumb_array
{
//...
void swap(dumb_array& other) noexcept
{
// Just in case we add UDT members later
using std::swap;
// both members are built-in types -> never throw
swap( this->mArray, other.mArray );
swap( this->mSize, other.mSize );
}
dumb_array& operator=(dumb_array&& other) noexcept
{
this->swap( other );
return *this;
}
//...
};
void swap( dumb_array &l, dumb_array &r ) noexcept { l.swap( r ); }
A swappable type that needs customization should have a two-argument free function called swap in the same namespace as the type. (The namespace restriction lets unqualified calls to swap to work.) A container type should also add a public swap member function to match the standard containers. If a member swap is not provided, then the free-function swap probably needs to be marked as a friend of the swappable type. If you customize moves to use swap, then you have to provide your own swapping code; the standard code calls the type's move code, which would result in infinite mutual recursion for move-customized types.
Like destructors, swap functions and move operations should be never-throw if at all possible, and probably marked as such (in C++11). Standard library types and routines have optimizations for non-throwable moving types.
This first version of move-assignment fulfills the basic contract. The source's resource markers are transferred to the destination object. The old resources won't be leaked since the source object now manages them. And the source object is left in a usable state where further operations, including assignment and destruction, can be applied to it.
Note that this move-assignment is automatically safe for self-assignment, since the swap call is. It's also strongly exception safe. The problem is unnecessary resource retention. The old resources for the destination are conceptually no longer needed, but here they are still around only so the source object can stay valid. If the scheduled destruction of the source object is a long way off, we're wasting resource space, or worse if the total resource space is limited and other resource petitions will happen before the (new) source object officially dies.
This issue is what caused the controversial current guru advice concerning self-targeting during move-assignment. The way to write move-assignment without lingering resources is something like:
class dumb_array
{
//...
dumb_array& operator=(dumb_array&& other) noexcept
{
delete [] this->mArray; // kill old resources
this->mArray = other.mArray;
this->mSize = other.mSize;
other.mArray = nullptr; // reset source
other.mSize = 0u;
return *this;
}
//...
};
The source is reset to default conditions, while the old destination resources are destroyed. In the self-assignment case, your current object ends up committing suicide. The main way around it is to surround the action code with an if(this != &other) block, or screw it and let clients eat an assert(this != &other) initial line (if you're feeling nice).
An alternative is to study how to make copy-assignment strongly exception safe, without unified-assignment, and apply it to move-assignment:
class dumb_array
{
//...
dumb_array& operator=(dumb_array&& other) noexcept
{
dumb_array temp{ std::move(other) };
this->swap( temp );
return *this;
}
//...
};
When other and this are distinct, other is emptied by the move to temp and stays that way. Then this loses its old resources to temp while getting the resources originally held by other. Then the old resources of this get killed when temp does.
When self-assignment happens, the emptying of other to temp empties this as well. Then target object gets its resources back when temp and this swap. The death of temp claims an empty object, which should be practically a no-op. The this/other object keeps its resources.
The move-assignment should be never-throw as long as move-construction and swapping are also. The cost of also being safe during self-assignment is a few more instructions over low-level types, which should be swamped out by the deallocation call.
I'm in the camp of those that want self-assignment safe operators, but don't want to write self-assignment checks in the implementations of operator=. And in fact I don't even want to implement operator= at all, I want the default behaviour to work 'right out of the box'. The best special members are those that come for free.
That being said, the MoveAssignable requirements present in the Standard are described as follows (from 17.6.3.1 Template argument requirements [utility.arg.requirements], n3290):
Expression Return type Return value Post-condition
t = rv T& t t is equivalent to the value of rv before the assignment
where the placeholders are described as: "t [is a] modifiable lvalue of type T;" and "rv is an rvalue of type T;". Note that those are requirements put on the types used as arguments to the templates of the Standard library, but looking elsewhere in the Standard I notice that every requirement on move assignment is similar to this one.
This means that a = std::move(a) has to be 'safe'. If what you need is an identity test (e.g. this != &other), then go for it, or else you won't even be able to put your objects into std::vector! (Unless you don't use those members/operations that do require MoveAssignable; but nevermind that.) Notice that with the previous example a = std::move(a), then this == &other will indeed hold.
As your current operator= function is written, since you've made the rvalue-reference argument const, there is no way you could "steal" the pointers and change the values of the incoming rvalue reference... you simply can't change it, you could only read from it. I would only see an issue if you were to start calling delete on pointers, etc. in your this object like you would in a normal lvaue-reference operator= method, but that sort of defeats the point of the rvalue-version ... i.e., it would seem redundant to use the rvalue version to basically do the same operations normally left to a const-lvalue operator= method.
Now if you defined your operator= to take a non-const rvalue-reference, then the only way I could see a check being required was if you passed the this object to a function that intentionally returned a rvalue reference rather than a temporary.
For instance, suppose someone tried to write an operator+ function, and utilize a mix of rvalue references and lvalue references in order to "prevent" extra temporaries from being created during some stacked addition operation on the object-type:
struct A; //defines operator=(A&& rhs) where it will "steal" the pointers
//of rhs and set the original pointers of rhs to NULL
A&& operator+(A& rhs, A&& lhs)
{
//...code
return std::move(rhs);
}
A&& operator+(A&& rhs, A&&lhs)
{
//...code
return std::move(rhs);
}
int main()
{
A a;
a = (a + A()) + A(); //calls operator=(A&&) with reference bound to a
//...rest of code
}
Now, from what I understand about rvalue references, doing the above is discouraged (i.e,. you should just return a temporary, not rvalue reference), but, if someone were to still do that, then you'd want to check to make sure the incoming rvalue-reference was not referencing the same object as the this pointer.
My answer is still that move assignment doesn't have to be save against self assigment, but it has a different explanation. Consider std::unique_ptr. If I were to implement one, I would do something like this:
unique_ptr& operator=(unique_ptr&& x) {
delete ptr_;
ptr_ = x.ptr_;
x.ptr_ = nullptr;
return *this;
}
If you look at Scott Meyers explaining this he does something similar. (If you wander why not to do swap - it has one extra write). And this is not safe for self assignment.
Sometimes this is unfortunate. Consider moving out of the vector all even numbers:
src.erase(
std::partition_copy(src.begin(), src.end(),
src.begin(),
std::back_inserter(even),
[](int num) { return num % 2; }
).first,
src.end());
This is ok for integers but I don't believe you can make something like this work with move semantics.
To conclude: move assignment to the object itself is not ok and you have to watch out for it.
Small update.
I disagree with Howard, which is a bad idea, but still - I think self move
assignment of "moved out" objects should work, because swap(x, x) should work. Algorithms love these things! It's always nice when a corner case just works. (And I am yet to see a case where it's not for free. Doesn't mean it doesn't exist though).
This is how assigning unique_ptrs is implemented in libc++:
unique_ptr& operator=(unique_ptr&& u) noexcept { reset(u.release()); ...}
It's safe for self move assignment.
Core Guidelines think it should be OK to self move assign.
There is a situation that (this == rhs) I can think of.
For this statement:
Myclass obj;
std::move(obj) = std::move(obj)
What is the copy-and-swap idiom and when should it be used? What problems does it solve? Does it change for C++11?
Related:
What are your favorite C++ Coding Style idioms: Copy-swap
Copy constructor and = operator overload in C++: is a common function possible?
What is copy elision and how it optimizes copy-and-swap idiom
C++: dynamically allocating an array of objects?
Overview
Why do we need the copy-and-swap idiom?
Any class that manages a resource (a wrapper, like a smart pointer) needs to implement The Big Three. While the goals and implementation of the copy-constructor and destructor are straightforward, the copy-assignment operator is arguably the most nuanced and difficult. How should it be done? What pitfalls need to be avoided?
The copy-and-swap idiom is the solution, and elegantly assists the assignment operator in achieving two things: avoiding code duplication, and providing a strong exception guarantee.
How does it work?
Conceptually, it works by using the copy-constructor's functionality to create a local copy of the data, then takes the copied data with a swap function, swapping the old data with the new data. The temporary copy then destructs, taking the old data with it. We are left with a copy of the new data.
In order to use the copy-and-swap idiom, we need three things: a working copy-constructor, a working destructor (both are the basis of any wrapper, so should be complete anyway), and a swap function.
A swap function is a non-throwing function that swaps two objects of a class, member for member. We might be tempted to use std::swap instead of providing our own, but this would be impossible; std::swap uses the copy-constructor and copy-assignment operator within its implementation, and we'd ultimately be trying to define the assignment operator in terms of itself!
(Not only that, but unqualified calls to swap will use our custom swap operator, skipping over the unnecessary construction and destruction of our class that std::swap would entail.)
An in-depth explanation
The goal
Let's consider a concrete case. We want to manage, in an otherwise useless class, a dynamic array. We start with a working constructor, copy-constructor, and destructor:
#include <algorithm> // std::copy
#include <cstddef> // std::size_t
class dumb_array
{
public:
// (default) constructor
dumb_array(std::size_t size = 0)
: mSize(size),
mArray(mSize ? new int[mSize]() : nullptr)
{
}
// copy-constructor
dumb_array(const dumb_array& other)
: mSize(other.mSize),
mArray(mSize ? new int[mSize] : nullptr)
{
// note that this is non-throwing, because of the data
// types being used; more attention to detail with regards
// to exceptions must be given in a more general case, however
std::copy(other.mArray, other.mArray + mSize, mArray);
}
// destructor
~dumb_array()
{
delete [] mArray;
}
private:
std::size_t mSize;
int* mArray;
};
This class almost manages the array successfully, but it needs operator= to work correctly.
A failed solution
Here's how a naive implementation might look:
// the hard part
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get rid of the old data...
delete [] mArray; // (2)
mArray = nullptr; // (2) *(see footnote for rationale)
// ...and put in the new
mSize = other.mSize; // (3)
mArray = mSize ? new int[mSize] : nullptr; // (3)
std::copy(other.mArray, other.mArray + mSize, mArray); // (3)
}
return *this;
}
And we say we're finished; this now manages an array, without leaks. However, it suffers from three problems, marked sequentially in the code as (n).
The first is the self-assignment test.
This check serves two purposes: it's an easy way to prevent us from running needless code on self-assignment, and it protects us from subtle bugs (such as deleting the array only to try and copy it). But in all other cases it merely serves to slow the program down, and act as noise in the code; self-assignment rarely occurs, so most of the time this check is a waste.
It would be better if the operator could work properly without it.
The second is that it only provides a basic exception guarantee. If new int[mSize] fails, *this will have been modified. (Namely, the size is wrong and the data is gone!)
For a strong exception guarantee, it would need to be something akin to:
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get the new data ready before we replace the old
std::size_t newSize = other.mSize;
int* newArray = newSize ? new int[newSize]() : nullptr; // (3)
std::copy(other.mArray, other.mArray + newSize, newArray); // (3)
// replace the old data (all are non-throwing)
delete [] mArray;
mSize = newSize;
mArray = newArray;
}
return *this;
}
The code has expanded! Which leads us to the third problem: code duplication.
Our assignment operator effectively duplicates all the code we've already written elsewhere, and that's a terrible thing.
In our case, the core of it is only two lines (the allocation and the copy), but with more complex resources this code bloat can be quite a hassle. We should strive to never repeat ourselves.
(One might wonder: if this much code is needed to manage one resource correctly, what if my class manages more than one?
While this may seem to be a valid concern, and indeed it requires non-trivial try/catch clauses, this is a non-issue.
That's because a class should manage one resource only!)
A successful solution
As mentioned, the copy-and-swap idiom will fix all these issues. But right now, we have all the requirements except one: a swap function. While The Rule of Three successfully entails the existence of our copy-constructor, assignment operator, and destructor, it should really be called "The Big Three and A Half": any time your class manages a resource it also makes sense to provide a swap function.
We need to add swap functionality to our class, and we do that as follows†:
class dumb_array
{
public:
// ...
friend void swap(dumb_array& first, dumb_array& second) // nothrow
{
// enable ADL (not necessary in our case, but good practice)
using std::swap;
// by swapping the members of two objects,
// the two objects are effectively swapped
swap(first.mSize, second.mSize);
swap(first.mArray, second.mArray);
}
// ...
};
(Here is the explanation why public friend swap.) Now not only can we swap our dumb_array's, but swaps in general can be more efficient; it merely swaps pointers and sizes, rather than allocating and copying entire arrays. Aside from this bonus in functionality and efficiency, we are now ready to implement the copy-and-swap idiom.
Without further ado, our assignment operator is:
dumb_array& operator=(dumb_array other) // (1)
{
swap(*this, other); // (2)
return *this;
}
And that's it! With one fell swoop, all three problems are elegantly tackled at once.
Why does it work?
We first notice an important choice: the parameter argument is taken by-value. While one could just as easily do the following (and indeed, many naive implementations of the idiom do):
dumb_array& operator=(const dumb_array& other)
{
dumb_array temp(other);
swap(*this, temp);
return *this;
}
We lose an important optimization opportunity. Not only that, but this choice is critical in C++11, which is discussed later. (On a general note, a remarkably useful guideline is as follows: if you're going to make a copy of something in a function, let the compiler do it in the parameter list.‡)
Either way, this method of obtaining our resource is the key to eliminating code duplication: we get to use the code from the copy-constructor to make the copy, and never need to repeat any bit of it. Now that the copy is made, we are ready to swap.
Observe that upon entering the function that all the new data is already allocated, copied, and ready to be used. This is what gives us a strong exception guarantee for free: we won't even enter the function if construction of the copy fails, and it's therefore not possible to alter the state of *this. (What we did manually before for a strong exception guarantee, the compiler is doing for us now; how kind.)
At this point we are home-free, because swap is non-throwing. We swap our current data with the copied data, safely altering our state, and the old data gets put into the temporary. The old data is then released when the function returns. (Where upon the parameter's scope ends and its destructor is called.)
Because the idiom repeats no code, we cannot introduce bugs within the operator. Note that this means we are rid of the need for a self-assignment check, allowing a single uniform implementation of operator=. (Additionally, we no longer have a performance penalty on non-self-assignments.)
And that is the copy-and-swap idiom.
What about C++11?
The next version of C++, C++11, makes one very important change to how we manage resources: the Rule of Three is now The Rule of Four (and a half). Why? Because not only do we need to be able to copy-construct our resource, we need to move-construct it as well.
Luckily for us, this is easy:
class dumb_array
{
public:
// ...
// move constructor
dumb_array(dumb_array&& other) noexcept ††
: dumb_array() // initialize via default constructor, C++11 only
{
swap(*this, other);
}
// ...
};
What's going on here? Recall the goal of move-construction: to take the resources from another instance of the class, leaving it in a state guaranteed to be assignable and destructible.
So what we've done is simple: initialize via the default constructor (a C++11 feature), then swap with other; we know a default constructed instance of our class can safely be assigned and destructed, so we know other will be able to do the same, after swapping.
(Note that some compilers do not support constructor delegation; in this case, we have to manually default construct the class. This is an unfortunate but luckily trivial task.)
Why does that work?
That is the only change we need to make to our class, so why does it work? Remember the ever-important decision we made to make the parameter a value and not a reference:
dumb_array& operator=(dumb_array other); // (1)
Now, if other is being initialized with an rvalue, it will be move-constructed. Perfect. In the same way C++03 let us re-use our copy-constructor functionality by taking the argument by-value, C++11 will automatically pick the move-constructor when appropriate as well. (And, of course, as mentioned in previously linked article, the copying/moving of the value may simply be elided altogether.)
And so concludes the copy-and-swap idiom.
Footnotes
*Why do we set mArray to null? Because if any further code in the operator throws, the destructor of dumb_array might be called; and if that happens without setting it to null, we attempt to delete memory that's already been deleted! We avoid this by setting it to null, as deleting null is a no-operation.
†There are other claims that we should specialize std::swap for our type, provide an in-class swap along-side a free-function swap, etc. But this is all unnecessary: any proper use of swap will be through an unqualified call, and our function will be found through ADL. One function will do.
‡The reason is simple: once you have the resource to yourself, you may swap and/or move it (C++11) anywhere it needs to be. And by making the copy in the parameter list, you maximize optimization.
††The move constructor should generally be noexcept, otherwise some code (e.g. std::vector resizing logic) will use the copy constructor even when a move would make sense. Of course, only mark it noexcept if the code inside doesn't throw exceptions.
Assignment, at its heart, is two steps: tearing down the object's old state and building its new state as a copy of some other object's state.
Basically, that's what the destructor and the copy constructor do, so the first idea would be to delegate the work to them. However, since destruction mustn't fail, while construction might, we actually want to do it the other way around: first perform the constructive part and, if that succeeded, then do the destructive part. The copy-and-swap idiom is a way to do just that: It first calls a class' copy constructor to create a temporary object, then swaps its data with the temporary's, and then lets the temporary's destructor destroy the old state.
Since swap() is supposed to never fail, the only part which might fail is the copy-construction. That is performed first, and if it fails, nothing will be changed in the targeted object.
In its refined form, copy-and-swap is implemented by having the copy performed by initializing the (non-reference) parameter of the assignment operator:
T& operator=(T tmp)
{
this->swap(tmp);
return *this;
}
There are some good answers already. I'll focus mainly on what I think they lack - an explanation of the "cons" with the copy-and-swap idiom....
What is the copy-and-swap idiom?
A way of implementing the assignment operator in terms of a swap function:
X& operator=(X rhs)
{
swap(rhs);
return *this;
}
The fundamental idea is that:
the most error-prone part of assigning to an object is ensuring any resources the new state needs are acquired (e.g. memory, descriptors)
that acquisition can be attempted before modifying the current state of the object (i.e. *this) if a copy of the new value is made, which is why rhs is accepted by value (i.e. copied) rather than by reference
swapping the state of the local copy rhs and *this is usually relatively easy to do without potential failure/exceptions, given the local copy doesn't need any particular state afterwards (just needs state fit for the destructor to run, much as for an object being moved from in >= C++11)
When should it be used? (Which problems does it solve [/create]?)
When you want the assigned-to objected unaffected by an assignment that throws an exception, assuming you have or can write a swap with strong exception guarantee, and ideally one that can't fail/throw..†
When you want a clean, easy to understand, robust way to define the assignment operator in terms of (simpler) copy constructor, swap and destructor functions.
Self-assignment done as a copy-and-swap avoids oft-overlooked edge cases.‡
When any performance penalty or momentarily higher resource usage created by having an extra temporary object during the assignment is not important to your application. ⁂
† swap throwing: it's generally possible to reliably swap data members that the objects track by pointer, but non-pointer data members that don't have a throw-free swap, or for which swapping has to be implemented as X tmp = lhs; lhs = rhs; rhs = tmp; and copy-construction or assignment may throw, still have the potential to fail leaving some data members swapped and others not. This potential applies even to C++03 std::string's as James comments on another answer:
#wilhelmtell: In C++03, there is no mention of exceptions potentially thrown by std::string::swap (which is called by std::swap). In C++0x, std::string::swap is noexcept and must not throw exceptions. – James McNellis Dec 22 '10 at 15:24
‡ assignment operator implementation that seems sane when assigning from a distinct object can easily fail for self-assignment. While it might seem unimaginable that client code would even attempt self-assignment, it can happen relatively easily during algo operations on containers, with x = f(x); code where f is (perhaps only for some #ifdef branches) a macro ala #define f(x) x or a function returning a reference to x, or even (likely inefficient but concise) code like x = c1 ? x * 2 : c2 ? x / 2 : x;). For example:
struct X
{
T* p_;
size_t size_;
X& operator=(const X& rhs)
{
delete[] p_; // OUCH!
p_ = new T[size_ = rhs.size_];
std::copy(p_, rhs.p_, rhs.p_ + rhs.size_);
}
...
};
On self-assignment, the above code delete's x.p_;, points p_ at a newly allocated heap region, then attempts to read the uninitialised data therein (Undefined Behaviour), if that doesn't do anything too weird, copy attempts a self-assignment to every just-destructed 'T'!
⁂ The copy-and-swap idiom can introduce inefficiencies or limitations due to the use of an extra temporary (when the operator's parameter is copy-constructed):
struct Client
{
IP_Address ip_address_;
int socket_;
X(const X& rhs)
: ip_address_(rhs.ip_address_), socket_(connect(rhs.ip_address_))
{ }
};
Here, a hand-written Client::operator= might check if *this is already connected to the same server as rhs (perhaps sending a "reset" code if useful), whereas the copy-and-swap approach would invoke the copy-constructor which would likely be written to open a distinct socket connection then close the original one. Not only could that mean a remote network interaction instead of a simple in-process variable copy, it could run afoul of client or server limits on socket resources or connections. (Of course this class has a pretty horrid interface, but that's another matter ;-P).
This answer is more like an addition and a slight modification to the answers above.
In some versions of Visual Studio (and possibly other compilers) there is a bug that is really annoying and doesn't make sense. So if you declare/define your swap function like this:
friend void swap(A& first, A& second) {
std::swap(first.size, second.size);
std::swap(first.arr, second.arr);
}
... the compiler will yell at you when you call the swap function:
This has something to do with a friend function being called and this object being passed as a parameter.
A way around this is to not use friend keyword and redefine the swap function:
void swap(A& other) {
std::swap(size, other.size);
std::swap(arr, other.arr);
}
This time, you can just call swap and pass in other, thus making the compiler happy:
After all, you don't need to use a friend function to swap 2 objects. It makes just as much sense to make swap a member function that has one other object as a parameter.
You already have access to this object, so passing it in as a parameter is technically redundant.
I would like to add a word of warning when you are dealing with C++11-style allocator-aware containers. Swapping and assignment have subtly different semantics.
For concreteness, let us consider a container std::vector<T, A>, where A is some stateful allocator type, and we'll compare the following functions:
void fs(std::vector<T, A> & a, std::vector<T, A> & b)
{
a.swap(b);
b.clear(); // not important what you do with b
}
void fm(std::vector<T, A> & a, std::vector<T, A> & b)
{
a = std::move(b);
}
The purpose of both functions fs and fm is to give a the state that b had initially. However, there is a hidden question: What happens if a.get_allocator() != b.get_allocator()? The answer is: It depends. Let's write AT = std::allocator_traits<A>.
If AT::propagate_on_container_move_assignment is std::true_type, then fm reassigns the allocator of a with the value of b.get_allocator(), otherwise it does not, and a continues to use its original allocator. In that case, the data elements need to be swapped individually, since the storage of a and b is not compatible.
If AT::propagate_on_container_swap is std::true_type, then fs swaps both data and allocators in the expected fashion.
If AT::propagate_on_container_swap is std::false_type, then we need a dynamic check.
If a.get_allocator() == b.get_allocator(), then the two containers use compatible storage, and swapping proceeds in the usual fashion.
However, if a.get_allocator() != b.get_allocator(), the program has undefined behaviour (cf. [container.requirements.general/8].
The upshot is that swapping has become a non-trivial operation in C++11 as soon as your container starts supporting stateful allocators. That's a somewhat "advanced use case", but it's not entirely unlikely, since move optimizations usually only become interesting once your class manages a resource, and memory is one of the most popular resources.
In the assignment operator of a class, you usually need to check if the object being assigned is the invoking object so you don't screw things up:
Class& Class::operator=(const Class& rhs) {
if (this != &rhs) {
// do the assignment
}
return *this;
}
Do you need the same thing for the move assignment operator? Is there ever a situation where this == &rhs would be true?
? Class::operator=(Class&& rhs) {
?
}
First, the Copy and Swap is not always the correct way to implement Copy Assignment. Almost certainly in the case of dumb_array, this is a sub-optimal solution.
The use of Copy and Swap is for dumb_array is a classic example of putting the most expensive operation with the fullest features at the bottom layer. It is perfect for clients who want the fullest feature and are willing to pay the performance penalty. They get exactly what they want.
But it is disastrous for clients who do not need the fullest feature and are instead looking for the highest performance. For them dumb_array is just another piece of software they have to rewrite because it is too slow. Had dumb_array been designed differently, it could have satisfied both clients with no compromises to either client.
The key to satisfying both clients is to build the fastest operations in at the lowest level, and then to add API on top of that for fuller features at more expense. I.e. you need the strong exception guarantee, fine, you pay for it. You don't need it? Here's a faster solution.
Let's get concrete: Here's the fast, basic exception guarantee Copy Assignment operator for dumb_array:
dumb_array& operator=(const dumb_array& other)
{
if (this != &other)
{
if (mSize != other.mSize)
{
delete [] mArray;
mArray = nullptr;
mArray = other.mSize ? new int[other.mSize] : nullptr;
mSize = other.mSize;
}
std::copy(other.mArray, other.mArray + mSize, mArray);
}
return *this;
}
Explanation:
One of the more expensive things you can do on modern hardware is make a trip to the heap. Anything you can do to avoid a trip to the heap is time & effort well spent. Clients of dumb_array may well want to often assign arrays of the same size. And when they do, all you need to do is a memcpy (hidden under std::copy). You don't want to allocate a new array of the same size and then deallocate the old one of the same size!
Now for your clients who actually want strong exception safety:
template <class C>
C&
strong_assign(C& lhs, C rhs)
{
swap(lhs, rhs);
return lhs;
}
Or maybe if you want to take advantage of move assignment in C++11 that should be:
template <class C>
C&
strong_assign(C& lhs, C rhs)
{
lhs = std::move(rhs);
return lhs;
}
If dumb_array's clients value speed, they should call the operator=. If they need strong exception safety, there are generic algorithms they can call that will work on a wide variety of objects and need only be implemented once.
Now back to the original question (which has a type-o at this point in time):
Class&
Class::operator=(Class&& rhs)
{
if (this == &rhs) // is this check needed?
{
// ...
}
return *this;
}
This is actually a controversial question. Some will say yes, absolutely, some will say no.
My personal opinion is no, you don't need this check.
Rationale:
When an object binds to an rvalue reference it is one of two things:
A temporary.
An object the caller wants you to believe is a temporary.
If you have a reference to an object that is an actual temporary, then by definition, you have a unique reference to that object. It can't possibly be referenced by anywhere else in your entire program. I.e. this == &temporary is not possible.
Now if your client has lied to you and promised you that you're getting a temporary when you're not, then it is the client's responsibility to be sure that you don't have to care. If you want to be really careful, I believe that this would be a better implementation:
Class&
Class::operator=(Class&& other)
{
assert(this != &other);
// ...
return *this;
}
I.e. If you are passed a self reference, this is a bug on the part of the client that should be fixed.
For completeness, here is a move assignment operator for dumb_array:
dumb_array& operator=(dumb_array&& other)
{
assert(this != &other);
delete [] mArray;
mSize = other.mSize;
mArray = other.mArray;
other.mSize = 0;
other.mArray = nullptr;
return *this;
}
In the typical use case of move assignment, *this will be a moved-from object and so delete [] mArray; should be a no-op. It is critical that implementations make delete on a nullptr as fast as possible.
Caveat:
Some will argue that swap(x, x) is a good idea, or just a necessary evil. And this, if the swap goes to the default swap, can cause a self-move-assignment.
I disagree that swap(x, x) is ever a good idea. If found in my own code, I will consider it a performance bug and fix it. But in case you want to allow it, realize that swap(x, x) only does self-move-assignemnet on a moved-from value. And in our dumb_array example this will be perfectly harmless if we simply omit the assert, or constrain it to the moved-from case:
dumb_array& operator=(dumb_array&& other)
{
assert(this != &other || mSize == 0);
delete [] mArray;
mSize = other.mSize;
mArray = other.mArray;
other.mSize = 0;
other.mArray = nullptr;
return *this;
}
If you self-assign two moved-from (empty) dumb_array's, you don't do anything incorrect aside from inserting useless instructions into your program. This same observation can be made for the vast majority of objects.
<Update>
I've given this issue some more thought, and changed my position somewhat. I now believe that assignment should be tolerant of self assignment, but that the post conditions on copy assignment and move assignment are different:
For copy assignment:
x = y;
one should have a post-condition that the value of y should not be altered. When &x == &y then this postcondition translates into: self copy assignment should have no impact on the value of x.
For move assignment:
x = std::move(y);
one should have a post-condition that y has a valid but unspecified state. When &x == &y then this postcondition translates into: x has a valid but unspecified state. I.e. self move assignment does not have to be a no-op. But it should not crash. This post-condition is consistent with allowing swap(x, x) to just work:
template <class T>
void
swap(T& x, T& y)
{
// assume &x == &y
T tmp(std::move(x));
// x and y now have a valid but unspecified state
x = std::move(y);
// x and y still have a valid but unspecified state
y = std::move(tmp);
// x and y have the value of tmp, which is the value they had on entry
}
The above works, as long as x = std::move(x) doesn't crash. It can leave x in any valid but unspecified state.
I see three ways to program the move assignment operator for dumb_array to achieve this:
dumb_array& operator=(dumb_array&& other)
{
delete [] mArray;
// set *this to a valid state before continuing
mSize = 0;
mArray = nullptr;
// *this is now in a valid state, continue with move assignment
mSize = other.mSize;
mArray = other.mArray;
other.mSize = 0;
other.mArray = nullptr;
return *this;
}
The above implementation tolerates self assignment, but *this and other end up being a zero-sized array after the self-move assignment, no matter what the original value of *this is. This is fine.
dumb_array& operator=(dumb_array&& other)
{
if (this != &other)
{
delete [] mArray;
mSize = other.mSize;
mArray = other.mArray;
other.mSize = 0;
other.mArray = nullptr;
}
return *this;
}
The above implementation tolerates self assignment the same way the copy assignment operator does, by making it a no-op. This is also fine.
dumb_array& operator=(dumb_array&& other)
{
swap(other);
return *this;
}
The above is ok only if dumb_array does not hold resources that should be destructed "immediately". For example if the only resource is memory, the above is fine. If dumb_array could possibly hold mutex locks or the open state of files, the client could reasonably expect those resources on the lhs of the move assignment to be immediately released and therefore this implementation could be problematic.
The cost of the first is two extra stores. The cost of the second is a test-and-branch. Both work. Both meet all of the requirements of Table 22 MoveAssignable requirements in the C++11 standard. The third also works modulo the non-memory-resource-concern.
All three implementations can have different costs depending on the hardware: How expensive is a branch? Are there lots of registers or very few?
The take-away is that self-move-assignment, unlike self-copy-assignment, does not have to preserve the current value.
</Update>
One final (hopefully) edit inspired by Luc Danton's comment:
If you're writing a high level class that doesn't directly manage memory (but may have bases or members that do), then the best implementation of move assignment is often:
Class& operator=(Class&&) = default;
This will move assign each base and each member in turn, and will not include a this != &other check. This will give you the very highest performance and basic exception safety assuming no invariants need to be maintained among your bases and members. For your clients demanding strong exception safety, point them towards strong_assign.
First, you got the signature of the move-assignment operator wrong. Since moves steal resources from the source object, the source has to be a non-const r-value reference.
Class &Class::operator=( Class &&rhs ) {
//...
return *this;
}
Note that you still return via a (non-const) l-value reference.
For either type of direct assignment, the standard isn't to check for self-assignment, but to make sure a self-assignment doesn't cause a crash-and-burn. Generally, no one explicitly does x = x or y = std::move(y) calls, but aliasing, especially through multiple functions, may lead a = b or c = std::move(d) into being self-assignments. An explicit check for self-assignment, i.e. this == &rhs, that skips the meat of the function when true is one way to ensure self-assignment safety. But it's one of the worst ways, since it optimizes a (hopefully) rare case while it's an anti-optimization for the more common case (due to branching and possibly cache misses).
Now when (at least) one of the operands is a directly temporary object, you can never have a self-assignment scenario. Some people advocate assuming that case and optimize the code for it so much that the code becomes suicidally stupid when the assumption is wrong. I say that dumping the same-object check on users is irresponsible. We don't make that argument for copy-assignment; why reverse the position for move-assignment?
Let's make an example, altered from another respondent:
dumb_array& dumb_array::operator=(const dumb_array& other)
{
if (mSize != other.mSize)
{
delete [] mArray;
mArray = nullptr; // clear this...
mSize = 0u; // ...and this in case the next line throws
mArray = other.mSize ? new int[other.mSize] : nullptr;
mSize = other.mSize;
}
std::copy(other.mArray, other.mArray + mSize, mArray);
return *this;
}
This copy-assignment handles self-assignment gracefully without an explicit check. If the source and destination sizes differ, then deallocation and reallocation precede the copying. Otherwise, just the copying is done. Self-assignment does not get an optimized path, it gets dumped into the same path as when the source and destination sizes start off equal. The copying is technically unnecessary when the two objects are equivalent (including when they're the same object), but that's the price when not doing an equality check (value-wise or address-wise) since said check itself would be a waste most of the time. Note that the object self-assignment here will cause a series of element-level self-assignments; the element type has to be safe for doing this.
Like its source example, this copy-assignment provides the basic exception safety guarantee. If you want the strong guarantee, then use the unified-assignment operator from the original Copy and Swap query, which handles both copy- and move-assignment. But the point of this example is to reduce safety by one rank to gain speed. (BTW, we're assuming that the individual elements' values are independent; that there's no invariant constraint limiting some values compared to others.)
Let's look at a move-assignment for this same type:
class dumb_array
{
//...
void swap(dumb_array& other) noexcept
{
// Just in case we add UDT members later
using std::swap;
// both members are built-in types -> never throw
swap( this->mArray, other.mArray );
swap( this->mSize, other.mSize );
}
dumb_array& operator=(dumb_array&& other) noexcept
{
this->swap( other );
return *this;
}
//...
};
void swap( dumb_array &l, dumb_array &r ) noexcept { l.swap( r ); }
A swappable type that needs customization should have a two-argument free function called swap in the same namespace as the type. (The namespace restriction lets unqualified calls to swap to work.) A container type should also add a public swap member function to match the standard containers. If a member swap is not provided, then the free-function swap probably needs to be marked as a friend of the swappable type. If you customize moves to use swap, then you have to provide your own swapping code; the standard code calls the type's move code, which would result in infinite mutual recursion for move-customized types.
Like destructors, swap functions and move operations should be never-throw if at all possible, and probably marked as such (in C++11). Standard library types and routines have optimizations for non-throwable moving types.
This first version of move-assignment fulfills the basic contract. The source's resource markers are transferred to the destination object. The old resources won't be leaked since the source object now manages them. And the source object is left in a usable state where further operations, including assignment and destruction, can be applied to it.
Note that this move-assignment is automatically safe for self-assignment, since the swap call is. It's also strongly exception safe. The problem is unnecessary resource retention. The old resources for the destination are conceptually no longer needed, but here they are still around only so the source object can stay valid. If the scheduled destruction of the source object is a long way off, we're wasting resource space, or worse if the total resource space is limited and other resource petitions will happen before the (new) source object officially dies.
This issue is what caused the controversial current guru advice concerning self-targeting during move-assignment. The way to write move-assignment without lingering resources is something like:
class dumb_array
{
//...
dumb_array& operator=(dumb_array&& other) noexcept
{
delete [] this->mArray; // kill old resources
this->mArray = other.mArray;
this->mSize = other.mSize;
other.mArray = nullptr; // reset source
other.mSize = 0u;
return *this;
}
//...
};
The source is reset to default conditions, while the old destination resources are destroyed. In the self-assignment case, your current object ends up committing suicide. The main way around it is to surround the action code with an if(this != &other) block, or screw it and let clients eat an assert(this != &other) initial line (if you're feeling nice).
An alternative is to study how to make copy-assignment strongly exception safe, without unified-assignment, and apply it to move-assignment:
class dumb_array
{
//...
dumb_array& operator=(dumb_array&& other) noexcept
{
dumb_array temp{ std::move(other) };
this->swap( temp );
return *this;
}
//...
};
When other and this are distinct, other is emptied by the move to temp and stays that way. Then this loses its old resources to temp while getting the resources originally held by other. Then the old resources of this get killed when temp does.
When self-assignment happens, the emptying of other to temp empties this as well. Then target object gets its resources back when temp and this swap. The death of temp claims an empty object, which should be practically a no-op. The this/other object keeps its resources.
The move-assignment should be never-throw as long as move-construction and swapping are also. The cost of also being safe during self-assignment is a few more instructions over low-level types, which should be swamped out by the deallocation call.
I'm in the camp of those that want self-assignment safe operators, but don't want to write self-assignment checks in the implementations of operator=. And in fact I don't even want to implement operator= at all, I want the default behaviour to work 'right out of the box'. The best special members are those that come for free.
That being said, the MoveAssignable requirements present in the Standard are described as follows (from 17.6.3.1 Template argument requirements [utility.arg.requirements], n3290):
Expression Return type Return value Post-condition
t = rv T& t t is equivalent to the value of rv before the assignment
where the placeholders are described as: "t [is a] modifiable lvalue of type T;" and "rv is an rvalue of type T;". Note that those are requirements put on the types used as arguments to the templates of the Standard library, but looking elsewhere in the Standard I notice that every requirement on move assignment is similar to this one.
This means that a = std::move(a) has to be 'safe'. If what you need is an identity test (e.g. this != &other), then go for it, or else you won't even be able to put your objects into std::vector! (Unless you don't use those members/operations that do require MoveAssignable; but nevermind that.) Notice that with the previous example a = std::move(a), then this == &other will indeed hold.
As your current operator= function is written, since you've made the rvalue-reference argument const, there is no way you could "steal" the pointers and change the values of the incoming rvalue reference... you simply can't change it, you could only read from it. I would only see an issue if you were to start calling delete on pointers, etc. in your this object like you would in a normal lvaue-reference operator= method, but that sort of defeats the point of the rvalue-version ... i.e., it would seem redundant to use the rvalue version to basically do the same operations normally left to a const-lvalue operator= method.
Now if you defined your operator= to take a non-const rvalue-reference, then the only way I could see a check being required was if you passed the this object to a function that intentionally returned a rvalue reference rather than a temporary.
For instance, suppose someone tried to write an operator+ function, and utilize a mix of rvalue references and lvalue references in order to "prevent" extra temporaries from being created during some stacked addition operation on the object-type:
struct A; //defines operator=(A&& rhs) where it will "steal" the pointers
//of rhs and set the original pointers of rhs to NULL
A&& operator+(A& rhs, A&& lhs)
{
//...code
return std::move(rhs);
}
A&& operator+(A&& rhs, A&&lhs)
{
//...code
return std::move(rhs);
}
int main()
{
A a;
a = (a + A()) + A(); //calls operator=(A&&) with reference bound to a
//...rest of code
}
Now, from what I understand about rvalue references, doing the above is discouraged (i.e,. you should just return a temporary, not rvalue reference), but, if someone were to still do that, then you'd want to check to make sure the incoming rvalue-reference was not referencing the same object as the this pointer.
My answer is still that move assignment doesn't have to be save against self assigment, but it has a different explanation. Consider std::unique_ptr. If I were to implement one, I would do something like this:
unique_ptr& operator=(unique_ptr&& x) {
delete ptr_;
ptr_ = x.ptr_;
x.ptr_ = nullptr;
return *this;
}
If you look at Scott Meyers explaining this he does something similar. (If you wander why not to do swap - it has one extra write). And this is not safe for self assignment.
Sometimes this is unfortunate. Consider moving out of the vector all even numbers:
src.erase(
std::partition_copy(src.begin(), src.end(),
src.begin(),
std::back_inserter(even),
[](int num) { return num % 2; }
).first,
src.end());
This is ok for integers but I don't believe you can make something like this work with move semantics.
To conclude: move assignment to the object itself is not ok and you have to watch out for it.
Small update.
I disagree with Howard, which is a bad idea, but still - I think self move
assignment of "moved out" objects should work, because swap(x, x) should work. Algorithms love these things! It's always nice when a corner case just works. (And I am yet to see a case where it's not for free. Doesn't mean it doesn't exist though).
This is how assigning unique_ptrs is implemented in libc++:
unique_ptr& operator=(unique_ptr&& u) noexcept { reset(u.release()); ...}
It's safe for self move assignment.
Core Guidelines think it should be OK to self move assign.
There is a situation that (this == rhs) I can think of.
For this statement:
Myclass obj;
std::move(obj) = std::move(obj)
What is the copy-and-swap idiom and when should it be used? What problems does it solve? Does it change for C++11?
Related:
What are your favorite C++ Coding Style idioms: Copy-swap
Copy constructor and = operator overload in C++: is a common function possible?
What is copy elision and how it optimizes copy-and-swap idiom
C++: dynamically allocating an array of objects?
Overview
Why do we need the copy-and-swap idiom?
Any class that manages a resource (a wrapper, like a smart pointer) needs to implement The Big Three. While the goals and implementation of the copy-constructor and destructor are straightforward, the copy-assignment operator is arguably the most nuanced and difficult. How should it be done? What pitfalls need to be avoided?
The copy-and-swap idiom is the solution, and elegantly assists the assignment operator in achieving two things: avoiding code duplication, and providing a strong exception guarantee.
How does it work?
Conceptually, it works by using the copy-constructor's functionality to create a local copy of the data, then takes the copied data with a swap function, swapping the old data with the new data. The temporary copy then destructs, taking the old data with it. We are left with a copy of the new data.
In order to use the copy-and-swap idiom, we need three things: a working copy-constructor, a working destructor (both are the basis of any wrapper, so should be complete anyway), and a swap function.
A swap function is a non-throwing function that swaps two objects of a class, member for member. We might be tempted to use std::swap instead of providing our own, but this would be impossible; std::swap uses the copy-constructor and copy-assignment operator within its implementation, and we'd ultimately be trying to define the assignment operator in terms of itself!
(Not only that, but unqualified calls to swap will use our custom swap operator, skipping over the unnecessary construction and destruction of our class that std::swap would entail.)
An in-depth explanation
The goal
Let's consider a concrete case. We want to manage, in an otherwise useless class, a dynamic array. We start with a working constructor, copy-constructor, and destructor:
#include <algorithm> // std::copy
#include <cstddef> // std::size_t
class dumb_array
{
public:
// (default) constructor
dumb_array(std::size_t size = 0)
: mSize(size),
mArray(mSize ? new int[mSize]() : nullptr)
{
}
// copy-constructor
dumb_array(const dumb_array& other)
: mSize(other.mSize),
mArray(mSize ? new int[mSize] : nullptr)
{
// note that this is non-throwing, because of the data
// types being used; more attention to detail with regards
// to exceptions must be given in a more general case, however
std::copy(other.mArray, other.mArray + mSize, mArray);
}
// destructor
~dumb_array()
{
delete [] mArray;
}
private:
std::size_t mSize;
int* mArray;
};
This class almost manages the array successfully, but it needs operator= to work correctly.
A failed solution
Here's how a naive implementation might look:
// the hard part
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get rid of the old data...
delete [] mArray; // (2)
mArray = nullptr; // (2) *(see footnote for rationale)
// ...and put in the new
mSize = other.mSize; // (3)
mArray = mSize ? new int[mSize] : nullptr; // (3)
std::copy(other.mArray, other.mArray + mSize, mArray); // (3)
}
return *this;
}
And we say we're finished; this now manages an array, without leaks. However, it suffers from three problems, marked sequentially in the code as (n).
The first is the self-assignment test.
This check serves two purposes: it's an easy way to prevent us from running needless code on self-assignment, and it protects us from subtle bugs (such as deleting the array only to try and copy it). But in all other cases it merely serves to slow the program down, and act as noise in the code; self-assignment rarely occurs, so most of the time this check is a waste.
It would be better if the operator could work properly without it.
The second is that it only provides a basic exception guarantee. If new int[mSize] fails, *this will have been modified. (Namely, the size is wrong and the data is gone!)
For a strong exception guarantee, it would need to be something akin to:
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get the new data ready before we replace the old
std::size_t newSize = other.mSize;
int* newArray = newSize ? new int[newSize]() : nullptr; // (3)
std::copy(other.mArray, other.mArray + newSize, newArray); // (3)
// replace the old data (all are non-throwing)
delete [] mArray;
mSize = newSize;
mArray = newArray;
}
return *this;
}
The code has expanded! Which leads us to the third problem: code duplication.
Our assignment operator effectively duplicates all the code we've already written elsewhere, and that's a terrible thing.
In our case, the core of it is only two lines (the allocation and the copy), but with more complex resources this code bloat can be quite a hassle. We should strive to never repeat ourselves.
(One might wonder: if this much code is needed to manage one resource correctly, what if my class manages more than one?
While this may seem to be a valid concern, and indeed it requires non-trivial try/catch clauses, this is a non-issue.
That's because a class should manage one resource only!)
A successful solution
As mentioned, the copy-and-swap idiom will fix all these issues. But right now, we have all the requirements except one: a swap function. While The Rule of Three successfully entails the existence of our copy-constructor, assignment operator, and destructor, it should really be called "The Big Three and A Half": any time your class manages a resource it also makes sense to provide a swap function.
We need to add swap functionality to our class, and we do that as follows†:
class dumb_array
{
public:
// ...
friend void swap(dumb_array& first, dumb_array& second) // nothrow
{
// enable ADL (not necessary in our case, but good practice)
using std::swap;
// by swapping the members of two objects,
// the two objects are effectively swapped
swap(first.mSize, second.mSize);
swap(first.mArray, second.mArray);
}
// ...
};
(Here is the explanation why public friend swap.) Now not only can we swap our dumb_array's, but swaps in general can be more efficient; it merely swaps pointers and sizes, rather than allocating and copying entire arrays. Aside from this bonus in functionality and efficiency, we are now ready to implement the copy-and-swap idiom.
Without further ado, our assignment operator is:
dumb_array& operator=(dumb_array other) // (1)
{
swap(*this, other); // (2)
return *this;
}
And that's it! With one fell swoop, all three problems are elegantly tackled at once.
Why does it work?
We first notice an important choice: the parameter argument is taken by-value. While one could just as easily do the following (and indeed, many naive implementations of the idiom do):
dumb_array& operator=(const dumb_array& other)
{
dumb_array temp(other);
swap(*this, temp);
return *this;
}
We lose an important optimization opportunity. Not only that, but this choice is critical in C++11, which is discussed later. (On a general note, a remarkably useful guideline is as follows: if you're going to make a copy of something in a function, let the compiler do it in the parameter list.‡)
Either way, this method of obtaining our resource is the key to eliminating code duplication: we get to use the code from the copy-constructor to make the copy, and never need to repeat any bit of it. Now that the copy is made, we are ready to swap.
Observe that upon entering the function that all the new data is already allocated, copied, and ready to be used. This is what gives us a strong exception guarantee for free: we won't even enter the function if construction of the copy fails, and it's therefore not possible to alter the state of *this. (What we did manually before for a strong exception guarantee, the compiler is doing for us now; how kind.)
At this point we are home-free, because swap is non-throwing. We swap our current data with the copied data, safely altering our state, and the old data gets put into the temporary. The old data is then released when the function returns. (Where upon the parameter's scope ends and its destructor is called.)
Because the idiom repeats no code, we cannot introduce bugs within the operator. Note that this means we are rid of the need for a self-assignment check, allowing a single uniform implementation of operator=. (Additionally, we no longer have a performance penalty on non-self-assignments.)
And that is the copy-and-swap idiom.
What about C++11?
The next version of C++, C++11, makes one very important change to how we manage resources: the Rule of Three is now The Rule of Four (and a half). Why? Because not only do we need to be able to copy-construct our resource, we need to move-construct it as well.
Luckily for us, this is easy:
class dumb_array
{
public:
// ...
// move constructor
dumb_array(dumb_array&& other) noexcept ††
: dumb_array() // initialize via default constructor, C++11 only
{
swap(*this, other);
}
// ...
};
What's going on here? Recall the goal of move-construction: to take the resources from another instance of the class, leaving it in a state guaranteed to be assignable and destructible.
So what we've done is simple: initialize via the default constructor (a C++11 feature), then swap with other; we know a default constructed instance of our class can safely be assigned and destructed, so we know other will be able to do the same, after swapping.
(Note that some compilers do not support constructor delegation; in this case, we have to manually default construct the class. This is an unfortunate but luckily trivial task.)
Why does that work?
That is the only change we need to make to our class, so why does it work? Remember the ever-important decision we made to make the parameter a value and not a reference:
dumb_array& operator=(dumb_array other); // (1)
Now, if other is being initialized with an rvalue, it will be move-constructed. Perfect. In the same way C++03 let us re-use our copy-constructor functionality by taking the argument by-value, C++11 will automatically pick the move-constructor when appropriate as well. (And, of course, as mentioned in previously linked article, the copying/moving of the value may simply be elided altogether.)
And so concludes the copy-and-swap idiom.
Footnotes
*Why do we set mArray to null? Because if any further code in the operator throws, the destructor of dumb_array might be called; and if that happens without setting it to null, we attempt to delete memory that's already been deleted! We avoid this by setting it to null, as deleting null is a no-operation.
†There are other claims that we should specialize std::swap for our type, provide an in-class swap along-side a free-function swap, etc. But this is all unnecessary: any proper use of swap will be through an unqualified call, and our function will be found through ADL. One function will do.
‡The reason is simple: once you have the resource to yourself, you may swap and/or move it (C++11) anywhere it needs to be. And by making the copy in the parameter list, you maximize optimization.
††The move constructor should generally be noexcept, otherwise some code (e.g. std::vector resizing logic) will use the copy constructor even when a move would make sense. Of course, only mark it noexcept if the code inside doesn't throw exceptions.
Assignment, at its heart, is two steps: tearing down the object's old state and building its new state as a copy of some other object's state.
Basically, that's what the destructor and the copy constructor do, so the first idea would be to delegate the work to them. However, since destruction mustn't fail, while construction might, we actually want to do it the other way around: first perform the constructive part and, if that succeeded, then do the destructive part. The copy-and-swap idiom is a way to do just that: It first calls a class' copy constructor to create a temporary object, then swaps its data with the temporary's, and then lets the temporary's destructor destroy the old state.
Since swap() is supposed to never fail, the only part which might fail is the copy-construction. That is performed first, and if it fails, nothing will be changed in the targeted object.
In its refined form, copy-and-swap is implemented by having the copy performed by initializing the (non-reference) parameter of the assignment operator:
T& operator=(T tmp)
{
this->swap(tmp);
return *this;
}
There are some good answers already. I'll focus mainly on what I think they lack - an explanation of the "cons" with the copy-and-swap idiom....
What is the copy-and-swap idiom?
A way of implementing the assignment operator in terms of a swap function:
X& operator=(X rhs)
{
swap(rhs);
return *this;
}
The fundamental idea is that:
the most error-prone part of assigning to an object is ensuring any resources the new state needs are acquired (e.g. memory, descriptors)
that acquisition can be attempted before modifying the current state of the object (i.e. *this) if a copy of the new value is made, which is why rhs is accepted by value (i.e. copied) rather than by reference
swapping the state of the local copy rhs and *this is usually relatively easy to do without potential failure/exceptions, given the local copy doesn't need any particular state afterwards (just needs state fit for the destructor to run, much as for an object being moved from in >= C++11)
When should it be used? (Which problems does it solve [/create]?)
When you want the assigned-to objected unaffected by an assignment that throws an exception, assuming you have or can write a swap with strong exception guarantee, and ideally one that can't fail/throw..†
When you want a clean, easy to understand, robust way to define the assignment operator in terms of (simpler) copy constructor, swap and destructor functions.
Self-assignment done as a copy-and-swap avoids oft-overlooked edge cases.‡
When any performance penalty or momentarily higher resource usage created by having an extra temporary object during the assignment is not important to your application. ⁂
† swap throwing: it's generally possible to reliably swap data members that the objects track by pointer, but non-pointer data members that don't have a throw-free swap, or for which swapping has to be implemented as X tmp = lhs; lhs = rhs; rhs = tmp; and copy-construction or assignment may throw, still have the potential to fail leaving some data members swapped and others not. This potential applies even to C++03 std::string's as James comments on another answer:
#wilhelmtell: In C++03, there is no mention of exceptions potentially thrown by std::string::swap (which is called by std::swap). In C++0x, std::string::swap is noexcept and must not throw exceptions. – James McNellis Dec 22 '10 at 15:24
‡ assignment operator implementation that seems sane when assigning from a distinct object can easily fail for self-assignment. While it might seem unimaginable that client code would even attempt self-assignment, it can happen relatively easily during algo operations on containers, with x = f(x); code where f is (perhaps only for some #ifdef branches) a macro ala #define f(x) x or a function returning a reference to x, or even (likely inefficient but concise) code like x = c1 ? x * 2 : c2 ? x / 2 : x;). For example:
struct X
{
T* p_;
size_t size_;
X& operator=(const X& rhs)
{
delete[] p_; // OUCH!
p_ = new T[size_ = rhs.size_];
std::copy(p_, rhs.p_, rhs.p_ + rhs.size_);
}
...
};
On self-assignment, the above code delete's x.p_;, points p_ at a newly allocated heap region, then attempts to read the uninitialised data therein (Undefined Behaviour), if that doesn't do anything too weird, copy attempts a self-assignment to every just-destructed 'T'!
⁂ The copy-and-swap idiom can introduce inefficiencies or limitations due to the use of an extra temporary (when the operator's parameter is copy-constructed):
struct Client
{
IP_Address ip_address_;
int socket_;
X(const X& rhs)
: ip_address_(rhs.ip_address_), socket_(connect(rhs.ip_address_))
{ }
};
Here, a hand-written Client::operator= might check if *this is already connected to the same server as rhs (perhaps sending a "reset" code if useful), whereas the copy-and-swap approach would invoke the copy-constructor which would likely be written to open a distinct socket connection then close the original one. Not only could that mean a remote network interaction instead of a simple in-process variable copy, it could run afoul of client or server limits on socket resources or connections. (Of course this class has a pretty horrid interface, but that's another matter ;-P).
This answer is more like an addition and a slight modification to the answers above.
In some versions of Visual Studio (and possibly other compilers) there is a bug that is really annoying and doesn't make sense. So if you declare/define your swap function like this:
friend void swap(A& first, A& second) {
std::swap(first.size, second.size);
std::swap(first.arr, second.arr);
}
... the compiler will yell at you when you call the swap function:
This has something to do with a friend function being called and this object being passed as a parameter.
A way around this is to not use friend keyword and redefine the swap function:
void swap(A& other) {
std::swap(size, other.size);
std::swap(arr, other.arr);
}
This time, you can just call swap and pass in other, thus making the compiler happy:
After all, you don't need to use a friend function to swap 2 objects. It makes just as much sense to make swap a member function that has one other object as a parameter.
You already have access to this object, so passing it in as a parameter is technically redundant.
I would like to add a word of warning when you are dealing with C++11-style allocator-aware containers. Swapping and assignment have subtly different semantics.
For concreteness, let us consider a container std::vector<T, A>, where A is some stateful allocator type, and we'll compare the following functions:
void fs(std::vector<T, A> & a, std::vector<T, A> & b)
{
a.swap(b);
b.clear(); // not important what you do with b
}
void fm(std::vector<T, A> & a, std::vector<T, A> & b)
{
a = std::move(b);
}
The purpose of both functions fs and fm is to give a the state that b had initially. However, there is a hidden question: What happens if a.get_allocator() != b.get_allocator()? The answer is: It depends. Let's write AT = std::allocator_traits<A>.
If AT::propagate_on_container_move_assignment is std::true_type, then fm reassigns the allocator of a with the value of b.get_allocator(), otherwise it does not, and a continues to use its original allocator. In that case, the data elements need to be swapped individually, since the storage of a and b is not compatible.
If AT::propagate_on_container_swap is std::true_type, then fs swaps both data and allocators in the expected fashion.
If AT::propagate_on_container_swap is std::false_type, then we need a dynamic check.
If a.get_allocator() == b.get_allocator(), then the two containers use compatible storage, and swapping proceeds in the usual fashion.
However, if a.get_allocator() != b.get_allocator(), the program has undefined behaviour (cf. [container.requirements.general/8].
The upshot is that swapping has become a non-trivial operation in C++11 as soon as your container starts supporting stateful allocators. That's a somewhat "advanced use case", but it's not entirely unlikely, since move optimizations usually only become interesting once your class manages a resource, and memory is one of the most popular resources.