Is it legal to implement assignment operators as "destroy + construct"? - c++

I frequently need to implement C++ wrappers for "raw" resource handles, like file handles, Win32 OS handles and similar. When doing this, I also need to implement move operators, since the default compiler-generated ones will not clear the moved-from object, yielding double-delete problems.
When implementing the move assignment operator, I prefer to call the destructor explicitly and in-place recreate the object with placement new. That way, I avoid duplication of the destructor logic. In addition, I often implement copy assignment in terms of copy+move (when relevant). This leads to the following code:
/** Canonical move-assignment operator.
Assumes no const or reference members. */
TYPE& operator = (TYPE && other) noexcept {
if (&other == this)
return *this; // self-assign
static_assert(std::is_final<TYPE>::value, "class must be final");
static_assert(noexcept(this->~TYPE()), "dtor must be noexcept");
this->~TYPE();
static_assert(noexcept(TYPE(std::move(other))), "move-ctor must be noexcept");
new(this) TYPE(std::move(other));
return *this;
}
/** Canonical copy-assignment operator. */
TYPE& operator = (const TYPE& other) {
if (&other == this)
return *this; // self-assign
TYPE copy(other); // may throw
static_assert(noexcept(operator = (std::move(copy))), "move-assignment must be noexcept");
operator = (std::move(copy));
return *this;
}
It strikes me as odd, but I have not seen any recommendations online for implementing the move+copy assignment operators in this "canonical" way. Instead, most websites tend to implement the assignment operators in a type-specific way that must be manually kept in sync with the constructors & destructor when maintaining the class.
Are there any arguments (besides performance) against implementing the move & copy assignment operators in this type-independent "canonical" way?
UPDATE 2019-10-08 based on UB comments:
I've read through http://eel.is/c++draft/basic.life#8 that seem to cover the case in question. Extract:
If, after the lifetime of an object has ended ..., a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, ... will
automatically refer to the new object and, ..., can be used to manipulate the new object, if ...
There's some obvious conditions thereafter related to the same type and const/reference members, but they seem to be required for any assignment operator implementation.
Please correct me if I'm wrong, but this seems to me like my "canonical" sample is well behaved and not UB(?)
UPDATE 2019-10-10 based on copy-and-swap comments:
The assignment implementations can be merged into a single method that takes a value argument instead of reference. This also seem to eliminate the need for the static_assert and self-assignment checks. My new proposed implementation then becomes:
/** Canonical copy/move-assignment operator.
Assumes no const or reference members. */
TYPE& operator = (TYPE other) noexcept {
static_assert(!std::has_virtual_destructor<TYPE>::value, "dtor cannot be virtual");
this->~TYPE();
new(this) TYPE(std::move(other));
return *this;
}

There is a strong argument against your "canonical" implementation — it is wrong.
You end the lifetime the original object and create a new object in its place. However, pointers, references, etc. to the original object are not automatically updated to point to the new object — you have to use std::launder. (This sentence is wrong for most classes; see Davis Herring’s comment.) Then, the destructor is automatically called on the original object, triggering undefined behavior.
Reference: (emphasis mine) [class.dtor]/16
Once a destructor is invoked for an object, the object no longer
exists; the behavior is undefined if the destructor is invoked for an
object whose lifetime has ended. [ Example: If the destructor
for an automatic object is explicitly invoked, and the block is
subsequently left in a manner that would ordinarily invoke implicit
destruction of the object, the behavior is undefined.
— end example ]
[basic.life]/1
[...] The lifetime of an object o of type T ends when:
if T is a class type with a non-trivial destructor ([class.dtor]), the destructor call starts, or
the storage which the object occupies is released, or is reused by an object that is not nested within o ([intro.object]).
(Depending on whether the destructor of your class is trivial, the line of code that ends the lifetime of the object is different. If the destructor is non-trivial, explicitly calling the destructor ends the lifetime of the object; otherwise, the placement new reuses the storage of the current object, ending its lifetime. In either case, the lifetime of the object has been ended when the assignment operator returns.)
You may think that this is yet another "any sane implementation will do the right thing" kind of undefined behavior, but actually many compiler optimization involve caching values, which take advantage of this specification. Therefore, your code can break at any time when the code is compiled under a different optimization level, by a different compiler, with a different version of the same compiler, or when the compiler just had a terrible day and is in a bad mood.
The actual "canonical" way is to use the copy-and-swap idiom:
// copy constructor is implemented normally
C::C(const C& other)
: // ...
{
// ...
}
// move constructor = default construct + swap
C::C(C&& other) noexcept
: C{}
{
swap(*this, other);
}
// assignment operator = (copy +) swap
C& C::operator=(C other) noexcept // C is taken by value to handle both copy and move
{
swap(*this, other);
return *this;
}
Note that, here, you need to provide a custom swap function instead of using std::swap, as mentioned by Howard Hinnant:
friend void swap(C& lhs, C& rhs) noexcept
{
// swap the members
}
If used properly, copy-and-swap incurs no overhead if the relevant functions are properly inlined (which should be pretty trivial). This idiom is very commonly used, and an average C++ programmer should have little trouble understanding it. Instead of being afraid that it will cause confusion, just take 2 minutes to learn it and then use it.
This time, we are swapping the values of the objects, and the lifetime of the object is not affected. The object is still the original object, just with a different value, not a brand new object. Think of it this way: you want to stop a kid from bullying others. Swapping values is like civilly educating them, whereas "destroying + constructing" is like killing them making them temporarily dead and giving them a brand new brain (possibly with the help of magic). The latter method can have some undesirable side effects, to say the least.
Like any other idiom, use it when appropriate — don't just use it for the sake of using it.

I believe the example in http://eel.is/c++draft/basic.life#8 clearly proves that assignment operators can be implemented through inplace "destroy + construct" assuming certain limitations related to non-const, non-overlapping objects and more.

Related

Is it feasible to optimize the assignment operator with swap [duplicate]

What is the copy-and-swap idiom and when should it be used? What problems does it solve? Does it change for C++11?
Related:
What are your favorite C++ Coding Style idioms: Copy-swap
Copy constructor and = operator overload in C++: is a common function possible?
What is copy elision and how it optimizes copy-and-swap idiom
C++: dynamically allocating an array of objects?
Overview
Why do we need the copy-and-swap idiom?
Any class that manages a resource (a wrapper, like a smart pointer) needs to implement The Big Three. While the goals and implementation of the copy-constructor and destructor are straightforward, the copy-assignment operator is arguably the most nuanced and difficult. How should it be done? What pitfalls need to be avoided?
The copy-and-swap idiom is the solution, and elegantly assists the assignment operator in achieving two things: avoiding code duplication, and providing a strong exception guarantee.
How does it work?
Conceptually, it works by using the copy-constructor's functionality to create a local copy of the data, then takes the copied data with a swap function, swapping the old data with the new data. The temporary copy then destructs, taking the old data with it. We are left with a copy of the new data.
In order to use the copy-and-swap idiom, we need three things: a working copy-constructor, a working destructor (both are the basis of any wrapper, so should be complete anyway), and a swap function.
A swap function is a non-throwing function that swaps two objects of a class, member for member. We might be tempted to use std::swap instead of providing our own, but this would be impossible; std::swap uses the copy-constructor and copy-assignment operator within its implementation, and we'd ultimately be trying to define the assignment operator in terms of itself!
(Not only that, but unqualified calls to swap will use our custom swap operator, skipping over the unnecessary construction and destruction of our class that std::swap would entail.)
An in-depth explanation
The goal
Let's consider a concrete case. We want to manage, in an otherwise useless class, a dynamic array. We start with a working constructor, copy-constructor, and destructor:
#include <algorithm> // std::copy
#include <cstddef> // std::size_t
class dumb_array
{
public:
// (default) constructor
dumb_array(std::size_t size = 0)
: mSize(size),
mArray(mSize ? new int[mSize]() : nullptr)
{
}
// copy-constructor
dumb_array(const dumb_array& other)
: mSize(other.mSize),
mArray(mSize ? new int[mSize] : nullptr)
{
// note that this is non-throwing, because of the data
// types being used; more attention to detail with regards
// to exceptions must be given in a more general case, however
std::copy(other.mArray, other.mArray + mSize, mArray);
}
// destructor
~dumb_array()
{
delete [] mArray;
}
private:
std::size_t mSize;
int* mArray;
};
This class almost manages the array successfully, but it needs operator= to work correctly.
A failed solution
Here's how a naive implementation might look:
// the hard part
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get rid of the old data...
delete [] mArray; // (2)
mArray = nullptr; // (2) *(see footnote for rationale)
// ...and put in the new
mSize = other.mSize; // (3)
mArray = mSize ? new int[mSize] : nullptr; // (3)
std::copy(other.mArray, other.mArray + mSize, mArray); // (3)
}
return *this;
}
And we say we're finished; this now manages an array, without leaks. However, it suffers from three problems, marked sequentially in the code as (n).
The first is the self-assignment test.
This check serves two purposes: it's an easy way to prevent us from running needless code on self-assignment, and it protects us from subtle bugs (such as deleting the array only to try and copy it). But in all other cases it merely serves to slow the program down, and act as noise in the code; self-assignment rarely occurs, so most of the time this check is a waste.
It would be better if the operator could work properly without it.
The second is that it only provides a basic exception guarantee. If new int[mSize] fails, *this will have been modified. (Namely, the size is wrong and the data is gone!)
For a strong exception guarantee, it would need to be something akin to:
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get the new data ready before we replace the old
std::size_t newSize = other.mSize;
int* newArray = newSize ? new int[newSize]() : nullptr; // (3)
std::copy(other.mArray, other.mArray + newSize, newArray); // (3)
// replace the old data (all are non-throwing)
delete [] mArray;
mSize = newSize;
mArray = newArray;
}
return *this;
}
The code has expanded! Which leads us to the third problem: code duplication.
Our assignment operator effectively duplicates all the code we've already written elsewhere, and that's a terrible thing.
In our case, the core of it is only two lines (the allocation and the copy), but with more complex resources this code bloat can be quite a hassle. We should strive to never repeat ourselves.
(One might wonder: if this much code is needed to manage one resource correctly, what if my class manages more than one?
While this may seem to be a valid concern, and indeed it requires non-trivial try/catch clauses, this is a non-issue.
That's because a class should manage one resource only!)
A successful solution
As mentioned, the copy-and-swap idiom will fix all these issues. But right now, we have all the requirements except one: a swap function. While The Rule of Three successfully entails the existence of our copy-constructor, assignment operator, and destructor, it should really be called "The Big Three and A Half": any time your class manages a resource it also makes sense to provide a swap function.
We need to add swap functionality to our class, and we do that as follows†:
class dumb_array
{
public:
// ...
friend void swap(dumb_array& first, dumb_array& second) // nothrow
{
// enable ADL (not necessary in our case, but good practice)
using std::swap;
// by swapping the members of two objects,
// the two objects are effectively swapped
swap(first.mSize, second.mSize);
swap(first.mArray, second.mArray);
}
// ...
};
(Here is the explanation why public friend swap.) Now not only can we swap our dumb_array's, but swaps in general can be more efficient; it merely swaps pointers and sizes, rather than allocating and copying entire arrays. Aside from this bonus in functionality and efficiency, we are now ready to implement the copy-and-swap idiom.
Without further ado, our assignment operator is:
dumb_array& operator=(dumb_array other) // (1)
{
swap(*this, other); // (2)
return *this;
}
And that's it! With one fell swoop, all three problems are elegantly tackled at once.
Why does it work?
We first notice an important choice: the parameter argument is taken by-value. While one could just as easily do the following (and indeed, many naive implementations of the idiom do):
dumb_array& operator=(const dumb_array& other)
{
dumb_array temp(other);
swap(*this, temp);
return *this;
}
We lose an important optimization opportunity. Not only that, but this choice is critical in C++11, which is discussed later. (On a general note, a remarkably useful guideline is as follows: if you're going to make a copy of something in a function, let the compiler do it in the parameter list.‡)
Either way, this method of obtaining our resource is the key to eliminating code duplication: we get to use the code from the copy-constructor to make the copy, and never need to repeat any bit of it. Now that the copy is made, we are ready to swap.
Observe that upon entering the function that all the new data is already allocated, copied, and ready to be used. This is what gives us a strong exception guarantee for free: we won't even enter the function if construction of the copy fails, and it's therefore not possible to alter the state of *this. (What we did manually before for a strong exception guarantee, the compiler is doing for us now; how kind.)
At this point we are home-free, because swap is non-throwing. We swap our current data with the copied data, safely altering our state, and the old data gets put into the temporary. The old data is then released when the function returns. (Where upon the parameter's scope ends and its destructor is called.)
Because the idiom repeats no code, we cannot introduce bugs within the operator. Note that this means we are rid of the need for a self-assignment check, allowing a single uniform implementation of operator=. (Additionally, we no longer have a performance penalty on non-self-assignments.)
And that is the copy-and-swap idiom.
What about C++11?
The next version of C++, C++11, makes one very important change to how we manage resources: the Rule of Three is now The Rule of Four (and a half). Why? Because not only do we need to be able to copy-construct our resource, we need to move-construct it as well.
Luckily for us, this is easy:
class dumb_array
{
public:
// ...
// move constructor
dumb_array(dumb_array&& other) noexcept ††
: dumb_array() // initialize via default constructor, C++11 only
{
swap(*this, other);
}
// ...
};
What's going on here? Recall the goal of move-construction: to take the resources from another instance of the class, leaving it in a state guaranteed to be assignable and destructible.
So what we've done is simple: initialize via the default constructor (a C++11 feature), then swap with other; we know a default constructed instance of our class can safely be assigned and destructed, so we know other will be able to do the same, after swapping.
(Note that some compilers do not support constructor delegation; in this case, we have to manually default construct the class. This is an unfortunate but luckily trivial task.)
Why does that work?
That is the only change we need to make to our class, so why does it work? Remember the ever-important decision we made to make the parameter a value and not a reference:
dumb_array& operator=(dumb_array other); // (1)
Now, if other is being initialized with an rvalue, it will be move-constructed. Perfect. In the same way C++03 let us re-use our copy-constructor functionality by taking the argument by-value, C++11 will automatically pick the move-constructor when appropriate as well. (And, of course, as mentioned in previously linked article, the copying/moving of the value may simply be elided altogether.)
And so concludes the copy-and-swap idiom.
Footnotes
*Why do we set mArray to null? Because if any further code in the operator throws, the destructor of dumb_array might be called; and if that happens without setting it to null, we attempt to delete memory that's already been deleted! We avoid this by setting it to null, as deleting null is a no-operation.
†There are other claims that we should specialize std::swap for our type, provide an in-class swap along-side a free-function swap, etc. But this is all unnecessary: any proper use of swap will be through an unqualified call, and our function will be found through ADL. One function will do.
‡The reason is simple: once you have the resource to yourself, you may swap and/or move it (C++11) anywhere it needs to be. And by making the copy in the parameter list, you maximize optimization.
††The move constructor should generally be noexcept, otherwise some code (e.g. std::vector resizing logic) will use the copy constructor even when a move would make sense. Of course, only mark it noexcept if the code inside doesn't throw exceptions.
Assignment, at its heart, is two steps: tearing down the object's old state and building its new state as a copy of some other object's state.
Basically, that's what the destructor and the copy constructor do, so the first idea would be to delegate the work to them. However, since destruction mustn't fail, while construction might, we actually want to do it the other way around: first perform the constructive part and, if that succeeded, then do the destructive part. The copy-and-swap idiom is a way to do just that: It first calls a class' copy constructor to create a temporary object, then swaps its data with the temporary's, and then lets the temporary's destructor destroy the old state.
Since swap() is supposed to never fail, the only part which might fail is the copy-construction. That is performed first, and if it fails, nothing will be changed in the targeted object.
In its refined form, copy-and-swap is implemented by having the copy performed by initializing the (non-reference) parameter of the assignment operator:
T& operator=(T tmp)
{
this->swap(tmp);
return *this;
}
There are some good answers already. I'll focus mainly on what I think they lack - an explanation of the "cons" with the copy-and-swap idiom....
What is the copy-and-swap idiom?
A way of implementing the assignment operator in terms of a swap function:
X& operator=(X rhs)
{
swap(rhs);
return *this;
}
The fundamental idea is that:
the most error-prone part of assigning to an object is ensuring any resources the new state needs are acquired (e.g. memory, descriptors)
that acquisition can be attempted before modifying the current state of the object (i.e. *this) if a copy of the new value is made, which is why rhs is accepted by value (i.e. copied) rather than by reference
swapping the state of the local copy rhs and *this is usually relatively easy to do without potential failure/exceptions, given the local copy doesn't need any particular state afterwards (just needs state fit for the destructor to run, much as for an object being moved from in >= C++11)
When should it be used? (Which problems does it solve [/create]?)
When you want the assigned-to objected unaffected by an assignment that throws an exception, assuming you have or can write a swap with strong exception guarantee, and ideally one that can't fail/throw..†
When you want a clean, easy to understand, robust way to define the assignment operator in terms of (simpler) copy constructor, swap and destructor functions.
Self-assignment done as a copy-and-swap avoids oft-overlooked edge cases.‡
When any performance penalty or momentarily higher resource usage created by having an extra temporary object during the assignment is not important to your application. ⁂
† swap throwing: it's generally possible to reliably swap data members that the objects track by pointer, but non-pointer data members that don't have a throw-free swap, or for which swapping has to be implemented as X tmp = lhs; lhs = rhs; rhs = tmp; and copy-construction or assignment may throw, still have the potential to fail leaving some data members swapped and others not. This potential applies even to C++03 std::string's as James comments on another answer:
#wilhelmtell: In C++03, there is no mention of exceptions potentially thrown by std::string::swap (which is called by std::swap). In C++0x, std::string::swap is noexcept and must not throw exceptions. – James McNellis Dec 22 '10 at 15:24
‡ assignment operator implementation that seems sane when assigning from a distinct object can easily fail for self-assignment. While it might seem unimaginable that client code would even attempt self-assignment, it can happen relatively easily during algo operations on containers, with x = f(x); code where f is (perhaps only for some #ifdef branches) a macro ala #define f(x) x or a function returning a reference to x, or even (likely inefficient but concise) code like x = c1 ? x * 2 : c2 ? x / 2 : x;). For example:
struct X
{
T* p_;
size_t size_;
X& operator=(const X& rhs)
{
delete[] p_; // OUCH!
p_ = new T[size_ = rhs.size_];
std::copy(p_, rhs.p_, rhs.p_ + rhs.size_);
}
...
};
On self-assignment, the above code delete's x.p_;, points p_ at a newly allocated heap region, then attempts to read the uninitialised data therein (Undefined Behaviour), if that doesn't do anything too weird, copy attempts a self-assignment to every just-destructed 'T'!
⁂ The copy-and-swap idiom can introduce inefficiencies or limitations due to the use of an extra temporary (when the operator's parameter is copy-constructed):
struct Client
{
IP_Address ip_address_;
int socket_;
X(const X& rhs)
: ip_address_(rhs.ip_address_), socket_(connect(rhs.ip_address_))
{ }
};
Here, a hand-written Client::operator= might check if *this is already connected to the same server as rhs (perhaps sending a "reset" code if useful), whereas the copy-and-swap approach would invoke the copy-constructor which would likely be written to open a distinct socket connection then close the original one. Not only could that mean a remote network interaction instead of a simple in-process variable copy, it could run afoul of client or server limits on socket resources or connections. (Of course this class has a pretty horrid interface, but that's another matter ;-P).
This answer is more like an addition and a slight modification to the answers above.
In some versions of Visual Studio (and possibly other compilers) there is a bug that is really annoying and doesn't make sense. So if you declare/define your swap function like this:
friend void swap(A& first, A& second) {
std::swap(first.size, second.size);
std::swap(first.arr, second.arr);
}
... the compiler will yell at you when you call the swap function:
This has something to do with a friend function being called and this object being passed as a parameter.
A way around this is to not use friend keyword and redefine the swap function:
void swap(A& other) {
std::swap(size, other.size);
std::swap(arr, other.arr);
}
This time, you can just call swap and pass in other, thus making the compiler happy:
After all, you don't need to use a friend function to swap 2 objects. It makes just as much sense to make swap a member function that has one other object as a parameter.
You already have access to this object, so passing it in as a parameter is technically redundant.
I would like to add a word of warning when you are dealing with C++11-style allocator-aware containers. Swapping and assignment have subtly different semantics.
For concreteness, let us consider a container std::vector<T, A>, where A is some stateful allocator type, and we'll compare the following functions:
void fs(std::vector<T, A> & a, std::vector<T, A> & b)
{
a.swap(b);
b.clear(); // not important what you do with b
}
void fm(std::vector<T, A> & a, std::vector<T, A> & b)
{
a = std::move(b);
}
The purpose of both functions fs and fm is to give a the state that b had initially. However, there is a hidden question: What happens if a.get_allocator() != b.get_allocator()? The answer is: It depends. Let's write AT = std::allocator_traits<A>.
If AT::propagate_on_container_move_assignment is std::true_type, then fm reassigns the allocator of a with the value of b.get_allocator(), otherwise it does not, and a continues to use its original allocator. In that case, the data elements need to be swapped individually, since the storage of a and b is not compatible.
If AT::propagate_on_container_swap is std::true_type, then fs swaps both data and allocators in the expected fashion.
If AT::propagate_on_container_swap is std::false_type, then we need a dynamic check.
If a.get_allocator() == b.get_allocator(), then the two containers use compatible storage, and swapping proceeds in the usual fashion.
However, if a.get_allocator() != b.get_allocator(), the program has undefined behaviour (cf. [container.requirements.general/8].
The upshot is that swapping has become a non-trivial operation in C++11 as soon as your container starts supporting stateful allocators. That's a somewhat "advanced use case", but it's not entirely unlikely, since move optimizations usually only become interesting once your class manages a resource, and memory is one of the most popular resources.

Why do I need the copy-and-swap idiom? [duplicate]

What is the copy-and-swap idiom and when should it be used? What problems does it solve? Does it change for C++11?
Related:
What are your favorite C++ Coding Style idioms: Copy-swap
Copy constructor and = operator overload in C++: is a common function possible?
What is copy elision and how it optimizes copy-and-swap idiom
C++: dynamically allocating an array of objects?
Overview
Why do we need the copy-and-swap idiom?
Any class that manages a resource (a wrapper, like a smart pointer) needs to implement The Big Three. While the goals and implementation of the copy-constructor and destructor are straightforward, the copy-assignment operator is arguably the most nuanced and difficult. How should it be done? What pitfalls need to be avoided?
The copy-and-swap idiom is the solution, and elegantly assists the assignment operator in achieving two things: avoiding code duplication, and providing a strong exception guarantee.
How does it work?
Conceptually, it works by using the copy-constructor's functionality to create a local copy of the data, then takes the copied data with a swap function, swapping the old data with the new data. The temporary copy then destructs, taking the old data with it. We are left with a copy of the new data.
In order to use the copy-and-swap idiom, we need three things: a working copy-constructor, a working destructor (both are the basis of any wrapper, so should be complete anyway), and a swap function.
A swap function is a non-throwing function that swaps two objects of a class, member for member. We might be tempted to use std::swap instead of providing our own, but this would be impossible; std::swap uses the copy-constructor and copy-assignment operator within its implementation, and we'd ultimately be trying to define the assignment operator in terms of itself!
(Not only that, but unqualified calls to swap will use our custom swap operator, skipping over the unnecessary construction and destruction of our class that std::swap would entail.)
An in-depth explanation
The goal
Let's consider a concrete case. We want to manage, in an otherwise useless class, a dynamic array. We start with a working constructor, copy-constructor, and destructor:
#include <algorithm> // std::copy
#include <cstddef> // std::size_t
class dumb_array
{
public:
// (default) constructor
dumb_array(std::size_t size = 0)
: mSize(size),
mArray(mSize ? new int[mSize]() : nullptr)
{
}
// copy-constructor
dumb_array(const dumb_array& other)
: mSize(other.mSize),
mArray(mSize ? new int[mSize] : nullptr)
{
// note that this is non-throwing, because of the data
// types being used; more attention to detail with regards
// to exceptions must be given in a more general case, however
std::copy(other.mArray, other.mArray + mSize, mArray);
}
// destructor
~dumb_array()
{
delete [] mArray;
}
private:
std::size_t mSize;
int* mArray;
};
This class almost manages the array successfully, but it needs operator= to work correctly.
A failed solution
Here's how a naive implementation might look:
// the hard part
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get rid of the old data...
delete [] mArray; // (2)
mArray = nullptr; // (2) *(see footnote for rationale)
// ...and put in the new
mSize = other.mSize; // (3)
mArray = mSize ? new int[mSize] : nullptr; // (3)
std::copy(other.mArray, other.mArray + mSize, mArray); // (3)
}
return *this;
}
And we say we're finished; this now manages an array, without leaks. However, it suffers from three problems, marked sequentially in the code as (n).
The first is the self-assignment test.
This check serves two purposes: it's an easy way to prevent us from running needless code on self-assignment, and it protects us from subtle bugs (such as deleting the array only to try and copy it). But in all other cases it merely serves to slow the program down, and act as noise in the code; self-assignment rarely occurs, so most of the time this check is a waste.
It would be better if the operator could work properly without it.
The second is that it only provides a basic exception guarantee. If new int[mSize] fails, *this will have been modified. (Namely, the size is wrong and the data is gone!)
For a strong exception guarantee, it would need to be something akin to:
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get the new data ready before we replace the old
std::size_t newSize = other.mSize;
int* newArray = newSize ? new int[newSize]() : nullptr; // (3)
std::copy(other.mArray, other.mArray + newSize, newArray); // (3)
// replace the old data (all are non-throwing)
delete [] mArray;
mSize = newSize;
mArray = newArray;
}
return *this;
}
The code has expanded! Which leads us to the third problem: code duplication.
Our assignment operator effectively duplicates all the code we've already written elsewhere, and that's a terrible thing.
In our case, the core of it is only two lines (the allocation and the copy), but with more complex resources this code bloat can be quite a hassle. We should strive to never repeat ourselves.
(One might wonder: if this much code is needed to manage one resource correctly, what if my class manages more than one?
While this may seem to be a valid concern, and indeed it requires non-trivial try/catch clauses, this is a non-issue.
That's because a class should manage one resource only!)
A successful solution
As mentioned, the copy-and-swap idiom will fix all these issues. But right now, we have all the requirements except one: a swap function. While The Rule of Three successfully entails the existence of our copy-constructor, assignment operator, and destructor, it should really be called "The Big Three and A Half": any time your class manages a resource it also makes sense to provide a swap function.
We need to add swap functionality to our class, and we do that as follows†:
class dumb_array
{
public:
// ...
friend void swap(dumb_array& first, dumb_array& second) // nothrow
{
// enable ADL (not necessary in our case, but good practice)
using std::swap;
// by swapping the members of two objects,
// the two objects are effectively swapped
swap(first.mSize, second.mSize);
swap(first.mArray, second.mArray);
}
// ...
};
(Here is the explanation why public friend swap.) Now not only can we swap our dumb_array's, but swaps in general can be more efficient; it merely swaps pointers and sizes, rather than allocating and copying entire arrays. Aside from this bonus in functionality and efficiency, we are now ready to implement the copy-and-swap idiom.
Without further ado, our assignment operator is:
dumb_array& operator=(dumb_array other) // (1)
{
swap(*this, other); // (2)
return *this;
}
And that's it! With one fell swoop, all three problems are elegantly tackled at once.
Why does it work?
We first notice an important choice: the parameter argument is taken by-value. While one could just as easily do the following (and indeed, many naive implementations of the idiom do):
dumb_array& operator=(const dumb_array& other)
{
dumb_array temp(other);
swap(*this, temp);
return *this;
}
We lose an important optimization opportunity. Not only that, but this choice is critical in C++11, which is discussed later. (On a general note, a remarkably useful guideline is as follows: if you're going to make a copy of something in a function, let the compiler do it in the parameter list.‡)
Either way, this method of obtaining our resource is the key to eliminating code duplication: we get to use the code from the copy-constructor to make the copy, and never need to repeat any bit of it. Now that the copy is made, we are ready to swap.
Observe that upon entering the function that all the new data is already allocated, copied, and ready to be used. This is what gives us a strong exception guarantee for free: we won't even enter the function if construction of the copy fails, and it's therefore not possible to alter the state of *this. (What we did manually before for a strong exception guarantee, the compiler is doing for us now; how kind.)
At this point we are home-free, because swap is non-throwing. We swap our current data with the copied data, safely altering our state, and the old data gets put into the temporary. The old data is then released when the function returns. (Where upon the parameter's scope ends and its destructor is called.)
Because the idiom repeats no code, we cannot introduce bugs within the operator. Note that this means we are rid of the need for a self-assignment check, allowing a single uniform implementation of operator=. (Additionally, we no longer have a performance penalty on non-self-assignments.)
And that is the copy-and-swap idiom.
What about C++11?
The next version of C++, C++11, makes one very important change to how we manage resources: the Rule of Three is now The Rule of Four (and a half). Why? Because not only do we need to be able to copy-construct our resource, we need to move-construct it as well.
Luckily for us, this is easy:
class dumb_array
{
public:
// ...
// move constructor
dumb_array(dumb_array&& other) noexcept ††
: dumb_array() // initialize via default constructor, C++11 only
{
swap(*this, other);
}
// ...
};
What's going on here? Recall the goal of move-construction: to take the resources from another instance of the class, leaving it in a state guaranteed to be assignable and destructible.
So what we've done is simple: initialize via the default constructor (a C++11 feature), then swap with other; we know a default constructed instance of our class can safely be assigned and destructed, so we know other will be able to do the same, after swapping.
(Note that some compilers do not support constructor delegation; in this case, we have to manually default construct the class. This is an unfortunate but luckily trivial task.)
Why does that work?
That is the only change we need to make to our class, so why does it work? Remember the ever-important decision we made to make the parameter a value and not a reference:
dumb_array& operator=(dumb_array other); // (1)
Now, if other is being initialized with an rvalue, it will be move-constructed. Perfect. In the same way C++03 let us re-use our copy-constructor functionality by taking the argument by-value, C++11 will automatically pick the move-constructor when appropriate as well. (And, of course, as mentioned in previously linked article, the copying/moving of the value may simply be elided altogether.)
And so concludes the copy-and-swap idiom.
Footnotes
*Why do we set mArray to null? Because if any further code in the operator throws, the destructor of dumb_array might be called; and if that happens without setting it to null, we attempt to delete memory that's already been deleted! We avoid this by setting it to null, as deleting null is a no-operation.
†There are other claims that we should specialize std::swap for our type, provide an in-class swap along-side a free-function swap, etc. But this is all unnecessary: any proper use of swap will be through an unqualified call, and our function will be found through ADL. One function will do.
‡The reason is simple: once you have the resource to yourself, you may swap and/or move it (C++11) anywhere it needs to be. And by making the copy in the parameter list, you maximize optimization.
††The move constructor should generally be noexcept, otherwise some code (e.g. std::vector resizing logic) will use the copy constructor even when a move would make sense. Of course, only mark it noexcept if the code inside doesn't throw exceptions.
Assignment, at its heart, is two steps: tearing down the object's old state and building its new state as a copy of some other object's state.
Basically, that's what the destructor and the copy constructor do, so the first idea would be to delegate the work to them. However, since destruction mustn't fail, while construction might, we actually want to do it the other way around: first perform the constructive part and, if that succeeded, then do the destructive part. The copy-and-swap idiom is a way to do just that: It first calls a class' copy constructor to create a temporary object, then swaps its data with the temporary's, and then lets the temporary's destructor destroy the old state.
Since swap() is supposed to never fail, the only part which might fail is the copy-construction. That is performed first, and if it fails, nothing will be changed in the targeted object.
In its refined form, copy-and-swap is implemented by having the copy performed by initializing the (non-reference) parameter of the assignment operator:
T& operator=(T tmp)
{
this->swap(tmp);
return *this;
}
There are some good answers already. I'll focus mainly on what I think they lack - an explanation of the "cons" with the copy-and-swap idiom....
What is the copy-and-swap idiom?
A way of implementing the assignment operator in terms of a swap function:
X& operator=(X rhs)
{
swap(rhs);
return *this;
}
The fundamental idea is that:
the most error-prone part of assigning to an object is ensuring any resources the new state needs are acquired (e.g. memory, descriptors)
that acquisition can be attempted before modifying the current state of the object (i.e. *this) if a copy of the new value is made, which is why rhs is accepted by value (i.e. copied) rather than by reference
swapping the state of the local copy rhs and *this is usually relatively easy to do without potential failure/exceptions, given the local copy doesn't need any particular state afterwards (just needs state fit for the destructor to run, much as for an object being moved from in >= C++11)
When should it be used? (Which problems does it solve [/create]?)
When you want the assigned-to objected unaffected by an assignment that throws an exception, assuming you have or can write a swap with strong exception guarantee, and ideally one that can't fail/throw..†
When you want a clean, easy to understand, robust way to define the assignment operator in terms of (simpler) copy constructor, swap and destructor functions.
Self-assignment done as a copy-and-swap avoids oft-overlooked edge cases.‡
When any performance penalty or momentarily higher resource usage created by having an extra temporary object during the assignment is not important to your application. ⁂
† swap throwing: it's generally possible to reliably swap data members that the objects track by pointer, but non-pointer data members that don't have a throw-free swap, or for which swapping has to be implemented as X tmp = lhs; lhs = rhs; rhs = tmp; and copy-construction or assignment may throw, still have the potential to fail leaving some data members swapped and others not. This potential applies even to C++03 std::string's as James comments on another answer:
#wilhelmtell: In C++03, there is no mention of exceptions potentially thrown by std::string::swap (which is called by std::swap). In C++0x, std::string::swap is noexcept and must not throw exceptions. – James McNellis Dec 22 '10 at 15:24
‡ assignment operator implementation that seems sane when assigning from a distinct object can easily fail for self-assignment. While it might seem unimaginable that client code would even attempt self-assignment, it can happen relatively easily during algo operations on containers, with x = f(x); code where f is (perhaps only for some #ifdef branches) a macro ala #define f(x) x or a function returning a reference to x, or even (likely inefficient but concise) code like x = c1 ? x * 2 : c2 ? x / 2 : x;). For example:
struct X
{
T* p_;
size_t size_;
X& operator=(const X& rhs)
{
delete[] p_; // OUCH!
p_ = new T[size_ = rhs.size_];
std::copy(p_, rhs.p_, rhs.p_ + rhs.size_);
}
...
};
On self-assignment, the above code delete's x.p_;, points p_ at a newly allocated heap region, then attempts to read the uninitialised data therein (Undefined Behaviour), if that doesn't do anything too weird, copy attempts a self-assignment to every just-destructed 'T'!
⁂ The copy-and-swap idiom can introduce inefficiencies or limitations due to the use of an extra temporary (when the operator's parameter is copy-constructed):
struct Client
{
IP_Address ip_address_;
int socket_;
X(const X& rhs)
: ip_address_(rhs.ip_address_), socket_(connect(rhs.ip_address_))
{ }
};
Here, a hand-written Client::operator= might check if *this is already connected to the same server as rhs (perhaps sending a "reset" code if useful), whereas the copy-and-swap approach would invoke the copy-constructor which would likely be written to open a distinct socket connection then close the original one. Not only could that mean a remote network interaction instead of a simple in-process variable copy, it could run afoul of client or server limits on socket resources or connections. (Of course this class has a pretty horrid interface, but that's another matter ;-P).
This answer is more like an addition and a slight modification to the answers above.
In some versions of Visual Studio (and possibly other compilers) there is a bug that is really annoying and doesn't make sense. So if you declare/define your swap function like this:
friend void swap(A& first, A& second) {
std::swap(first.size, second.size);
std::swap(first.arr, second.arr);
}
... the compiler will yell at you when you call the swap function:
This has something to do with a friend function being called and this object being passed as a parameter.
A way around this is to not use friend keyword and redefine the swap function:
void swap(A& other) {
std::swap(size, other.size);
std::swap(arr, other.arr);
}
This time, you can just call swap and pass in other, thus making the compiler happy:
After all, you don't need to use a friend function to swap 2 objects. It makes just as much sense to make swap a member function that has one other object as a parameter.
You already have access to this object, so passing it in as a parameter is technically redundant.
I would like to add a word of warning when you are dealing with C++11-style allocator-aware containers. Swapping and assignment have subtly different semantics.
For concreteness, let us consider a container std::vector<T, A>, where A is some stateful allocator type, and we'll compare the following functions:
void fs(std::vector<T, A> & a, std::vector<T, A> & b)
{
a.swap(b);
b.clear(); // not important what you do with b
}
void fm(std::vector<T, A> & a, std::vector<T, A> & b)
{
a = std::move(b);
}
The purpose of both functions fs and fm is to give a the state that b had initially. However, there is a hidden question: What happens if a.get_allocator() != b.get_allocator()? The answer is: It depends. Let's write AT = std::allocator_traits<A>.
If AT::propagate_on_container_move_assignment is std::true_type, then fm reassigns the allocator of a with the value of b.get_allocator(), otherwise it does not, and a continues to use its original allocator. In that case, the data elements need to be swapped individually, since the storage of a and b is not compatible.
If AT::propagate_on_container_swap is std::true_type, then fs swaps both data and allocators in the expected fashion.
If AT::propagate_on_container_swap is std::false_type, then we need a dynamic check.
If a.get_allocator() == b.get_allocator(), then the two containers use compatible storage, and swapping proceeds in the usual fashion.
However, if a.get_allocator() != b.get_allocator(), the program has undefined behaviour (cf. [container.requirements.general/8].
The upshot is that swapping has become a non-trivial operation in C++11 as soon as your container starts supporting stateful allocators. That's a somewhat "advanced use case", but it's not entirely unlikely, since move optimizations usually only become interesting once your class manages a resource, and memory is one of the most popular resources.

Implementing the copy constructor in terms of operator=

If the operator= is properly defined, is it OK to use the following as copy constructor?
MyClass::MyClass(MyClass const &_copy)
{
*this = _copy;
}
If all members of MyClass have a default constructor, yes.
Note that usually it is the other way around:
class MyClass
{
public:
MyClass(MyClass const&); // Implemented
void swap(MyClass&) throw(); // Implemented
MyClass& operator=(MyClass rhs) { rhs.swap(*this); return *this; }
};
We pass by value in operator= so that the copy constructor gets called. Note that everything is exception safe, since swap is guaranteed not to throw (you have to ensure this in your implementation).
EDIT, as requested, about the call-by-value stuff: The operator= could be written as
MyClass& MyClass::operator=(MyClass const& rhs)
{
MyClass tmp(rhs);
tmp.swap(*this);
return *this;
}
C++ students are usually told to pass class instances by reference because the copy constructor gets called if they are passed by value. In our case, we have to copy rhs anyway, so passing by value is fine.
Thus, the operator= (first version, call by value) reads:
Make a copy of rhs (via the copy constructor, automatically called)
Swap its contents with *this
Return *this and let rhs (which contains the old value) be destroyed at method exit.
Now, we have an extra bonus with this call-by-value. If the object being passed to operator= (or any function which gets its arguments by value) is a temporary object, the compiler can (and usually does) make no copy at all. This is called copy elision.
Therefore, if rhs is temporary, no copy is made. We are left with:
Swap this and rhs contents
Destroy rhs
So passing by value is in this case more efficient than passing by reference.
It is more advisable to implement operator= in terms of an exception safe copy constructor. See Example 4. in this from Herb Sutter for an explanation of the technique and why it's a good idea.
http://www.gotw.ca/gotw/059.htm
This implementation implies that the default constructors for all the data members (and base classes) are available and accessible from MyClass, because they will be called first, before making the assignment. Even in this case, having this extra call for the constructors might be expensive (depending on the content of the class).
I would still stick to separate implementation of the copy constructor through initialization list, even if it means writing more code.
Another thing: This implementation might have side effects (e.g. if you have dynamically allocated members).
While the end result is the same, the members are first default initialized, only copied after that.
With 'expensive' members, you better copy-construct with an initializer list.
struct C {
ExpensiveType member;
C( const C& other ): member(other.member) {}
};
};
I would say this is not okay if MyClass allocates memory or is mutable.
yes.
personally, if your class doesn't have pointers though I'd not overload the equal operator or write the copy constructor and let the compiler do it for you; it will implement a shallow copy and you'll know for sure that all member data is copied, whereas if you overload the = op; and then add a data member and then forget to update the overload you'll have a problem.
#Alexandre - I am not sure about passing by value in assignment operator. What is the advantage you will get by calling copy constructor there? Is this going to fasten the assignment operator?
P.S. I don't know how to write comments. Or may be I am not allowed to write comments.
It is technically OK, if you have a working assignment operator (copy operator).
However, you should prefer copy-and-swap because:
Exception safety is easier with copy-swap
Most logical separation of concerns:
The copy-ctor is about allocating the resources it needs (to copy the other stuff).
The swap function is (mostly) only about exchanging internal "handles" and doesn't need to do resource (de)allocation
The destructor is about resource deallocation
Copy-and-swap naturally combines these three function in the assignment/copy operator

What is the copy-and-swap idiom?

What is the copy-and-swap idiom and when should it be used? What problems does it solve? Does it change for C++11?
Related:
What are your favorite C++ Coding Style idioms: Copy-swap
Copy constructor and = operator overload in C++: is a common function possible?
What is copy elision and how it optimizes copy-and-swap idiom
C++: dynamically allocating an array of objects?
Overview
Why do we need the copy-and-swap idiom?
Any class that manages a resource (a wrapper, like a smart pointer) needs to implement The Big Three. While the goals and implementation of the copy-constructor and destructor are straightforward, the copy-assignment operator is arguably the most nuanced and difficult. How should it be done? What pitfalls need to be avoided?
The copy-and-swap idiom is the solution, and elegantly assists the assignment operator in achieving two things: avoiding code duplication, and providing a strong exception guarantee.
How does it work?
Conceptually, it works by using the copy-constructor's functionality to create a local copy of the data, then takes the copied data with a swap function, swapping the old data with the new data. The temporary copy then destructs, taking the old data with it. We are left with a copy of the new data.
In order to use the copy-and-swap idiom, we need three things: a working copy-constructor, a working destructor (both are the basis of any wrapper, so should be complete anyway), and a swap function.
A swap function is a non-throwing function that swaps two objects of a class, member for member. We might be tempted to use std::swap instead of providing our own, but this would be impossible; std::swap uses the copy-constructor and copy-assignment operator within its implementation, and we'd ultimately be trying to define the assignment operator in terms of itself!
(Not only that, but unqualified calls to swap will use our custom swap operator, skipping over the unnecessary construction and destruction of our class that std::swap would entail.)
An in-depth explanation
The goal
Let's consider a concrete case. We want to manage, in an otherwise useless class, a dynamic array. We start with a working constructor, copy-constructor, and destructor:
#include <algorithm> // std::copy
#include <cstddef> // std::size_t
class dumb_array
{
public:
// (default) constructor
dumb_array(std::size_t size = 0)
: mSize(size),
mArray(mSize ? new int[mSize]() : nullptr)
{
}
// copy-constructor
dumb_array(const dumb_array& other)
: mSize(other.mSize),
mArray(mSize ? new int[mSize] : nullptr)
{
// note that this is non-throwing, because of the data
// types being used; more attention to detail with regards
// to exceptions must be given in a more general case, however
std::copy(other.mArray, other.mArray + mSize, mArray);
}
// destructor
~dumb_array()
{
delete [] mArray;
}
private:
std::size_t mSize;
int* mArray;
};
This class almost manages the array successfully, but it needs operator= to work correctly.
A failed solution
Here's how a naive implementation might look:
// the hard part
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get rid of the old data...
delete [] mArray; // (2)
mArray = nullptr; // (2) *(see footnote for rationale)
// ...and put in the new
mSize = other.mSize; // (3)
mArray = mSize ? new int[mSize] : nullptr; // (3)
std::copy(other.mArray, other.mArray + mSize, mArray); // (3)
}
return *this;
}
And we say we're finished; this now manages an array, without leaks. However, it suffers from three problems, marked sequentially in the code as (n).
The first is the self-assignment test.
This check serves two purposes: it's an easy way to prevent us from running needless code on self-assignment, and it protects us from subtle bugs (such as deleting the array only to try and copy it). But in all other cases it merely serves to slow the program down, and act as noise in the code; self-assignment rarely occurs, so most of the time this check is a waste.
It would be better if the operator could work properly without it.
The second is that it only provides a basic exception guarantee. If new int[mSize] fails, *this will have been modified. (Namely, the size is wrong and the data is gone!)
For a strong exception guarantee, it would need to be something akin to:
dumb_array& operator=(const dumb_array& other)
{
if (this != &other) // (1)
{
// get the new data ready before we replace the old
std::size_t newSize = other.mSize;
int* newArray = newSize ? new int[newSize]() : nullptr; // (3)
std::copy(other.mArray, other.mArray + newSize, newArray); // (3)
// replace the old data (all are non-throwing)
delete [] mArray;
mSize = newSize;
mArray = newArray;
}
return *this;
}
The code has expanded! Which leads us to the third problem: code duplication.
Our assignment operator effectively duplicates all the code we've already written elsewhere, and that's a terrible thing.
In our case, the core of it is only two lines (the allocation and the copy), but with more complex resources this code bloat can be quite a hassle. We should strive to never repeat ourselves.
(One might wonder: if this much code is needed to manage one resource correctly, what if my class manages more than one?
While this may seem to be a valid concern, and indeed it requires non-trivial try/catch clauses, this is a non-issue.
That's because a class should manage one resource only!)
A successful solution
As mentioned, the copy-and-swap idiom will fix all these issues. But right now, we have all the requirements except one: a swap function. While The Rule of Three successfully entails the existence of our copy-constructor, assignment operator, and destructor, it should really be called "The Big Three and A Half": any time your class manages a resource it also makes sense to provide a swap function.
We need to add swap functionality to our class, and we do that as follows†:
class dumb_array
{
public:
// ...
friend void swap(dumb_array& first, dumb_array& second) // nothrow
{
// enable ADL (not necessary in our case, but good practice)
using std::swap;
// by swapping the members of two objects,
// the two objects are effectively swapped
swap(first.mSize, second.mSize);
swap(first.mArray, second.mArray);
}
// ...
};
(Here is the explanation why public friend swap.) Now not only can we swap our dumb_array's, but swaps in general can be more efficient; it merely swaps pointers and sizes, rather than allocating and copying entire arrays. Aside from this bonus in functionality and efficiency, we are now ready to implement the copy-and-swap idiom.
Without further ado, our assignment operator is:
dumb_array& operator=(dumb_array other) // (1)
{
swap(*this, other); // (2)
return *this;
}
And that's it! With one fell swoop, all three problems are elegantly tackled at once.
Why does it work?
We first notice an important choice: the parameter argument is taken by-value. While one could just as easily do the following (and indeed, many naive implementations of the idiom do):
dumb_array& operator=(const dumb_array& other)
{
dumb_array temp(other);
swap(*this, temp);
return *this;
}
We lose an important optimization opportunity. Not only that, but this choice is critical in C++11, which is discussed later. (On a general note, a remarkably useful guideline is as follows: if you're going to make a copy of something in a function, let the compiler do it in the parameter list.‡)
Either way, this method of obtaining our resource is the key to eliminating code duplication: we get to use the code from the copy-constructor to make the copy, and never need to repeat any bit of it. Now that the copy is made, we are ready to swap.
Observe that upon entering the function that all the new data is already allocated, copied, and ready to be used. This is what gives us a strong exception guarantee for free: we won't even enter the function if construction of the copy fails, and it's therefore not possible to alter the state of *this. (What we did manually before for a strong exception guarantee, the compiler is doing for us now; how kind.)
At this point we are home-free, because swap is non-throwing. We swap our current data with the copied data, safely altering our state, and the old data gets put into the temporary. The old data is then released when the function returns. (Where upon the parameter's scope ends and its destructor is called.)
Because the idiom repeats no code, we cannot introduce bugs within the operator. Note that this means we are rid of the need for a self-assignment check, allowing a single uniform implementation of operator=. (Additionally, we no longer have a performance penalty on non-self-assignments.)
And that is the copy-and-swap idiom.
What about C++11?
The next version of C++, C++11, makes one very important change to how we manage resources: the Rule of Three is now The Rule of Four (and a half). Why? Because not only do we need to be able to copy-construct our resource, we need to move-construct it as well.
Luckily for us, this is easy:
class dumb_array
{
public:
// ...
// move constructor
dumb_array(dumb_array&& other) noexcept ††
: dumb_array() // initialize via default constructor, C++11 only
{
swap(*this, other);
}
// ...
};
What's going on here? Recall the goal of move-construction: to take the resources from another instance of the class, leaving it in a state guaranteed to be assignable and destructible.
So what we've done is simple: initialize via the default constructor (a C++11 feature), then swap with other; we know a default constructed instance of our class can safely be assigned and destructed, so we know other will be able to do the same, after swapping.
(Note that some compilers do not support constructor delegation; in this case, we have to manually default construct the class. This is an unfortunate but luckily trivial task.)
Why does that work?
That is the only change we need to make to our class, so why does it work? Remember the ever-important decision we made to make the parameter a value and not a reference:
dumb_array& operator=(dumb_array other); // (1)
Now, if other is being initialized with an rvalue, it will be move-constructed. Perfect. In the same way C++03 let us re-use our copy-constructor functionality by taking the argument by-value, C++11 will automatically pick the move-constructor when appropriate as well. (And, of course, as mentioned in previously linked article, the copying/moving of the value may simply be elided altogether.)
And so concludes the copy-and-swap idiom.
Footnotes
*Why do we set mArray to null? Because if any further code in the operator throws, the destructor of dumb_array might be called; and if that happens without setting it to null, we attempt to delete memory that's already been deleted! We avoid this by setting it to null, as deleting null is a no-operation.
†There are other claims that we should specialize std::swap for our type, provide an in-class swap along-side a free-function swap, etc. But this is all unnecessary: any proper use of swap will be through an unqualified call, and our function will be found through ADL. One function will do.
‡The reason is simple: once you have the resource to yourself, you may swap and/or move it (C++11) anywhere it needs to be. And by making the copy in the parameter list, you maximize optimization.
††The move constructor should generally be noexcept, otherwise some code (e.g. std::vector resizing logic) will use the copy constructor even when a move would make sense. Of course, only mark it noexcept if the code inside doesn't throw exceptions.
Assignment, at its heart, is two steps: tearing down the object's old state and building its new state as a copy of some other object's state.
Basically, that's what the destructor and the copy constructor do, so the first idea would be to delegate the work to them. However, since destruction mustn't fail, while construction might, we actually want to do it the other way around: first perform the constructive part and, if that succeeded, then do the destructive part. The copy-and-swap idiom is a way to do just that: It first calls a class' copy constructor to create a temporary object, then swaps its data with the temporary's, and then lets the temporary's destructor destroy the old state.
Since swap() is supposed to never fail, the only part which might fail is the copy-construction. That is performed first, and if it fails, nothing will be changed in the targeted object.
In its refined form, copy-and-swap is implemented by having the copy performed by initializing the (non-reference) parameter of the assignment operator:
T& operator=(T tmp)
{
this->swap(tmp);
return *this;
}
There are some good answers already. I'll focus mainly on what I think they lack - an explanation of the "cons" with the copy-and-swap idiom....
What is the copy-and-swap idiom?
A way of implementing the assignment operator in terms of a swap function:
X& operator=(X rhs)
{
swap(rhs);
return *this;
}
The fundamental idea is that:
the most error-prone part of assigning to an object is ensuring any resources the new state needs are acquired (e.g. memory, descriptors)
that acquisition can be attempted before modifying the current state of the object (i.e. *this) if a copy of the new value is made, which is why rhs is accepted by value (i.e. copied) rather than by reference
swapping the state of the local copy rhs and *this is usually relatively easy to do without potential failure/exceptions, given the local copy doesn't need any particular state afterwards (just needs state fit for the destructor to run, much as for an object being moved from in >= C++11)
When should it be used? (Which problems does it solve [/create]?)
When you want the assigned-to objected unaffected by an assignment that throws an exception, assuming you have or can write a swap with strong exception guarantee, and ideally one that can't fail/throw..†
When you want a clean, easy to understand, robust way to define the assignment operator in terms of (simpler) copy constructor, swap and destructor functions.
Self-assignment done as a copy-and-swap avoids oft-overlooked edge cases.‡
When any performance penalty or momentarily higher resource usage created by having an extra temporary object during the assignment is not important to your application. ⁂
† swap throwing: it's generally possible to reliably swap data members that the objects track by pointer, but non-pointer data members that don't have a throw-free swap, or for which swapping has to be implemented as X tmp = lhs; lhs = rhs; rhs = tmp; and copy-construction or assignment may throw, still have the potential to fail leaving some data members swapped and others not. This potential applies even to C++03 std::string's as James comments on another answer:
#wilhelmtell: In C++03, there is no mention of exceptions potentially thrown by std::string::swap (which is called by std::swap). In C++0x, std::string::swap is noexcept and must not throw exceptions. – James McNellis Dec 22 '10 at 15:24
‡ assignment operator implementation that seems sane when assigning from a distinct object can easily fail for self-assignment. While it might seem unimaginable that client code would even attempt self-assignment, it can happen relatively easily during algo operations on containers, with x = f(x); code where f is (perhaps only for some #ifdef branches) a macro ala #define f(x) x or a function returning a reference to x, or even (likely inefficient but concise) code like x = c1 ? x * 2 : c2 ? x / 2 : x;). For example:
struct X
{
T* p_;
size_t size_;
X& operator=(const X& rhs)
{
delete[] p_; // OUCH!
p_ = new T[size_ = rhs.size_];
std::copy(p_, rhs.p_, rhs.p_ + rhs.size_);
}
...
};
On self-assignment, the above code delete's x.p_;, points p_ at a newly allocated heap region, then attempts to read the uninitialised data therein (Undefined Behaviour), if that doesn't do anything too weird, copy attempts a self-assignment to every just-destructed 'T'!
⁂ The copy-and-swap idiom can introduce inefficiencies or limitations due to the use of an extra temporary (when the operator's parameter is copy-constructed):
struct Client
{
IP_Address ip_address_;
int socket_;
X(const X& rhs)
: ip_address_(rhs.ip_address_), socket_(connect(rhs.ip_address_))
{ }
};
Here, a hand-written Client::operator= might check if *this is already connected to the same server as rhs (perhaps sending a "reset" code if useful), whereas the copy-and-swap approach would invoke the copy-constructor which would likely be written to open a distinct socket connection then close the original one. Not only could that mean a remote network interaction instead of a simple in-process variable copy, it could run afoul of client or server limits on socket resources or connections. (Of course this class has a pretty horrid interface, but that's another matter ;-P).
This answer is more like an addition and a slight modification to the answers above.
In some versions of Visual Studio (and possibly other compilers) there is a bug that is really annoying and doesn't make sense. So if you declare/define your swap function like this:
friend void swap(A& first, A& second) {
std::swap(first.size, second.size);
std::swap(first.arr, second.arr);
}
... the compiler will yell at you when you call the swap function:
This has something to do with a friend function being called and this object being passed as a parameter.
A way around this is to not use friend keyword and redefine the swap function:
void swap(A& other) {
std::swap(size, other.size);
std::swap(arr, other.arr);
}
This time, you can just call swap and pass in other, thus making the compiler happy:
After all, you don't need to use a friend function to swap 2 objects. It makes just as much sense to make swap a member function that has one other object as a parameter.
You already have access to this object, so passing it in as a parameter is technically redundant.
I would like to add a word of warning when you are dealing with C++11-style allocator-aware containers. Swapping and assignment have subtly different semantics.
For concreteness, let us consider a container std::vector<T, A>, where A is some stateful allocator type, and we'll compare the following functions:
void fs(std::vector<T, A> & a, std::vector<T, A> & b)
{
a.swap(b);
b.clear(); // not important what you do with b
}
void fm(std::vector<T, A> & a, std::vector<T, A> & b)
{
a = std::move(b);
}
The purpose of both functions fs and fm is to give a the state that b had initially. However, there is a hidden question: What happens if a.get_allocator() != b.get_allocator()? The answer is: It depends. Let's write AT = std::allocator_traits<A>.
If AT::propagate_on_container_move_assignment is std::true_type, then fm reassigns the allocator of a with the value of b.get_allocator(), otherwise it does not, and a continues to use its original allocator. In that case, the data elements need to be swapped individually, since the storage of a and b is not compatible.
If AT::propagate_on_container_swap is std::true_type, then fs swaps both data and allocators in the expected fashion.
If AT::propagate_on_container_swap is std::false_type, then we need a dynamic check.
If a.get_allocator() == b.get_allocator(), then the two containers use compatible storage, and swapping proceeds in the usual fashion.
However, if a.get_allocator() != b.get_allocator(), the program has undefined behaviour (cf. [container.requirements.general/8].
The upshot is that swapping has become a non-trivial operation in C++11 as soon as your container starts supporting stateful allocators. That's a somewhat "advanced use case", but it's not entirely unlikely, since move optimizations usually only become interesting once your class manages a resource, and memory is one of the most popular resources.

Who deletes the copied instance in + operator ? (c++)

I searched how to implement + operator properly all over the internet and all the results i found do the following steps :
const MyClass MyClass::operator+(const MyClass &other) const
{
MyClass result = *this; // Make a copy of myself. Same as MyClass result(*this);
result += other; // Use += to add other to the copy.
return result; // All done!
}
I have few questions about this "process" :
Isn't that stupid to implement + operator this way, it calls the assignment operator(which copies the class) in the first line and then the copy constructor in the return (which also copies the class , due to the fact that the return is by value, so it destroys the first copy and creates a new one.. which is frankly not really smart ... )
When i write a=b+c, the b+c part creates a new copy of the class, then the 'a=' part copies the copy to himself.
who deletes the copy that b+c created ?
Is there a better way to implement + operator without coping the class twice, and also without any memory issues ?
thanks in advance
That's effectively not an assignment operator, but a copy constructor. An operation like addition creates a new value, after all, so it has to be created somewhere. This is more efficient than it seems, since the compiler is free to do Return Value Optimization, which means it can construct the value directly where it will next be used.
The result is declared as a local variable, and hence goes away with the function call - except if RVO (see above) is used, in which case it was never actually created in the function, but in the caller.
Not really; this method is much more efficient than it looks at first.
Under the circumstances, I'd probably consider something like:
MyClass MyClass::operator+(MyClass other) {
other += *this;
return other;
}
Dave Abrahams wrote an article a while back explaining how this works and why this kind of code is usually quite efficient even though it initially seems like it shouldn't be.
Edit (thank you MSalters): Yes, this does assume/depend upon the commutative property holding for MyClass. If a+b != b+a, then the original code is what you want (most of the same reasoning applies).
it calls the assignment operator(which copies the class) in the first line
No, this is copy-initialization (through constructor).
then the copy constructor in the return (which also copies the class
Compilers can (and typically do) elide this copy using NRVO.
When i write a=b+c, the b+c part creates a new copy of the class, then the 'a=' part copies the copy to himself. who deletes the copy that b+c created
The compiler, as any other temporary value. They are deleted at the end of full-expression (in this case, it means at or after ; at the end of line.
Is there a better way to implement + operator without coping the class twice, and also without any memory issues ?
Not really. It's not that inefficient.
This appears to be the correct way to implement operator+. A few points:
MyClass result = *this does not use the assignment operator, it should be calling the copy constructor, as if it were written MyClass result(*this).
The returned value when used in a = b + c is called a temporary, and the compiler is responsible for deleting it (which will probably happen at the end of the statement ie. the semicolon, after everything else has been done). You don't have to worry about that, the compiler will always clean up temporaries.
There's no better way, you need the copy. The compiler, however, is allowed to optimise away the temporary copies, so not as many as you think may be made. In C++0x though, you can use move constructors to improve performance by transfering ownership of the content of a temporary rather than copying it in its entirity.
I'll try my best to answer:
Point (1): No, it does not call the assignment operator. Instead it calls a constructor. Since you need to construct the object anyway (since operator+ returns a copy), this does not introduce extra operations.
Point (2): The temporary result is created in stack and hence does not introduce memory problem (it is destroyed when function exits). On return, a temporary is created so that an assignment (or copy constructor) can be used to assign the results to a (in a=b+c;) even after result is destroyed. This temporary is destroyed automatically by the compiler.
Point (3): The above is what the standard prescribes. Remember that compiler implementors are allowed to optimize the implementation as long as the effect is the same as what the standard prescribed. I believe, compilers in reality optimize away many of the copying that occurs here. Using the idiom above is readable and is not actually inefficient.
P.S. I sometime prefer to implement operator+ as a non-member to leverage implicit conversion for both sides of the operators (only if it makes sense).
There are no memory issues (provided that the assignment operator, and copy constructor are well written). Simply because all the memory for these objects is taken on the stack and managed by the compiler. Furthermore, compilers do optimize this and perform all the operations directly on the final a instead of copying twice.
This is the proper way of implementing the operator+ in C++. Most of the copies you are so afraid of will get elided by the compiler and will be subject to move semantics in C++0x.
The class is a temporary and will be deleted. If you bind the temporary to a const& the life time of the temporary will be extended to the life time of the const reference.
May implementing it as a freefunction is a little more obvious. The first parameter in MyClass::operator+ is an implicit this and the compiler will rewrite the function to operator+(const MyClass&, const MyClass&) anyway.
As far as I remember, Stroustrup's 'The C++ Programming Language' recommends to implement operators as member functions only when internal representation is affected by operation and as external functions when not. operator+ does not need to access internal representation if implemented based on operator+=, which does.
So you would have:
class MyClass
{
public:
MyClass& operator+=(const MyClass &other)
{
// Implementation
return *this;
}
};
MyClass operator+(const MyClass &op1, const MyClass &op2)
{
MyClass r = op1;
return r += op2;
}