Sometimes I want to make classes/structs with const members. I realize that this is a bad idea for multiple reasons, but for the sake of argument let's pretend the only reason is that it makes a well-formed operator = a hassle, to say the least. However, I contrived a fairly simple work-around to it, as demonstrated by this struct:
struct S {
const int i;
S(int i) : i(i) {}
S(const S& other) : i(other.i) {}
S& operator =(const S& other) {
new (this) S(other);
return *this;
}
};
Ignoring destructors and move semantics, is there any really big reason why this shouldn't be done? It seems to me like a more type-safe version of
S& operator =(const S& other) {
const_cast<int&>(i) = other.i;
return *this;
}
So, the summary of the question is this: is there any major reason placement-new should not be used to implement copy assignment to have the same semantics as a copy construction?
I don't believe that placement new is a problem here but the const_cast which produces undefined behavior:
C++ 10.1.7.1-4
Except that any class member declared mutable (10.1.1) can be modified, any attempt to modify a const object during its lifetime (6.6.3) results in undefined behavior.
You'll probably get away with this until compiler starts to optimize things.
The other problem is the use of a placement new on a piece memory occupied by living (non-destroyed) object. But you'll probably get away with this while object in question has a trivial destructor.
is there any really big reason why this shouldn't be done?
You must be absolutely sure that every derived class defines its own assignment operator, even if it is trivial. Because an implicitly defined copy-assignment operator of a derived class will screw everything. It'll call S::operator= which will re-create a wrong type of object in its place.
Such destroy-and-construct assignment operator can't be re-used by any derived class. So, not only you are forcing derived classes to provide an explicit copy operator, but you're forcing them to stick to the same destroy-and-construct idiom in their assignment operator.
You must be absolutely sure that no other thread is accessing the object while it is being destroyed-and-constructed by such assignment operator.
A class may have some data members that must not be affected by the assignment operator. For example, a thread-safe class may have some kind of mutex or critical section member, with some other thread waiting on them right when the current thread is going to destroy-and-construct that mutex...
Performance-wise, it has virtually no advantage over standard copy-and-swap idiom. So what would be the gain in going through all the pain mentioned above?
Related
I frequently need to implement C++ wrappers for "raw" resource handles, like file handles, Win32 OS handles and similar. When doing this, I also need to implement move operators, since the default compiler-generated ones will not clear the moved-from object, yielding double-delete problems.
When implementing the move assignment operator, I prefer to call the destructor explicitly and in-place recreate the object with placement new. That way, I avoid duplication of the destructor logic. In addition, I often implement copy assignment in terms of copy+move (when relevant). This leads to the following code:
/** Canonical move-assignment operator.
Assumes no const or reference members. */
TYPE& operator = (TYPE && other) noexcept {
if (&other == this)
return *this; // self-assign
static_assert(std::is_final<TYPE>::value, "class must be final");
static_assert(noexcept(this->~TYPE()), "dtor must be noexcept");
this->~TYPE();
static_assert(noexcept(TYPE(std::move(other))), "move-ctor must be noexcept");
new(this) TYPE(std::move(other));
return *this;
}
/** Canonical copy-assignment operator. */
TYPE& operator = (const TYPE& other) {
if (&other == this)
return *this; // self-assign
TYPE copy(other); // may throw
static_assert(noexcept(operator = (std::move(copy))), "move-assignment must be noexcept");
operator = (std::move(copy));
return *this;
}
It strikes me as odd, but I have not seen any recommendations online for implementing the move+copy assignment operators in this "canonical" way. Instead, most websites tend to implement the assignment operators in a type-specific way that must be manually kept in sync with the constructors & destructor when maintaining the class.
Are there any arguments (besides performance) against implementing the move & copy assignment operators in this type-independent "canonical" way?
UPDATE 2019-10-08 based on UB comments:
I've read through http://eel.is/c++draft/basic.life#8 that seem to cover the case in question. Extract:
If, after the lifetime of an object has ended ..., a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, ... will
automatically refer to the new object and, ..., can be used to manipulate the new object, if ...
There's some obvious conditions thereafter related to the same type and const/reference members, but they seem to be required for any assignment operator implementation.
Please correct me if I'm wrong, but this seems to me like my "canonical" sample is well behaved and not UB(?)
UPDATE 2019-10-10 based on copy-and-swap comments:
The assignment implementations can be merged into a single method that takes a value argument instead of reference. This also seem to eliminate the need for the static_assert and self-assignment checks. My new proposed implementation then becomes:
/** Canonical copy/move-assignment operator.
Assumes no const or reference members. */
TYPE& operator = (TYPE other) noexcept {
static_assert(!std::has_virtual_destructor<TYPE>::value, "dtor cannot be virtual");
this->~TYPE();
new(this) TYPE(std::move(other));
return *this;
}
There is a strong argument against your "canonical" implementation — it is wrong.
You end the lifetime the original object and create a new object in its place. However, pointers, references, etc. to the original object are not automatically updated to point to the new object — you have to use std::launder. (This sentence is wrong for most classes; see Davis Herring’s comment.) Then, the destructor is automatically called on the original object, triggering undefined behavior.
Reference: (emphasis mine) [class.dtor]/16
Once a destructor is invoked for an object, the object no longer
exists; the behavior is undefined if the destructor is invoked for an
object whose lifetime has ended. [ Example: If the destructor
for an automatic object is explicitly invoked, and the block is
subsequently left in a manner that would ordinarily invoke implicit
destruction of the object, the behavior is undefined.
— end example ]
[basic.life]/1
[...] The lifetime of an object o of type T ends when:
if T is a class type with a non-trivial destructor ([class.dtor]), the destructor call starts, or
the storage which the object occupies is released, or is reused by an object that is not nested within o ([intro.object]).
(Depending on whether the destructor of your class is trivial, the line of code that ends the lifetime of the object is different. If the destructor is non-trivial, explicitly calling the destructor ends the lifetime of the object; otherwise, the placement new reuses the storage of the current object, ending its lifetime. In either case, the lifetime of the object has been ended when the assignment operator returns.)
You may think that this is yet another "any sane implementation will do the right thing" kind of undefined behavior, but actually many compiler optimization involve caching values, which take advantage of this specification. Therefore, your code can break at any time when the code is compiled under a different optimization level, by a different compiler, with a different version of the same compiler, or when the compiler just had a terrible day and is in a bad mood.
The actual "canonical" way is to use the copy-and-swap idiom:
// copy constructor is implemented normally
C::C(const C& other)
: // ...
{
// ...
}
// move constructor = default construct + swap
C::C(C&& other) noexcept
: C{}
{
swap(*this, other);
}
// assignment operator = (copy +) swap
C& C::operator=(C other) noexcept // C is taken by value to handle both copy and move
{
swap(*this, other);
return *this;
}
Note that, here, you need to provide a custom swap function instead of using std::swap, as mentioned by Howard Hinnant:
friend void swap(C& lhs, C& rhs) noexcept
{
// swap the members
}
If used properly, copy-and-swap incurs no overhead if the relevant functions are properly inlined (which should be pretty trivial). This idiom is very commonly used, and an average C++ programmer should have little trouble understanding it. Instead of being afraid that it will cause confusion, just take 2 minutes to learn it and then use it.
This time, we are swapping the values of the objects, and the lifetime of the object is not affected. The object is still the original object, just with a different value, not a brand new object. Think of it this way: you want to stop a kid from bullying others. Swapping values is like civilly educating them, whereas "destroying + constructing" is like killing them making them temporarily dead and giving them a brand new brain (possibly with the help of magic). The latter method can have some undesirable side effects, to say the least.
Like any other idiom, use it when appropriate — don't just use it for the sake of using it.
I believe the example in http://eel.is/c++draft/basic.life#8 clearly proves that assignment operators can be implemented through inplace "destroy + construct" assuming certain limitations related to non-const, non-overlapping objects and more.
Background:
I have a complicated class with many variables. I have a sound and tested copy constructor:
Applepie::Applepie( const Applepie ©) :
m_crust(copy.m_crust),
m_filling(copy.m_filling)
{
}
Some of the member variable copy constructors called in the intializer list perform allocation.
Question:
I need to create operator=. Rather than duplicating the existing constuctor with assignment instead of initialization list, and freeing memory that's being replaced, and etc etc etc, can I simply do the following:
Applepie& Applepie::operator=( const Applepie ©)
{
if( this != ©)
{
this->~Applepie(); // release own object
new(this) Applepie(copy); // placement new copy constructor
}
return *this;
}
In other words, is destroy self followed by a placement new copy constructor semantically identical to operator= ?
This seems to have the potential to dramatically reduce repeat code and confirming that each variable is initialized properly, at the cost of potential slight loss of efficiency during assignment. Am I missing something more obscure?
Rationale:
My actual class has about 30 varaibles. I am concerned about the fact that both my copy constructor and my assignment operator have to copy all thirty, and that the code might diverge, causing the two operations to do things differently.
As Herb Sutter in "Exceptional C++" states, it is not exception safe. That means, if anything is going wrong during new or construction of the new object, the left hand operand of the assignment is in bad (undefined) state, calling for more trouble. I would strongly recommend using the copy & swap idiom.
Applepie& Applepie::operator=(Applepie copy)
{
swap(m_crust, copy.m_crust);
swap(m_filling, copy.m_filling);
return *this;
}
When your object uses the Pimpl idiom (pointer to implementation) also, the swap is done by changing only two pointers.
In addition to Rene's answer, there is also the problem of what would happen if ApplePie was a base class of the actual object: ApplePie would be replacing the object with an object of the wrong type!
I read about this from "Effective c++" ,this is Col.10.
It say it's a good way to have assignment operators return a reference to *this.
I wrote a code snippet to test this idea. I overridden the assignment operator here.And tested it. Everything is fine.
But when I remove that operator overriding, everything is the same. That means, the chaining assignment still works well. So, what am I missing? Why is that? Need some explanation from you guys, THank you.
#include <iostream>
using namespace std;
class Widget{
public:
Widget& operator=(int rhs)
{
return *this;
}
int value;
};
int main()
{
Widget mywidget;
mywidget.value = 1;
Widget mywidget2;
mywidget2.value = 2;
Widget mywidget3 ;
mywidget3.value = 3;
mywidget = mywidget2 = mywidget3;
cout << mywidget.value<<endl;
cout << mywidget2.value<<endl;
cout << mywidget3.value<<endl;
}
If you remove completely the operator= method, a default operator= will be created by the compiler, which implements shallow copy1 and returns a reference to *this.
Incidentally, when you write
mywidget = mywidget2 = mywidget3;
you're actually calling this default operator=, since your overloaded operator is designed to work with ints on the right side.
The chained assignment will stop working, instead, if you return, for example, a value, a const reference (=>you'll get compilation errors) or a reference to something different from *this (counterintuitive stuff will start to happen).
Partially related: the copy and swap idiom, i.e. the perfect way to write an assignment operator. Strongly advised read if you need to write an operator=
The default operator= will perform as if there were an assignment between each member of the left hand operand and each member of the right hand one. This means that for primitive types it will be a "brutal" bitwise copy, which in 90% of cases isn't ok for pointers to owned resources.
The question touches two different concepts, whether you should define operator= and whether in doing so you should return a reference to the object.
You should consider the rule of the three: if you define one of copy constructor, assignment operator or destructor you should define the three of them. The rationale around that rule is that if you need to provide a destructor it means that you are managing a resource, and in doing so, chances are that the default copy constructor and assignment operator won't cut it. As an example, if you hold memory through a raw pointer, then you need to release the memory in the destructor. If you don't provide the copy constructor and assignment operator, then the pointer will be copied and two different objects will try to release the memory held by the pointer.
While a pointer is the most common example, this applies to any resource. The exception is classes where you disable copy construction and assignment --but then again you are somehow defining them to be disabled.
On the second part of the question, or whether you should return a reference to the object, you should. The reason, as with all other operator overloads is that it is usually a good advice to mimic what the existing operators for basic types do. This is sometimes given by a quote: when overloading operators, do as ints do.
Widget& operator=(int rhs)
This allows you to assign an int to a Widget - e.g. mywidget = 3;
Make a Widget& operator=(Widget const & rhs) - it'll be called for your mywidget = mywidget2 = mywidget3; line.
You do not need an operator=(Widget const & rhs) though - the default should do fine.
Also, it may be a good idea to add e.g. cout << "operator=(int rhs)\n"; to your custom operator - then you'd see that it didn't get called at all in your code.
If the operator= is properly defined, is it OK to use the following as copy constructor?
MyClass::MyClass(MyClass const &_copy)
{
*this = _copy;
}
If all members of MyClass have a default constructor, yes.
Note that usually it is the other way around:
class MyClass
{
public:
MyClass(MyClass const&); // Implemented
void swap(MyClass&) throw(); // Implemented
MyClass& operator=(MyClass rhs) { rhs.swap(*this); return *this; }
};
We pass by value in operator= so that the copy constructor gets called. Note that everything is exception safe, since swap is guaranteed not to throw (you have to ensure this in your implementation).
EDIT, as requested, about the call-by-value stuff: The operator= could be written as
MyClass& MyClass::operator=(MyClass const& rhs)
{
MyClass tmp(rhs);
tmp.swap(*this);
return *this;
}
C++ students are usually told to pass class instances by reference because the copy constructor gets called if they are passed by value. In our case, we have to copy rhs anyway, so passing by value is fine.
Thus, the operator= (first version, call by value) reads:
Make a copy of rhs (via the copy constructor, automatically called)
Swap its contents with *this
Return *this and let rhs (which contains the old value) be destroyed at method exit.
Now, we have an extra bonus with this call-by-value. If the object being passed to operator= (or any function which gets its arguments by value) is a temporary object, the compiler can (and usually does) make no copy at all. This is called copy elision.
Therefore, if rhs is temporary, no copy is made. We are left with:
Swap this and rhs contents
Destroy rhs
So passing by value is in this case more efficient than passing by reference.
It is more advisable to implement operator= in terms of an exception safe copy constructor. See Example 4. in this from Herb Sutter for an explanation of the technique and why it's a good idea.
http://www.gotw.ca/gotw/059.htm
This implementation implies that the default constructors for all the data members (and base classes) are available and accessible from MyClass, because they will be called first, before making the assignment. Even in this case, having this extra call for the constructors might be expensive (depending on the content of the class).
I would still stick to separate implementation of the copy constructor through initialization list, even if it means writing more code.
Another thing: This implementation might have side effects (e.g. if you have dynamically allocated members).
While the end result is the same, the members are first default initialized, only copied after that.
With 'expensive' members, you better copy-construct with an initializer list.
struct C {
ExpensiveType member;
C( const C& other ): member(other.member) {}
};
};
I would say this is not okay if MyClass allocates memory or is mutable.
yes.
personally, if your class doesn't have pointers though I'd not overload the equal operator or write the copy constructor and let the compiler do it for you; it will implement a shallow copy and you'll know for sure that all member data is copied, whereas if you overload the = op; and then add a data member and then forget to update the overload you'll have a problem.
#Alexandre - I am not sure about passing by value in assignment operator. What is the advantage you will get by calling copy constructor there? Is this going to fasten the assignment operator?
P.S. I don't know how to write comments. Or may be I am not allowed to write comments.
It is technically OK, if you have a working assignment operator (copy operator).
However, you should prefer copy-and-swap because:
Exception safety is easier with copy-swap
Most logical separation of concerns:
The copy-ctor is about allocating the resources it needs (to copy the other stuff).
The swap function is (mostly) only about exchanging internal "handles" and doesn't need to do resource (de)allocation
The destructor is about resource deallocation
Copy-and-swap naturally combines these three function in the assignment/copy operator
Consider a class of which copies need to be made. The vast majority of the data elements in the copy must strictly reflect the original, however there are select few elements whose state is not to be preserved and need to be reinitialized.
Is it bad form to call a default assignment operator from the copy constructor?
The default assignment operator will behave well with Plain Old Data( int,double,char,short) as well user defined classes per their assignment operators. Pointers would need to be treated separately.
One drawback is that this method renders the assignment operator crippled since the extra reinitialization is not performed. It is also not possible to disable the use of the assignment operator thus opening up the option of the user to create a broken class by using the incomplete default assignment operator A obj1,obj2; obj2=obj1; /* Could result is an incorrectly initialized obj2 */ .
It would be good to relax the requirement that to a(orig.a),b(orig.b)... in addition to a(0),b(0) ... must be written. Needing to write all of the initialization twice creates two places for errors and if new variables (say double x,y,z) were to be added to the class, initialization code would need to correctly added in at least 2 places instead of 1.
Is there a better way?
Is there be a better way in C++0x?
class A {
public:
A(): a(0),b(0),c(0),d(0)
A(const A & orig){
*this = orig; /* <----- is this "bad"? */
c = int();
}
public:
int a,b,c,d;
};
A X;
X.a = 123;
X.b = 456;
X.c = 789;
X.d = 987;
A Y(X);
printf("X: %d %d %d %d\n",X.a,X.b,X.c,X.d);
printf("Y: %d %d %d %d\n",Y.a,Y.b,Y.c,Y.d);
Output:
X: 123 456 789 987
Y: 123 456 0 987
Alternative Copy Constructor:
A(const A & orig):a(orig.a),b(orig.b),c(0),d(orig.d){} /* <-- is this "better"? */
As brone points out, you're better off implementing assignment in terms of copy construction. I prefer an alternative idiom to his:
T& T::operator=(T t) {
swap(*this, t);
return *this;
}
It's a bit shorter, and can take advantage of some esoteric language features to improve performance. Like any good piece of C++ code, it also has some subtleties to watch for.
First, the t parameter is intentionally passed by value, so that the copy constructor will be called (most of the time) and we can modify is to our heart's content without affecting the original value. Using const T& would fail to compile, and T& would trigger some surprising behaviour by modifying the assigned-from value.
This technique also requires swap to be specialized for the type in a way that doesn't use the type's assignment operator (as std::swap does), or it will cause an infinite recursion. Be careful of any stray using std::swap or using namespace std, as they will pull std::swap into scope and cause problems if you didn't specialize swap for T. Overload resolution and ADL will ensure the correct version of swap is used if you have defined it.
There are a couple of ways to define swap for a type. The first method uses a swap member function to do the actual work and has a swap specialization that delegates to it, like so:
class T {
public:
// ....
void swap(T&) { ... }
};
void swap(T& a, T& b) { a.swap(b); }
This is pretty common in the standard library; std::vector, for example, has swapping implemented this way. If you have a swap member function you can just call it directly from the assignment operator and avoid any issues with function lookup.
Another way is to declare swap as a friend function and have it do all of the work:
class T {
// ....
friend void swap(T& a, T& b);
};
void swap(T& a, T& b) { ... }
I prefer the second one, as swap() usually isn't an integral part of the class' interface; it seems more appropriate as a free function. It's a matter of taste, however.
Having an optimized swap for a type is a common method of achieving some of the benefits of rvalue references in C++0x, so it's a good idea in general if the class can take advantage of it and you really need the performance.
With your version of the copy constructor the members are first default-constructed and then assigned.
With integral types this doesn't matter, but if you had non-trivial members like std::strings this is unneccessary overhead.
Thus, yes, in general your alternative copy constructor is better, but if you only have integral types as members it doesn't really matter.
Essentially, what you are saying is that you have some members of your class which don't contribute to the identity of the class. As it currently stands you have this expressed by using the assignment operator to copy class members and then resetting those members which shouldn't be copied. This leaves you with an assignment operator that is inconsistent with the copy constructor.
Much better would be to use the copy and swap idiom, and express which members shouldn't be copied in the copy constructor. You still have one place where the "don't copy this member" behaviour is expressed, but now your assignment operator and copy constructor are consistent.
class A
{
public:
A() : a(), b(), c(), d() {}
A(const A& other)
: a(other.a)
, b(other.b)
, c() // c isn't copied!
, d(other.d)
A& operator=(const A& other)
{
A tmp(other); // doesn't copy other.c
swap(tmp);
return *this;
}
void Swap(A& other)
{
using std::swap;
swap(a, other.a);
swap(b, other.b);
swap(c, other.c); // see note
swap(d, other.d);
}
private:
// ...
};
Note: in the swap member function, I have swapped the c member. For the purposes of use in the assignment operator this preserves the behaviour to match that of the copy constructor: it re-initializes the c member. If you leave the swap function public, or provide access to it through a swap free function you should make sure that this behaviour is suitable for other uses of swap.
Personally I think the broken assignment operator is killer. I always say that people should read the documentation and not do anything it tells them not to, but even so it's just too easy to write an assignment without thinking about it, or use a template which requires the type to be assignable. There's a reason for the noncopyable idiom: if operator= isn't going to work, it's just too dangerous to leave it accessible.
If I remember rightly, C++0x will let you do this:
private:
A &operator=(const A &) = default;
Then at least it's only the class itself which can use the broken default assignment operator, and you'd hope that in this restricted context it's easier to be careful.
I would call it bad form, not because you double-assign all your objects, but because in my experience it's often bad form to rely on the default copy constructor / assignment operator for a specific set of functionality. Since these are not in the source anywhere, it's hard to tell that the behavior you want depends on their behavior. For instance, what if someone in a year wants to add a vector of strings to your class? You no longer have the plain old datatypes, and it would be very hard for a maintainer to know that they were breaking things.
I think that, nice as DRY is, creating subtle un-specified requirements is much worse from a maintenance point of view. Even repeating yourself, as bad as that is, is less evil.
I think the better way is not to implement a copy constructor if the behaviour is trivial (in your case it appears to be broken: at least assignment and copying should have similar semantics but your code suggests this won't be so - but then I suppose it is a contrived example). Code that is generated for you cannot be wrong.
If you need to implement those methods, most likely the class could do with a fast swap method and thus be able to reuse the copy constructor to implement the assignment operator.
If you for some reason need to provide a default shallow copy constructor, then C++0X has
X(const X&) = default;
But I don't think there is an idiom for weird semantics. In this case using assignment instead of initialization is cheap (since leaving ints uninitialized doesn't cost anything), so you might just as well do it like this.