Purpose of assignment operator overloading in C++ - c++

I'm trying to understand the purpose of overloading some operators in C++.
Conceptually, an assignment statement can be easily implemented via:
Destruction of the old object followed by copy construction of the new object
Copy construction of the new object, followed by a swap with the old object, followed by destruction of the old object
In fact, often, the copy-and-swap implementation is the implementation of assignment in real code.
Why, then, does C++ allow the programmer to overload the assignment operator, instead of just performing the above?
Was it intended to allow a scenario in which assignment is faster than destruction + construction?
If so, when does that happen? And if not, then what use case was it intended to support?

1) Reference counting
Suppose you have a resource that is ref-counted and it is wrapped in objects.
void operator=(const MyObject& v) {
this->resource = refCount(v->resource);
}
// Here the internal resource will be copied and not the objects.
MyObject A = B;
2) Or you just want to copy the fields without fancy semantics.
void operator=(const MyObject& v) {
this->someField = v->someField;
}
// So this statement should generate just one function call and not a fancy
// collection of temporaries that require construction destruction.
MyObject A = B;
In both cases the code runs much faster. In the second case the effect is similar.
3) Also what about types...
Use the operator to handle assigning other types to your type.
void operator=(const MyObject& v) {
this->someField = v->someField;
}
void operator=(int x) {
this->value = x;
}
void operator=(float y) {
this->floatValue = y;
}

first, note that "Destruction of the old object followed by copy construction of the new object" is not exception safe.
but re "Copy construction of the new object, followed by a swap with the old object, followed by destruction of the old object", that's the swap idiom for implementing an assignment operator, and it's exception safe if done correctly.
in some cases a custom assignment operator can be faster than the swap idiom. for example, direct arrays of POD type can't really be swapped except by way of lower level assignments. so there for the swap idiom you can expect an overhead proportional to the array size.
however, historically there wasn't much focus on swapping and exception safety.
bjarne wanted exceptions originally (if i recall correctly), but they didn't get into the language until 1989 or thereabouts. so the original c++ way of programming was more focused on assignments. to the degree that a failing constructor signalled its failure by assigning 0 to this… i think, that in those days your question would not have made sense. it was just assignments all over.
typewise, some objects have identity, and others have value. it makes sense to assign to value objects, but for identity objects one typically wants to limit the ways that the object can be modified. while this doesn't require the ability to customize copy assignment (only to make it unavailable), with that ability one doesn't need any other language support.
and i think likewise for any other specific reasons one can think of: probably no such reason really requires the general ability, but the general ability is sufficient to cover it all, so it lowers the overall language complexity.
a good source to get more definitive answer than my hunches, recollections and gut feelings, is bjarne's "the design and evolution of c++" book.
probably the question has a definitive answer there.

Destruction of the old object, followed by copy construction of
the new, will not usually work. And the swap idiom is
guaranteed not to work unless the class provides a special swap
function—std::swap uses assignment in its unspecialized
implementation, and using it directly in the assignment operator
will lead to endless recursion.
And of course, the user may want to do something special, e.g.
make the assignment operator private, for example.
And finally, what is almost certainly an overruling reason: the
default assignment operator has to be compatible with C.

Actually, after seeing juanchopanza's answer (which was deleted), I think I ended up figuring it out myself.
Copy-assignment operators allow classes like basic_string to avoid allocating resources unnecessarily when they can re-use them (in this case, memory).
So when you assign to a basic_string, an overloaded copy assignment operator would avoid allocating memory, and would just copy the string data directly to the existing buffer.
If the object had to be destroyed and constructed again, the buffer would have to be reallocated, which would be potentially much more costly for a small string.
(Note that vector could benefit from this too, but only if it knew that the elements' copy constructors would never throw exceptions. Otherwise it would need to maintain its exception safety and actually perform a copy-and-swap.)

It allows you to use assignment of other types as well.
You could have a class Person with an assignment operator that assigns an ID.
But besides that, you don't always want to copy all the members as they are.
The default assignment only does a shallow copy.
For example, if the class contains pointers, or locks, you dont always want to copy them from the other object.
Usually when you have pointers you want to use a deep copy, and maybe create a copy of the object that the pointers are pointed to.
And if you have locks, you want them to be specific to the object, and you don't want to copy their state from the other object.
It is actually a common practice to provide your own copy constructor and assignment operator if your class holds pointers as members.

I have used it often as a conversion constructor but with already existing objects. i.e assigning member variable type, etc to an object.

Related

Why not a consolidated Copy constructor and Assignment operator available in C++?

I do understand the scenarios where the respective functions (Copy Constructor and Assignment operator) would be called. And both these functions are literally doing the same functionality - properly allocating memory for dynamic data members and copy the data from the passed argument object, so that both the object looks identical in data. Why not in that case, C++ provides a consolidated (one function) which would be called in both these scenarios instead of complicating things by providing two variants?
They are not the same, and it would be a pain in the neck if someone forced them to be.
Copy construction is a way of creating an object. Amongst other things, base member initialisers can be used. In multi-threaded code, you don't need to worry so much about mutual exclusion units in a constructor since you cannot create the same object simultaneously!
An assignment operator does a rather different thing. It operates on an object that already exists and should return a reference to self. An implementation can do subtly different things here cf. copy construction. For example, a string class might not release resources if the new assigned string is smaller.
In simple cases they may well do the same thing and the return value of an assignment discarded. But in such cases you can rely on the ones that the compiler generates automatically.
They are so not the same. They might be the same in special cases, but generally, no.
when you have something like this:
std::vector myVec = myOtherVec;
It looks like assignment, but actually the copy constructor is being called.
The copy constructor starts an object from nothing.
This falls back to the basic question of what's the difference between malloc (the old C way to reserve memory) and new. The difference is: new calls the constructor of your object, which is very important in C++, otherwise we'd be talking about garbage memory that can't be initialzed unless it explicitly is.
For example, in the internal implementation of std::vector, there is a size variable that tracks the number of elements actively acknowledged by the user with push_back() or resize (we're not talking about reserve).
Now imagine how it's implemented:
template <typename T>
class vector
{
int size;
T* theArray;
void reserveMyMemory(); //ignoring allocators for simplicity
}
What's the difference between the copy constructor and assignment operator?
Assignment operator: Just copies size and the array content.
Copy-cosntructor: Must reserve memory and initialize variables, then copy.
Now imageine that memory reserving requires checking the size and whether theArray is nullptr. What's going to happen if internally it uses assignment operator? A catastrophe. Because the values are not initialized. So, you need a constructor to start.
In this case, the copy constructor is more general, because it should initialize variables, then copy the elements it has to copy. Of course, this whole example is just a demonstration. Don't take it literally for std::vector, the STL doesn't work that way.

Can we say bye to copy constructors?

Copy constructors were traditionally ubiquitous in C++ programs. However, I'm doubting whether there's a good reason to that since C++11.
Even when the program logic didn't need copying objects, copy constructors (usu. default) were often included for the sole purpose of object reallocation. Without a copy constructor, you couldn't store objects in a std::vector or even return an object from a function.
However, since C++11, move constructors have been responsible for object reallocation.
Another use case for copy constructors was, simply, making clones of objects. However, I'm quite convinced that a .copy() or .clone() method is better suited for that role than a copy constructor because...
Copying objects isn't really commonplace. Certainly it's sometimes necessary for an object's interface to contain a "make a duplicate of yourself" method, but only sometimes. And when it is the case, explicit is better than implicit.
Sometimes an object could expose several different .copy()-like methods, because in different contexts the copy might need to be created differently (e.g. shallower or deeper).
In some contexts, we'd want the .copy() methods to do non-trivial things related to program logic (increment some counter, or perhaps generate a new unique name for the copy). I wouldn't accept any code that has non-obvious logic in a copy constructor.
Last but not least, a .copy() method can be virtual if needed, allowing to solve the problem of slicing.
The only cases where I'd actually want to use a copy constructor are:
RAII handles of copiable resources (quite obviously)
Structures that are intended to be used like built-in types, like math vectors or matrices -
simply because they are copied often and vec3 b = a.copy() is too verbose.
Side note: I've considered the fact that copy constructor is needed for CAS, but CAS is needed for operator=(const T&) which I consider redundant basing on the exact same reasoning;
.copy() + operator=(T&&) = default would be preferred if you really need this.)
For me, that's quite enough incentive to use T(const T&) = delete everywhere by default and provide a .copy() method when needed. (Perhaps also a private T(const T&) = default just to be able to write copy() or virtual copy() without boilerplate.)
Q: Is the above reasoning correct or am I missing any good reasons why logic objects actually need or somehow benefit from copy constructors?
Specifically, am I correct in that move constructors took over the responsibility of object reallocation in C++11 completely? I'm using "reallocation" informally for all the situations when an object needs to be moved someplace else in the memory without altering its state.
The problem is what is the word "object" referring to.
If objects are the resources that variables refers to (like in java or in C++ through pointers, using classical OOP paradigms) every "copy between variables" is a "sharing", and if single ownership is imposed, "sharing" becomes "moving".
If objects are the variables themselves, since each variables has to have its own history, you cannot "move" if you cannot / don't want to impose the destruction of a value in favor of another.
Cosider for example std::strings:
std::string a="Aa";
std::string b=a;
...
b = "Bb";
Do you expect the value of a to change, or that code to don't compile? If not, then copy is needed.
Now consider this:
std::string a="Aa";
std::string b=std::move(a);
...
b = "Bb";
Now a is left empty, since its value (better, the dynamic memory that contains it) had been "moved" to b. The value of b is then chaged, and the old "Aa" discarded.
In essence, move works only if explicitly called or if the right argument is "temporary", like in
a = b+c;
where the resource hold by the return of operator+ is clearly not needed after the assignment, hence moving it to a, rather than copy it in another a's held place and delete it is more effective.
Move and copy are two different things. Move is not "THE replacement for copy". It an more efficient way to avoid copy only in all the cases when an object is not required to generate a clone of itself.
Short anwer
Is the above reasoning correct or am I missing any good reasons why logic objects actually need or somehow benefit from copy constructors?
Automatically generated copy constructors are a great benefit in separating resource management from program logic; classes implementing logic do not need to worry about allocating, freeing or copying resources at all.
In my opinion, any replacement would need to do the same, and doing that for named functions feels a bit weird.
Long answer
When considering copy semantics, it's useful to divide types into four categories:
Primitive types, with semantics defined by the language;
Resource management (or RAII) types, with special requirements;
Aggregate types, which simply copy each member;
Polymorphic types.
Primitive types are what they are, so they are beyond the scope of the question; I'm assuming that a radical change to the language, breaking decades of legacy code, won't happen. Polymorphic types can't be copied (while maintaining the dynamic type) without user-defined virtual functions or RTTI shenanigans, so they are also beyond the scope of the question.
So the proposal is: mandate that RAII and aggregate types implement a named function, rather than a copy constructor, if they should be copied.
This makes little difference to RAII types; they just need to declare a differently-named copy function, and users just need to be slightly more verbose.
However, in the current world, aggregate types do not need to declare an explicit copy constructor at all; one will be generated automatically to copy all the members, or deleted if any are uncopyable. This ensures that, as long as all the member types are correctly copyable, so is the aggregate.
In your world, there are two possibilities:
Either the language knows about your copy-function, and can automatically generate one (perhaps only if explicitly requested, i.e. T copy() = default;, since you want explicitness). In my opinion, automatically generating named functions based on the same named function in other types feels more like magic than the current scheme of generating "language elements" (constructors and operator overloads), but perhaps that's just my prejudice speaking.
Or it's left to the user to correctly implement copying semantics for aggregates. This is error-prone (since you could add a member and forget to update the function), and breaks the current clean separation between resource management and program logic.
And to address the points you make in favour:
Copying (non-polymorphic) objects is commonplace, although as you say it's less common now that they can be moved when possible. It's just your opinion that "explicit is better" or that T a(b); is less explicit than T a(b.copy());
Agreed, if an object doesn't have clearly defined copy semantics, then it should have named functions to cover whatever options it offers. I don't see how that affects how normal objects should be copied.
I've no idea why you think that a copy constructor shouldn't be allowed to do things that a named function could, as long as they are part of the defined copy semantics. You argue that copy constructors shouldn't be used because of artificial restrictions that you place on them yourself.
Copying polymorphic objects is an entirely different kettle of fish. Forcing all types to use named functions just because polymorphic ones must won't give the consistency you seem to be arguing for, since the return types would have to be different. Polymorphic copies will need to be dynamically allocated and returned by pointer; non-polymorphic copies should be returned by value. In my opinion, there is little value in making these different operations look similar without being interchangable.
One case where copy constructors come in useful is when implementing the strong exception guarantees.
To illustrate the point, let's consider the resize function of std::vector. The function might be implemented roughly as follows:
void std::vector::resize(std::size_t n)
{
if (n > capacity())
{
T *newData = new T [n];
for (std::size_t i = 0; i < capacity(); i++)
newData[i] = std::move(m_data[i]);
delete[] m_data;
m_data = newData;
}
else
{ /* ... */ }
}
If the resize function were to have a strong exception guarantee we need to ensure that, if an exception is thrown, the state of the std::vector before the resize() call is preserved.
If T has no move constructor, then we will default to the copy constructor. In this case, if the copy constructor throws an exception, we can still provide strong exception guarantee: we simply delete the newData array and no harm to the std::vector has been done.
However, if we were using the move constructor of T and it threw an exception, then we have a bunch of Ts that were moved into the newData array. Rolling this operation back isn't straight-forward: if we try to move them back into the m_data array the move constructor of T may throw an exception again!
To resolve this issue we have the std::move_if_noexcept function. This function will use the move constructor of T if it is marked as noexcept, otherwise the copy constructor will be used. This allows us to implement std::vector::resize in such a way as to provide a strong exception guarantee.
For completeness, I should mention that C++11 std::vector::resize does not provide a strong exception guarantee in all cases. According to www.cplusplus.com we have the the follow guarantees:
If n is less than or equal to the size of the container, the function never throws exceptions (no-throw guarantee).
If n is greater and a reallocation happens, there are no changes in the container in case of exception (strong guarantee) if the type of the elements is either copyable or no-throw moveable.
Otherwise, if an exception is thrown, the container is left with a valid state (basic guarantee).
Here's the thing. Moving is the new default- the new minimum requirement. But copying is still often a useful and convenient operation.
Nobody should bend over backwards to offer a copy constructor anymore. But it is still useful for your users to have copyability if you can offer it simply.
I would not ditch copy constructors any time soon, but I admit that for my own types, I only add them when it becomes clear I need them- not immediately. So far this is very, very few types.

Implementing copy-constructor using assignment operator [duplicate]

Are there some drawbacks of such implementation of copy-constructor?
Foo::Foo(const Foo& i_foo)
{
*this = i_foo;
}
As I remember, it was recommend in some book to call copy constructor from assignment operator and use well-known swap trick, but I don't remember, why...
Yes, that's a bad idea. All member variables of user-defined types will be initialized first, and then immediately overwritten.
That swap trick is this:
Foo& operator=(Foo rhs) // note the copying
{
rhs.swap(*this); //swap our internals with the copy of rhs
return *this;
} // rhs, now containing our old internals, will be deleted
There are both potential drawbacks and potential gains from calling operator=() in your constructor.
Drawbacks:
Your constructor will initialize all the member variables whether you specify values or not, and then operator= will initialize them again. This increases execution complexity. You will need to make smart decisions about when this will create unacceptable behavior in your code.
Your constructor and operator= become tightly coupled. Everything you need to do when instantiating your object will also be done when copying your object. Again, you have to be smart about determining if this is a problem.
Gains:
The codebase becomes less complex and easier to maintain. Once again, be smart about evaluating this gain. If you have a struct with 2 string members, it's probably not worth it. On the other hand if you have a class with 50 data members (you probably shouldn't but that's a story for another post) or data members that have a complex relationship to one another, there could be a lot of benefit by having just one init function instead of two or more.
You're looking for Scott Meyers' Effective C++, Item 12: "Copy all parts of an object", whose summary states:
Copying functions should be sure to copy all of an object's data members and all of its base class parts.
Don't try to implement one of the copying functions in terms of the other. Instead, put common functionality in a third function that both
call.

Why would the assignment operator ever do something different than its matching constructor?

I was reading some boost code, and came across this:
inline sparse_vector &assign_temporary(sparse_vector &v) {
swap(v);
return *this;
}
template<class AE>
inline sparse_vector &operator=(const sparse_vector<AE> &ae) {
self_type temporary(ae);
return assign_temporary(temporary);
}
It seems to be mapping all of the constructors to assignment operators. Great. But why did C++ ever opt to make them do different things? All I can think of is scoped_ptr?
why did C++ ever opt to make them do different things?
Because assignment works on a fully constructed object. In resource managing classes, this means that every member pointer already points to a resource. Contrast this to a constructor, where the members don't have any meaning prior to executing it.
By the way, in the very early days of C++, T a(b); was actually defined as T a; a = b;, but this proved to be inefficient, hence the introduction of the copy constructor.
Notice the name "assign_temporary" indicates that the assignment is from an object that is temporary and therefore can be destroyed in the process of assignment. The assignment operator takes some regular object that you might want to use later, so it is not an option to destroy it during the assignment. In this Boost code, "assign_temporary" is synonymous with the rvalue reference assignment operator, while the assignment operator that you have shown above is the standard const (lvalue) reference assignment operator, so you would, therefore, expect this kind of mismatch between the two.
I agree, though, the assignment operator is typically implemented using the copy and swap trick (create a copy with the copy constructor, then swap with the copy). However, the standard already defines the automatic implementation of the assignment operator by the compiler in the absence of an explicit definition, and so changing the default implementation would potentially break existing code.
For the very reason that some classes sometimes shouldn't be allowed to be assigned or even copied. For instance, using RAII you can build a mutex class which opens a lock and then closes the lock upon expiration. If you were allowed to copy or assign such a class you could conceivably pass it outside of the function scope. This could cause bad things in the hands of bad people.

Would this constructor be acceptable practice?

Let's assume I have a c++ class that have properly implemented a copy constructor and an overloaded = operator. By properly implemented I mean they are working and perform a deep copy:
Class1::Class1(const Class1 &class1)
{
// Perform copy
}
Class1& Class1::operator=(const Class1 *class1)
{
// perform copy
return *this;
}
Now lets say I have this constructor as well:
Class1::Class1(Class1 *class1)
{
*this = *class1;
}
My question is would the above constructor be acceptable practice? This is code that i've inherited and maintaining.
I would say, "no", for the following reasons:
A traditional copy constructor accepts its argument as a const reference, not as a pointer.
Even if you were to accept a pointer as a parameter, it really ought to be const Class1* to signify that the argument will not be modified.
This copy constructor is inefficient (or won't work!) because all members of Class1 are default-initialized, and then copied using operator=
operator= has the same problem; it should accept a reference, not a pointer.
The traditional way to "re-use" the copy constructor in operator= is the copy-and-swap idiom. I would suggest implementing the class that way.
Personally, I don't think it's good practice.
For the constructor, it's hard to think of a place where an implicit conversion from a pointer to an object to the object itself would be useful.
There's no reason for the pointer to be to non-const, and if you have available pointer to the class it is not hard to dereference it, and so clearly state your intention of wanting to copy the object using the copy constructor.
Similarly, for the non-standard assignment operator why allow assignment from a pointer when correctly dereferencing at the call site is clearer and more idiomatic?
I believe a somewhat more important issue than what has been discussed so far is that your non-standard assignment operator does not stop the compiler from generating the standard one. Since you've decided that you need to create an assignment operator (good bet since you made the copy constructor), the default is almost certainly not sufficient. Thus a user of this class could fall prey to this problem during what would seem very basic and standard use of an object to almost anyone.
Objects and pointers to objects are two very different things. Typically, when you're passing objects around, you expect that they're going to be copied (though, ideally functions would take const refs where possible to reduce/eliminate unnecessary copies). When you're passing a pointer around, you don't expect any copying to take place. You're passing around a pointer to a specific object and, depending on the code, it could really matter that you deal with that specific object and not a copy of it.
Assignment operators and constructors that take pointers to the type - especially constructors which can be used for implicit conversion - are really going to muddle things and stand a high chance of creating unintended copies, which not only could be a performance issue, but it could cause bugs.
I can't think of any good reason why you would ever want to make the conversion between a pointer to a type and the type itself implicit - or even explicit. The built-in way to do that is to dereference the object. I suppose that there might be some set of specific circumstances which I can't think of where this sort of thing might be necessary or a good idea, but I really doubt it. Certainly, I would strongly advise against doing it unless you have a specific and good reason for doing so.