Efficiency difference between copy and move constructor - c++

C++11 introduced a new concept of rvalue reference. I was reading it somewhere and found following:
class Base
{
public:
Base() //Default Ctor
Base(int t) //Parameterized Ctor
Base(const Base& b) //Copy Ctor
Base(Base&& b) //Move Ctor
};
void foo(Base b) //Function 1
{}
void foo(Base& b) //Function 2
{}
int main()
{
Base b(10);
foo(b); -- Line 1 (i know of ambiquity but lets ignore for understanding purpose)
foo(Base()); -- Line 2
foo(2) ; -- Line 3
}
Now with my limited understanding, my observations are as follows :
Line 1 will simply call the copy constructor as argument is an lvalue.
Line 2 before C++11 would have called copy constructor and all those temporary copy stuff, but with move constructor defined, that would be called here.
Line 3 will again call move constructor as 2 will be implicitly converted to Base type (rvalue).
Please correct and explain if any of above observation is wrong.
Now, here'r my questions :
I know once we move an object it's data will be lost at calling location. So, i above example how can i change Line 2 to move object "b" in foo (is it using std::move(b) ?).
I have read move constructor is more efficient than copy constructor. How? I can think of only situation where we have memory on heap need not to be allocated again in case of move constructor. Does this statement hold true when we don't have any memory on heap?
Is it even more efficient than passing by reference (no, right?)?

First on your "understandings":
As I can see it, they are in principle right but you should be aware of Copy elision which could prevent the program from calling any copy/move Constructor. Depends on your compiler (-settings).
On your Questions:
Yes you have to call foo(std::move(b)) to call an Function which takes an rvalue with an lvalue. std::move will do the cast. Note: std::move itself does not move anything.
Using the move-constructor "might" be more efficient. In truth it only enables programmers to implement some more efficient Constructors. Example consider a vector which is a Class around a pointer to an array which holds the data (similar to std::vector), if you copy it you have to copy the data, if you move it you can just pass the pointer and set the old one to nullptr.
But as I read in Effective Modern C++ by Scott Meyers: Do not think your program will be faster only because you use std::move everywere.
That depends on the usage of the input. If you do not need a copy in the function it will in the most cases be more efficient to just pass the object by (const) reference. If you need a copy there are several ways of doing it for example the copy and swap idiom. But as a

Line 2 before C++11 would have called copy constructor and all those temporary copy stuff, but with move constructor defined, that would be called here.
Correct, except any decent optimizer would "elide" the copy, so that before C++11 the copy would have been avoided, and post C++11 the move would have been avoided. Same for line 3.
I know once we move an object it's data will be lost at calling location.
Depends on how the move constructor/assignment is implemented. If you don't know, this is what you must assume.
So, i above example how can i change Line 2 to move object "b" in foo (is it using std::move(b) ?).
Exactly. std::move changes the type of the expression into r-value and therefore the move constructor is invoked.
I have read move constructor is more efficient than copy constructor.
It can be, in some cases. For example the move constructor of std::vector is much faster than copy.
I can think of only situation where we have memory on heap need not to be allocated again in case of move constructor. Does this statement hold true when we don't have any memory on heap?
The statement isn't universally true, since for objects with trivial copy constructor, the move constructor isn't any more efficient. But owning dynamic memory isn't strictly a requirement for a more efficient move. More generally, move may can be efficient if the object owns any external resource, which could be dynamic memory, or it could be for example a reference counter or a file descriptor that must be released in the destructor and therefore re-aquired or re-calculated on copy - which can be avoided on move.
Is it even more efficient than passing by reference (no, right?)?
Indeed not. However, if you intend to move the object within the function where you pass it by reference, then you would have to pass a non-const reference and therefore not be able to pass temporaries.
In short: Reference is great for giving temporary access to an object that you keep, move is great for giving the ownership away.

Related

Arguments for the copy constructor

Why use references to the parameters of the copy constructor?
I found a lot of information saying that it is to avoid unlimited calls, but I still can't understand it.
When you pass to a method by value, a copy is made of the argument. Copying uses the copy constructor, so you get a chicken and egg situation with infinite recursive calls to the copy constructor.
Response to comment:
Passing by reference does not make a copy of the object begin passed. It simply passes the address of the object (hidden behind the reference syntax) so the object inside the copy constructor (or any method to which an object is passed by reference) is the same object as the one outside.
As well as solving the chicken-and-egg here, passing by reference is usually (for larger objects - larger than the size of a point) faster.
Response to further comment:
You could write a kind of copy constructor that passed by pointer, and it would work in the same way as passing by reference. But it would be fiddly to call explicitly and impossible to call implicitly.
Declaration:
class X
{
public:
X();
X(const X* const pOther);
};
The explicit copy:
X x1;
X x2(&x1); // Have to take address
The implicit copy:
void foo (X copyOfX); // Pass by value, copy made
...
X x1;
foo (x1); // Copy constructor called implicitly if correctly declared
// But not matched if declared with pointer
foo (&x1); // Copy construcxtor with pointer might (?) be matched
// But function call to foo isn't
Ultimately, such a thing would not be regarded as a C++ copy constructor.
This code:
class MyClass {
public:
MyClass();
MyClass(MyClass c);
};
does not compile. That is, because the second line here:
MyClass a;
MyClass b(a);
should theoretically cause the infinite loop you're talking about - it should construct a copy of a to before calling the constructor for b. However, if the copy constructor looks like this:
MyClass(const MyClass& c);
Then no copies are required to be made before calling the copy constructor.
From this webpage
A copy constructor is called when an object is passed by value. Copy
constructor itself is a function. So if we pass an argument by value
in a copy constructor, a call to copy constructor would be made to
call copy constructor which becomes a non-terminating chain of calls.
Therefore compiler doesn’t allow parameters to be passed by value.
By passing the argument by value the copy constructor calls itself, entering in an infinite 'recursion cycle'. The link above explain pretty well the basic topics about the copy constructor.

Guaranteed copy elision in C++17 and emplace_back(...)

emplace_back(...) was introduced with C++11 to prevent the creation of temporary objects. Now with C++17 pure lvalues are even purer so that they do not lead to the creation of temporaries anymore (see this question for more). Now I still do not fully understand the consequences of these changes, do we still need emplace_back(...) or can we just go back and use push_back(...) again?
Both push_back and emplace_back member functions create a new object of its value_type T at some place of the pre-allocated buffer. This is accomplished by the vector's allocator, which, by default, uses the placement new mechanism for this construction (placement new is basically just a way of constructing an object at a specified place in memory).
However:
emplace_back perfect-forwards its arguments to the constructor of T, thus the constructor that is the best match for these arguments is selected.
push_back(T&&) internally uses the move constructor (if it exists and does not throw) to initialize the new element. This call of move constructor cannot be elided and is always used.
Consider the following situation:
std::vector<std::string> v;
v.push_back(std::string("hello"));
The std::string's move constructor is always called here that follows the converting constructor which creates a string object from a string literal. In this case:
v.emplace_back("hello");
there is no move constructor called and the vector's element is initialized by std::string's converting constructor directly.
This does not necessarily mean the push_back is less efficient. Compiler optimizations might eliminate all the additional instructions and finally both cases might produce the exact same assembly code. Just it's not guaranteed.
By the way, if push_back passed arguments by value — void push_back(T param); — then this would be a case for the application of copy elision. Namely, in:
v.push_back(std::string("hello"));
the parameter param would be constructed by a move-constructor from the temporary. This move-construction would be a candidate for copy elision. However, this approach would not at all change anything about the mandatory move-construction for vector's element inside push_back body.
You may see here: std::vector::push_back that this method requires either CopyInsertable or MoveInsertable, also it takes either const T& value or T&& value, so I dont see how elision could be of use here.
The new rules of mandatory copy ellision are of use in the following example:
struct Data {
Data() {}
Data(const Data&) = delete;
Data(Data&&) = delete;
};
Data create() {
return Data{}; // error before c++17
}
void foo(Data) {}
int main()
{
Data pf = create();
foo(Data{}); // error before c++17
}
so, you have a class which does not support copy/move operations. Why, because maybe its too expensive. Above example is a kind of a factory method which always works. With new rules you dont need to worry if compiler will actually use elision - even if your class supports copy/move.
I dont see the new rules will make push_back faster. emplace_back is still more efficient but not because of the copy ellision but because of the fact it creates object in place with forwarding arguments to it.

Does D have a move constructor?

I am referencing this SO answer Does D have something akin to C++0x's move semantics?
Next, you can override C++'s constructor(constructor &&that) by defining this(Struct that). Likewise, you can override the assign with opAssign(Struct that). In both cases, you need to make sure that you destroy the values of that.
He gives an example like this:
// Move operations
this(UniquePtr!T that) {
this.ptr = that.ptr;
that.ptr = null;
}
Will the variable that always get moved? Or could it happen that the variable that could get copied in some situations?
It would be unfortunate if I would only null the ptr on a temporary copy.
Well, you can also take a look at this SO question:
Questions about postblit and move semantics
The way that a struct is copied in D is that its memory is blitted, and then if it has a postblit constructor, its postblit constructor is called. And if the compiler determines that a copy isn't actually necessary, then it will just not call the postblit constructor and will not call the destructor on the original object. So, it will have moved the object rather than copy it.
In particular, according to TDPL (p.251), the language guarantees that
All anonymous rvalues are moved, not copied. A call to this(this)
is never inserted when the source is an anonymous rvalue (i.e., a
temporary as featured in the function hun above).
All named temporaries that are stack-allocated inside a function and
then returned elide a call to this(this).
There is no guarantee that other potential elisions are observed.
So, in other cases, the compiler may or may not elide copies, depending on the current compiler implementation and optimization level (e.g. if you pass an lvalue to a function that takes it by value, and that variable is never referenced again after the function call).
So, if you have
void foo(Bar bar)
{}
then whether the argument to foo gets moved or not depends on whether it was an rvalue or an lvalue. If it's an rvalue, it will be moved, whereas if it's an lvalue, it probably won't be (but might depending on the calling code and the compiler).
So, if you have
void foo(UniquePtr!T ptr)
{}
ptr will be moved if foo was passed an rvalue and may or may not be moved it it's passed an lvalue (though generally not). So, what happens with the internals of UniquePtr depends on how you implemented it. If UniquePtr disabled the postblit constructor so that it can't be copied, then passing an rvalue will move the argument, and passing an lvalue will result in a compilation error (since the rvalue is guaranteed to be moved, whereas the lvalue is not).
Now, what you have is
this(UniquePtr!T that)
{
this.ptr = that.ptr;
that.ptr = null;
}
which appears to act like the current type has the same members as those of its argument. So, I assume that what you're actually trying to do here is a copy constructor / move constructor for UniquePtr and not a constructor for an arbitrary type that takes a UniquePtr!T. And if that's what you're doing, then you'd want a postblit constructor - this(this) - and not one that takes the same type as the struct itself (since D does not have copy constructors). So, if what you want is a copy constructor, then you do something like
this(this)
{
// Do any deep copying you want here. e.g.
arr = arr.dup;
}
But if a bitwise copy of your struct's elements works for your type, then you don't need a postblit constructor. But moving is built-in, so you don't need to declare a move constructor regardless (a move will just blit the struct's members). Rather, if what you want is to guarantee that the object is moved and never copied, then what you want to do is disable the struct's postblit constructor. e.g.
#disable this(this);
Then any and all times that you pass a UniquePtr!T anywhere, it's guaranteed to be a move or a compilation error. And while I would have thought that you might have to disable opAssign separately to disable assignment, from the looks of it (based on the code that I just tested), you don't even have to disable assignment separately. Disabling the postblit constructor also disables the assignment operator. But if that weren't the case, then you'd just have to disable opOpAssign as well.
JMD answer covers theoretical part of move semantics, I can extend it with a very simplified example implementation:
struct UniquePtr(T)
{
private T* ptr;
#disable this(this);
UniquePtr release()
{
scope(exit) this.ptr = null;
return UniquePtr(this.ptr);
}
}
// some function that takes argument by value:
void foo ( UniquePtr!int ) { }
auto p = UniquePtr!int(new int);
// won't compile, postblit constructor is disabled
foo(p);
// ok, release() returns a new rvalue which is
// guaranteed to be moved without copying
foo(p.release());
// release also resets previous pointer:
assert(p.ptr is null);
I think I can answer it myself. Quoting the "The Programming Language":
All anonymous rvalues are moved, not copied. A call to this ( this ) is never inserted
when the source is an anonymous rvalue (i.e., a temporary as featured in
the function hun above).
If I understood it correctly, this means that this(Struct that) will never be a copy, because it only accepts rvalues in the first place.

Implementing the copy constructor in terms of operator=

If the operator= is properly defined, is it OK to use the following as copy constructor?
MyClass::MyClass(MyClass const &_copy)
{
*this = _copy;
}
If all members of MyClass have a default constructor, yes.
Note that usually it is the other way around:
class MyClass
{
public:
MyClass(MyClass const&); // Implemented
void swap(MyClass&) throw(); // Implemented
MyClass& operator=(MyClass rhs) { rhs.swap(*this); return *this; }
};
We pass by value in operator= so that the copy constructor gets called. Note that everything is exception safe, since swap is guaranteed not to throw (you have to ensure this in your implementation).
EDIT, as requested, about the call-by-value stuff: The operator= could be written as
MyClass& MyClass::operator=(MyClass const& rhs)
{
MyClass tmp(rhs);
tmp.swap(*this);
return *this;
}
C++ students are usually told to pass class instances by reference because the copy constructor gets called if they are passed by value. In our case, we have to copy rhs anyway, so passing by value is fine.
Thus, the operator= (first version, call by value) reads:
Make a copy of rhs (via the copy constructor, automatically called)
Swap its contents with *this
Return *this and let rhs (which contains the old value) be destroyed at method exit.
Now, we have an extra bonus with this call-by-value. If the object being passed to operator= (or any function which gets its arguments by value) is a temporary object, the compiler can (and usually does) make no copy at all. This is called copy elision.
Therefore, if rhs is temporary, no copy is made. We are left with:
Swap this and rhs contents
Destroy rhs
So passing by value is in this case more efficient than passing by reference.
It is more advisable to implement operator= in terms of an exception safe copy constructor. See Example 4. in this from Herb Sutter for an explanation of the technique and why it's a good idea.
http://www.gotw.ca/gotw/059.htm
This implementation implies that the default constructors for all the data members (and base classes) are available and accessible from MyClass, because they will be called first, before making the assignment. Even in this case, having this extra call for the constructors might be expensive (depending on the content of the class).
I would still stick to separate implementation of the copy constructor through initialization list, even if it means writing more code.
Another thing: This implementation might have side effects (e.g. if you have dynamically allocated members).
While the end result is the same, the members are first default initialized, only copied after that.
With 'expensive' members, you better copy-construct with an initializer list.
struct C {
ExpensiveType member;
C( const C& other ): member(other.member) {}
};
};
I would say this is not okay if MyClass allocates memory or is mutable.
yes.
personally, if your class doesn't have pointers though I'd not overload the equal operator or write the copy constructor and let the compiler do it for you; it will implement a shallow copy and you'll know for sure that all member data is copied, whereas if you overload the = op; and then add a data member and then forget to update the overload you'll have a problem.
#Alexandre - I am not sure about passing by value in assignment operator. What is the advantage you will get by calling copy constructor there? Is this going to fasten the assignment operator?
P.S. I don't know how to write comments. Or may be I am not allowed to write comments.
It is technically OK, if you have a working assignment operator (copy operator).
However, you should prefer copy-and-swap because:
Exception safety is easier with copy-swap
Most logical separation of concerns:
The copy-ctor is about allocating the resources it needs (to copy the other stuff).
The swap function is (mostly) only about exchanging internal "handles" and doesn't need to do resource (de)allocation
The destructor is about resource deallocation
Copy-and-swap naturally combines these three function in the assignment/copy operator

Who deletes the copied instance in + operator ? (c++)

I searched how to implement + operator properly all over the internet and all the results i found do the following steps :
const MyClass MyClass::operator+(const MyClass &other) const
{
MyClass result = *this; // Make a copy of myself. Same as MyClass result(*this);
result += other; // Use += to add other to the copy.
return result; // All done!
}
I have few questions about this "process" :
Isn't that stupid to implement + operator this way, it calls the assignment operator(which copies the class) in the first line and then the copy constructor in the return (which also copies the class , due to the fact that the return is by value, so it destroys the first copy and creates a new one.. which is frankly not really smart ... )
When i write a=b+c, the b+c part creates a new copy of the class, then the 'a=' part copies the copy to himself.
who deletes the copy that b+c created ?
Is there a better way to implement + operator without coping the class twice, and also without any memory issues ?
thanks in advance
That's effectively not an assignment operator, but a copy constructor. An operation like addition creates a new value, after all, so it has to be created somewhere. This is more efficient than it seems, since the compiler is free to do Return Value Optimization, which means it can construct the value directly where it will next be used.
The result is declared as a local variable, and hence goes away with the function call - except if RVO (see above) is used, in which case it was never actually created in the function, but in the caller.
Not really; this method is much more efficient than it looks at first.
Under the circumstances, I'd probably consider something like:
MyClass MyClass::operator+(MyClass other) {
other += *this;
return other;
}
Dave Abrahams wrote an article a while back explaining how this works and why this kind of code is usually quite efficient even though it initially seems like it shouldn't be.
Edit (thank you MSalters): Yes, this does assume/depend upon the commutative property holding for MyClass. If a+b != b+a, then the original code is what you want (most of the same reasoning applies).
it calls the assignment operator(which copies the class) in the first line
No, this is copy-initialization (through constructor).
then the copy constructor in the return (which also copies the class
Compilers can (and typically do) elide this copy using NRVO.
When i write a=b+c, the b+c part creates a new copy of the class, then the 'a=' part copies the copy to himself. who deletes the copy that b+c created
The compiler, as any other temporary value. They are deleted at the end of full-expression (in this case, it means at or after ; at the end of line.
Is there a better way to implement + operator without coping the class twice, and also without any memory issues ?
Not really. It's not that inefficient.
This appears to be the correct way to implement operator+. A few points:
MyClass result = *this does not use the assignment operator, it should be calling the copy constructor, as if it were written MyClass result(*this).
The returned value when used in a = b + c is called a temporary, and the compiler is responsible for deleting it (which will probably happen at the end of the statement ie. the semicolon, after everything else has been done). You don't have to worry about that, the compiler will always clean up temporaries.
There's no better way, you need the copy. The compiler, however, is allowed to optimise away the temporary copies, so not as many as you think may be made. In C++0x though, you can use move constructors to improve performance by transfering ownership of the content of a temporary rather than copying it in its entirity.
I'll try my best to answer:
Point (1): No, it does not call the assignment operator. Instead it calls a constructor. Since you need to construct the object anyway (since operator+ returns a copy), this does not introduce extra operations.
Point (2): The temporary result is created in stack and hence does not introduce memory problem (it is destroyed when function exits). On return, a temporary is created so that an assignment (or copy constructor) can be used to assign the results to a (in a=b+c;) even after result is destroyed. This temporary is destroyed automatically by the compiler.
Point (3): The above is what the standard prescribes. Remember that compiler implementors are allowed to optimize the implementation as long as the effect is the same as what the standard prescribed. I believe, compilers in reality optimize away many of the copying that occurs here. Using the idiom above is readable and is not actually inefficient.
P.S. I sometime prefer to implement operator+ as a non-member to leverage implicit conversion for both sides of the operators (only if it makes sense).
There are no memory issues (provided that the assignment operator, and copy constructor are well written). Simply because all the memory for these objects is taken on the stack and managed by the compiler. Furthermore, compilers do optimize this and perform all the operations directly on the final a instead of copying twice.
This is the proper way of implementing the operator+ in C++. Most of the copies you are so afraid of will get elided by the compiler and will be subject to move semantics in C++0x.
The class is a temporary and will be deleted. If you bind the temporary to a const& the life time of the temporary will be extended to the life time of the const reference.
May implementing it as a freefunction is a little more obvious. The first parameter in MyClass::operator+ is an implicit this and the compiler will rewrite the function to operator+(const MyClass&, const MyClass&) anyway.
As far as I remember, Stroustrup's 'The C++ Programming Language' recommends to implement operators as member functions only when internal representation is affected by operation and as external functions when not. operator+ does not need to access internal representation if implemented based on operator+=, which does.
So you would have:
class MyClass
{
public:
MyClass& operator+=(const MyClass &other)
{
// Implementation
return *this;
}
};
MyClass operator+(const MyClass &op1, const MyClass &op2)
{
MyClass r = op1;
return r += op2;
}