Guaranteed copy elision in C++17 and emplace_back(...)

Guaranteed copy elision in C++17 and emplace_back(...) - c++

emplace_back(...) was introduced with C++11 to prevent the creation of temporary objects. Now with C++17 pure lvalues are even purer so that they do not lead to the creation of temporaries anymore (see this question for more). Now I still do not fully understand the consequences of these changes, do we still need emplace_back(...) or can we just go back and use push_back(...) again?

Both push_back and emplace_back member functions create a new object of its value_type T at some place of the pre-allocated buffer. This is accomplished by the vector's allocator, which, by default, uses the placement new mechanism for this construction (placement new is basically just a way of constructing an object at a specified place in memory).
However:
emplace_back perfect-forwards its arguments to the constructor of T, thus the constructor that is the best match for these arguments is selected.
push_back(T&&) internally uses the move constructor (if it exists and does not throw) to initialize the new element. This call of move constructor cannot be elided and is always used.
Consider the following situation:
std::vector<std::string> v;
v.push_back(std::string("hello"));
The std::string's move constructor is always called here that follows the converting constructor which creates a string object from a string literal. In this case:
v.emplace_back("hello");
there is no move constructor called and the vector's element is initialized by std::string's converting constructor directly.
This does not necessarily mean the push_back is less efficient. Compiler optimizations might eliminate all the additional instructions and finally both cases might produce the exact same assembly code. Just it's not guaranteed.
By the way, if push_back passed arguments by value — void push_back(T param); — then this would be a case for the application of copy elision. Namely, in:
v.push_back(std::string("hello"));
the parameter param would be constructed by a move-constructor from the temporary. This move-construction would be a candidate for copy elision. However, this approach would not at all change anything about the mandatory move-construction for vector's element inside push_back body.

You may see here: std::vector::push_back that this method requires either CopyInsertable or MoveInsertable, also it takes either const T& value or T&& value, so I dont see how elision could be of use here.
The new rules of mandatory copy ellision are of use in the following example:
struct Data {
Data() {}
Data(const Data&) = delete;
Data(Data&&) = delete;
};
Data create() {
return Data{}; // error before c++17
}
void foo(Data) {}
int main()
{
Data pf = create();
foo(Data{}); // error before c++17
}
so, you have a class which does not support copy/move operations. Why, because maybe its too expensive. Above example is a kind of a factory method which always works. With new rules you dont need to worry if compiler will actually use elision - even if your class supports copy/move.
I dont see the new rules will make push_back faster. emplace_back is still more efficient but not because of the copy ellision but because of the fact it creates object in place with forwarding arguments to it.

Related

Struct member address after returning

The following code emulates some code I'm working with. Basically struct Foo allocates an std::vector member (d_vec) and then defines some other member to be a pointer to the vector's contents (d_buf).
#include <cstddef>
#include <vector>
struct Foo
{
Foo(std::size_t n)
: d_vec(n, 0.)
, d_buf(d_vec.data())
{}
std::vector<double> d_vec;
double* d_buf;
};
Now, the following looks fine to me:
void buildAndUseFoo()
{
Foo f{10};
// do stuff with f.d_buf, it is safe
// ...
}
What I am not sure about is this:
Foo buildAndReturnFoo()
{
Foo f{10};
return f;
}
void someMethod()
{
auto f = buildAndReturnFoo();
// is it safe to use f.d_buf?
// ...
}
I wonder if the d_vec's address could change from when it's created inside buildAndReturnFoo() to when it's used in someMethod(). Then if I attempted to dereference it, I would get undefined behavior.
Note: I have tested printing the addresses and they happened to be the same but I'd like to be sure this is guaranteed, and that I wasn't relying on "luck".
Note #2: I'm aware of safer approaches; I'm just looking to learn about this scenario.

Your struct is dangerous to copy or move in any kind of situation, not restricted to function returns.
When such a struct is copied/moved, the d_buf of the destination object still points to the original vector’s data. That’s almost certainly not what you intended. So you need to respect the spirit of the rule of 5[*] and implement a copy ctor, copy assignment operator, move ctor and move assignment operator that all do the right thing, i.e. update where d_buf points to. Or disable copy and/or move by deleteing those functions.
The alternative is to get rid of d_buf. Replace it with a member function buffer() that accesses the vector’s data() on the fly. Because getting that pointer is a cheap operation I’d lean towards this solution.
[*] The rule of 5 states that if you need to implement at least one of copy ctor, copy assignment, move ctor, move assignment or destructor, you need all five of them. Your struct doesn’t manage any resources explicitly, so you don’t need a destructor and technically it’s not the full Rule of 5. But its spirit still applies.

The observed behavior may be caused by:
C++11 move semantic, if vector is move constructed by transferring data to the new one
C++17 Elision of copy/move operations, aka RVO/NRVO
For the first reason, tranferring data without reallocating is very likely to happen, but is not enforced by the standard.
For the second reason, it is also very likely to happen, even before C++17, but it is named variable, so does not fall into
mandatory elision of copy/move operations.
So the observed behavior is very likely in practice, but is not guaranteed.

Use std::move on named return value to guarantee no copying?

UPDATE: To be even more explicit, and avoid misunderstandings: What I am asking is, in case of returning a named value, does the C++17 standard GUARANTEE that the move constructor will be invoked if I do std::move on the return value?. I understand that if not using std::move, compilers are allowed, but not required, to entirely elide copying and move constructors and just construct the return value in the calling function directly. That is not what I want to do in my example, I want guarantees.
Consider
class A; // Class with heap-allocated memory and 'sane' move constructor/move assignment.
A a_factory(/* some args to construct an A object */)
{
// Code to process args to be able to build an A object.
A a(// args); // A named instance of A, so would require non-guaranteed NRVO.
return std::move(a);
}
void foo()
{
A result = a_factory();
}
In this scenario, does the C++ standard guarantee that no copying will take place when constructing the result object, i.e. do we have guaranteed move construction?
I do understand the drawbacks of explicit std::move on a return value, e.g. in cases where class A is unmovable, we cannot do late materialization of temporaries and get 0 copy even without a move constructor in the class. But my specific question is this - I come from a hard real-time background and the current status of NRVO not being guaranteed by the standard is less than ideal. I do know the 2 specific cases where C++17 made (non-named) RVO mandatory, but this is not my question.

Copy elision when creating object inside emplace()

I see a lot of code at work where people use emplace and emplace_back with a temporary object, like this:
struct A {
A::A(int, int);
};
vector<A> v;
vector<A>.emplace_back(A(1, 2));
I know that the whole point of emplace_back is to be able to pass the parameters directly, like this:
v.emplace_back(1, 2);
But unfortunately this is not clear to a few people. But let's not dwell on that....
My question is: is the compiler able to optimize this and skip the create and copy? Or should I really try to fix these occurrences?
For your reference... we're working with C++14.

My question is: is the compiler able to optimize this and skip the create and copy? Or should I really try to fix these occurrences?
It can't avoid a copy, in the general case. Since emplace_back accepts by forwarding references, it must create temporaries from a pure standardese perspective. Those references must bind to objects, after all.
Copy elision is a set of rules that allows a copy(or move) constructor to be avoided, and a copy elided, even if the constructor and corresponding destructor have side-effects. It applies in only specific circumstances. And passing arguments by reference is not one of those. So for non-trivial types, where the object copies can't be inlined by the as-if rule, the compiler's hands are bound if it aims to be standard conformant.

The easy answer is no; elision doesn't work with perfect forwarding. But this is c++ so the answer is actually yes.
It requires a touch of boilerplate:
struct A {
A(int, int){std::cout << "A(int,int)\n"; }
A(A&&){std::cout<<"A(A&&)\n";}
};
template<class F>
struct maker_t {
F f;
template<class T>
operator T()&&{ return f(); }
};
template<class F>
maker_t<std::decay_t<F>> maker( F&& f ) { return {std::forward<F>(f)}; }
vector<A> v;
v.emplace_back(maker([]{ return A(1,2); }));
live example.
Output is one call to A(int,int). No move occurs. In c++17 the making doesn't even require that a move constructor exist (but the vector does, as it thinks it may have to move the elements in an already allocated buffer). In c++14 the moves are simply elided.

is the compiler able to optimize this and skip the create and copy?
There is not necessarily a copy involved. If a move constructor is available, there will be a move. This cannot be optimized away, as the direct initialization case will just call the init constructor, while in the other case, the move constructor will be called additionally (including its side-effects).
Therefore, if possible, you should refactor that code.

Efficiency difference between copy and move constructor

C++11 introduced a new concept of rvalue reference. I was reading it somewhere and found following:
class Base
{
public:
Base() //Default Ctor
Base(int t) //Parameterized Ctor
Base(const Base& b) //Copy Ctor
Base(Base&& b) //Move Ctor
};
void foo(Base b) //Function 1
{}
void foo(Base& b) //Function 2
{}
int main()
{
Base b(10);
foo(b); -- Line 1 (i know of ambiquity but lets ignore for understanding purpose)
foo(Base()); -- Line 2
foo(2) ; -- Line 3
}
Now with my limited understanding, my observations are as follows :
Line 1 will simply call the copy constructor as argument is an lvalue.
Line 2 before C++11 would have called copy constructor and all those temporary copy stuff, but with move constructor defined, that would be called here.
Line 3 will again call move constructor as 2 will be implicitly converted to Base type (rvalue).
Please correct and explain if any of above observation is wrong.
Now, here'r my questions :
I know once we move an object it's data will be lost at calling location. So, i above example how can i change Line 2 to move object "b" in foo (is it using std::move(b) ?).
I have read move constructor is more efficient than copy constructor. How? I can think of only situation where we have memory on heap need not to be allocated again in case of move constructor. Does this statement hold true when we don't have any memory on heap?
Is it even more efficient than passing by reference (no, right?)?

First on your "understandings":
As I can see it, they are in principle right but you should be aware of Copy elision which could prevent the program from calling any copy/move Constructor. Depends on your compiler (-settings).
On your Questions:
Yes you have to call foo(std::move(b)) to call an Function which takes an rvalue with an lvalue. std::move will do the cast. Note: std::move itself does not move anything.
Using the move-constructor "might" be more efficient. In truth it only enables programmers to implement some more efficient Constructors. Example consider a vector which is a Class around a pointer to an array which holds the data (similar to std::vector), if you copy it you have to copy the data, if you move it you can just pass the pointer and set the old one to nullptr.
But as I read in Effective Modern C++ by Scott Meyers: Do not think your program will be faster only because you use std::move everywere.
That depends on the usage of the input. If you do not need a copy in the function it will in the most cases be more efficient to just pass the object by (const) reference. If you need a copy there are several ways of doing it for example the copy and swap idiom. But as a

Line 2 before C++11 would have called copy constructor and all those temporary copy stuff, but with move constructor defined, that would be called here.
Correct, except any decent optimizer would "elide" the copy, so that before C++11 the copy would have been avoided, and post C++11 the move would have been avoided. Same for line 3.
I know once we move an object it's data will be lost at calling location.
Depends on how the move constructor/assignment is implemented. If you don't know, this is what you must assume.
So, i above example how can i change Line 2 to move object "b" in foo (is it using std::move(b) ?).
Exactly. std::move changes the type of the expression into r-value and therefore the move constructor is invoked.
I have read move constructor is more efficient than copy constructor.
It can be, in some cases. For example the move constructor of std::vector is much faster than copy.
I can think of only situation where we have memory on heap need not to be allocated again in case of move constructor. Does this statement hold true when we don't have any memory on heap?
The statement isn't universally true, since for objects with trivial copy constructor, the move constructor isn't any more efficient. But owning dynamic memory isn't strictly a requirement for a more efficient move. More generally, move may can be efficient if the object owns any external resource, which could be dynamic memory, or it could be for example a reference counter or a file descriptor that must be released in the destructor and therefore re-aquired or re-calculated on copy - which can be avoided on move.
Is it even more efficient than passing by reference (no, right?)?
Indeed not. However, if you intend to move the object within the function where you pass it by reference, then you would have to pass a non-const reference and therefore not be able to pass temporaries.
In short: Reference is great for giving temporary access to an object that you keep, move is great for giving the ownership away.

Does D have a move constructor?

I am referencing this SO answer Does D have something akin to C++0x's move semantics?
Next, you can override C++'s constructor(constructor &&that) by defining this(Struct that). Likewise, you can override the assign with opAssign(Struct that). In both cases, you need to make sure that you destroy the values of that.
He gives an example like this:
// Move operations
this(UniquePtr!T that) {
this.ptr = that.ptr;
that.ptr = null;
}
Will the variable that always get moved? Or could it happen that the variable that could get copied in some situations?
It would be unfortunate if I would only null the ptr on a temporary copy.

Well, you can also take a look at this SO question:
Questions about postblit and move semantics
The way that a struct is copied in D is that its memory is blitted, and then if it has a postblit constructor, its postblit constructor is called. And if the compiler determines that a copy isn't actually necessary, then it will just not call the postblit constructor and will not call the destructor on the original object. So, it will have moved the object rather than copy it.
In particular, according to TDPL (p.251), the language guarantees that
All anonymous rvalues are moved, not copied. A call to this(this)
is never inserted when the source is an anonymous rvalue (i.e., a
temporary as featured in the function hun above).
All named temporaries that are stack-allocated inside a function and
then returned elide a call to this(this).
There is no guarantee that other potential elisions are observed.
So, in other cases, the compiler may or may not elide copies, depending on the current compiler implementation and optimization level (e.g. if you pass an lvalue to a function that takes it by value, and that variable is never referenced again after the function call).
So, if you have
void foo(Bar bar)
{}
then whether the argument to foo gets moved or not depends on whether it was an rvalue or an lvalue. If it's an rvalue, it will be moved, whereas if it's an lvalue, it probably won't be (but might depending on the calling code and the compiler).
So, if you have
void foo(UniquePtr!T ptr)
{}
ptr will be moved if foo was passed an rvalue and may or may not be moved it it's passed an lvalue (though generally not). So, what happens with the internals of UniquePtr depends on how you implemented it. If UniquePtr disabled the postblit constructor so that it can't be copied, then passing an rvalue will move the argument, and passing an lvalue will result in a compilation error (since the rvalue is guaranteed to be moved, whereas the lvalue is not).
Now, what you have is
this(UniquePtr!T that)
{
this.ptr = that.ptr;
that.ptr = null;
}
which appears to act like the current type has the same members as those of its argument. So, I assume that what you're actually trying to do here is a copy constructor / move constructor for UniquePtr and not a constructor for an arbitrary type that takes a UniquePtr!T. And if that's what you're doing, then you'd want a postblit constructor - this(this) - and not one that takes the same type as the struct itself (since D does not have copy constructors). So, if what you want is a copy constructor, then you do something like
this(this)
{
// Do any deep copying you want here. e.g.
arr = arr.dup;
}
But if a bitwise copy of your struct's elements works for your type, then you don't need a postblit constructor. But moving is built-in, so you don't need to declare a move constructor regardless (a move will just blit the struct's members). Rather, if what you want is to guarantee that the object is moved and never copied, then what you want to do is disable the struct's postblit constructor. e.g.
#disable this(this);
Then any and all times that you pass a UniquePtr!T anywhere, it's guaranteed to be a move or a compilation error. And while I would have thought that you might have to disable opAssign separately to disable assignment, from the looks of it (based on the code that I just tested), you don't even have to disable assignment separately. Disabling the postblit constructor also disables the assignment operator. But if that weren't the case, then you'd just have to disable opOpAssign as well.

JMD answer covers theoretical part of move semantics, I can extend it with a very simplified example implementation:
struct UniquePtr(T)
{
private T* ptr;
#disable this(this);
UniquePtr release()
{
scope(exit) this.ptr = null;
return UniquePtr(this.ptr);
}
}
// some function that takes argument by value:
void foo ( UniquePtr!int ) { }
auto p = UniquePtr!int(new int);
// won't compile, postblit constructor is disabled
foo(p);
// ok, release() returns a new rvalue which is
// guaranteed to be moved without copying
foo(p.release());
// release also resets previous pointer:
assert(p.ptr is null);

I think I can answer it myself. Quoting the "The Programming Language":
All anonymous rvalues are moved, not copied. A call to this ( this ) is never inserted
when the source is an anonymous rvalue (i.e., a temporary as featured in
the function hun above).
If I understood it correctly, this means that this(Struct that) will never be a copy, because it only accepts rvalues in the first place.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js