I have a function:
std::string makeMeat() { return "Pork"; }
And somewhere in code I use it this way:
std::string meat = makeMeat();
I want to know what is exact sequence of operations made on this line of code. Assuming two different circumstances:
std::string has no move constructor (just for example)
std::string has move constructor
I guess makeMeat() creates temporary object of class std::string.
std::string temp("Pork");
After that std::string meat object is created and initialized with copy constructor by data from temp object?
std::string meat(temp);
Finally temp object is destroyed?
I think it happens this way if there was no return value optimization.
What happens if it was?
The string is directly constructed in meat. No temporaries with a distinct lifetime exist. This is known as elision.
This behaviour is mandated under C++17 and in practice happens in any reasonably modern production-quality modern compiler with no pathological build flags set in C++03 11 and 14.
In C++14 and earlier, the class must have a move or copy ctor for the above to happen, or you get a build break. No code in said constructors will run.
Ancient or toy compilers, or compilers with pathological flags telling them not to elide, may make up to 2 temporary objects and mess around with copies. This case isn't interesting, as pathological compiler states are equally free to implement a+=b; (with a and b integral types) as for (i from 0 to b)++a;! You should honestly consider lack of elision as equally pathological.
Elision in C++ refers to the standard-permitted merging of object lifetime and identity. So in some sense 3 strings (the temporary eithin the function, the return value, and the value constructed from the return value) exist, their identities are merged into one object with a unified lifetime.
You can test this using a custom structure:
struct S {
S (const char *);
S (S const&) = default;
S (S&&) = default;
virtual ~S();
};
S get_s () { return "S"; }
int main () {
S s = get_s();
}
Without option, g++ will elide most constructors call and this code is equivalent to:
S s("S");
So only the constructor from const char * is called.
Now, if you tell g++ to not elide constructor (-fno-elide-constructors), there are three constructors/destructors call:
The first one create a temporary S("S");
The second one create a temporary inside get_s, S(S&&);
Then the destructor of the first temporary is called;
Then the move constructor is called inside main;
Then the destructor of the temporary returned by get_s is called;
Then the destructor of s is called.
If S does not have a move constructor, you can simply replace move constructors by copy constructors in the above list.
Related
UPDATE: To be even more explicit, and avoid misunderstandings: What I am asking is, in case of returning a named value, does the C++17 standard GUARANTEE that the move constructor will be invoked if I do std::move on the return value?. I understand that if not using std::move, compilers are allowed, but not required, to entirely elide copying and move constructors and just construct the return value in the calling function directly. That is not what I want to do in my example, I want guarantees.
Consider
class A; // Class with heap-allocated memory and 'sane' move constructor/move assignment.
A a_factory(/* some args to construct an A object */)
{
// Code to process args to be able to build an A object.
A a(// args); // A named instance of A, so would require non-guaranteed NRVO.
return std::move(a);
}
void foo()
{
A result = a_factory();
}
In this scenario, does the C++ standard guarantee that no copying will take place when constructing the result object, i.e. do we have guaranteed move construction?
I do understand the drawbacks of explicit std::move on a return value, e.g. in cases where class A is unmovable, we cannot do late materialization of temporaries and get 0 copy even without a move constructor in the class. But my specific question is this - I come from a hard real-time background and the current status of NRVO not being guaranteed by the standard is less than ideal. I do know the 2 specific cases where C++17 made (non-named) RVO mandatory, but this is not my question.
emplace_back(...) was introduced with C++11 to prevent the creation of temporary objects. Now with C++17 pure lvalues are even purer so that they do not lead to the creation of temporaries anymore (see this question for more). Now I still do not fully understand the consequences of these changes, do we still need emplace_back(...) or can we just go back and use push_back(...) again?
Both push_back and emplace_back member functions create a new object of its value_type T at some place of the pre-allocated buffer. This is accomplished by the vector's allocator, which, by default, uses the placement new mechanism for this construction (placement new is basically just a way of constructing an object at a specified place in memory).
However:
emplace_back perfect-forwards its arguments to the constructor of T, thus the constructor that is the best match for these arguments is selected.
push_back(T&&) internally uses the move constructor (if it exists and does not throw) to initialize the new element. This call of move constructor cannot be elided and is always used.
Consider the following situation:
std::vector<std::string> v;
v.push_back(std::string("hello"));
The std::string's move constructor is always called here that follows the converting constructor which creates a string object from a string literal. In this case:
v.emplace_back("hello");
there is no move constructor called and the vector's element is initialized by std::string's converting constructor directly.
This does not necessarily mean the push_back is less efficient. Compiler optimizations might eliminate all the additional instructions and finally both cases might produce the exact same assembly code. Just it's not guaranteed.
By the way, if push_back passed arguments by value — void push_back(T param); — then this would be a case for the application of copy elision. Namely, in:
v.push_back(std::string("hello"));
the parameter param would be constructed by a move-constructor from the temporary. This move-construction would be a candidate for copy elision. However, this approach would not at all change anything about the mandatory move-construction for vector's element inside push_back body.
You may see here: std::vector::push_back that this method requires either CopyInsertable or MoveInsertable, also it takes either const T& value or T&& value, so I dont see how elision could be of use here.
The new rules of mandatory copy ellision are of use in the following example:
struct Data {
Data() {}
Data(const Data&) = delete;
Data(Data&&) = delete;
};
Data create() {
return Data{}; // error before c++17
}
void foo(Data) {}
int main()
{
Data pf = create();
foo(Data{}); // error before c++17
}
so, you have a class which does not support copy/move operations. Why, because maybe its too expensive. Above example is a kind of a factory method which always works. With new rules you dont need to worry if compiler will actually use elision - even if your class supports copy/move.
I dont see the new rules will make push_back faster. emplace_back is still more efficient but not because of the copy ellision but because of the fact it creates object in place with forwarding arguments to it.
I'm reading on copy elision (and how it's supposed to be guaranteed in C++17) and this got me a bit confused (I'm not sure I know things I thought I knew before). So here's a minimal test case:
std::string nameof(int param)
{
switch (param)
{
case 1:
return "1"; // A
case 2:
return "2" // B
}
return std::string(); // C
}
The way I see it, cases A and B perform a direct construction on the return value so copy elision has no meaning here, while case C cannot perform copy elision because there are multiple return paths. Are these assumptions correct?
Also, I'd like to know if
there's a better way of writing the above (e.g. have a std::string retval; and always return that one or write cases A and B as return string("1") etc)
there's any move happening, for example "1" is a temporary but I'm assuming it's being used as a parameter for the constructor of std::string
there are optimization concerns I ommited (e.g. I believe C could be written as return{}, would that be a better choice?)
To make it NRVO-friendly, you should always return the same object. The value of the object might be different, but the object should be the same.
However, following above rule makes program harder to read, and often one should opt for readability over unnoticeable performance improvement. Since std::string has a move constructor defined, the difference between moving a pointer and a length and not doing so would be so tiny that I see no way of actually noticing this in the application.
As for your last question, return std::string() and return {} would be exactly the same.
There are also some incorrect statements in your question. For example, "1" is not a temporary. It's a string literal. Temporary is created from this literal.
Last, but not least, mandatory C++17 copy elision does not apply here. It is reserved for cases like
std::string x = "X";
which before the mandatory requirement could generate code to create a temporary std::string and initialize x with copy (or move) constructor.
In all cases, the copy might or might not be elided. Consider:
std::string j = nameof(whatever);
This could be implemented one of two ways:
Only one std::string object is ever constructed, j. (The copy is elided.)
A temporary std::string object is constructed, its value is copied to j, then the temporary is destroyed. (The function returns a temporary that is copied.)
In the first time,the code looks like below:
#include "stdafx.h"
#include<iostream>
using namespace std;
class Test{
public:
explicit Test(int);
~Test();
//Test(Test&);
int varInt;
};
Test::Test(int temp){
varInt = temp;
cout << "call Test::constructor\n";
}
Test::~Test(){
cout << "call Test::destructor\n";
}
/*Test::Test(Test&temp){
varInt = temp.varInt;
cout << "call Test::copy constructor\n";
}*/
void func(Test temp){
cout << "call func\n";
}
int _tmain(int argc, _TCHAR* argv[])
{
func(Test(1));
return 0;
}
output:
call Test::constructor
call func
call Test::destructor
call Test::destructor
This confuses me,cause there's only one object that was created(as the argument of func),but two destructors were called after the function ends.
I started to wonder,is this because the default copy constructor was called?So I wrote the definition of copy constructor,which made things just more strange.
After I add the Commented-Out Code as you can see above,namely the definition of copy constructor,into the class,the output became like this:
output:
call Test::constructor
call func
call Test::destructor
Things became just right now.
Can someone explain this phenomenon to me?Thank u very much.
Your interpretation of your original code (that the implicitly-declared copy constructor is being called) is correct.
Depending on the version of the standard that your compiler is implementing, it may actually be using the implicitly-declared move constructor instead. But this amounts to the same thing.
Your modified code (where you've explicitly provided a copy constructor) happens to be triggering the copy elision optimization, where the compiler just constructs the object in the desired location to begin with. This is one of the few situations where the standard specifically allows an optimization even though it affects the observable behavior of the program (since you can tell whether your copy constructor was called).
Nothing about your modified code requires copy elision, and nothing about your original code forbids it; the two versions just happen to differ in whether they trigger the optimization in your compiler under your current settings.
Note: the situation here changes a bit in C++17, where this optimization does become mandatory in some cases. See my above link for details.
Edited to add: Incidentally, in your version with an explicit copy constructor, your constructor is unusual in taking a non-constant reference. This actually means that it can't be used anyway, since a non-constant reference can't bind to the temporary Test(1). I think this oddness may have to do with why your compiler is performing copy elision. If you change the constructor to take a constant reference, as the implicitly-declared copy constructor would, you may see the behavior you were expecting, with your explicit copy constructor being called and the destructor being called twice. (But that's just speculation on my part; you'll have to try it and see!)
You have two objects of class Test. Since you pass arguments by value, one is constructed explicitly in the main function, another one is constructed with default copy constructor, as your copy constructor is commented out. Both objects get destructed. One on the exit from func(), another at the exit from main(). Hence two destructor calls.
If a function return a value like this:
std::string foo() {
std::string ret {"Test"};
return ret;
}
The compiler is allowed to move ret, since it is not used anymore. This doesn't hold for cases like this:
void foo (std::string str) {
// do sth. with str
}
int main() {
std::string a {"Test"};
foo(a);
}
Although a is obviously not needed anymore since it is destroyed in the next step you have to do:
int main() {
std::string a {"Test"};
foo(std::move(a));
}
Why? In my opinion, this is unnecessarily complicated, since rvalues and move semantic are hard to understand especially for beginners. So it would be great if you wouldn't have to care in standard cases but benefit from move semantic anyway (like with return values and temporaries). It is also annoying to have to look at the class definition to discover if a class is move-enabled and benefits from std::move at all (or use std::move anyway in the hope that it will sometimes be helpfull. It is also error-prone if you work on existing code:
int main() {
std::string a {"Test"};
foo(std::move(a));
// [...] 100 lines of code
// new line:
foo(a); // Ups!
}
The compiler knows better if an object is no longer used used. std::move everywhere is also verbose and reduces readability.
It is not obvious that an object is not going to be used after a given point.
For instance, have a look at the following variant of your code:
struct Bar {
~Bar() { std::cout << str.size() << std::endl; }
std::string& str;
}
Bar make_bar(std::string& str) {
return Bar{ str };
}
void foo (std::string str) {
// do sth. with str
}
int main() {
std::string a {"Test"};
Bar b = make_bar(a);
foo(std::move(a));
}
This code would break, because the string a is put in an invalid state by the move operation, but Bar is holding a reference to it, and will try to use it when it's destroyed, which happens after the foo call.
If make_bar is defined in an external assembly (e.g. a DLL/so), the compiler has no way, when compiling Bar b = make_bar(a);, of telling if b is holding a reference to a or not. So, even if foo(a) is the last usage of a, that doesn't mean it's safe to use move semantics, because some other object might be holding a reference to a as a consequence of previous instructions.
Only you can know if you can use move semantics or not, by looking at the specifications of the functions you call.
On the other side, you can always use move semantics in the return case, because that object will go out of scope anyway, which means any object holding a reference to it will result in undefined behaviour regardless of the move semantics.
By the way, you don't even need move semantics there, because of copy elision.
Its all sums up on what you define by "Destroyed"? std::string has no special effect for self-destroying but deallocating the char array which hides inside.
what if my destructor DOES something special? for example - doing some important logging? then by simply "moving it because it's not needed anymore" I miss some special behavior that the destructor might do.
Because compilers cannot do optimizations that change behavior of the program except when allowed by the standard. return optimization is allowed in certain cases but this optimization is not allowed for method calls. By changing the behavior, it would skip calling copy constructor and destructor which can have side effects (they are not required to be pure) but by skipping them, these side effects won't happen and therefore the behavior would be changed.
(Note that this highly depends on what you try to pass and, in this case, STL implementation. In cases where all code is available at the time of compilation, the compiler may determine both copy constructor and destructor are pure and optimize them out.)
While the compiler is allowed to move ret in your first snippet, it might also do a copy/move elision and construct it directly into the stack of the caller.
This is why it is not recommended to write the function like this:
std::string foo() {
auto ret = std::string("Test");
return std::move(ret);
}
Now for the second snippet, your string a is a lvalue. Move semantics only apply to rvalue-references, which obtained by returning a temporary, unnamed object, or casting a lvalue. The latter is exactly what std::move does.
std::string GetString();
auto s = GetString();
// s is a lvalue, use std::move to cast it to rvalue-ref to force move semantics
foo(s);
// GetString returns a temporary object, which is a rvalue-ref and move semantics apply automatically
foo(GetString());