Will compilers apply move semantics automatically in a setter method? - c++

I want to know if the compiler is allowed to automatically use the move constructor for wstring in the following setter method (without an explicit call to std::move):
void SetString(std::wstring str)
{
m_str = str; // Will str be moved into m_str automatically or is std::move(str) needed?
}
From what I've read it sounds as though the compiler is not allowed to make this decision since str is an lvalue, but it seems pretty obvious that using move here would not change program behavior.
Barring move, will some other sort of copy elision be applied?

[is] the compiler [...] allowed to automatically use the move constructor
Yes, it would be nice. But this is not only an optimization, this has real impact on the language.
Consider a move-only type like unique_ptr:
std::unique_ptr<int> f()
{
std::unique_ptr<int> up;
return up; // this is ok although unique_ptr is non-copyable.
}
Let's assume your rule would be included into the C++ standard, called the rule of "argument's last occurence".
void SetString(std::unique_ptr<int> data)
{
m_data = data; // this must be ok because this is "argument's last occurence"
}
Checking if an identifier is used in a return is easy. Checking if it is "argument's last occurence" isn't.
void SetString(std::unique_ptr<int> data)
{
if (condition) {
m_data = data; // this is argument's last occurence
} else {
data.foo();
m_data = data; // this is argument's last occurence too
}
// many lines of code without access to data
}
This is valid code too. So each compiler would be required to check for "argument's last occurence", wich isn't an easy thing. To do so, he would have to scan the whole function just to decide if the first line is valid. It is also difficult to reason about as a human if you have to scroll 2 pages down to check this.
No, the compiler isn't allowed to in C++11. And he probably won't be allowed in future standards because this feature is very difficult to implement in compilers in general, and it is just a convenience for the user.

no, move semantics will not be used here, since str can be used in the next code, in fact even if it was rvalue youd still have to std::move force it.. if you want to use move semantics I would advise getting wstring&& str to the function and then using move..

No, the complier is not allowed. Due to some reasons, not only because it is difficult to do. I think copy and move can have side effects and you need to know when you can expect each will be used. For example it is well know that returning a local object will move it - you expect that, it is documented, is OK.
So, we have the following possibilities:
Your example:
void SetString(std::wstring str)
{
m_str = str;
}
For r-values: One r-ref in str, plus a copy into m_str. For l-values: a copy in str an a copy in m_str.
We can do it “better” manually:
void SetString( std::wstring str)
{
m_str = std::move(str);
}
For r-values: One r-ref in str, plus a move into m_str. For l-values: a copy in str an a move in m_str.
If for some reason (you want it to compile without C++11 without changes but automatically take advantages of C++11 when porting the code?) you don’t want “manually optimize” the code you can do:
void SetString(const std::wstring& str)
{
m_str = str;
}
For r-values: One ref in str, plus a copy into m_str. For l-values: a ref in str an a copy in m_str. Never 2 copy.

Related

Should I make my local variables const or movable?

My default behaviour for any objects in local scopes is to make it const. E.g.:
auto const cake = bake_cake(arguments);
I try to have as little non-functional code as I can as this increases readability (and offers some optimisation opportunities for the compiler). So it is logical to also reflect this in the type system.
However, with move semantics, this creates the problem: what if my cake is hard or impossible to copy and I want to pass it out after I'm done with it? E.g.:
if (tastes_fine(cake)) {
return serve_dish(cake);
}
As I understand copy elision rules it's not guaranteed that the cake copy will be elided (but I'm not sure on this).
So, I'd have to move cake out:
return serve_dish(std::move(cake)); // this will not work as intended
But that std::move will do nothing useful, as it (correctly) will not cast Cake const& to Cake&&. Even though the lifetime of the object is very near its end. We cannot steal resources from something we promised not to change. But this will weaken const-correctness.
So, how can I have my cake and eat it too?
(i.e. how can I have const-correctness and also benefit from move semantics.)
I believe it's not possible to move from a const object, at least with a standard move constructor and non-mutable members. However, it is possible to have a const automatic local object and apply copy elision (namely NRVO) for it. In your case, you can rewrite your original function as follows:
Cake helper(arguments)
{
const auto cake = bake_cake(arguments);
... // original code with const cake
return cake; // NRVO
}
Then, in your original function, you can just call:
return serve_dish(helper(arguments));
Since the object returned by helper is already a non-const rvalue, it may be moved-from (which may be, again, elided, if applicable).
Here is a live-demo that demonstrates this approach. Note that there are no copy/move constructors called in the generated assembly.
You should indeed continue to make your variables const as that is good practice (called const-correctness) and it also helps when reasoning about code - even while creating it. A const object cannot be moved from - this is a good thing - if you move from an object you are almost always modifying it to a large degree or at least that is implied (since basically a move implies stealing the resources owned by the original object) !
From the core guidelines:
You can’t have a race condition on a constant. It is easier to reason
about a program when many of the objects cannot change their values.
Interfaces that promises “no change” of objects passed as arguments
greatly increase readability.
and in particular this guideline :
Con.4: Use const to define objects with values that do not change
after construction
Moving on to the next, main part of the question:
Is there a solution that does not exploit NRVO?
If by NRVO you take to include guaranteed copy elision, then not really, or yes and no at the same. This is somewhat complicated. Trying to move the return value out of a return by value function doesn't necessarily do what you think or want it to. Also, a "no copy" is always better than a move performance-wise. Therefore, instead you should try to let the compiler do it's magic and rely in particular on guaranteed copy elision (since you use c++17). If you have what I would call a complex scenario where elision is not possible: you can then use a move combined with guaranteed copy elision/NRVO, so as to avoid a full copy.
So the answer to that question is something like: if you object is already declared as const, then you can almost always rely on copy-elision/return by value directly, so use that. Otherwise you have some other scenario and then use discretion as to the best approach - in rare cases a move could be in order(meaning it's combined with copy-elision).
Example of 'complex' scenario:
std::string f() {
std::string res("res");
return res.insert(0, "more: ");//'complex scenario': a reference gets returned here will usually mean a copy is invoked here.
}
Superior way to 'fix' is to use copy-elision i.e.:
return res;//just return res as we already had that thus avoiding copy altogether - it's possible that we can't use this solution for more *hairy/complex* scenarios.
Inferior way to 'fix' in this example would be;
return std::move(res.insert(0, "more: "));
Make them movable if you can.
It's time to change your "default behaviour" as it's anachronistic.
If move semantics were built into the language from inception then making automatic variables const would have quickly become established as poor programming practice.
const was never intended to be used for micro-optimisations. Micro-optimisations are best left to the compiler. const exists primarily for member variables and member functions. It's also helped clean up the language a little: e.g. "foo" is a const char[4] type whereas in C it's a char[4] type with the curious understanding that you're not allowed to modify the contents.
Now (since C++11) const for automatic variables can actually be harmful as you observe, the time has come to stop this practice. The same can be said for const parameter by-value types. Your code would be less verbose too.
Personally I prefer immutable objects to const objects.
It seems to me, that if you want to move, than it will be "const correct" to not declare it const, because you will(!) change it.
It's ideological contradiction. You cannot move something and leave in place at the same time.
You mean, that object will be const for a part of time, in some scope. In this case, you can declare const reference to it, but it seems to me, that this will complicate the code and will add no safety.
Even vice versa, if you accidentally use the const reference to object after std::move() there will be problems, despite it will look like work with const object.
A limited workaround would be const move constructor:
class Cake
{
public:
Cake(/**/) : resource(acquire_resource()) {}
~Cake() { if (owning) release_resource(resource); }
Cake(const Cake& rhs) : resource(rhs.owning ? copy_resource(rhs.resource) : nullptr) {}
// Cake(Cake&& rhs) // not needed, but same as const version should be ok.
Cake(const Cake&& rhs) : resource(rhs.resource) { rhs.owning = false; }
Cake& operator=(const Cake& rhs) {
if (this == &rhs) return *this;
if (owning) release_resource(resource);
resource = rhs.owning ? copy_resource(rhs.resource) : nullptr;
owning = rhs.owning;
}
// Cake& operator=(Cake&& rhs) // not needed, but same as const version should be ok.
Cake& operator=(const Cake&& rhs) {
if (this == &rhs) return *this;
if (owning) release_resource(resource);
resource = rhs.resource;
owning = rhs.owning;
rhs.owning = false;
}
// ...
private:
Resource* resource = nullptr;
// ...
mutable bool owning = true;
};
Require extra mutable member.
not compatible with std containers which will do copy instead of move (providing non const version will leverage copy in non const usage)
usage after move should be considered (we should be in valid state, normally). Either provide owning getter, or "protect" appropriate methods with owning check.
I would personally just drop the const when move is used.

What is the most efficient way to set class variable using rvalue in c++?

I just started working with c++11 r-values. I read some tutorials, but I haven't found the answer.
What is the best way (the most efficient way) to set a class variable? Is below code correct or not? (let's assume std::string has defined move constructor and assignment operator).
class StringWrapper
{
private:
std::string str_;
public:
StringWrapper() : str_("") {}
void setString1(std::string&& str) {
str_ = std::move(str);
}
void setString2(const std::string& str) {
str_ = std::move(str);
}
// other possibility?
};
int main() {
std::string myText("text");
StringWrapper x1, x2;
x1.setString?("text"); // I guess here should be setString1
x2.setString?(myText); // I guess here should be setString2
}
I know that compiler can optimize my code and/or I can use overload functions. I'd like to only know what is the best way.
Herb Sutter's advice on this is to start with the standard C++98 approach:
void setString(const std::string& str) {
str_ = str;
}
And if you need to optimize for rvalues add an overload that takes an rvalue reference:
void setString(std::string&& str) noexcept {
str_ = std::move(str);
}
Note that most implementations of std::string use the small string optimization so that if your strings are small a move is the same as a copy anyway and you wouldn't get any benefit.
It is tempting to use pass-by-value and then move (as in Adam Hunyadi's answer) to avoid having to write multiple overloads. But Herb pointed out that it does not re-use any existing capacity of str_. If you call it multiple times with lvalues it will allocate a new string each time. If you have a const std::string& overload then it can re-use existing capacity and avoid allocations.
If you are really clever you can use a templated setter that uses perfect forwarding but to get it completely correct is actually quite complicated.
Compiler designers are clever folk. Use the crystal clear and therefore maintainable
void setString(const std::string& str) {
str_ = str;
}
and let the compiler worry about optimisations. Pretty please, with sugar on top.
Better still, don't masquerade code as being encapsulated. If you intend to provide such a method, then why not simply make str_ public? (Unless you intend to make other adjustments to your object if the member changes.)
Finally, why don't you like the default constructor of std::string? Ditch str_("").
The version with rvalue reference would not normally bind to an lvalue (in your case, mytext), you would have to move it, and therefore construct the object twice, leaving you with a dangerous object. A const lvalue reference should be slower when constructing from an rvalue, because it would do the same thing again: construct -> move -> move construct.
The compiler could possibly optimize the overhead away though.
Your best bet would actually be:
void setString(std::string str)
{
str_ = std::move(str);
}
The compiler here is suprisingly guaranteed to deduce the type of the argument and call the copy constructor for lvalues and the move constructor for rvalues.
Update:
Chris Dew pointed out that constructing and move assigning a string is actually more expensive than copy constructing. I am now convinced that using a const& argument is the better option. :D
You might probably use templatized setString and forwarding references:
class StringWrapper
{
private:
std::string str_;
public:
template<typename T>
void setString(T&& str) {
str_ = std::forward<T>(str);
}
};

What's happening in this return statement?

I'm reading on copy elision (and how it's supposed to be guaranteed in C++17) and this got me a bit confused (I'm not sure I know things I thought I knew before). So here's a minimal test case:
std::string nameof(int param)
{
switch (param)
{
case 1:
return "1"; // A
case 2:
return "2" // B
}
return std::string(); // C
}
The way I see it, cases A and B perform a direct construction on the return value so copy elision has no meaning here, while case C cannot perform copy elision because there are multiple return paths. Are these assumptions correct?
Also, I'd like to know if
there's a better way of writing the above (e.g. have a std::string retval; and always return that one or write cases A and B as return string("1") etc)
there's any move happening, for example "1" is a temporary but I'm assuming it's being used as a parameter for the constructor of std::string
there are optimization concerns I ommited (e.g. I believe C could be written as return{}, would that be a better choice?)
To make it NRVO-friendly, you should always return the same object. The value of the object might be different, but the object should be the same.
However, following above rule makes program harder to read, and often one should opt for readability over unnoticeable performance improvement. Since std::string has a move constructor defined, the difference between moving a pointer and a length and not doing so would be so tiny that I see no way of actually noticing this in the application.
As for your last question, return std::string() and return {} would be exactly the same.
There are also some incorrect statements in your question. For example, "1" is not a temporary. It's a string literal. Temporary is created from this literal.
Last, but not least, mandatory C++17 copy elision does not apply here. It is reserved for cases like
std::string x = "X";
which before the mandatory requirement could generate code to create a temporary std::string and initialize x with copy (or move) constructor.
In all cases, the copy might or might not be elided. Consider:
std::string j = nameof(whatever);
This could be implemented one of two ways:
Only one std::string object is ever constructed, j. (The copy is elided.)
A temporary std::string object is constructed, its value is copied to j, then the temporary is destroyed. (The function returns a temporary that is copied.)

Why not auto move if object is destroyed in next step?

If a function return a value like this:
std::string foo() {
std::string ret {"Test"};
return ret;
}
The compiler is allowed to move ret, since it is not used anymore. This doesn't hold for cases like this:
void foo (std::string str) {
// do sth. with str
}
int main() {
std::string a {"Test"};
foo(a);
}
Although a is obviously not needed anymore since it is destroyed in the next step you have to do:
int main() {
std::string a {"Test"};
foo(std::move(a));
}
Why? In my opinion, this is unnecessarily complicated, since rvalues and move semantic are hard to understand especially for beginners. So it would be great if you wouldn't have to care in standard cases but benefit from move semantic anyway (like with return values and temporaries). It is also annoying to have to look at the class definition to discover if a class is move-enabled and benefits from std::move at all (or use std::move anyway in the hope that it will sometimes be helpfull. It is also error-prone if you work on existing code:
int main() {
std::string a {"Test"};
foo(std::move(a));
// [...] 100 lines of code
// new line:
foo(a); // Ups!
}
The compiler knows better if an object is no longer used used. std::move everywhere is also verbose and reduces readability.
It is not obvious that an object is not going to be used after a given point.
For instance, have a look at the following variant of your code:
struct Bar {
~Bar() { std::cout << str.size() << std::endl; }
std::string& str;
}
Bar make_bar(std::string& str) {
return Bar{ str };
}
void foo (std::string str) {
// do sth. with str
}
int main() {
std::string a {"Test"};
Bar b = make_bar(a);
foo(std::move(a));
}
This code would break, because the string a is put in an invalid state by the move operation, but Bar is holding a reference to it, and will try to use it when it's destroyed, which happens after the foo call.
If make_bar is defined in an external assembly (e.g. a DLL/so), the compiler has no way, when compiling Bar b = make_bar(a);, of telling if b is holding a reference to a or not. So, even if foo(a) is the last usage of a, that doesn't mean it's safe to use move semantics, because some other object might be holding a reference to a as a consequence of previous instructions.
Only you can know if you can use move semantics or not, by looking at the specifications of the functions you call.
On the other side, you can always use move semantics in the return case, because that object will go out of scope anyway, which means any object holding a reference to it will result in undefined behaviour regardless of the move semantics.
By the way, you don't even need move semantics there, because of copy elision.
Its all sums up on what you define by "Destroyed"? std::string has no special effect for self-destroying but deallocating the char array which hides inside.
what if my destructor DOES something special? for example - doing some important logging? then by simply "moving it because it's not needed anymore" I miss some special behavior that the destructor might do.
Because compilers cannot do optimizations that change behavior of the program except when allowed by the standard. return optimization is allowed in certain cases but this optimization is not allowed for method calls. By changing the behavior, it would skip calling copy constructor and destructor which can have side effects (they are not required to be pure) but by skipping them, these side effects won't happen and therefore the behavior would be changed.
(Note that this highly depends on what you try to pass and, in this case, STL implementation. In cases where all code is available at the time of compilation, the compiler may determine both copy constructor and destructor are pure and optimize them out.)
While the compiler is allowed to move ret in your first snippet, it might also do a copy/move elision and construct it directly into the stack of the caller.
This is why it is not recommended to write the function like this:
std::string foo() {
auto ret = std::string("Test");
return std::move(ret);
}
Now for the second snippet, your string a is a lvalue. Move semantics only apply to rvalue-references, which obtained by returning a temporary, unnamed object, or casting a lvalue. The latter is exactly what std::move does.
std::string GetString();
auto s = GetString();
// s is a lvalue, use std::move to cast it to rvalue-ref to force move semantics
foo(s);
// GetString returns a temporary object, which is a rvalue-ref and move semantics apply automatically
foo(GetString());

Should I always move on `sink` constructor or setter arguments?

struct TestConstRef {
std::string str;
Test(const std::string& mStr) : str{mStr} { }
};
struct TestMove {
std::string str;
Test(std::string mStr) : str{std::move(mStr)} { }
};
After watching GoingNative 2013, I understood that sink arguments should always be passed by value and moved with std::move. Is TestMove::ctor the correct way of applying this idiom? Is there any case where TestConstRef::ctor is better/more efficient?
What about trivial setters? Should I use the following idiom or pass a const std::string&?
struct TestSetter {
std::string str;
void setStr(std::string mStr) { str = std::move(str); }
};
The simple answer is: yes.
The reason is quite simple as well, if you store by value you might either need to move (from a temporary) or make a copy (from a l-value). Let us examine what happens in both situations, with both ways.
From a temporary
if you take the argument by const-ref, the temporary is bound to the const-ref and cannot be moved from again, thus you end up making a (useless) copy.
if you take the argument by value, the value is initialized from the temporary (moving), and then you yourself move from the argument, thus no copy is made.
One limitation: a class without an efficient move-constructor (such as std::array<T, N>) because then you did two copies instead of one.
From a l-value (or const temporary, but who would do that...)
if you take the argument by const-ref, nothing happens there, and then you copy it (cannot move from it), thus a single copy is made.
if you take the argument by value, you copy it in the argument and then move from it, thus a single copy is made.
One limitation: the same... classes for which moving is akin to copying.
So, the simple answer is that in most cases, by using a sink you avoid unnecessary copies (replacing them by moves).
The single limitation is classes for which the move constructor is as expensive (or near as expensive) as the copy constructor; in which case having two moves instead of one copy is "worst". Thankfully, such classes are rare (arrays are one case).
A bit late, as this question already has an accepted answer, but anyways... here's an alternative:
struct Test {
std::string str;
Test(std::string&& mStr) : str{std::move(mStr)} { } // 1
Test(const std::string& mStr) : str{mStr} { } // 2
};
Why would that be better? Consider the two cases:
From a temporary (case // 1)
Only one move-constructor is called for str.
From an l-value (case // 2)
Only one copy-constructor is called for str.
It probably can't get any better than that.
But wait, there is more:
No additional code is generated on the caller's side! The calling of the copy- or move-constructor (which might be inlined or not) can now live in the implementation of the called function (here: Test::Test) and therefore only a single copy of that code is required. If you use by-value parameter passing, the caller is responsible for producing the object that is passed to the function. This might add up in large projects and I try to avoid it if possible.