What does it take to use the move assignment operator of std::string (in VC11)?
I hoped it'd be used automatically as v isn't needed after the assignment anymore.
Is std::move required in this case? If so, I might as well use the non-C++11 swap.
#include <string>
struct user_t
{
void set_name(std::string v)
{
name_ = v;
// swap(name_, v);
// name_ = std::move(v);
}
std::string name_;
};
int main()
{
user_t u;
u.set_name("Olaf");
return 0;
}
I hoped it'd be used automatically as v isn't needed after the assignment anymore. Is std::move required in this case?
Movement always must be explicitly stated for lvalues, unless they are being returned (by value) from a function.
This prevents accidentally moving something. Remember: movement is a destructive act; you don't want it to just happen.
Also, it would be strange if the semantics of name_ = v; changed based on whether this was the last line in a function. After all, this is perfectly legal code:
name_ = v;
v[0] = 5; //Assuming v has at least one character.
Why should the first line execute a copy sometimes and a move other times?
If so, I might as well use the non-C++11 swap.
You can do as you like, but std::move is more obvious as to the intent. We know what it means and what you're doing with it.
The accepted answer is a good answer (and I've upvoted it). But I wanted to address this question in a little more detail:
The core of my question is: Why doesn't it pick the move assignment
operator automatically? The compiler knows v isn't used after the
assignment, doesn't it? Or does C++11 not require the compiler to be
that smart?
This possibility was looked at during the design of move semantics. At an extreme, you might want the compiler to do some static analysis and move from objects whenever possible:
void set_name(std::string v)
{
name_ = v; // move from v if it can be proven that some_event is false?
if (some_event)
f(v);
}
Ultimately demanding this kind of analysis from the compiler is very tricky. Some compilers may be able to make the proof, and others may not. Thus leading to code that isn't really portable.
Ok, so what about some simpler cases without if statements?
void foo()
{
X x;
Y y(x);
X x2 = x; // last use? move?
}
Well, it is difficult to know if y.~Y() will notice x has been moved from. And in general:
void foo()
{
X x;
// ...
// lots of code here
// ...
X x2 = x; // last use? move?
}
it is difficult for the compiler to analyze this to know if x is truly no longer used after the copy construction to x2.
So the original "move" proposal gave a rule for implicit moves that was really simple, and very conservative:
lvalues can only be implicitly moved from in cases where copy
elision is already permissible.
For example:
#include <cassert>
struct X
{
int i_;
X() : i_(1) {}
~X() {i_ = 0;}
};
struct Y
{
X* x_;
Y() : x_(0) {}
~Y() {assert(x_ != 0); assert(x_->i_ != 0);}
};
X foo(bool some_test)
{
Y y;
X x;
if (some_test)
{
X x2;
return x2;
}
y.x_ = &x;
return x;
}
int main()
{
X x = foo(false);
}
Here, by C++98/03 rules, this program may or may not assert, depending on whether or not copy elision at return x happens. If it does happen, the program runs fine. If it doesn't happen, the program asserts.
And so it was reasoned: When RVO is allowed, we are already in an area where there are no guarantees regarding the value of x. So we should be able to take advantage of this leeway and move from x. The risk looked small and the benefit looked huge. Not only would this mean that many existing programs would become much faster with a simple recompile, but it also meant that we could now return "move only" types from factory functions. This is a very large benefit to risk ratio.
Late in the standardization process, we got a little greedy and also said that implicit move happens when returning a by-value parameter (and the type matches the return type). The benefits seem relatively large here too, though the chance for code breakage is slightly larger since this is not a case where RVO was (or is) legal. But I don't have a demonstration of breaking code for this case.
So ultimately, the answer to your core question is that the original design of move semantics took a very conservative route with respect to breaking existing code. Had it not, it would surely have been shot down in committee. Late in the process, there were a few changes that made the design a bit more aggressive. But by this time the core proposal was firmly entrenched in the standard with a majority (but not unanimous) support.
In your example, set_name takes the string by value. Inside set_name, however, v
is an lvalue. Let's treat these cases separately:
user_t u;
std::string str("Olaf"); // Creates string by copying a char const*.
u.set_name(std::move(str)); // Moves string.
Inside set_name you invoke the assignment operator of std::string,
which incurs an unnecessary copy. But there is also an rvalue
overload of operator=,
which makes more sense in your case:
void set_name(std::string v)
{
name_ = std::move(v);
}
This way, the only copying that takes place is the string constrution
(std::string("Olaf")).
Related
In C++11, we can write this code:
struct Cat {
Cat(){}
};
const Cat cat;
std::move(cat); //this is valid in C++11
when I call std::move, it means I want to move the object, i.e. I will change the object. To move a const object is unreasonable, so why does std::move not restrict this behaviour? It will be a trap in the future, right?
Here trap means as Brandon mentioned in the comment:
" I think he means it "traps" him sneaky sneaky because if he doesn't
realize, he ends up with a copy which is not what he intended."
In the book 'Effective Modern C++' by Scott Meyers, he gives an example:
class Annotation {
public:
explicit Annotation(const std::string text)
: value(std::move(text)) //here we want to call string(string&&),
//but because text is const,
//the return type of std::move(text) is const std::string&&
//so we actually called string(const string&)
//it is a bug which is very hard to find out
private:
std::string value;
};
If std::move was forbidden from operating on a const object, we could easily find out the bug, right?
There's a trick here you're overlooking, namely that std::move(cat) doesn't actually move anything. It merely tells the compiler to try to move. However, since your class has no constructor that accepts a const CAT&&, it will instead use the implicit const CAT& copy constructor, and safely copy. No danger, no trap. If the copy constructor is disabled for any reason, you'll get a compiler error.
struct CAT
{
CAT(){}
CAT(const CAT&) {std::cout << "COPY";}
CAT(CAT&&) {std::cout << "MOVE";}
};
int main() {
const CAT cat;
CAT cat2 = std::move(cat);
}
prints COPY, not MOVE.
http://coliru.stacked-crooked.com/a/0dff72133dbf9d1f
Note that the bug in the code you mention is a performance issue, not a stability issue, so such a bug won't cause a crash, ever. It will just use a slower copy. Additionally, such a bug also occurs for non-const objects that don't have move constructors, so merely adding a const overload won't catch all of them. We could check for the ability to move construct or move assign from the parameter type, but that would interfere with generic template code that is supposed to fall back on the copy constructor.
And heck, maybe someone wants to be able to construct from const CAT&&, who am I to say he can't?
struct strange {
mutable size_t count = 0;
strange( strange const&& o ):count(o.count) { o.count = 0; }
};
const strange s;
strange s2 = std::move(s);
here we see a use of std::move on a T const. It returns a T const&&. We have a move constructor for strange that takes exactly this type.
And it is called.
Now, it is true that this strange type is more rare than the bugs your proposal would fix.
But, on the other hand, the existing std::move works better in generic code, where you don't know if the type you are working with is a T or a T const.
One reason the rest of the answers have overlooked so far is the ability for generic code to be resilient in the face of move. For example lets say that I wanted to write a generic function which moved all of the elements out of one kind of container to create another kind of container with the same values:
template <class C1, class C2>
C1
move_each(C2&& c2)
{
return C1(std::make_move_iterator(c2.begin()),
std::make_move_iterator(c2.end()));
}
Cool, now I can relatively efficiently create a vector<string> from a deque<string> and each individual string will be moved in the process.
But what if I want to move from a map?
int
main()
{
std::map<int, std::string> m{{1, "one"}, {2, "two"}, {3, "three"}};
auto v = move_each<std::vector<std::pair<int, std::string>>>(m);
for (auto const& p : v)
std::cout << "{" << p.first << ", " << p.second << "} ";
std::cout << '\n';
}
If std::move insisted on a non-const argument, the above instantiation of move_each would not compile because it is trying to move a const int (the key_type of the map). But this code doesn't care if it can't move the key_type. It wants to move the mapped_type (std::string) for performance reasons.
It is for this example, and countless other examples like it in generic coding that std::move is a request to move, not a demand to move.
I have the same concern as the OP.
std::move does not move an object, neither guarantees the object is movable. Then why is it called move?
I think being not movable can be one of following two scenarios:
1. The moving type is const.
The reason we have const keyword in the language is that we want the compiler to prevent any change to an object defined to be const. Given the example in Scott Meyers' book:
class Annotation {
public:
explicit Annotation(const std::string text)
: value(std::move(text)) // "move" text into value; this code
{ … } // doesn't do what it seems to!
…
private:
std::string value;
};
What does it literally mean? Move a const string to the value member - at least, that's my understanding before I reading the explanation.
If the language intends to not do move or not guarantee move is applicable when std::move() is called, then it is literally misleading when using word move.
If the language is encouraging people using std::move to have better efficiency, it has to prevent traps like this as early as possible, especially for this type of obvious literal contradiction.
I agree that people should be aware moving a constant is impossible, but this obligation should not imply the compiler can be silent when obvious contradiction happens.
2. The object has no move constructor
Personally, I think this is a separate story from OP's concern, as Chris Drew said
#hvd That seems like a bit of a non-argument to me. Just because OP's suggestion doesn't fix all bugs in the world doesn't necessarily mean it is a bad idea (it probably is, but not for the reason you give). – Chris Drew
I'm surprised nobody mentioned the backward compatibility aspect of this. I believe, std::move was purposely designed to do this in C++11. Imagine you're working with a legacy codebase, that heavily relies on C++98 libraries, so without the fallback on copy assignment, moving would break things.
Fortunately you can use clang-tidy's check to find such issues:
https://clang.llvm.org/extra/clang-tidy/checks/performance/move-const-arg.html
I came across this answer to how to write C++ getters/setters and the author implies that when it comes to value-oriented properties, the setters in the standard library use the std::move() like this...
class Foo
{
X x_;
public:
X x() const { return x_; }
void x(X x) { x_ = std::move(x); }
}
(code taken directly from the aforementioned answer)
... to leverage the move operator, if it is specified, potentially improving performance.
That itself makes sense to me - the values are copied once when passed by value to the method, so there's no need to copy them a second time to the property if they can be moved instead. However, my knowledge of C++ isn't good enough to be certain that doing this is safe in all cases.
Does passing an argument by value always make a deep copy? Or does it depend on the object? Considering the std::move() supposedly "signals the compiler I don't care what happens to the moved object", this could have unexpected side effects if I intended the original object to stay.
Apologies if this is a well-known problem, I'm in the process of learning C++ and am really trying to get to the bottom of the language.
Yes there is.
Receiving a parameter by value and move is okay if you always send an rvalue to that parameter. It is also okay to send an lvalue, but will be slower than receiving by const ref, especially in a loop.
Why? It seem that instead of making a copy you simply make a copy and then move, in which the move in insignificant in term of perfomance.
False.
You assume that a copy assignment is as slow as a copy constructor, which is false.
Consider this:
std::string str_target;
std::string long1 = "long_string_no_sso hello1234";
std::string long2 = "long_string_no_sso goobye123";
str_target = long1; // allocation to copy the buffer
str_target = long2; // no allocation, reuse space, simply copy bytes
This is why for setter function, by default, you should receive by const ref by default, and add a rvalue ref to optimize for rvalues:
class Foo
{
X x_;
public:
X x() const { return x_; }
// default setter, okay in most cases
void x(X const& x) { x_ = x; }
// optionally, define an overload that optimise for rvalues
void x(X&& x) noexcept { x_ = std::move(x); }
};
The only place where this does not apply is on constructor parameter and other sink functions, since they always construct and there is no buffer to reuse:
struct my_type {
// No buffer to reuse, this->_str is a totally new object
explicit my_type(std::string str) noexcept : _str{std::move(str)}
private:
std::string _str;
};
In C++11, we can write this code:
struct Cat {
Cat(){}
};
const Cat cat;
std::move(cat); //this is valid in C++11
when I call std::move, it means I want to move the object, i.e. I will change the object. To move a const object is unreasonable, so why does std::move not restrict this behaviour? It will be a trap in the future, right?
Here trap means as Brandon mentioned in the comment:
" I think he means it "traps" him sneaky sneaky because if he doesn't
realize, he ends up with a copy which is not what he intended."
In the book 'Effective Modern C++' by Scott Meyers, he gives an example:
class Annotation {
public:
explicit Annotation(const std::string text)
: value(std::move(text)) //here we want to call string(string&&),
//but because text is const,
//the return type of std::move(text) is const std::string&&
//so we actually called string(const string&)
//it is a bug which is very hard to find out
private:
std::string value;
};
If std::move was forbidden from operating on a const object, we could easily find out the bug, right?
There's a trick here you're overlooking, namely that std::move(cat) doesn't actually move anything. It merely tells the compiler to try to move. However, since your class has no constructor that accepts a const CAT&&, it will instead use the implicit const CAT& copy constructor, and safely copy. No danger, no trap. If the copy constructor is disabled for any reason, you'll get a compiler error.
struct CAT
{
CAT(){}
CAT(const CAT&) {std::cout << "COPY";}
CAT(CAT&&) {std::cout << "MOVE";}
};
int main() {
const CAT cat;
CAT cat2 = std::move(cat);
}
prints COPY, not MOVE.
http://coliru.stacked-crooked.com/a/0dff72133dbf9d1f
Note that the bug in the code you mention is a performance issue, not a stability issue, so such a bug won't cause a crash, ever. It will just use a slower copy. Additionally, such a bug also occurs for non-const objects that don't have move constructors, so merely adding a const overload won't catch all of them. We could check for the ability to move construct or move assign from the parameter type, but that would interfere with generic template code that is supposed to fall back on the copy constructor.
And heck, maybe someone wants to be able to construct from const CAT&&, who am I to say he can't?
struct strange {
mutable size_t count = 0;
strange( strange const&& o ):count(o.count) { o.count = 0; }
};
const strange s;
strange s2 = std::move(s);
here we see a use of std::move on a T const. It returns a T const&&. We have a move constructor for strange that takes exactly this type.
And it is called.
Now, it is true that this strange type is more rare than the bugs your proposal would fix.
But, on the other hand, the existing std::move works better in generic code, where you don't know if the type you are working with is a T or a T const.
One reason the rest of the answers have overlooked so far is the ability for generic code to be resilient in the face of move. For example lets say that I wanted to write a generic function which moved all of the elements out of one kind of container to create another kind of container with the same values:
template <class C1, class C2>
C1
move_each(C2&& c2)
{
return C1(std::make_move_iterator(c2.begin()),
std::make_move_iterator(c2.end()));
}
Cool, now I can relatively efficiently create a vector<string> from a deque<string> and each individual string will be moved in the process.
But what if I want to move from a map?
int
main()
{
std::map<int, std::string> m{{1, "one"}, {2, "two"}, {3, "three"}};
auto v = move_each<std::vector<std::pair<int, std::string>>>(m);
for (auto const& p : v)
std::cout << "{" << p.first << ", " << p.second << "} ";
std::cout << '\n';
}
If std::move insisted on a non-const argument, the above instantiation of move_each would not compile because it is trying to move a const int (the key_type of the map). But this code doesn't care if it can't move the key_type. It wants to move the mapped_type (std::string) for performance reasons.
It is for this example, and countless other examples like it in generic coding that std::move is a request to move, not a demand to move.
I have the same concern as the OP.
std::move does not move an object, neither guarantees the object is movable. Then why is it called move?
I think being not movable can be one of following two scenarios:
1. The moving type is const.
The reason we have const keyword in the language is that we want the compiler to prevent any change to an object defined to be const. Given the example in Scott Meyers' book:
class Annotation {
public:
explicit Annotation(const std::string text)
: value(std::move(text)) // "move" text into value; this code
{ … } // doesn't do what it seems to!
…
private:
std::string value;
};
What does it literally mean? Move a const string to the value member - at least, that's my understanding before I reading the explanation.
If the language intends to not do move or not guarantee move is applicable when std::move() is called, then it is literally misleading when using word move.
If the language is encouraging people using std::move to have better efficiency, it has to prevent traps like this as early as possible, especially for this type of obvious literal contradiction.
I agree that people should be aware moving a constant is impossible, but this obligation should not imply the compiler can be silent when obvious contradiction happens.
2. The object has no move constructor
Personally, I think this is a separate story from OP's concern, as Chris Drew said
#hvd That seems like a bit of a non-argument to me. Just because OP's suggestion doesn't fix all bugs in the world doesn't necessarily mean it is a bad idea (it probably is, but not for the reason you give). – Chris Drew
I'm surprised nobody mentioned the backward compatibility aspect of this. I believe, std::move was purposely designed to do this in C++11. Imagine you're working with a legacy codebase, that heavily relies on C++98 libraries, so without the fallback on copy assignment, moving would break things.
Fortunately you can use clang-tidy's check to find such issues:
https://clang.llvm.org/extra/clang-tidy/checks/performance/move-const-arg.html
Move semantics can be useful when the compiler cannot use RVO and NRVO. But in which case can't the compiler use these features?
The answer is that it is compiler and situation dependent. E.g. control flow branching might confuse optimizers. Wikipedia give this example:
#include <string>
std::string f(bool cond = false) {
std::string first("first");
std::string second("second");
// the function may return one of two named objects
// depending on its argument. RVO might not be applied
return cond ? first : second;
}
int main() {
std::string result = f();
}
Well, it's not so much whether the compiler can use RVO, but whether it thereby can avoid a copy construction.
Consider:
struct Blah
{
int x;
Blah( int const _x ): x( _x ) { cout << "Hum de dum " << x << endl; }
};
Blah foo()
{
Blah const a( 1 );
if( fermatWasRight() ) { return Blah( 2 ); }
return a;
}
Getting the side effects (output from the constructor) right here, is at first glance pretty incompatible with constructing a directy in storage provided by the caller. But if the compiler is smart enough then it can note that destroying this object is a null-operation. And more generally, for any particular situation, if the compiler is smart enough then maybe it can manage to avoid a copy operation no matter how sneakily we design the code.
I'm not sure of the formal though, but the above, with more payload in the object so that copying would be more expensive, is one case where move semantics can help, so that optimization will be guaranteed no matter the smarts of the compiler (or not).
I've found recently that most of the errors in my C++ programs are of
a form like the following example:
#include <iostream>
class Z
{
public:
Z(int n) : n(n) {}
int n;
};
class Y
{
public:
Y(const Z& z) : z(z) {}
const Z& z;
};
class X
{
public:
X(const Y& y) : y(y) {}
Y y;
};
class Big
{
public:
Big()
{
for (int i = 0; i < 1000; ++i) { a[i] = i + 1000; }
}
int a[1000];
};
X get_x() { return X(Y(Z(123))); }
int main()
{
X x = get_x();
Big b;
std::cout << x.y.z.n << std::endl;
}
OUTPUT: 1000
I would expect this program to output 123 (the value of x.y.z.n set in
get_x()) but the creation of "Big b" overwrites the temporary Z. As a
result, the reference to the temporary Z in the object Y is now
overwritten with Big b, and hence the output is not what I would
expect.
When I compiled this program with gcc 4.5 with the option "-Wall", it
gave no warning.
The fix is obviously to remove the reference from the member Z in the
class Y. However, often class Y is part of a library which I have not
developed (boost::fusion most recently), and in addition the situation
is much more complicated than this example I've given.
This there some sort of option to gcc, or any additional software that
would allow me to detect such issues preferably at compile time, but
even runtime would be better than nothing?
Thanks,
Clinton
I submitted such cases on the clang-dev mailing list a few months ago, but no one had the time to work on it back then (and neither did I, unfortunately).
Argyrios Kyrtzidis is currently working on it though, and here is his last update on the matter (30 Nov 23h04 GMT):
I reverted the previous commit, much
better fix in
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20101129/036875.html.
e.g. for
struct S { int x; };
int &get_ref() { S s; S &s2 = s; int &x2 = s2.x; return x2; }
we get
t3.cpp:9:10: warning: reference to stack memory associated with local variable 's' returned
return x2;
^~
t3.cpp:8:8: note: binding reference variable 'x2' here
int &x2 = s2.x;
^ ~~
t3.cpp:7:6: note: binding reference variable 's2' here
S &s2 = s;
^ ~
1 warning generated.
The previous attempt failed the self-hosting test, so I hope this attempt will pass. I'm pretty glad Argyrios is looking into it anyway :)
It isn't perfect yet, admittedly, as it's quite a complicated problem to tackle (reminds me of pointer aliasing in a way), but this is nonetheless a great step in the right direction.
Could you test your code against this version of Clang ? I'm pretty sure Argyrios would appreciate the feedback (whether it is detected or not).
[Edited third bullet to demonstrate a technique that may help]
This is the rabbit hole you go down when a language permits passing arguments by value or reference with the same caller syntax. You have the following options:
Change the arguments to non-const references. A temporary value will not match a non-const reference type.
Drop the references altogether in cases where this is a possibility. If your const references don't indicate logically shared state between caller and callee (if they did this problem wouldn't occur very frequently), they were probably inserted in an attempt to avoid naive copying of complex types. Modern compilers have advanced copy ellision optimizations that make pass-by-value as efficient as pass-by-reference in most cases; see http://cpp-next.com/archive/2009/08/want-speed-pass-by-value for a great explanation. Copy ellision clearly won't be performed if you're passing the values on to external library functions that might modify the temporaries, but if that were the case then you're either not passing them in as const references or deliberately casting away the const-ness in the original version. This is my preferred solution as it lets the compiler worry about copy optimization and frees me to worry about other sources of error in the code.
If your compiler supports rvalue references, use them. If you can at least edit the parameter types of the functions where you worry about this problem, you can define a wrapper metaclass like so:
template < typename T > class need_ref {
T & ref_;
public:
need_ref(T &&x) { /* nothing */ }
need_ref(T &x) : ref_(x) { /* nothing */ }
operator T & () { return ref_; }
};
and then replace arguments of type T& with arguments of type need_ref. For example, if you define the following
class user {
int &z;
public:
user(need_ref< int> arg) : z(arg) { /* nothing */ }
};
then you can safely initalize an object of type user with code of the form "int a = 1, b = 2; user ua(a);", but if you attempt to initialize as "user sum(a+b)" or "user five(5)" your compiler should generate an uninitialized reference error inside the first version of the need_ref() constructor. The technique is obviously not limited to constructors, and imposes no runtime overhead.
The problem here is the code
Y(const Z& z) : z(z) {}
as the member 'z' is initialized with a reference to the formal parameter 'z'. Once the constructor returns the reference refers to an object which is no longer valid.
I don't think the compiler will or can in many cases detect such logical flaws. The fix then IMO is obviously to be aware of such classes and use them in a manner appropriate to their design. This should really be documented by the library vendor.
BTW, it is better to name the member 'Y::z' as 'Y::mz' if possible. The expression 'z(z)' is not very appealing