Related
This question already has answers here:
What is std::move(), and when should it be used?
(9 answers)
Closed 3 years ago.
I am currently learning more about all the c++11/14 features and wondering when to use std::move in function calls.
I know I should not use it when returning local variables, because this breaks return value optimisation, but I do not really understand where in function calls casting to a rvalue actually helps.
When a function accepts an rvalue reference, you have to provide an rvalue (either by having already a prvalue, or using std::move to create an xvalue). E.g.
void foo(std::string&& s);
std::string s;
foo(s); // Compile-time error
foo(std::move(s)); // OK
foo(std::string{}) // OK
When a function accepts a value, you can use std::move to move-construct the function argument instead of copy-constructing. E.g.
void bar(std::string s);
std::string s;
bar(s); // Copies into `s`
bar(std::move(s)); // Moves into `s`
When a function accepts a forwarding reference, you can use std::move to allows the function to move the object further down the call stack. E.g.
template <typename T>
void pipe(T&& x)
{
sink(std::forward<T>(x));
}
std::string s;
pipe(s); // `std::forward` will do nothing
pipe(std::move(s)); // `std::forward` will move
pipe(std::string{}); // `std::forward` will move
When you have some substantial object, and you're passing it as an argument to a function (e.g. an API, or a container emplace operation), and you will no longer need it at the callsite, so you want to transfer ownership, rather than copying then "immediately" losing the original. That's when you move it.
void StoreThing(std::vector<int> v);
int main()
{
std::vector<int> v{1,2,3,4,5,6/*,.....*/};
StoreThing(v);
}
// Copies `v` then lets it go out of scope. Pointless!
versus:
void StoreThing(std::vector<int> v);
int main()
{
std::vector<int> v{1,2,3,4,5,6/*,.....*/};
StoreThing(std::move(v));
}
// Better! We didn't need `v` in `main` any more...
This happens automatically when returning local variables, if RVO hasn't been applied (and note that such an "optimisation" is mandated since C++17 so you're right to say that adding a "redundant" std::move in that case can actually be harmful).
Also it's pointless to std::move if you're passing something really small (particularly a non-class thing which cannot possibly have a move constructor, let alone a meaningful one!) or you know you're passing into a function that accepts its arguments const-ly; in that case it's up to you as to whether you want to save the added source code distraction of a std::move that won't do anything: on the surface it's wise, but in a template you may not be so sure.
I heard a recent talk by Herb Sutter who suggested that the reasons to pass std::vector and std::string by const & are largely gone. He suggested that writing a function such as the following is now preferable:
std::string do_something ( std::string inval )
{
std::string return_val;
// ... do stuff ...
return return_val;
}
I understand that the return_val will be an rvalue at the point the function returns and can therefore be returned using move semantics, which are very cheap. However, inval is still much larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea.
Can anyone explain why Herb might have said this?
The reason Herb said what he said is because of cases like this.
Let's say I have function A which calls function B, which calls function C. And A passes a string through B and into C. A does not know or care about C; all A knows about is B. That is, C is an implementation detail of B.
Let's say that A is defined as follows:
void A()
{
B("value");
}
If B and C take the string by const&, then it looks something like this:
void B(const std::string &str)
{
C(str);
}
void C(const std::string &str)
{
//Do something with `str`. Does not store it.
}
All well and good. You're just passing pointers around, no copying, no moving, everyone's happy. C takes a const& because it doesn't store the string. It simply uses it.
Now, I want to make one simple change: C needs to store the string somewhere.
void C(const std::string &str)
{
//Do something with `str`.
m_str = str;
}
Hello, copy constructor and potential memory allocation (ignore the Short String Optimization (SSO)). C++11's move semantics are supposed to make it possible to remove needless copy-constructing, right? And A passes a temporary; there's no reason why C should have to copy the data. It should just abscond with what was given to it.
Except it can't. Because it takes a const&.
If I change C to take its parameter by value, that just causes B to do the copy into that parameter; I gain nothing.
So if I had just passed str by value through all of the functions, relying on std::move to shuffle the data around, we wouldn't have this problem. If someone wants to hold on to it, they can. If they don't, oh well.
Is it more expensive? Yes; moving into a value is more expensive than using references. Is it less expensive than the copy? Not for small strings with SSO. Is it worth doing?
It depends on your use case. How much do you hate memory allocations?
Are the days of passing const std::string & as a parameter over?
No. Many people take this advice (including Dave Abrahams) beyond the domain it applies to, and simplify it to apply to all std::string parameters -- Always passing std::string by value is not a "best practice" for any and all arbitrary parameters and applications because the optimizations these talks/articles focus on apply only to a restricted set of cases.
If you're returning a value, mutating the parameter, or taking the value, then passing by value could save expensive copying and offer syntactical convenience.
As ever, passing by const reference saves much copying when you don't need a copy.
Now to the specific example:
However inval is still quite a lot larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea. Can anyone explain why Herb might have said this?
If stack size is a concern (and assuming this is not inlined/optimized), return_val + inval > return_val -- IOW, peak stack usage can be reduced by passing by value here (note: oversimplification of ABIs). Meanwhile, passing by const reference can disable the optimizations. The primary reason here is not to avoid stack growth, but to ensure the optimization can be performed where it is applicable.
The days of passing by const reference aren't over -- the rules just more complicated than they once were. If performance is important, you'll be wise to consider how you pass these types, based on the details you use in your implementations.
Short answer: NO! Long answer:
If you won't modify the string (treat is as read-only), pass it as const ref&.(the const ref& obviously needs to stay within scope while the function that uses it executes)
If you plan to modify it or you know it will get out of scope (threads), pass it as a value, don't copy the const ref& inside your function body.
There was a post on cpp-next.com called "Want speed, pass by value!". The TL;DR:
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.
TRANSLATION of ^
Don’t copy your function arguments --- means: if you plan to modify the argument value by copying it to an internal variable, just use a value argument instead.
So, don't do this:
std::string function(const std::string& aString){
auto vString(aString);
vString.clear();
return vString;
}
do this:
std::string function(std::string aString){
aString.clear();
return aString;
}
When you need to modify the argument value in your function body.
You just need to be aware how you plan to use the argument in the function body. Read-only or NOT... and if it sticks within scope.
This highly depends on the compiler's implementation.
However, it also depends on what you use.
Lets consider next functions :
bool foo1( const std::string v )
{
return v.empty();
}
bool foo2( const std::string & v )
{
return v.empty();
}
These functions are implemented in a separate compilation unit in order to avoid inlining. Then :
1. If you pass a literal to these two functions, you will not see much difference in performances. In both cases, a string object has to be created
2. If you pass another std::string object, foo2 will outperform foo1, because foo1 will do a deep copy.
On my PC, using g++ 4.6.1, I got these results :
variable by reference: 1000000000 iterations -> time elapsed: 2.25912 sec
variable by value: 1000000000 iterations -> time elapsed: 27.2259 sec
literal by reference: 100000000 iterations -> time elapsed: 9.10319 sec
literal by value: 100000000 iterations -> time elapsed: 8.62659 sec
Unless you actually need a copy it's still reasonable to take const &. For example:
bool isprint(std::string const &s) {
return all_of(begin(s),end(s),(bool(*)(char))isprint);
}
If you change this to take the string by value then you'll end up moving or copying the parameter, and there's no need for that. Not only is copy/move likely more expensive, but it also introduces a new potential failure; the copy/move could throw an exception (e.g., allocation during copy could fail) whereas taking a reference to an existing value can't.
If you do need a copy then passing and returning by value is usually (always?) the best option. In fact I generally wouldn't worry about it in C++03 unless you find that extra copies actually causes a performance problem. Copy elision seems pretty reliable on modern compilers. I think people's skepticism and insistence that you have to check your table of compiler support for RVO is mostly obsolete nowadays.
In short, C++11 doesn't really change anything in this regard except for people that didn't trust copy elision.
Almost.
In C++17, we have basic_string_view<?>, which brings us down to basically one narrow use case for std::string const& parameters.
The existence of move semantics has eliminated one use case for std::string const& -- if you are planning on storing the parameter, taking a std::string by value is more optimal, as you can move out of the parameter.
If someone called your function with a raw C "string" this means only one std::string buffer is ever allocated, as opposed to two in the std::string const& case.
However, if you don't intend to make a copy, taking by std::string const& is still useful in C++14.
With std::string_view, so long as you aren't passing said string to an API that expects C-style '\0'-terminated character buffers, you can more efficiently get std::string like functionality without risking any allocation. A raw C string can even be turned into a std::string_view without any allocation or character copying.
At that point, the use for std::string const& is when you aren't copying the data wholesale, and are going to pass it on to a C-style API that expects a null terminated buffer, and you need the higher level string functions that std::string provides. In practice, this is a rare set of requirements.
std::string is not Plain Old Data(POD), and its raw size is not the most relevant thing ever. For example, if you pass in a string which is above the length of SSO and allocated on the heap, I would expect the copy constructor to not copy the SSO storage.
The reason this is recommended is because inval is constructed from the argument expression, and thus is always moved or copied as appropriate- there is no performance loss, assuming that you need ownership of the argument. If you don't, a const reference could still be the better way to go.
I've copy/pasted the answer from this question here, and changed the names and spelling to fit this question.
Here is code to measure what is being asked:
#include <iostream>
struct string
{
string() {}
string(const string&) {std::cout << "string(const string&)\n";}
string& operator=(const string&) {std::cout << "string& operator=(const string&)\n";return *this;}
#if (__has_feature(cxx_rvalue_references))
string(string&&) {std::cout << "string(string&&)\n";}
string& operator=(string&&) {std::cout << "string& operator=(string&&)\n";return *this;}
#endif
};
#if PROCESS == 1
string
do_something(string inval)
{
// do stuff
return inval;
}
#elif PROCESS == 2
string
do_something(const string& inval)
{
string return_val = inval;
// do stuff
return return_val;
}
#if (__has_feature(cxx_rvalue_references))
string
do_something(string&& inval)
{
// do stuff
return std::move(inval);
}
#endif
#endif
string source() {return string();}
int main()
{
std::cout << "do_something with lvalue:\n\n";
string x;
string t = do_something(x);
#if (__has_feature(cxx_rvalue_references))
std::cout << "\ndo_something with xvalue:\n\n";
string u = do_something(std::move(x));
#endif
std::cout << "\ndo_something with prvalue:\n\n";
string v = do_something(source());
}
For me this outputs:
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=1 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
string(string&&)
do_something with xvalue:
string(string&&)
string(string&&)
do_something with prvalue:
string(string&&)
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=2 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
do_something with xvalue:
string(string&&)
do_something with prvalue:
string(string&&)
The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:
+----+--------+--------+---------+
| | lvalue | xvalue | prvalue |
+----+--------+--------+---------+
| p1 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
| p2 | 1/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
The pass-by-value solution requires only one overload but costs an extra move construction when passing lvalues and xvalues. This may or may not be acceptable for any given situation. Both solutions have advantages and disadvantages.
Herb Sutter is still on record, along with Bjarne Stroustroup, in recommending const std::string& as a parameter type; see https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-in .
There is a pitfall not mentioned in any of the other answers here: if you pass a string literal to a const std::string& parameter, it will pass a reference to a temporary string, created on-the-fly to hold the characters of the literal. If you then save that reference, it will be invalid once the temporary string is deallocated. To be safe, you must save a copy, not the reference. The problem stems from the fact that string literals are const char[N] types, requiring promotion to std::string.
The code below illustrates the pitfall and the workaround, along with a minor efficiency option -- overloading with a const char* method, as described at Is there a way to pass a string literal as reference in C++.
(Note: Sutter & Stroustroup advise that if you keep a copy of the string, also provide an overloaded function with a && parameter and std::move() it.)
#include <string>
#include <iostream>
class WidgetBadRef {
public:
WidgetBadRef(const std::string& s) : myStrRef(s) // copy the reference...
{}
const std::string& myStrRef; // might be a reference to a temporary (oops!)
};
class WidgetSafeCopy {
public:
WidgetSafeCopy(const std::string& s) : myStrCopy(s)
// constructor for string references; copy the string
{std::cout << "const std::string& constructor\n";}
WidgetSafeCopy(const char* cs) : myStrCopy(cs)
// constructor for string literals (and char arrays);
// for minor efficiency only;
// create the std::string directly from the chars
{std::cout << "const char * constructor\n";}
const std::string myStrCopy; // save a copy, not a reference!
};
int main() {
WidgetBadRef w1("First string");
WidgetSafeCopy w2("Second string"); // uses the const char* constructor, no temp string
WidgetSafeCopy w3(w2.myStrCopy); // uses the String reference constructor
std::cout << w1.myStrRef << "\n"; // garbage out
std::cout << w2.myStrCopy << "\n"; // OK
std::cout << w3.myStrCopy << "\n"; // OK
}
OUTPUT:
const char * constructor
const std::string& constructor
Second string
Second string
See “Herb Sutter "Back to the Basics! Essentials of Modern C++ Style”. Among other topics, he reviews the parameter passing advice that’s been given in the past, and new ideas that come in with C++11 and specifically looks at the idea of passing strings by value.
The benchmarks show that passing std::strings by value, in cases where the function will copy it in anyway, can be significantly slower!
This is because you are forcing it to always make a full copy (and then move into place), while the const& version will update the old string which may reuse the already-allocated buffer.
See his slide 27: For “set” functions, option 1 is the same as it always was. Option 2 adds an overload for rvalue reference, but this gives a combinatorial explosion if there are multiple parameters.
It is only for “sink” parameters where a string must be created (not have its existing value changed) that the pass-by-value trick is valid. That is, constructors in which the parameter directly initializes the member of the matching type.
If you want to see how deep you can go in worrying about this, watch Nicolai Josuttis’s presentation and good luck with that (“Perfect — Done!” n times after finding fault with the previous version. Ever been there?)
This is also summarized as ⧺F.15 in the Standard Guidelines.
update
Generally, you want to declare "string" parameters as std::string_view (by value). This allows you to pass an existing std::string object as efficiently as with const std::string&, and also pass a lexical string literal (like "hello!") without copying it, and pass objects of type string_view which is necessary now that those are in the ecosystem too.
The exception is when the function needs an actual std::string instance, in order to pass to another function that's declared to take const std::string&.
IMO using the C++ reference for std::string is a quick and short local optimization, while using passing by value could be (or not) a better global optimization.
So the answer is: it depends on circumstances:
If you write all the code from the outside to the inside functions, you know what the code does, you can use the reference const std::string &.
If you write the library code or use heavily library code where strings are passed, you likely gain more in global sense by trusting std::string copy constructor behavior.
As #JDługosz points out in the comments, Herb gives other advice in another (later?) talk, see roughly from here: https://youtu.be/xnqTKD8uD64?t=54m50s.
His advice boils down to only using value parameters for a function f that takes so-called sink arguments, assuming you will move construct from these sink arguments.
This general approach only adds the overhead of a move constructor for both lvalue and rvalue arguments compared to an optimal implementation of f tailored to lvalue and rvalue arguments respectively. To see why this is the case, suppose f takes a value parameter, where T is some copy and move constructible type:
void f(T x) {
T y{std::move(x)};
}
Calling f with an lvalue argument will result in a copy constructor being called to construct x, and a move constructor being called to construct y. On the other hand, calling f with an rvalue argument will cause a move constructor to be called to construct x, and another move constructor to be called to construct y.
In general, the optimal implementation of f for lvalue arguments is as follows:
void f(const T& x) {
T y{x};
}
In this case, only one copy constructor is called to construct y. The optimal implementation of f for rvalue arguments is, again in general, as follows:
void f(T&& x) {
T y{std::move(x)};
}
In this case, only one move constructor is called to construct y.
So a sensible compromise is to take a value parameter and have one extra move constructor call for either lvalue or rvalue arguments with respect to the optimal implementation, which is also the advice given in Herb's talk.
As #JDługosz pointed out in the comments, passing by value only makes sense for functions that will construct some object from the sink argument. When you have a function f that copies its argument, the pass-by-value approach will have more overhead than a general pass-by-const-reference approach. The pass-by-value approach for a function f that retains a copy of its parameter will have the form:
void f(T x) {
T y{...};
...
y = std::move(x);
}
In this case, there is a copy construction and a move assignment for an lvalue argument, and a move construction and move assignment for an rvalue argument. The most optimal case for an lvalue argument is:
void f(const T& x) {
T y{...};
...
y = x;
}
This boils down to an assignment only, which is potentially much cheaper than the copy constructor plus move assignment required for the pass-by-value approach. The reason for this is that the assignment might reuse existing allocated memory in y, and therefore prevent (de)allocations, whereas the copy constructor will usually allocate memory.
For an rvalue argument the most optimal implementation for f that retains a copy has the form:
void f(T&& x) {
T y{...};
...
y = std::move(x);
}
So, only a move assignment in this case. Passing an rvalue to the version of f that takes a const reference only costs an assignment instead of a move assignment. So relatively speaking, the version of f taking a const reference in this case as the general implementation is preferable.
So in general, for the most optimal implementation, you will need to overload or do some kind of perfect forwarding as shown in the talk. The drawback is a combinatorial explosion in the number of overloads required, depending on the number of parameters for f in case you opt to overload on the value category of the argument. Perfect forwarding has the drawback that f becomes a template function, which prevents making it virtual, and results in significantly more complex code if you want to get it 100% right (see the talk for the gory details).
The problem is that "const" is a non-granular qualifier. What is usually meant by "const string ref" is "don't modify this string", not "don't modify the reference count". There is simply no way, in C++, to say which members are "const". They either all are, or none of them are.
In order to hack around this language issue, STL could allow "C()" in your example to make a move-semantic copy anyway, and dutifully ignore the "const" with regard to the reference count (mutable). As long as it was well-specified, this would be fine.
Since STL doesn't, I have a version of a string that const_casts<> away the reference counter (no way to retroactively make something mutable in a class hierarchy), and - lo and behold - you can freely pass cmstring's as const references, and make copies of them in deep functions, all day long, with no leaks or issues.
Since C++ offers no "derived class const granularity" here, writing up a good specification and making a shiny new "const movable string" (cmstring) object is the best solution I've seen.
If I have no use for a variable after I pass it to a function, does it matter whether I pass it a non-const lvalue reference or use std::move to pass it an rvalue reference. The assumption is that there are two different overloads. The only difference in the two cases is the lifetime of the passed object, which ends earlier if I pass by rvalue reference. Are there other factors to consider?
If I have a function foo overloaded like:
void foo(X& x);
void foo(X&& x);
X x;
foo(std::move(x)); // Does it matter if I called foo(x) instead?
// ... no further accesses to x
// end-of-scope
The lifetime of an object does not end when it is passed by rvalue reference. The rvalue reference merely gives foo permission to take ownership of its argument and potentially change its value to nonsense. This might involve deallocating its members, which is a kind of end of lifetime, but the argument itself lives to the end of the scope of its declaration.
Using std::move on the last access is idiomatic. There is no potential downside. Presumably if there are two overloads, the rvalue reference one has the same semantics but higher efficiency. Of course, they could do completely different things, just for the sake of insane sadism.
It depends on what you do in foo():
Inside foo(), if you store the argument in some internal storage, then yes it does matter, from readability point of view, because it is explicit at the call site that this particular argument is being moved and it should not be used here at call site, after the function call returns.
If you simply read/write its value, then it doesn't matter. Note that even if you pass by T&, the argument can still be moved to some internal storage, but that is less preferred approach — in fact it should be considered a dangerous approach.
Also note that std::move does NOT actually move the object. It simply makes the object moveable. An object is moved if it invokes the move-constructor or move-assignment:
void f(X && x) { return; }
void g(X x) { return; }
X x1,x2;
f(std::move(x1)); //x1 is NOT actually moved (no move constructor invocation).
g(std::move(x2)); //x2 is actually moved (by the move-constructor).
//here it is safe to use x1
//here it is unsafe to use x2
Alright it is more complex than this. Consider another example:
void f(X && x) { vec_storage.push_back(std::move(x)); return; }
void g(X x) { return; }
X x1,x2;
f(std::move(x1)); //x1 is actually moved (move-constructor invocation in push_back)
g(std::move(x2)); //x2 is actually moved (move-constructor invocation when passing argument by copy).
//here it is unsafe to use x1 and x2 both.
Hope that helps.
Let's take the following method as an example:
void Asset::Load( const std::string& path )
{
// complicated method....
}
General use of this method would be as follows:
Asset exampleAsset;
exampleAsset.Load("image0.png");
Since we know most of the time the Path is a temporary rvalue, does it make sense to add an Rvalue version of this method? And if so, is this a correct implementation;
void Asset::Load( const std::string& path )
{
// complicated method....
}
void Asset::Load( std::string&& path )
{
Load(path); // call the above method
}
Is this a correct approach to writing rvalue versions of methods?
For your particular case, the second overload is useless.
With the original code, which has just one overload for Load, this function is called for lvalues and rvalues.
With the new code, the first overload is called for lvalues and the second is called for rvalues. However, the second overload calls the first one. At the end, the effect of calling one or the other implies that the same operation (whatever the first overload does) will be performed.
Therefore, the effects of the original code and the new code are the same but the first code is just simpler.
Deciding whether a function must take an argument by value, lvalue reference or rvalue reference depends very much on what it does. You should provide an overload taking rvalue references when you want to move the passed argument. There are several good references on move semantincs out there, so I won't cover it here.
Bonus:
To help me make my point consider this simple probe class:
struct probe {
probe(const char* ) { std::cout << "ctr " << std::endl; }
probe(const probe& ) { std::cout << "copy" << std::endl; }
probe(probe&& ) { std::cout << "move" << std::endl; }
};
Now consider this function:
void f(const probe& p) {
probe q(p);
// use q;
}
Calling f("foo"); produces the following output:
ctr
copy
No surprises here: we create a temporary probe passing the const char* "foo". Hence the first output line. Then, this temporary is bound to p and a copy q of p is created inside f. Hence the second output line.
Now, consider taking p by value, that is, change f to:
void f(probe p) {
// use p;
}
The output of f("foo"); is now
ctr
Some will be surprised that in this case: there's no copy! In general, if you take an argument by reference and copy it inside your function, then it's better to take the argument by value. In this case, instead of creating a temporary and copying it, the compiler can construct the argument (p in this case) direct from the input ("foo"). For more information, see Want Speed? Pass by Value. by Dave Abrahams.
There are two notable exceptions to this guideline: constructors and assignment operators.
Consider this class:
struct foo {
probe p;
foo(const probe& q) : p(q) { }
};
The constructor takes a probe by const reference and then copy it to p. In this case, following the guideline above doesn't bring any performance improvement and probe's copy constructor will be called anyway. However, taking q by value might create an overload resolution issue similar to the one with assignment operator that I shall cover now.
Suppose that our class probe has a non-throwing swap method. Then the suggested implementation of its assignment operator (thinking in C++03 terms for the time being) is
probe& operator =(const probe& other) {
probe tmp(other);
swap(tmp);
return *this;
}
Then, according to the guideline above, it's better to write it like this
probe& operator =(probe tmp) {
swap(tmp);
return *this;
}
Now enter C++11 with rvalue references and move semantics. You decided to add a move assignment operator:
probe& operator =(probe&&);
Now calling the assignment operator on a temporary creates an ambiguity because both overloads are viable and none is preferred over the other. To resolve this issue, use the original implementation of the assignment operator (taking the argument by const reference).
Actually, this issue is not particular to constructors and assignment operators and might happen with any function. (It's more likely that you will experience it with constructors and assignment operators though.) For instance, calling g("foo"); when g has the following two overloads raises the ambiguity:
void g(probe);
void g(probe&&);
Unless you're doing something other than calling the lvalue reference version of Load, you don't need the second function, as an rvalue will bind to a const lvalue reference.
Since we know most of the time the Path is a temporary rvalue, does it make sense to add an Rvalue version of this method?
Probably not... Unless you need to do something tricky inside Load() that requires a non-const parameter. For example, maybe you want to std::move(Path) into another thread. In that case it might make sense to use move semantics.
Is this a correct approach to writing rvalue versions of methods?
No, you should do it the other way around:
void Asset::load( const std::string& path )
{
auto path_copy = path;
load(std::move(path_copy)); // call the below method
}
void Asset::load( std::string&& path )
{
// complicated method....
}
It's generally a question of whether internally you will make a copy (explicitly, or implicitly) of the incoming object (provide T&& argument), or you will just use it (stick to [const] T&).
If your Load member function doesn't assign from the incoming string, you should simply provide void Asset::Load(const std::string& Path).
If you do assign from the incoming path, say to a member variable, then there's a scenario where it could be slightly more efficient to provide void Asset::Load(std::string&& Path) too, but you'd need a different implementation that assigns ala loaded_from_path_ = std::move(Path);.
The potential benefit is to the caller, in that with the && version they might receive the free-store region that had been owned by the member variable, avoiding a pessimistic delete[]ion of that buffer inside void Asset::Load(const std::string& Path) and possible re-allocation next time the caller's string is assigned to (assuming the buffer's large enough to fit its next value too).
In your stated scenario, you're usually passing in string literals; such caller's will get no benefit from any && overload as there's no caller-owned std::string instance to receive the existing data member's buffer.
Here's what I do when trying to decide on the function signature
(const std::string& const_lvalue) argument is read only
(std::string& lvalue) I can modify argument (usually put something in) so the change would be VISIBLE to the caller
(std::string&& rvalue) I can modify argument (usually steal something from), zero consequences since the caller would no longer see/use this argument (consider it self destroyed after function returns) RVALUE reference bind to a temp object
All three of them are "pass-by-reference", but they show different intentions. 2+3 are similar, they can both modify the argument but 2 wants the modification to be seen by the caller whereas 3 doesn't.
// (2) caller sees the change argument
void ModifyInPlace(Foo& lvalue){
delete lvalue.data_pointer;
lvalue.data_pointer = nullptr;
}
// (3) move constructor, caller ignores the change to the argument
Foo(Foo&& rvalue)
{
this->data_pointer = that.data_pointer;
that.data_pointer = nullptr;
}
I heard a recent talk by Herb Sutter who suggested that the reasons to pass std::vector and std::string by const & are largely gone. He suggested that writing a function such as the following is now preferable:
std::string do_something ( std::string inval )
{
std::string return_val;
// ... do stuff ...
return return_val;
}
I understand that the return_val will be an rvalue at the point the function returns and can therefore be returned using move semantics, which are very cheap. However, inval is still much larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea.
Can anyone explain why Herb might have said this?
The reason Herb said what he said is because of cases like this.
Let's say I have function A which calls function B, which calls function C. And A passes a string through B and into C. A does not know or care about C; all A knows about is B. That is, C is an implementation detail of B.
Let's say that A is defined as follows:
void A()
{
B("value");
}
If B and C take the string by const&, then it looks something like this:
void B(const std::string &str)
{
C(str);
}
void C(const std::string &str)
{
//Do something with `str`. Does not store it.
}
All well and good. You're just passing pointers around, no copying, no moving, everyone's happy. C takes a const& because it doesn't store the string. It simply uses it.
Now, I want to make one simple change: C needs to store the string somewhere.
void C(const std::string &str)
{
//Do something with `str`.
m_str = str;
}
Hello, copy constructor and potential memory allocation (ignore the Short String Optimization (SSO)). C++11's move semantics are supposed to make it possible to remove needless copy-constructing, right? And A passes a temporary; there's no reason why C should have to copy the data. It should just abscond with what was given to it.
Except it can't. Because it takes a const&.
If I change C to take its parameter by value, that just causes B to do the copy into that parameter; I gain nothing.
So if I had just passed str by value through all of the functions, relying on std::move to shuffle the data around, we wouldn't have this problem. If someone wants to hold on to it, they can. If they don't, oh well.
Is it more expensive? Yes; moving into a value is more expensive than using references. Is it less expensive than the copy? Not for small strings with SSO. Is it worth doing?
It depends on your use case. How much do you hate memory allocations?
Are the days of passing const std::string & as a parameter over?
No. Many people take this advice (including Dave Abrahams) beyond the domain it applies to, and simplify it to apply to all std::string parameters -- Always passing std::string by value is not a "best practice" for any and all arbitrary parameters and applications because the optimizations these talks/articles focus on apply only to a restricted set of cases.
If you're returning a value, mutating the parameter, or taking the value, then passing by value could save expensive copying and offer syntactical convenience.
As ever, passing by const reference saves much copying when you don't need a copy.
Now to the specific example:
However inval is still quite a lot larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea. Can anyone explain why Herb might have said this?
If stack size is a concern (and assuming this is not inlined/optimized), return_val + inval > return_val -- IOW, peak stack usage can be reduced by passing by value here (note: oversimplification of ABIs). Meanwhile, passing by const reference can disable the optimizations. The primary reason here is not to avoid stack growth, but to ensure the optimization can be performed where it is applicable.
The days of passing by const reference aren't over -- the rules just more complicated than they once were. If performance is important, you'll be wise to consider how you pass these types, based on the details you use in your implementations.
Short answer: NO! Long answer:
If you won't modify the string (treat is as read-only), pass it as const ref&.(the const ref& obviously needs to stay within scope while the function that uses it executes)
If you plan to modify it or you know it will get out of scope (threads), pass it as a value, don't copy the const ref& inside your function body.
There was a post on cpp-next.com called "Want speed, pass by value!". The TL;DR:
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.
TRANSLATION of ^
Don’t copy your function arguments --- means: if you plan to modify the argument value by copying it to an internal variable, just use a value argument instead.
So, don't do this:
std::string function(const std::string& aString){
auto vString(aString);
vString.clear();
return vString;
}
do this:
std::string function(std::string aString){
aString.clear();
return aString;
}
When you need to modify the argument value in your function body.
You just need to be aware how you plan to use the argument in the function body. Read-only or NOT... and if it sticks within scope.
This highly depends on the compiler's implementation.
However, it also depends on what you use.
Lets consider next functions :
bool foo1( const std::string v )
{
return v.empty();
}
bool foo2( const std::string & v )
{
return v.empty();
}
These functions are implemented in a separate compilation unit in order to avoid inlining. Then :
1. If you pass a literal to these two functions, you will not see much difference in performances. In both cases, a string object has to be created
2. If you pass another std::string object, foo2 will outperform foo1, because foo1 will do a deep copy.
On my PC, using g++ 4.6.1, I got these results :
variable by reference: 1000000000 iterations -> time elapsed: 2.25912 sec
variable by value: 1000000000 iterations -> time elapsed: 27.2259 sec
literal by reference: 100000000 iterations -> time elapsed: 9.10319 sec
literal by value: 100000000 iterations -> time elapsed: 8.62659 sec
Unless you actually need a copy it's still reasonable to take const &. For example:
bool isprint(std::string const &s) {
return all_of(begin(s),end(s),(bool(*)(char))isprint);
}
If you change this to take the string by value then you'll end up moving or copying the parameter, and there's no need for that. Not only is copy/move likely more expensive, but it also introduces a new potential failure; the copy/move could throw an exception (e.g., allocation during copy could fail) whereas taking a reference to an existing value can't.
If you do need a copy then passing and returning by value is usually (always?) the best option. In fact I generally wouldn't worry about it in C++03 unless you find that extra copies actually causes a performance problem. Copy elision seems pretty reliable on modern compilers. I think people's skepticism and insistence that you have to check your table of compiler support for RVO is mostly obsolete nowadays.
In short, C++11 doesn't really change anything in this regard except for people that didn't trust copy elision.
Almost.
In C++17, we have basic_string_view<?>, which brings us down to basically one narrow use case for std::string const& parameters.
The existence of move semantics has eliminated one use case for std::string const& -- if you are planning on storing the parameter, taking a std::string by value is more optimal, as you can move out of the parameter.
If someone called your function with a raw C "string" this means only one std::string buffer is ever allocated, as opposed to two in the std::string const& case.
However, if you don't intend to make a copy, taking by std::string const& is still useful in C++14.
With std::string_view, so long as you aren't passing said string to an API that expects C-style '\0'-terminated character buffers, you can more efficiently get std::string like functionality without risking any allocation. A raw C string can even be turned into a std::string_view without any allocation or character copying.
At that point, the use for std::string const& is when you aren't copying the data wholesale, and are going to pass it on to a C-style API that expects a null terminated buffer, and you need the higher level string functions that std::string provides. In practice, this is a rare set of requirements.
std::string is not Plain Old Data(POD), and its raw size is not the most relevant thing ever. For example, if you pass in a string which is above the length of SSO and allocated on the heap, I would expect the copy constructor to not copy the SSO storage.
The reason this is recommended is because inval is constructed from the argument expression, and thus is always moved or copied as appropriate- there is no performance loss, assuming that you need ownership of the argument. If you don't, a const reference could still be the better way to go.
I've copy/pasted the answer from this question here, and changed the names and spelling to fit this question.
Here is code to measure what is being asked:
#include <iostream>
struct string
{
string() {}
string(const string&) {std::cout << "string(const string&)\n";}
string& operator=(const string&) {std::cout << "string& operator=(const string&)\n";return *this;}
#if (__has_feature(cxx_rvalue_references))
string(string&&) {std::cout << "string(string&&)\n";}
string& operator=(string&&) {std::cout << "string& operator=(string&&)\n";return *this;}
#endif
};
#if PROCESS == 1
string
do_something(string inval)
{
// do stuff
return inval;
}
#elif PROCESS == 2
string
do_something(const string& inval)
{
string return_val = inval;
// do stuff
return return_val;
}
#if (__has_feature(cxx_rvalue_references))
string
do_something(string&& inval)
{
// do stuff
return std::move(inval);
}
#endif
#endif
string source() {return string();}
int main()
{
std::cout << "do_something with lvalue:\n\n";
string x;
string t = do_something(x);
#if (__has_feature(cxx_rvalue_references))
std::cout << "\ndo_something with xvalue:\n\n";
string u = do_something(std::move(x));
#endif
std::cout << "\ndo_something with prvalue:\n\n";
string v = do_something(source());
}
For me this outputs:
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=1 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
string(string&&)
do_something with xvalue:
string(string&&)
string(string&&)
do_something with prvalue:
string(string&&)
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=2 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
do_something with xvalue:
string(string&&)
do_something with prvalue:
string(string&&)
The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:
+----+--------+--------+---------+
| | lvalue | xvalue | prvalue |
+----+--------+--------+---------+
| p1 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
| p2 | 1/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
The pass-by-value solution requires only one overload but costs an extra move construction when passing lvalues and xvalues. This may or may not be acceptable for any given situation. Both solutions have advantages and disadvantages.
Herb Sutter is still on record, along with Bjarne Stroustroup, in recommending const std::string& as a parameter type; see https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-in .
There is a pitfall not mentioned in any of the other answers here: if you pass a string literal to a const std::string& parameter, it will pass a reference to a temporary string, created on-the-fly to hold the characters of the literal. If you then save that reference, it will be invalid once the temporary string is deallocated. To be safe, you must save a copy, not the reference. The problem stems from the fact that string literals are const char[N] types, requiring promotion to std::string.
The code below illustrates the pitfall and the workaround, along with a minor efficiency option -- overloading with a const char* method, as described at Is there a way to pass a string literal as reference in C++.
(Note: Sutter & Stroustroup advise that if you keep a copy of the string, also provide an overloaded function with a && parameter and std::move() it.)
#include <string>
#include <iostream>
class WidgetBadRef {
public:
WidgetBadRef(const std::string& s) : myStrRef(s) // copy the reference...
{}
const std::string& myStrRef; // might be a reference to a temporary (oops!)
};
class WidgetSafeCopy {
public:
WidgetSafeCopy(const std::string& s) : myStrCopy(s)
// constructor for string references; copy the string
{std::cout << "const std::string& constructor\n";}
WidgetSafeCopy(const char* cs) : myStrCopy(cs)
// constructor for string literals (and char arrays);
// for minor efficiency only;
// create the std::string directly from the chars
{std::cout << "const char * constructor\n";}
const std::string myStrCopy; // save a copy, not a reference!
};
int main() {
WidgetBadRef w1("First string");
WidgetSafeCopy w2("Second string"); // uses the const char* constructor, no temp string
WidgetSafeCopy w3(w2.myStrCopy); // uses the String reference constructor
std::cout << w1.myStrRef << "\n"; // garbage out
std::cout << w2.myStrCopy << "\n"; // OK
std::cout << w3.myStrCopy << "\n"; // OK
}
OUTPUT:
const char * constructor
const std::string& constructor
Second string
Second string
See “Herb Sutter "Back to the Basics! Essentials of Modern C++ Style”. Among other topics, he reviews the parameter passing advice that’s been given in the past, and new ideas that come in with C++11 and specifically looks at the idea of passing strings by value.
The benchmarks show that passing std::strings by value, in cases where the function will copy it in anyway, can be significantly slower!
This is because you are forcing it to always make a full copy (and then move into place), while the const& version will update the old string which may reuse the already-allocated buffer.
See his slide 27: For “set” functions, option 1 is the same as it always was. Option 2 adds an overload for rvalue reference, but this gives a combinatorial explosion if there are multiple parameters.
It is only for “sink” parameters where a string must be created (not have its existing value changed) that the pass-by-value trick is valid. That is, constructors in which the parameter directly initializes the member of the matching type.
If you want to see how deep you can go in worrying about this, watch Nicolai Josuttis’s presentation and good luck with that (“Perfect — Done!” n times after finding fault with the previous version. Ever been there?)
This is also summarized as ⧺F.15 in the Standard Guidelines.
update
Generally, you want to declare "string" parameters as std::string_view (by value). This allows you to pass an existing std::string object as efficiently as with const std::string&, and also pass a lexical string literal (like "hello!") without copying it, and pass objects of type string_view which is necessary now that those are in the ecosystem too.
The exception is when the function needs an actual std::string instance, in order to pass to another function that's declared to take const std::string&.
IMO using the C++ reference for std::string is a quick and short local optimization, while using passing by value could be (or not) a better global optimization.
So the answer is: it depends on circumstances:
If you write all the code from the outside to the inside functions, you know what the code does, you can use the reference const std::string &.
If you write the library code or use heavily library code where strings are passed, you likely gain more in global sense by trusting std::string copy constructor behavior.
As #JDługosz points out in the comments, Herb gives other advice in another (later?) talk, see roughly from here: https://youtu.be/xnqTKD8uD64?t=54m50s.
His advice boils down to only using value parameters for a function f that takes so-called sink arguments, assuming you will move construct from these sink arguments.
This general approach only adds the overhead of a move constructor for both lvalue and rvalue arguments compared to an optimal implementation of f tailored to lvalue and rvalue arguments respectively. To see why this is the case, suppose f takes a value parameter, where T is some copy and move constructible type:
void f(T x) {
T y{std::move(x)};
}
Calling f with an lvalue argument will result in a copy constructor being called to construct x, and a move constructor being called to construct y. On the other hand, calling f with an rvalue argument will cause a move constructor to be called to construct x, and another move constructor to be called to construct y.
In general, the optimal implementation of f for lvalue arguments is as follows:
void f(const T& x) {
T y{x};
}
In this case, only one copy constructor is called to construct y. The optimal implementation of f for rvalue arguments is, again in general, as follows:
void f(T&& x) {
T y{std::move(x)};
}
In this case, only one move constructor is called to construct y.
So a sensible compromise is to take a value parameter and have one extra move constructor call for either lvalue or rvalue arguments with respect to the optimal implementation, which is also the advice given in Herb's talk.
As #JDługosz pointed out in the comments, passing by value only makes sense for functions that will construct some object from the sink argument. When you have a function f that copies its argument, the pass-by-value approach will have more overhead than a general pass-by-const-reference approach. The pass-by-value approach for a function f that retains a copy of its parameter will have the form:
void f(T x) {
T y{...};
...
y = std::move(x);
}
In this case, there is a copy construction and a move assignment for an lvalue argument, and a move construction and move assignment for an rvalue argument. The most optimal case for an lvalue argument is:
void f(const T& x) {
T y{...};
...
y = x;
}
This boils down to an assignment only, which is potentially much cheaper than the copy constructor plus move assignment required for the pass-by-value approach. The reason for this is that the assignment might reuse existing allocated memory in y, and therefore prevent (de)allocations, whereas the copy constructor will usually allocate memory.
For an rvalue argument the most optimal implementation for f that retains a copy has the form:
void f(T&& x) {
T y{...};
...
y = std::move(x);
}
So, only a move assignment in this case. Passing an rvalue to the version of f that takes a const reference only costs an assignment instead of a move assignment. So relatively speaking, the version of f taking a const reference in this case as the general implementation is preferable.
So in general, for the most optimal implementation, you will need to overload or do some kind of perfect forwarding as shown in the talk. The drawback is a combinatorial explosion in the number of overloads required, depending on the number of parameters for f in case you opt to overload on the value category of the argument. Perfect forwarding has the drawback that f becomes a template function, which prevents making it virtual, and results in significantly more complex code if you want to get it 100% right (see the talk for the gory details).
The problem is that "const" is a non-granular qualifier. What is usually meant by "const string ref" is "don't modify this string", not "don't modify the reference count". There is simply no way, in C++, to say which members are "const". They either all are, or none of them are.
In order to hack around this language issue, STL could allow "C()" in your example to make a move-semantic copy anyway, and dutifully ignore the "const" with regard to the reference count (mutable). As long as it was well-specified, this would be fine.
Since STL doesn't, I have a version of a string that const_casts<> away the reference counter (no way to retroactively make something mutable in a class hierarchy), and - lo and behold - you can freely pass cmstring's as const references, and make copies of them in deep functions, all day long, with no leaks or issues.
Since C++ offers no "derived class const granularity" here, writing up a good specification and making a shiny new "const movable string" (cmstring) object is the best solution I've seen.