Pass by value to avoid interface duplication [duplicate] - c++

I'm learning C++ at the moment and try avoid picking up bad habits.
From what I understand, clang-tidy contains many "best practices" and I try to stick to them as best as possible (even though I don't necessarily understand why they are considered good yet), but I'm not sure if I understand what's recommended here.
I used this class from the tutorial:
class Creature
{
private:
std::string m_name;
public:
Creature(const std::string &name)
: m_name{name}
{
}
};
This leads to a suggestion from clang-tidy that I should pass by value instead of reference and use std::move.
If I do, I get the suggestion to make name a reference (to ensure it does not get copied every time) and the warning that std::move won't have any effect because name is a const so I should remove it.
The only way I don't get a warning is by removing const altogether:
Creature(std::string name)
: m_name{std::move(name)}
{
}
Which seems logical, as the only benefit of const was to prevent messing with the original string (which doesn't happen because I passed by value).
But I read on CPlusPlus.com:
Although note that -in the standard library- moving implies that the moved-from object is left in a valid but unspecified state. Which means that, after such an operation, the value of the moved-from object should only be destroyed or assigned a new value; accessing it otherwise yields an unspecified value.
Now imagine this code:
std::string nameString("Alex");
Creature c(nameString);
Because nameString gets passed by value, std::move will only invalidate name inside the constructor and not touch the original string. But what are the advantages of this? It seems like the content gets copied only once anyhow - if I pass by reference when I call m_name{name}, if I pass by value when I pass it (and then it gets moved). I understand that this is better than passing by value and not using std::move (because it gets copied twice).
So two questions:
Did I understand correctly what is happening here?
Is there any upside of using std::move over passing by reference and just calling m_name{name}?

/* (0) */
Creature(const std::string &name) : m_name{name} { }
A passed lvalue binds to name, then is copied into m_name.
A passed rvalue binds to name, then is copied into m_name.
/* (1) */
Creature(std::string name) : m_name{std::move(name)} { }
A passed lvalue is copied into name, then is moved into m_name.
A passed rvalue is moved into name, then is moved into m_name.
/* (2) */
Creature(const std::string &name) : m_name{name} { }
Creature(std::string &&rname) : m_name{std::move(rname)} { }
A passed lvalue binds to name, then is copied into m_name.
A passed rvalue binds to rname, then is moved into m_name.
As move operations are usually faster than copies, (1) is better than (0) if you pass a lot of temporaries. (2) is optimal in terms of copies/moves, but requires code repetition.
The code repetition can be avoided with perfect forwarding:
/* (3) */
template <typename T,
std::enable_if_t<
std::is_convertible_v<std::remove_cvref_t<T>, std::string>,
int> = 0
>
Creature(T&& name) : m_name{std::forward<T>(name)} { }
You might optionally want to constrain T in order to restrict the domain of types that this constructor can be instantiated with (as shown above). C++20 aims to simplify this with Concepts.
In C++17, prvalues are affected by guaranteed copy elision, which - when applicable - will reduce the number of copies/moves when passing arguments to functions.

Did I understand correctly what is happening here?
Yes.
Is there any upside of using std::move over passing by reference and just calling m_name{name}?
An easy to grasp function signature without any additional overloads. The signature immediately reveals that the argument will be copied - this saves callers from wondering whether a const std::string& reference might be stored as a data member, possibly becoming a dangling reference later on. And there is no need to overload on std::string&& name and const std::string& arguments to avoid unnecessary copies when rvalues are passed to the function. Passing an lvalue
std::string nameString("Alex");
Creature c(nameString);
to the function that takes its argument by value causes one copy and one move construction. Passing an rvalue to the same function
std::string nameString("Alex");
Creature c(std::move(nameString));
causes two move constructions. In contrast, when the function parameter is const std::string&, there will always be a copy, even when passing an rvalue argument. This is clearly an advantage as long as the argument type is cheap to move-construct (this is the case for std::string).
But there is a downside to consider: the reasoning doesn't work for functions that assign the function argument to another variable (instead of initializing it):
void setName(std::string name)
{
m_name = std::move(name);
}
will cause a deallocation of the resource that m_name refers to before it's reassigned. I recommend reading Item 41 in Effective Modern C++ and also this question.

How you pass is not the only variable here, what you pass makes the big difference between the two.
In C++, we have all kinds of value categories and this "idiom" exists for cases where you pass in an rvalue (such as "Alex-string-literal-that-constructs-temporary-std::string" or std::move(nameString)), which results in 0 copies of std::string being made (the type does not even have to be copy-constructible for rvalue arguments), and only uses std::string's move constructor.
Somewhat related Q&A.

There are several disadvantages of pass-by-value-and-move approach over pass-by-(rv)reference:
it causes 3 objects to be spawned instead of 2;
passing an object by value may lead to extra stack overhead, because even regular string class is typically at least 3 or 4 times larger than a pointer;
argument objects construction is going to be done on the caller side, causing code bloat;

Related

Is moving objects upon call by value a bad practice compared to passing by const reference?

I came across a C++17 code base where functions always accept parameters by value, even if a const reference would work, and no semantic reason for passing by value is apparent. The code then explicitly uses a std::move when calling functions. For instance:
A retrieveData(DataReader reader) // reader could be passed by const reference.
{
A a = { };
a.someField = reader.retrieveField(); // retrieveField is a const function.
return a;
}
auto someReader = constructDataReader();
auto data = retrieveData(std::move(someReader)); // Calls explicitly use move all the time.
Defining functions with value parameters by default and counting on move semantics like this seems like a bad practice, but is it? Is this really faster/better than simply passing lvalues by const reference, or perhaps creating a && overload for rvalues if needed?
I'm not sure how many copies modern compilers would do in the above example in case of a call without an explicit move on an lvalue, i.e. retrieveData(r).
I know a lot has been written on the subject of moving, but would really appreciate some clarification here.

Passing by const reference or utilize move semantics [duplicate]

I heard a recent talk by Herb Sutter who suggested that the reasons to pass std::vector and std::string by const & are largely gone. He suggested that writing a function such as the following is now preferable:
std::string do_something ( std::string inval )
{
std::string return_val;
// ... do stuff ...
return return_val;
}
I understand that the return_val will be an rvalue at the point the function returns and can therefore be returned using move semantics, which are very cheap. However, inval is still much larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea.
Can anyone explain why Herb might have said this?
The reason Herb said what he said is because of cases like this.
Let's say I have function A which calls function B, which calls function C. And A passes a string through B and into C. A does not know or care about C; all A knows about is B. That is, C is an implementation detail of B.
Let's say that A is defined as follows:
void A()
{
B("value");
}
If B and C take the string by const&, then it looks something like this:
void B(const std::string &str)
{
C(str);
}
void C(const std::string &str)
{
//Do something with `str`. Does not store it.
}
All well and good. You're just passing pointers around, no copying, no moving, everyone's happy. C takes a const& because it doesn't store the string. It simply uses it.
Now, I want to make one simple change: C needs to store the string somewhere.
void C(const std::string &str)
{
//Do something with `str`.
m_str = str;
}
Hello, copy constructor and potential memory allocation (ignore the Short String Optimization (SSO)). C++11's move semantics are supposed to make it possible to remove needless copy-constructing, right? And A passes a temporary; there's no reason why C should have to copy the data. It should just abscond with what was given to it.
Except it can't. Because it takes a const&.
If I change C to take its parameter by value, that just causes B to do the copy into that parameter; I gain nothing.
So if I had just passed str by value through all of the functions, relying on std::move to shuffle the data around, we wouldn't have this problem. If someone wants to hold on to it, they can. If they don't, oh well.
Is it more expensive? Yes; moving into a value is more expensive than using references. Is it less expensive than the copy? Not for small strings with SSO. Is it worth doing?
It depends on your use case. How much do you hate memory allocations?
Are the days of passing const std::string & as a parameter over?
No. Many people take this advice (including Dave Abrahams) beyond the domain it applies to, and simplify it to apply to all std::string parameters -- Always passing std::string by value is not a "best practice" for any and all arbitrary parameters and applications because the optimizations these talks/articles focus on apply only to a restricted set of cases.
If you're returning a value, mutating the parameter, or taking the value, then passing by value could save expensive copying and offer syntactical convenience.
As ever, passing by const reference saves much copying when you don't need a copy.
Now to the specific example:
However inval is still quite a lot larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea. Can anyone explain why Herb might have said this?
If stack size is a concern (and assuming this is not inlined/optimized), return_val + inval > return_val -- IOW, peak stack usage can be reduced by passing by value here (note: oversimplification of ABIs). Meanwhile, passing by const reference can disable the optimizations. The primary reason here is not to avoid stack growth, but to ensure the optimization can be performed where it is applicable.
The days of passing by const reference aren't over -- the rules just more complicated than they once were. If performance is important, you'll be wise to consider how you pass these types, based on the details you use in your implementations.
Short answer: NO! Long answer:
If you won't modify the string (treat is as read-only), pass it as const ref&.(the const ref& obviously needs to stay within scope while the function that uses it executes)
If you plan to modify it or you know it will get out of scope (threads), pass it as a value, don't copy the const ref& inside your function body.
There was a post on cpp-next.com called "Want speed, pass by value!". The TL;DR:
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.
TRANSLATION of ^
Don’t copy your function arguments --- means: if you plan to modify the argument value by copying it to an internal variable, just use a value argument instead.
So, don't do this:
std::string function(const std::string& aString){
auto vString(aString);
vString.clear();
return vString;
}
do this:
std::string function(std::string aString){
aString.clear();
return aString;
}
When you need to modify the argument value in your function body.
You just need to be aware how you plan to use the argument in the function body. Read-only or NOT... and if it sticks within scope.
This highly depends on the compiler's implementation.
However, it also depends on what you use.
Lets consider next functions :
bool foo1( const std::string v )
{
return v.empty();
}
bool foo2( const std::string & v )
{
return v.empty();
}
These functions are implemented in a separate compilation unit in order to avoid inlining. Then :
1. If you pass a literal to these two functions, you will not see much difference in performances. In both cases, a string object has to be created
2. If you pass another std::string object, foo2 will outperform foo1, because foo1 will do a deep copy.
On my PC, using g++ 4.6.1, I got these results :
variable by reference: 1000000000 iterations -> time elapsed: 2.25912 sec
variable by value: 1000000000 iterations -> time elapsed: 27.2259 sec
literal by reference: 100000000 iterations -> time elapsed: 9.10319 sec
literal by value: 100000000 iterations -> time elapsed: 8.62659 sec
Unless you actually need a copy it's still reasonable to take const &. For example:
bool isprint(std::string const &s) {
return all_of(begin(s),end(s),(bool(*)(char))isprint);
}
If you change this to take the string by value then you'll end up moving or copying the parameter, and there's no need for that. Not only is copy/move likely more expensive, but it also introduces a new potential failure; the copy/move could throw an exception (e.g., allocation during copy could fail) whereas taking a reference to an existing value can't.
If you do need a copy then passing and returning by value is usually (always?) the best option. In fact I generally wouldn't worry about it in C++03 unless you find that extra copies actually causes a performance problem. Copy elision seems pretty reliable on modern compilers. I think people's skepticism and insistence that you have to check your table of compiler support for RVO is mostly obsolete nowadays.
In short, C++11 doesn't really change anything in this regard except for people that didn't trust copy elision.
Almost.
In C++17, we have basic_string_view<?>, which brings us down to basically one narrow use case for std::string const& parameters.
The existence of move semantics has eliminated one use case for std::string const& -- if you are planning on storing the parameter, taking a std::string by value is more optimal, as you can move out of the parameter.
If someone called your function with a raw C "string" this means only one std::string buffer is ever allocated, as opposed to two in the std::string const& case.
However, if you don't intend to make a copy, taking by std::string const& is still useful in C++14.
With std::string_view, so long as you aren't passing said string to an API that expects C-style '\0'-terminated character buffers, you can more efficiently get std::string like functionality without risking any allocation. A raw C string can even be turned into a std::string_view without any allocation or character copying.
At that point, the use for std::string const& is when you aren't copying the data wholesale, and are going to pass it on to a C-style API that expects a null terminated buffer, and you need the higher level string functions that std::string provides. In practice, this is a rare set of requirements.
std::string is not Plain Old Data(POD), and its raw size is not the most relevant thing ever. For example, if you pass in a string which is above the length of SSO and allocated on the heap, I would expect the copy constructor to not copy the SSO storage.
The reason this is recommended is because inval is constructed from the argument expression, and thus is always moved or copied as appropriate- there is no performance loss, assuming that you need ownership of the argument. If you don't, a const reference could still be the better way to go.
I've copy/pasted the answer from this question here, and changed the names and spelling to fit this question.
Here is code to measure what is being asked:
#include <iostream>
struct string
{
string() {}
string(const string&) {std::cout << "string(const string&)\n";}
string& operator=(const string&) {std::cout << "string& operator=(const string&)\n";return *this;}
#if (__has_feature(cxx_rvalue_references))
string(string&&) {std::cout << "string(string&&)\n";}
string& operator=(string&&) {std::cout << "string& operator=(string&&)\n";return *this;}
#endif
};
#if PROCESS == 1
string
do_something(string inval)
{
// do stuff
return inval;
}
#elif PROCESS == 2
string
do_something(const string& inval)
{
string return_val = inval;
// do stuff
return return_val;
}
#if (__has_feature(cxx_rvalue_references))
string
do_something(string&& inval)
{
// do stuff
return std::move(inval);
}
#endif
#endif
string source() {return string();}
int main()
{
std::cout << "do_something with lvalue:\n\n";
string x;
string t = do_something(x);
#if (__has_feature(cxx_rvalue_references))
std::cout << "\ndo_something with xvalue:\n\n";
string u = do_something(std::move(x));
#endif
std::cout << "\ndo_something with prvalue:\n\n";
string v = do_something(source());
}
For me this outputs:
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=1 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
string(string&&)
do_something with xvalue:
string(string&&)
string(string&&)
do_something with prvalue:
string(string&&)
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=2 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
do_something with xvalue:
string(string&&)
do_something with prvalue:
string(string&&)
The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:
+----+--------+--------+---------+
| | lvalue | xvalue | prvalue |
+----+--------+--------+---------+
| p1 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
| p2 | 1/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
The pass-by-value solution requires only one overload but costs an extra move construction when passing lvalues and xvalues. This may or may not be acceptable for any given situation. Both solutions have advantages and disadvantages.
Herb Sutter is still on record, along with Bjarne Stroustroup, in recommending const std::string& as a parameter type; see https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-in .
There is a pitfall not mentioned in any of the other answers here: if you pass a string literal to a const std::string& parameter, it will pass a reference to a temporary string, created on-the-fly to hold the characters of the literal. If you then save that reference, it will be invalid once the temporary string is deallocated. To be safe, you must save a copy, not the reference. The problem stems from the fact that string literals are const char[N] types, requiring promotion to std::string.
The code below illustrates the pitfall and the workaround, along with a minor efficiency option -- overloading with a const char* method, as described at Is there a way to pass a string literal as reference in C++.
(Note: Sutter & Stroustroup advise that if you keep a copy of the string, also provide an overloaded function with a && parameter and std::move() it.)
#include <string>
#include <iostream>
class WidgetBadRef {
public:
WidgetBadRef(const std::string& s) : myStrRef(s) // copy the reference...
{}
const std::string& myStrRef; // might be a reference to a temporary (oops!)
};
class WidgetSafeCopy {
public:
WidgetSafeCopy(const std::string& s) : myStrCopy(s)
// constructor for string references; copy the string
{std::cout << "const std::string& constructor\n";}
WidgetSafeCopy(const char* cs) : myStrCopy(cs)
// constructor for string literals (and char arrays);
// for minor efficiency only;
// create the std::string directly from the chars
{std::cout << "const char * constructor\n";}
const std::string myStrCopy; // save a copy, not a reference!
};
int main() {
WidgetBadRef w1("First string");
WidgetSafeCopy w2("Second string"); // uses the const char* constructor, no temp string
WidgetSafeCopy w3(w2.myStrCopy); // uses the String reference constructor
std::cout << w1.myStrRef << "\n"; // garbage out
std::cout << w2.myStrCopy << "\n"; // OK
std::cout << w3.myStrCopy << "\n"; // OK
}
OUTPUT:
const char * constructor
const std::string& constructor
Second string
Second string
See “Herb Sutter "Back to the Basics! Essentials of Modern C++ Style”. Among other topics, he reviews the parameter passing advice that’s been given in the past, and new ideas that come in with C++11 and specifically looks at the idea of passing strings by value.
The benchmarks show that passing std::strings by value, in cases where the function will copy it in anyway, can be significantly slower!
This is because you are forcing it to always make a full copy (and then move into place), while the const& version will update the old string which may reuse the already-allocated buffer.
See his slide 27: For “set” functions, option 1 is the same as it always was. Option 2 adds an overload for rvalue reference, but this gives a combinatorial explosion if there are multiple parameters.
It is only for “sink” parameters where a string must be created (not have its existing value changed) that the pass-by-value trick is valid. That is, constructors in which the parameter directly initializes the member of the matching type.
If you want to see how deep you can go in worrying about this, watch Nicolai Josuttis’s presentation and good luck with that (“Perfect — Done!” n times after finding fault with the previous version. Ever been there?)
This is also summarized as ⧺F.15 in the Standard Guidelines.
update
Generally, you want to declare "string" parameters as std::string_view (by value). This allows you to pass an existing std::string object as efficiently as with const std::string&, and also pass a lexical string literal (like "hello!") without copying it, and pass objects of type string_view which is necessary now that those are in the ecosystem too.
The exception is when the function needs an actual std::string instance, in order to pass to another function that's declared to take const std::string&.
IMO using the C++ reference for std::string is a quick and short local optimization, while using passing by value could be (or not) a better global optimization.
So the answer is: it depends on circumstances:
If you write all the code from the outside to the inside functions, you know what the code does, you can use the reference const std::string &.
If you write the library code or use heavily library code where strings are passed, you likely gain more in global sense by trusting std::string copy constructor behavior.
As #JDługosz points out in the comments, Herb gives other advice in another (later?) talk, see roughly from here: https://youtu.be/xnqTKD8uD64?t=54m50s.
His advice boils down to only using value parameters for a function f that takes so-called sink arguments, assuming you will move construct from these sink arguments.
This general approach only adds the overhead of a move constructor for both lvalue and rvalue arguments compared to an optimal implementation of f tailored to lvalue and rvalue arguments respectively. To see why this is the case, suppose f takes a value parameter, where T is some copy and move constructible type:
void f(T x) {
T y{std::move(x)};
}
Calling f with an lvalue argument will result in a copy constructor being called to construct x, and a move constructor being called to construct y. On the other hand, calling f with an rvalue argument will cause a move constructor to be called to construct x, and another move constructor to be called to construct y.
In general, the optimal implementation of f for lvalue arguments is as follows:
void f(const T& x) {
T y{x};
}
In this case, only one copy constructor is called to construct y. The optimal implementation of f for rvalue arguments is, again in general, as follows:
void f(T&& x) {
T y{std::move(x)};
}
In this case, only one move constructor is called to construct y.
So a sensible compromise is to take a value parameter and have one extra move constructor call for either lvalue or rvalue arguments with respect to the optimal implementation, which is also the advice given in Herb's talk.
As #JDługosz pointed out in the comments, passing by value only makes sense for functions that will construct some object from the sink argument. When you have a function f that copies its argument, the pass-by-value approach will have more overhead than a general pass-by-const-reference approach. The pass-by-value approach for a function f that retains a copy of its parameter will have the form:
void f(T x) {
T y{...};
...
y = std::move(x);
}
In this case, there is a copy construction and a move assignment for an lvalue argument, and a move construction and move assignment for an rvalue argument. The most optimal case for an lvalue argument is:
void f(const T& x) {
T y{...};
...
y = x;
}
This boils down to an assignment only, which is potentially much cheaper than the copy constructor plus move assignment required for the pass-by-value approach. The reason for this is that the assignment might reuse existing allocated memory in y, and therefore prevent (de)allocations, whereas the copy constructor will usually allocate memory.
For an rvalue argument the most optimal implementation for f that retains a copy has the form:
void f(T&& x) {
T y{...};
...
y = std::move(x);
}
So, only a move assignment in this case. Passing an rvalue to the version of f that takes a const reference only costs an assignment instead of a move assignment. So relatively speaking, the version of f taking a const reference in this case as the general implementation is preferable.
So in general, for the most optimal implementation, you will need to overload or do some kind of perfect forwarding as shown in the talk. The drawback is a combinatorial explosion in the number of overloads required, depending on the number of parameters for f in case you opt to overload on the value category of the argument. Perfect forwarding has the drawback that f becomes a template function, which prevents making it virtual, and results in significantly more complex code if you want to get it 100% right (see the talk for the gory details).
The problem is that "const" is a non-granular qualifier. What is usually meant by "const string ref" is "don't modify this string", not "don't modify the reference count". There is simply no way, in C++, to say which members are "const". They either all are, or none of them are.
In order to hack around this language issue, STL could allow "C()" in your example to make a move-semantic copy anyway, and dutifully ignore the "const" with regard to the reference count (mutable). As long as it was well-specified, this would be fine.
Since STL doesn't, I have a version of a string that const_casts<> away the reference counter (no way to retroactively make something mutable in a class hierarchy), and - lo and behold - you can freely pass cmstring's as const references, and make copies of them in deep functions, all day long, with no leaks or issues.
Since C++ offers no "derived class const granularity" here, writing up a good specification and making a shiny new "const movable string" (cmstring) object is the best solution I've seen.

C++: Overloading 'normal' functions with rvalue references - does it make sense? [duplicate]

I am a C++ beginner but not a programming beginner.
I'm trying to learn C++(c++11) and it's kinda unclear for me the most important thing: passing parameters.
I considered these simple examples:
A class that has all its members primitive types:
CreditCard(std::string number, int expMonth, int expYear,int pin):number(number), expMonth(expMonth), expYear(expYear), pin(pin)
A class that has as members primitive types + 1 complex type:
Account(std::string number, float amount, CreditCard creditCard) : number(number), amount(amount), creditCard(creditCard)
A class that has as members primitive types + 1 collection of some complex type:
Client(std::string firstName, std::string lastName, std::vector<Account> accounts):firstName(firstName), lastName(lastName), accounts(accounts)
When I create an account, I do this:
CreditCard cc("12345",2,2015,1001);
Account acc("asdasd",345, cc);
Obviously the credit card will be copied twice in this scenario.
If I rewrite that constructor as
Account(std::string number, float amount, CreditCard& creditCard)
: number(number)
, amount(amount)
, creditCard(creditCard)
there will be one copy.
If I rewrite it as
Account(std::string number, float amount, CreditCard&& creditCard)
: number(number)
, amount(amount)
, creditCard(std::forward<CreditCard>(creditCard))
There will be 2 moves and no copy.
I think sometimes you may want to copy some parameter, sometimes you don't want to copy when you create that object.
I come from C# and, being used to references, it's a bit strange to me and I think there should be 2 overloads for each parameter but I know I am wrong.
Are there any best practices of how to send parameters in C++ because I really find it, let's say, not trivial. How would you handle my examples presented above?
THE MOST IMPORTANT QUESTION FIRST:
Are there any best practices of how to send parameters in C++ because I really find it, let's say, not trivial
If your function needs to modify the original object being passed, so that after the call returns, modifications to that object will be visible to the caller, then you should pass by lvalue reference:
void foo(my_class& obj)
{
// Modify obj here...
}
If your function does not need to modify the original object, and does not need to create a copy of it (in other words, it only needs to observe its state), then you should pass by lvalue reference to const:
void foo(my_class const& obj)
{
// Observe obj here
}
This will allow you to call the function both with lvalues (lvalues are objects with a stable identity) and with rvalues (rvalues are, for instance temporaries, or objects you're about to move from as the result of calling std::move()).
One could also argue that for fundamental types or types for which copying is fast, such as int, bool, or char, there is no need to pass by reference if the function simply needs to observe the value, and passing by value should be favored. That is correct if reference semantics is not needed, but what if the function wanted to store a pointer to that very same input object somewhere, so that future reads through that pointer will see the value modifications that have been performed in some other part of the code? In this case, passing by reference is the correct solution.
If your function does not need to modify the original object, but needs to store a copy of that object (possibly to return the result of a transformation of the input without altering the input), then you could consider taking by value:
void foo(my_class obj) // One copy or one move here, but not working on
// the original object...
{
// Working on obj...
// Possibly move from obj if the result has to be stored somewhere...
}
Invoking the above function will always result in one copy when passing lvalues, and in one moves when passing rvalues. If your function needs to store this object somewhere, you could perform an additional move from it (for instance, in the case foo() is a member function that needs to store the value in a data member).
In case moves are expensive for objects of type my_class, then you may consider overloading foo() and provide one version for lvalues (accepting an lvalue reference to const) and one version for rvalues (accepting an rvalue reference):
// Overload for lvalues
void foo(my_class const& obj) // No copy, no move (just reference binding)
{
my_class copyOfObj = obj; // Copy!
// Working on copyOfObj...
}
// Overload for rvalues
void foo(my_class&& obj) // No copy, no move (just reference binding)
{
my_class copyOfObj = std::move(obj); // Move!
// Notice, that invoking std::move() is
// necessary here, because obj is an
// *lvalue*, even though its type is
// "rvalue reference to my_class".
// Working on copyOfObj...
}
The above functions are so similar, in fact, that you could make one single function out of it: foo() could become a function template and you could use perfect forwarding for determining whether a move or a copy of the object being passed will be internally generated:
template<typename C>
void foo(C&& obj) // No copy, no move (just reference binding)
// ^^^
// Beware, this is not always an rvalue reference! This will "magically"
// resolve into my_class& if an lvalue is passed, and my_class&& if an
// rvalue is passed
{
my_class copyOfObj = std::forward<C>(obj); // Copy if lvalue, move if rvalue
// Working on copyOfObj...
}
You may want to learn more about this design by watching this talk by Scott Meyers (just mind the fact that the term "Universal References" that he is using is non-standard).
One thing to keep in mind is that std::forward will usually end up in a move for rvalues, so even though it looks relatively innocent, forwarding the same object multiple times may be a source of troubles - for instance, moving from the same object twice! So be careful not to put this in a loop, and not to forward the same argument multiple times in a function call:
template<typename C>
void foo(C&& obj)
{
bar(std::forward<C>(obj), std::forward<C>(obj)); // Dangerous!
}
Also notice, that you normally do not resort to the template-based solution unless you have a good reason for it, as it makes your code harder to read. Normally, you should focus on clarity and simplicity.
The above are just simple guidelines, but most of the time they will point you towards good design decisions.
CONCERNING THE REST OF YOUR POST:
If i rewrite it as [...] there will be 2 moves and no copy.
This is not correct. To begin with, an rvalue reference cannot bind to an lvalue, so this will only compile when you are passing an rvalue of type CreditCard to your constructor. For instance:
// Here you are passing a temporary (OK! temporaries are rvalues)
Account acc("asdasd",345, CreditCard("12345",2,2015,1001));
CreditCard cc("12345",2,2015,1001);
// Here you are passing the result of std::move (OK! that's also an rvalue)
Account acc("asdasd",345, std::move(cc));
But it won't work if you try to do this:
CreditCard cc("12345",2,2015,1001);
Account acc("asdasd",345, cc); // ERROR! cc is an lvalue
Because cc is an lvalue and rvalue references cannot bind to lvalues. Moreover, when binding a reference to an object, no move is performed: it's just a reference binding. Thus, there will only be one move.
So based on the guidelines provided in the first part of this answer, if you are concerned with the number of moves being generated when you take a CreditCard by value, you could define two constructor overloads, one taking an lvalue reference to const (CreditCard const&) and one taking an rvalue reference (CreditCard&&).
Overload resolution will select the former when passing an lvalue (in this case, one copy will be performed) and the latter when passing an rvalue (in this case, one move will be performed).
Account(std::string number, float amount, CreditCard const& creditCard)
: number(number), amount(amount), creditCard(creditCard) // copy here
{ }
Account(std::string number, float amount, CreditCard&& creditCard)
: number(number), amount(amount), creditCard(std::move(creditCard)) // move here
{ }
Your usage of std::forward<> is normally seen when you want to achieve perfect forwarding. In that case, your constructor would actually be a constructor template, and would look more or less as follows
template<typename C>
Account(std::string number, float amount, C&& creditCard)
: number(number), amount(amount), creditCard(std::forward<C>(creditCard)) { }
In a sense, this combines both the overloads I've shown previously into one single function: C will be deduced to be CreditCard& in case you are passing an lvalue, and due to the reference collapsing rules, it will cause this function to be instantiated:
Account(std::string number, float amount, CreditCard& creditCard) :
number(num), amount(amount), creditCard(std::forward<CreditCard&>(creditCard))
{ }
This will cause a copy-construction of creditCard, as you would wish. On the other hand, when an rvalue is passed, C will be deduced to be CreditCard, and this function will be instantiated instead:
Account(std::string number, float amount, CreditCard&& creditCard) :
number(num), amount(amount), creditCard(std::forward<CreditCard>(creditCard))
{ }
This will cause a move-construction of creditCard, which is what you want (because the value being passed is an rvalue, and that means we are authorized to move from it).
First, let me correct some details. When you say the following:
there will be 2 moves and no copy.
That is false. Binding to an rvalue reference is not a move. There is only one move.
Additionally, since CreditCard is not a template parameter, std::forward<CreditCard>(creditCard) is just a verbose way of saying std::move(creditCard).
Now...
If your types have "cheap" moves, you may want to just make your life easy and take everything by value and "std::move along".
Account(std::string number, float amount, CreditCard creditCard)
: number(std::move(number),
amount(amount),
creditCard(std::move(creditCard)) {}
This approach will yield you two moves when it could yield only one, but if the moves are cheap, they may be acceptable.
While we are on this "cheap moves" matter, I should remind you that std::string is often implemented with the so-called small string optimisation, so its moves may not be as cheap as copying some pointers. As usual with optimisation issues, whether it matters or not is something to ask your profiler, not me.
What to do if you don't want to incur those extra moves? Maybe they prove too expensive, or worse, maybe the types cannot actually be moved and you might incur extra copies.
If there is only one problematic parameter, you can provide two overloads, with T const& and T&&. That will bind references all the time until the actual member initialisation, where a copy or move happens.
However, if you have more than one parameter, this leads to an exponential explosion in the number of overloads.
This is a problem that can be solved with perfect forwarding. That means you write a template instead, and use std::forward to carry along the value category of the arguments to their final destination as members.
template <typename TString, typename TCreditCard>
Account(TString&& number, float amount, TCreditCard&& creditCard)
: number(std::forward<TString>(number),
amount(amount),
creditCard(std::forward<TCreditCard>(creditCard)) {}
First of all, std::string is quite a hefty class type just like std::vector. It's certainly not primitive.
If you're taking any large moveable types by value into a constructor, I would std::move them into the member:
CreditCard(std::string number, float amount, CreditCard creditCard)
: number(std::move(number)), amount(amount), creditCard(std::move(creditCard))
{ }
This is exactly how I would recommend implementing the constructor. It causes the members number and creditCard to be move constructed, rather than copy constructed. When you use this constructor, there will be one copy (or move, if temporary) as the object is passed into the constructor and then one move when initialising the member.
Now let's consider this constructor:
Account(std::string number, float amount, CreditCard& creditCard)
: number(number), amount(amount), creditCard(creditCard)
You're right, this will involve one copy of creditCard, because it is first passed to the constructor by reference. But now you can't pass const objects to the constructor (because the reference is non-const) and you can't pass temporary objects. For example, you couldn't do this:
Account account("something", 10.0f, CreditCard("12345",2,2015,1001));
Now let's consider:
Account(std::string number, float amount, CreditCard&& creditCard)
: number(number), amount(amount), creditCard(std::forward<CreditCard>(creditCard))
Here you've shown a misunderstanding of rvalue references and std::forward. You should only really be using std::forward when the object you're forwarding is declared as T&& for some deduced type T. Here CreditCard is not deduced (I'm assuming), and so the std::forward is being used in error. Look up universal references.
I use a quite simple rule for general case:
Use copy for POD (int, bool, double,...)
and const & for everything else...
And wanting to copy or not, is not answered by the method signature but more by what you do with the paramaters.
struct A {
A(const std::string& aValue, const std::string& another)
: copiedValue(aValue), justARef(another) {}
std::string copiedValue;
const std::string& justARef;
};
precision for pointer : I almost never used them. Only advantage over & is that they can be null, or re-assigned.
It's kinda unclear for me the most important thing: passing parameters.
If you want to modify the variable passed inside the function / method
you pass it by reference
you pass it as a pointer (*)
If you want to read the value / variable passed inside the function / method
you pass it by const reference
If you want to modify the value passed inside the function / method
you pass it normally by copying the object (**)
(*) pointers may refers to dynamically allocated memory, therefore when possible you should prefer references over pointers even if references are, in the end, usually implemented as pointers.
(**) "normally" means by copy constructor (if you pass an object of the same type of the parameter) or by normal constructor (if you pass a compatible type for the class). When you pass an object as myMethod(std::string), for example, the copy constructor will be used if an std::string is passed to it, therefore you have to make sure that one exists.

Why pass a string by reference and make it a constant?

Just a general question say the signature of a function is:
void f( const string & s )...
Why is it necessary to pass this by reference if you are not actually changing the string (since it is constant)?
There are two alternatives to passing by reference - passing by pointer, and passing by value.
Passing by pointer is similar to passing by reference: the same argument of "why pass a pointer if you do not want to modify it" could be made for it.
Passing by value requires making a copy. Copying a string is usually more expensive, because dynamic memory needs to be allocated and de-allocated under the cover. That is why it is often a better idea to pass a const reference / pointer than passing a copy that you are not planning to change.
When you pass a variable by value, it makes a copy. Passing by reference avoids that copy. You want to mark it const for the same reason: Since you're not making a copy, you don't want to accidentally mess with the original. I think this could also potentially allow for compiler optimizations.
The reason you don't usually see this for int, char, float, and other primitive types is that they're relatively cheap to copy, and in some cases, passing by reference is more expensive (for example, passing a char by reference could involve passing 64-bits of data (the pointer) instead of 8-bits. Passing by reference also adds some indirection, which isn't a big deal with a big type like a string, but is wasteful for something like an int.
It is not "necessary," but the common answer is "for performance reasons, to prevent copying," however that is a naive answer and the truth is a bit more complex.
In your example, assuming s really is immutable and something you don't "own or can't change," then the const decorator is appropriate for s. If the reference of s wasn't taken, then that would guarantee a copied (excluding compiler optimizations).
If f() is not going to use the copy of s after f() returns, then the effort of copying s was wasted. So, passing by reference prevents the copying and f() retains the ability to inspect the string s. Great. And again, that's the naive answer and pre-C++11, would be the mostly correct answer.
There are more scenarios worth considering in order to answer "is it necessary?" but I'll focus on just one:
If the caller of f() doesn't need the string s after invoking
f(), but f() needs to retain a copy of the data.
Suppose the code is:
void f(const std::string& arg1); // f()'s signature
void g(const std::string& arg1) {
std::string s(arg1);
s.append(" mutate s");
f(s);
}
In this case, you would have constructed the string s, passed it by const reference to f(), and everything is fine from a performance perspective if you assume f() is opaque and there are no further optimizations available.
Now, suppose f() needs a copy of the data in s, then what? Well, f() will call a copy constructor and copy s in to a local variable:
// Hypothetical f()
void f(const std::string& s) {
this->someString_ = s;
}
In this case, by the time the data is stored in someString_, the normal constructor will have been called in g(), and the copy ctor will have been called in f(), however the work done in g() will have been wasted. To improve performance, there are two things that can be done, pass by value and/or use move constructors.
// Explicitly move arg1 in to someString_
void f(std::string&& arg1) {
this->somestring_ = std::move(arg1);
}
void g(const std::string& arg1) {
std::string s(arg1);
s.append(" mutate s");
f(s);
}
Which is explicitly doing what the compiler will automatically do starting with C++11, which means the more correct version is to pass by value and let the compiler do the right thing:
void f(std::string arg1) {
this->somestring_ = arg1; // Implicit move, let the compiler do the right thing
}
void g(const std::string& arg1) {
std::string s(arg1);
s.append(" mutate s");
f(s);
}
And in this case, the string is constructed in g() and no additional work was done anywhere. So in this case, the answer to,
Why is it necessary to pass this by reference if you are not actually
changing the string (since it is constant)?
The string wasn't changed, but it was copied, and therefore const reference was not necessary.
It's an exercise for the reader to list the optimizations the compiler can take when s is or isn't mutated after the call to f().
I can't recommend enough that people look in to David Abrams's post, 'Want Speed? Pass by value' or Is it better in C++ to pass by value or pass by constant reference? and post How to pass objects to functions in C++?.
in other words, you would get the speed of passing by reference ( not making extra copies ).
and the integrity of passing by value ( the original variable value is not changed)
It isn't 'necessary', but it's a very good idea, for several reasons:
Efficiency. Not creating or destroying new strings is more efficient than creating and destroying them, especially as strings are arbitrary in length and therefore require dynamically allocated memory.
A string can be constructed from a string literal. If you specify const you allow the compiler to construct a temporary string from a literal so that the caller can just provide the literal rather than the string object. If you don't specify const the compiler can't do that, so the caller can't do that either.

Are the days of passing const std::string & as a parameter over?

I heard a recent talk by Herb Sutter who suggested that the reasons to pass std::vector and std::string by const & are largely gone. He suggested that writing a function such as the following is now preferable:
std::string do_something ( std::string inval )
{
std::string return_val;
// ... do stuff ...
return return_val;
}
I understand that the return_val will be an rvalue at the point the function returns and can therefore be returned using move semantics, which are very cheap. However, inval is still much larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea.
Can anyone explain why Herb might have said this?
The reason Herb said what he said is because of cases like this.
Let's say I have function A which calls function B, which calls function C. And A passes a string through B and into C. A does not know or care about C; all A knows about is B. That is, C is an implementation detail of B.
Let's say that A is defined as follows:
void A()
{
B("value");
}
If B and C take the string by const&, then it looks something like this:
void B(const std::string &str)
{
C(str);
}
void C(const std::string &str)
{
//Do something with `str`. Does not store it.
}
All well and good. You're just passing pointers around, no copying, no moving, everyone's happy. C takes a const& because it doesn't store the string. It simply uses it.
Now, I want to make one simple change: C needs to store the string somewhere.
void C(const std::string &str)
{
//Do something with `str`.
m_str = str;
}
Hello, copy constructor and potential memory allocation (ignore the Short String Optimization (SSO)). C++11's move semantics are supposed to make it possible to remove needless copy-constructing, right? And A passes a temporary; there's no reason why C should have to copy the data. It should just abscond with what was given to it.
Except it can't. Because it takes a const&.
If I change C to take its parameter by value, that just causes B to do the copy into that parameter; I gain nothing.
So if I had just passed str by value through all of the functions, relying on std::move to shuffle the data around, we wouldn't have this problem. If someone wants to hold on to it, they can. If they don't, oh well.
Is it more expensive? Yes; moving into a value is more expensive than using references. Is it less expensive than the copy? Not for small strings with SSO. Is it worth doing?
It depends on your use case. How much do you hate memory allocations?
Are the days of passing const std::string & as a parameter over?
No. Many people take this advice (including Dave Abrahams) beyond the domain it applies to, and simplify it to apply to all std::string parameters -- Always passing std::string by value is not a "best practice" for any and all arbitrary parameters and applications because the optimizations these talks/articles focus on apply only to a restricted set of cases.
If you're returning a value, mutating the parameter, or taking the value, then passing by value could save expensive copying and offer syntactical convenience.
As ever, passing by const reference saves much copying when you don't need a copy.
Now to the specific example:
However inval is still quite a lot larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea. Can anyone explain why Herb might have said this?
If stack size is a concern (and assuming this is not inlined/optimized), return_val + inval > return_val -- IOW, peak stack usage can be reduced by passing by value here (note: oversimplification of ABIs). Meanwhile, passing by const reference can disable the optimizations. The primary reason here is not to avoid stack growth, but to ensure the optimization can be performed where it is applicable.
The days of passing by const reference aren't over -- the rules just more complicated than they once were. If performance is important, you'll be wise to consider how you pass these types, based on the details you use in your implementations.
Short answer: NO! Long answer:
If you won't modify the string (treat is as read-only), pass it as const ref&.(the const ref& obviously needs to stay within scope while the function that uses it executes)
If you plan to modify it or you know it will get out of scope (threads), pass it as a value, don't copy the const ref& inside your function body.
There was a post on cpp-next.com called "Want speed, pass by value!". The TL;DR:
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.
TRANSLATION of ^
Don’t copy your function arguments --- means: if you plan to modify the argument value by copying it to an internal variable, just use a value argument instead.
So, don't do this:
std::string function(const std::string& aString){
auto vString(aString);
vString.clear();
return vString;
}
do this:
std::string function(std::string aString){
aString.clear();
return aString;
}
When you need to modify the argument value in your function body.
You just need to be aware how you plan to use the argument in the function body. Read-only or NOT... and if it sticks within scope.
This highly depends on the compiler's implementation.
However, it also depends on what you use.
Lets consider next functions :
bool foo1( const std::string v )
{
return v.empty();
}
bool foo2( const std::string & v )
{
return v.empty();
}
These functions are implemented in a separate compilation unit in order to avoid inlining. Then :
1. If you pass a literal to these two functions, you will not see much difference in performances. In both cases, a string object has to be created
2. If you pass another std::string object, foo2 will outperform foo1, because foo1 will do a deep copy.
On my PC, using g++ 4.6.1, I got these results :
variable by reference: 1000000000 iterations -> time elapsed: 2.25912 sec
variable by value: 1000000000 iterations -> time elapsed: 27.2259 sec
literal by reference: 100000000 iterations -> time elapsed: 9.10319 sec
literal by value: 100000000 iterations -> time elapsed: 8.62659 sec
Unless you actually need a copy it's still reasonable to take const &. For example:
bool isprint(std::string const &s) {
return all_of(begin(s),end(s),(bool(*)(char))isprint);
}
If you change this to take the string by value then you'll end up moving or copying the parameter, and there's no need for that. Not only is copy/move likely more expensive, but it also introduces a new potential failure; the copy/move could throw an exception (e.g., allocation during copy could fail) whereas taking a reference to an existing value can't.
If you do need a copy then passing and returning by value is usually (always?) the best option. In fact I generally wouldn't worry about it in C++03 unless you find that extra copies actually causes a performance problem. Copy elision seems pretty reliable on modern compilers. I think people's skepticism and insistence that you have to check your table of compiler support for RVO is mostly obsolete nowadays.
In short, C++11 doesn't really change anything in this regard except for people that didn't trust copy elision.
Almost.
In C++17, we have basic_string_view<?>, which brings us down to basically one narrow use case for std::string const& parameters.
The existence of move semantics has eliminated one use case for std::string const& -- if you are planning on storing the parameter, taking a std::string by value is more optimal, as you can move out of the parameter.
If someone called your function with a raw C "string" this means only one std::string buffer is ever allocated, as opposed to two in the std::string const& case.
However, if you don't intend to make a copy, taking by std::string const& is still useful in C++14.
With std::string_view, so long as you aren't passing said string to an API that expects C-style '\0'-terminated character buffers, you can more efficiently get std::string like functionality without risking any allocation. A raw C string can even be turned into a std::string_view without any allocation or character copying.
At that point, the use for std::string const& is when you aren't copying the data wholesale, and are going to pass it on to a C-style API that expects a null terminated buffer, and you need the higher level string functions that std::string provides. In practice, this is a rare set of requirements.
std::string is not Plain Old Data(POD), and its raw size is not the most relevant thing ever. For example, if you pass in a string which is above the length of SSO and allocated on the heap, I would expect the copy constructor to not copy the SSO storage.
The reason this is recommended is because inval is constructed from the argument expression, and thus is always moved or copied as appropriate- there is no performance loss, assuming that you need ownership of the argument. If you don't, a const reference could still be the better way to go.
I've copy/pasted the answer from this question here, and changed the names and spelling to fit this question.
Here is code to measure what is being asked:
#include <iostream>
struct string
{
string() {}
string(const string&) {std::cout << "string(const string&)\n";}
string& operator=(const string&) {std::cout << "string& operator=(const string&)\n";return *this;}
#if (__has_feature(cxx_rvalue_references))
string(string&&) {std::cout << "string(string&&)\n";}
string& operator=(string&&) {std::cout << "string& operator=(string&&)\n";return *this;}
#endif
};
#if PROCESS == 1
string
do_something(string inval)
{
// do stuff
return inval;
}
#elif PROCESS == 2
string
do_something(const string& inval)
{
string return_val = inval;
// do stuff
return return_val;
}
#if (__has_feature(cxx_rvalue_references))
string
do_something(string&& inval)
{
// do stuff
return std::move(inval);
}
#endif
#endif
string source() {return string();}
int main()
{
std::cout << "do_something with lvalue:\n\n";
string x;
string t = do_something(x);
#if (__has_feature(cxx_rvalue_references))
std::cout << "\ndo_something with xvalue:\n\n";
string u = do_something(std::move(x));
#endif
std::cout << "\ndo_something with prvalue:\n\n";
string v = do_something(source());
}
For me this outputs:
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=1 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
string(string&&)
do_something with xvalue:
string(string&&)
string(string&&)
do_something with prvalue:
string(string&&)
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=2 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
do_something with xvalue:
string(string&&)
do_something with prvalue:
string(string&&)
The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:
+----+--------+--------+---------+
| | lvalue | xvalue | prvalue |
+----+--------+--------+---------+
| p1 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
| p2 | 1/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
The pass-by-value solution requires only one overload but costs an extra move construction when passing lvalues and xvalues. This may or may not be acceptable for any given situation. Both solutions have advantages and disadvantages.
Herb Sutter is still on record, along with Bjarne Stroustroup, in recommending const std::string& as a parameter type; see https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-in .
There is a pitfall not mentioned in any of the other answers here: if you pass a string literal to a const std::string& parameter, it will pass a reference to a temporary string, created on-the-fly to hold the characters of the literal. If you then save that reference, it will be invalid once the temporary string is deallocated. To be safe, you must save a copy, not the reference. The problem stems from the fact that string literals are const char[N] types, requiring promotion to std::string.
The code below illustrates the pitfall and the workaround, along with a minor efficiency option -- overloading with a const char* method, as described at Is there a way to pass a string literal as reference in C++.
(Note: Sutter & Stroustroup advise that if you keep a copy of the string, also provide an overloaded function with a && parameter and std::move() it.)
#include <string>
#include <iostream>
class WidgetBadRef {
public:
WidgetBadRef(const std::string& s) : myStrRef(s) // copy the reference...
{}
const std::string& myStrRef; // might be a reference to a temporary (oops!)
};
class WidgetSafeCopy {
public:
WidgetSafeCopy(const std::string& s) : myStrCopy(s)
// constructor for string references; copy the string
{std::cout << "const std::string& constructor\n";}
WidgetSafeCopy(const char* cs) : myStrCopy(cs)
// constructor for string literals (and char arrays);
// for minor efficiency only;
// create the std::string directly from the chars
{std::cout << "const char * constructor\n";}
const std::string myStrCopy; // save a copy, not a reference!
};
int main() {
WidgetBadRef w1("First string");
WidgetSafeCopy w2("Second string"); // uses the const char* constructor, no temp string
WidgetSafeCopy w3(w2.myStrCopy); // uses the String reference constructor
std::cout << w1.myStrRef << "\n"; // garbage out
std::cout << w2.myStrCopy << "\n"; // OK
std::cout << w3.myStrCopy << "\n"; // OK
}
OUTPUT:
const char * constructor
const std::string& constructor
Second string
Second string
See “Herb Sutter "Back to the Basics! Essentials of Modern C++ Style”. Among other topics, he reviews the parameter passing advice that’s been given in the past, and new ideas that come in with C++11 and specifically looks at the idea of passing strings by value.
The benchmarks show that passing std::strings by value, in cases where the function will copy it in anyway, can be significantly slower!
This is because you are forcing it to always make a full copy (and then move into place), while the const& version will update the old string which may reuse the already-allocated buffer.
See his slide 27: For “set” functions, option 1 is the same as it always was. Option 2 adds an overload for rvalue reference, but this gives a combinatorial explosion if there are multiple parameters.
It is only for “sink” parameters where a string must be created (not have its existing value changed) that the pass-by-value trick is valid. That is, constructors in which the parameter directly initializes the member of the matching type.
If you want to see how deep you can go in worrying about this, watch Nicolai Josuttis’s presentation and good luck with that (“Perfect — Done!” n times after finding fault with the previous version. Ever been there?)
This is also summarized as ⧺F.15 in the Standard Guidelines.
update
Generally, you want to declare "string" parameters as std::string_view (by value). This allows you to pass an existing std::string object as efficiently as with const std::string&, and also pass a lexical string literal (like "hello!") without copying it, and pass objects of type string_view which is necessary now that those are in the ecosystem too.
The exception is when the function needs an actual std::string instance, in order to pass to another function that's declared to take const std::string&.
IMO using the C++ reference for std::string is a quick and short local optimization, while using passing by value could be (or not) a better global optimization.
So the answer is: it depends on circumstances:
If you write all the code from the outside to the inside functions, you know what the code does, you can use the reference const std::string &.
If you write the library code or use heavily library code where strings are passed, you likely gain more in global sense by trusting std::string copy constructor behavior.
As #JDługosz points out in the comments, Herb gives other advice in another (later?) talk, see roughly from here: https://youtu.be/xnqTKD8uD64?t=54m50s.
His advice boils down to only using value parameters for a function f that takes so-called sink arguments, assuming you will move construct from these sink arguments.
This general approach only adds the overhead of a move constructor for both lvalue and rvalue arguments compared to an optimal implementation of f tailored to lvalue and rvalue arguments respectively. To see why this is the case, suppose f takes a value parameter, where T is some copy and move constructible type:
void f(T x) {
T y{std::move(x)};
}
Calling f with an lvalue argument will result in a copy constructor being called to construct x, and a move constructor being called to construct y. On the other hand, calling f with an rvalue argument will cause a move constructor to be called to construct x, and another move constructor to be called to construct y.
In general, the optimal implementation of f for lvalue arguments is as follows:
void f(const T& x) {
T y{x};
}
In this case, only one copy constructor is called to construct y. The optimal implementation of f for rvalue arguments is, again in general, as follows:
void f(T&& x) {
T y{std::move(x)};
}
In this case, only one move constructor is called to construct y.
So a sensible compromise is to take a value parameter and have one extra move constructor call for either lvalue or rvalue arguments with respect to the optimal implementation, which is also the advice given in Herb's talk.
As #JDługosz pointed out in the comments, passing by value only makes sense for functions that will construct some object from the sink argument. When you have a function f that copies its argument, the pass-by-value approach will have more overhead than a general pass-by-const-reference approach. The pass-by-value approach for a function f that retains a copy of its parameter will have the form:
void f(T x) {
T y{...};
...
y = std::move(x);
}
In this case, there is a copy construction and a move assignment for an lvalue argument, and a move construction and move assignment for an rvalue argument. The most optimal case for an lvalue argument is:
void f(const T& x) {
T y{...};
...
y = x;
}
This boils down to an assignment only, which is potentially much cheaper than the copy constructor plus move assignment required for the pass-by-value approach. The reason for this is that the assignment might reuse existing allocated memory in y, and therefore prevent (de)allocations, whereas the copy constructor will usually allocate memory.
For an rvalue argument the most optimal implementation for f that retains a copy has the form:
void f(T&& x) {
T y{...};
...
y = std::move(x);
}
So, only a move assignment in this case. Passing an rvalue to the version of f that takes a const reference only costs an assignment instead of a move assignment. So relatively speaking, the version of f taking a const reference in this case as the general implementation is preferable.
So in general, for the most optimal implementation, you will need to overload or do some kind of perfect forwarding as shown in the talk. The drawback is a combinatorial explosion in the number of overloads required, depending on the number of parameters for f in case you opt to overload on the value category of the argument. Perfect forwarding has the drawback that f becomes a template function, which prevents making it virtual, and results in significantly more complex code if you want to get it 100% right (see the talk for the gory details).
The problem is that "const" is a non-granular qualifier. What is usually meant by "const string ref" is "don't modify this string", not "don't modify the reference count". There is simply no way, in C++, to say which members are "const". They either all are, or none of them are.
In order to hack around this language issue, STL could allow "C()" in your example to make a move-semantic copy anyway, and dutifully ignore the "const" with regard to the reference count (mutable). As long as it was well-specified, this would be fine.
Since STL doesn't, I have a version of a string that const_casts<> away the reference counter (no way to retroactively make something mutable in a class hierarchy), and - lo and behold - you can freely pass cmstring's as const references, and make copies of them in deep functions, all day long, with no leaks or issues.
Since C++ offers no "derived class const granularity" here, writing up a good specification and making a shiny new "const movable string" (cmstring) object is the best solution I've seen.