Performance comparison: f(std::string&&) vs f(T&&) - c++

I'm trying to understand the performance implications of using WidgetURef::setName (URef being a Universal Reference, the term coined by Scott Meyers) vs WidgedRRef::setName (RRef being an R-value Reference):
#include <string>
class WidgetURef {
public:
template<typename T>
void setName(T&& newName)
{
name = std::move(newName);
}
private:
std::string name;
};
class WidgetRRef {
public:
void setName(std::string&& newName)
{
name = std::move(newName);
}
private:
std::string name;
};
int main() {
WidgetURef w_uref;
w_uref.setName("Adela Novak");
WidgetRRef w_rref;
w_rref.setName("Adela Novak");
}
I do appreciate that with universal references one should be using std::forward instead, but this is just an (imperfect) example to highlight the interesting bit.
Question
In this particular example, what is the performance implications of using one implementation vs the other? Although WidgetURef requires type deduction, it's otherwise identical to WidgetRRef, isn't it? At least in this particular scenario, in both cases the argument is an r-value reference, so no temporaries are created. Is this reasoning correct?
Context
The example was taken from Item25 of Scott Meyers' "Effective Modern C++" (p. 170). According to the book (provided that my understanding is correct!), the version taking a universal reference T&& doesn't require temporary objects and the other one, taking std::string&&, does. I don't really see why.

setName(T&& newName) with argument "Adela Novak" gets T duduced as const char (&)[12] which is then assigned to std::string.
setName(std::string&& newName) with argument "Adela Novak" creates a temporary std::string object which is then move assigned to std::string.
The first one is more efficient here because there is no moving involved.

In this particular example, what is the performance implications of using one implementation vs the other?
Universal references, as Scott Meyers calls them, are not primarily there for performance reasons, but, loosely speaking, to treat both L- and Rvalue references in the same manner to avoid countless overloads (and for being able to propagate all type information during forwarding).
[...] so no temporaries are created. Is this reasoning correct?
Rvalue references do not prevent temporaries from being created. Rvalue references are the kind of references that are able to be bound to temporaries (apart from const lvalue references)! Of course, in your example, there will be temporaries, but the rvalue reference can bind to it. The universal reference first has to undergo the reference collapsing but in the end, the behaviour will be identical in your case:
// explicitly created temporary
w_uref.setName(std::string("Adela Novak"));
// will create temporary of std::string --> uref collapses to rvalue ref
// so is effectively the same as
w_rref.setName("Adela Novak");
By using the rvalue reference on the other hand, you force a temporary implicitly as std::string&& cannot bind to that literal.
w_rref.setName("Adela Novak"); // need conversion
So the compiler will create a temporary std::string from the literal the rvalue reference then can bind to.
I don't really see why.
In this case, the template will be resolved to const char(&)[12] and thus, no std::string temporary will be created in contrast to the case above. This therefore is more efficient.

Scott himself says that WidgetURef "compiles, but is bad, bad, bad!" (verbatim). These two classes behave differently as you use std::move instead of std::forward: setName therefore can modify its argument:
#include <string>
#include <iostream>
class WidgetURef {
public:
template<typename T>
void setName(T&& newName)
{
name = std::move(newName);
}
private:
std::string name;
};
int main() {
WidgetURef w_uref;
std::string name = "Hello";
w_uref.setName(name);
std::cout << "name=" << name << "\n";
}
can easily print name=, meaning that the value of name was changed. And indeed it does on ideone at the very least.
On the other hand, WidgetRRef requires that the passed argument is a rvalue-reference, so the example above wouldn't compile without explicit setName(std::move(name)).
Neither WidgetURef, nor WidgetRRef require creating extra copies if you pass std::string as an argument. However, if you pass something which std::string can be assigned from (such as const char*), then the first example will pass that by reference and assign it to string (without any copies except for copying data from C-style string into std::string), and the second example will first create a temporary string, and then pass it as an rvalue reference to the method. These properties preserve if you replace std::move(newName) with a correct std::forward<T>(newName).

Assuming the arguments as stated in the question
template<typename T>
void setName(T&& newName)
{
name = std::forward<T>(newName);
}
Will invoke the std::string assignment operator for the data member name with a const char * argument
void setName(std::string&& newName)
{
name = std::move(newName);
}
Invokes std::string constructor to create a temporary, to which the Rvalue Ref can bind to.
Invokes std::string move assignment / constructor for the data member name with a std::string&& argument
Invokes std::string destructor to destroy the temporary, from which we moved the data.

Related

Understanding comment from the errata about Item 41 of EMC++

In Item 41, Scott Meyers writes the following two classes:
class Widget {
public:
void addName(const std::string& newName) // take lvalue;
{ names.push_back(newName); } // copy it
void addName(std::string&& newName) // take rvalue;
{ names.push_back(std::move(newName)); } // move it; ...
private:
std::vector<std::string> names;
};
class Widget {
public:
template<typename T> // take lvalues
void addName(T&& newName) // and rvalues;
{ // copy lvalues,
names.push_back(std::forward<T>(newName)); } // move rvalues;
} // ...
private:
std::vector<std::string> names;
};
What's written in the comments is correct, even if it doesn't mean at all that the two solutions are equivalent, and some of the differences are indeed discussed in the book.
In the errata, however, the author comments another difference not discussed in the book:
Another behavioral difference between (1) overloading for lvalues and rvalues and (2) a template taking a universal reference (uref) is that the lvalue overload declares its parameter const, while the uref approach doesn't. This means that functions invoked on the lvalue overload's parameter will always be the const versions, while functions invoked on the uref version's parameter will be the const versions only if the argument passed in is const. In other words, non-const lvalue arguments may yield different behavior in the overloading design vis-a-vis the uref design.
But I'm not sure I understand it.
Actually, writing this question I've probably understood, but I'm not writing an answer as I'm still not sure.
Probably the author is saying that when a non-const lvalue is passed to addName, newName is const in the first code, and non-const in the second code, which means that if newName was passed to another function (or a member function was called on it), than that function would be required to take a const parameter (or be a const member function).
Have I interpreted correctly?
However, I don't see how this makes a difference in the specific example, since no member function is called on newName, nor it is passed to a function which has different overloads for const and non-const parameters (not exactly: std::vector<T>::push_back has two overloads for const T& arguments and T&& arguments`, but an lvalue would still bind only to the former overload...).
In the second scenario, when a const std::string lvalue is passed to the template
template<typename T>
void addName(T&& newName)
{ names.push_back(std::forward<T>(newName)); }
the instantiation results in the following (where I've removed the std::forward call as it is, in practice, a no-op)
void addName(const std::string& newName)
{ names.push_back(newName); }
whereas if a std::string lvalue is passed, then the resulting instance of addName is
void addName(std::string& newName)
{ names.push_back(newName); }
which means that the non-const version of std::vector<>::push_back is called.
In the first scenario, when a std::string lvalue is passed to addName, the first overload is picked regardless of the const-ness
void addName(const std::string& newName)
{ names.push_back(newName); }
which means that the const overload of std::vector<>::push_back is selected in both cases.

Implementing rvalue references as parameters in function overloads

I've already asked on code review and software engineering but the topic didn't fit the site, so I'm asking here hoping this is not opinion-based. I am an "old school" C++ developer (I've stopped at C++ 2003) but now I've read a few books on modern C++ 11/17 and I'm rewriting some libraries of mine.
The first thing I've made is adding move constructor/assignment operator where needed ( = classes that already had destructor + copy constructor and copy assignment). Basically I'm using the rule of five.
Most of my functions are declared like
func(const std::string& s);
Which is the common way to pass a reference avoiding a copy. By the way there is also the new move semantic and there's somethig that I wasn't able to find in my books/online. This code:
void fun(std::string& x) {
x.append(" world");
std::cout << x;
}
int main()
{
std::string s{"Hello "};
fun(s);
}
Can also be written as:
void fun(std::string&& x) {
x.append(" world");
std::cout << x;
}
int main()
{
std::string s{"Hello "};
fun(std::move(s));
//or fun("Hello ");
// or fun(std::string {"Hello" });
}
My question is: when should I declare functions that accept a paramenter that is a rvalue reference?
I understand the usage of && semantic on constructors and assignment operators but not really on functions. In the example above (first function) I have a std::string& x which cannot be called as fun("Hello "); of course because I should delcare the type as const std::string& x. But now the const doesnt allow me to change the string!
Yes, I could use a const cast but I rarely do casts (and if it's the case, they're dynamic casts). The power of the && is that I avoid copies, I don't have to do something like
std::string x = "...";
fun(x); //void fun(std::string& x) {}
and I can assing temporary values that will be moved. Should I declare functions with rvalue references when possible?
I have a library that I'm rewriting with modern C++ 17 and I have functions like:
//only const-ref
Type1 func(const type2& x);
Type3 function(const type4& x);
I am asking if it's worth rewriting all of them as
//const-ref AND rvalue reference
Type1 func(const type2& x);
Type3 function(const type4& x);
Type1 func(type2&& x);
Type3 function(type4&& x);
I don't want to create too many overloads that may be useless but if an user of my library wanted to use the move operation I should create the && param types. Of course I am not doing this for primitive types (int, double, char...) but for containers or classes. What do you suggest?
I am not sure if the latter scenario (with both versions) would be useful or not.
Let me comment on four scenarios in your question and examples.
std::string_view with pass-by-value is supposed to replace const std::string& parameters and whenever you can guarantee the necessary preconditions for a safe usage of std::string_view (lifetime, pointee doesn't change), it's a good candidate to start modernizing your function signatures.
const T& vs. T&& (where T is not subject to template type deduction) with known usage scenarios. The void fun function that appends to a given, modifiable string, will only makes sense as void fun(std::string&&) if calling code doesn't need the result after the call. In this case, the rvalue-reference signature documents this expectation nicely and is the way to go. But these cases are rather rare in my experience.
const T& vs. T&& (again, no type deduction) with unknown usage scenarios. A good reference here is std::vector::push_back, which is overloaded for both rvalue and lvalue references. The push_back operation is assumed to be cheap compared to move-construction a T, that's why the overload makes sense. When a function is assumed to be more expensive than such a move-construction, passing the argument by value is a simplification that can make sense (see also Item 41 in EMC++).
const T& vs. T&& when type deduction takes place. Here, use universal references together with std::forward whenever possible and the parameters can't be const qualified. If they aren't modified in the function body, go with const T&.
You want to use rvalue references only if:
You might retain a copy and you need the extra performance (measure!)
Example for this would be writing a library type (e.g. std::vector) where performance matters to its users.
You want only temporaries to be passed to your function
Example for this is the move assignment operator: After the assignment, the original objects state will not exist anymore.
Forwarding references (T&& with T deduced) fall under the first option.
Rvalue reference (not to be confused with a forwarding reference!) in function arguments is used when there is a need to move ownership from one object to another.
It is true that it is often done in context of move constructors/assignment operators, but this is not the only case. For example, a function accepting an ownership of std::unique_prt could accept it's argument by an rvalue reference.

Why the difference in the flow of universal reference and rvalue reference

Working from the Efficient Modern C++, Item 25. we have an example
Case 1
class Widget {
public:
template<typename T>
void setName(T&& newName)
{ name = std::forward<T>(newName); }
...
};
Case 2
class Widget {
public:
void setName(const std::string& newName)
{ name = newName; }
void setName(std::string&& newName)
{ name = std::move(newName); }
...
};
The call
Widget w;
w.setName("Adela Novak");
Now assuming case 1, the book states that the literal is conveyed to the assignment operator for t std::string inside w's name data member.
Assuming case 2, the book states that -> first a temporary is created from the literal, calling the string constructor, so the setName parameter can bind to it, and than this temporary is moved into w's name data member.
Question
Why does this difference in behavior come about and how am I to think about it?
Namely, why is there no need for a temporary in case 1? Why is there difference? Is T&& not deduced to be an rvalue reference to a string, thus arriving at the same behavior as case 2 (obviously not, as per the book, but why)?
In case 1, T is deduced to be const char (&)[12], not std::string. There is no reason for the compiler to promote the string literal to std::string yet. In case 2, every overload requires takes a reference to an std::string, which forces the creation of a temporary std::string to which a reference can be bound using the implicit const char* constructor.
Note that while an rvalue reference such as std::string && may only bind to an rvalue, the templated equivalent T && may bind to both rvalues and lvalues.

C++ reference for both LValue and Rvalue without type deduction

I was reading a good tutorial on lvalue/rvalue references. If I've understood correctly when there is type deduction something like T&& can accept both an lvalue and an rvalue.
But is there a way to achieve that without a generic class? I'd like to avoid duplicating all my methods for accepting both lvalues and rvalues. And of course avoid passing big objects by value.
r-value references are mostly use in move-constructor and move assignment.
For regular method, you may stick with one reference type only:
For read only parameter (without copy), const reference is enough.
if you have to do a copy, you may take your argument by value and use std::move:
Example:
class Test
{
public:
void displayString(const std::string& s) const { std::cout << s << m_s; }
void setString(std::string s) { m_s = std::move(s); }
private:
std::string m_s;
};
If the function that you implement does not need rvalue semantic, then you can simply pass the argument by reference or by constant reference.
However, if you can take advantage of rvalues and do not want to duplicate your code, you can pass by value and move the result. That should be almost as efficient and can be more maintainable than code duplication or an implementation with universal references.
This answer shows the technique: Should all/most setter functions in C++11 be written as function templates accepting universal references?
// copy, then move
void set_a(A a_) { a = std::move(a_); }

C++11 move(x) actually means static_cast<X&&>(x)? [duplicate]

This question already has answers here:
When is the move constructor called in the `std::move()` function?
(2 answers)
Closed 9 years ago.
Just reading Stroustrup's C++ Programming Language 4th Ed and in chapter 7 he says:
move(x) means static_cast<X&&>(x) where X is the type of x
and
Since move(x) does not move x (it simply produces an rvalue reference
to x) it would have been better if move() had been called rval()
My question is, if move() just turns the variable in to an rval, what is the actual mechanism which achieves the "moving" of the reference to the variable (by updating the pointer)??
I thought move() is just like a move constructor except the client can use move() to force the compiler??
what is the actual mechanism which achieves the "moving" of the reference to the variable (by updating the pointer)??
Passing it to a function (or constructor) that takes an rvalue reference, and moves the value from that reference. Without the cast, variables cannot bind to rvalue references, and so can't be passed to such a function - this prevents variables from being accidentally moved from.
I thought move() is just like a move constructor except the client can use move() to force the compiler??
No; it's used to convert an lvalue into an rvalue in order to pass it to a move constructor (or other moving function) which requires an rvalue reference.
typedef std::unique_ptr<int> noncopyable; // Example of a noncopyable type
noncopyable x;
noncopyable y(x); // Error: no copy constructor, and can't implicitly move from x
noncopyable z(std::move(x)); // OK: convert to rvalue, then use move constructor
When you are calling move, you are just telling "Hey, I want to move this object". And when constructor accepts rvalue-reference, it understands it as "Hmm, someone want I move data from this object into myself. So, OK, I'll do it".
std::move does not moves or changes object, it just "marks" it as "ready-for-moving". And only function, that accepts rvalue reference should implement moving actual object.
This is an example, that describes the text above:
#include <iostream>
#include <utility>
class Foo
{
public:
Foo(std::size_t n): _array(new int[n])
{
}
Foo(Foo&& foo): _array(foo._array)
{
// Hmm, someone tells, that this object is no longer needed
// I will move it into myself
foo._array = nullptr;
}
~Foo()
{
delete[] _array;
}
private:
int* _array;
};
int main()
{
Foo f1(5);
// Hey, constructor, I want you move this object, please
Foo f2(std::move(f1));
return 0;
}
As in Going Native 2013, Scott Meyers gave the talk about C++ 11 features, including move.
What std::move essentially do is "unconditionally casts to a rvalue".
My question is, if move() just turns the variable in to an rval, what is the actual mechanism which achieves the "moving" of the reference to the variable (by updating the pointer)??
move does the type casting, thus the compiler will know which ctor to use. The actual move operation is done by the move ctor. You can take it as a function overloading. (ctor overloads with the rvalue parameter type.)
rvalues are generally temporary values which are discarded and destroyed immediately after creation (with a few exceptions). std::string&& is a reference to a std::string that will only bind to an rvalue. Prior to C++11, temporaries would only bind to std::string const& -- after C++11, they also bind to std::string&&.
A variable of type std::string&& behaves much like a bog-standard reference. It is pretty much only in the binding of function signatures and initialization that std::string&& differs from std::string& variables. The other way it differs is when you decltype the reference. All other uses are unchanged.
On the other hand, if a function returns a std::string&&, it is very different than returning a std::string&, because the second kind of thing that can be bound to a std::string&& is the return value of a function returning std::string&&.
std::move is the most common way to generate such a function. In a sense, it lies to the context it is in and tells it "I am a temporary, do with me what you will". So std::move takes a reference to something, and does a cast that makes it pretend to be a temporary -- aka, rvalue.
Move constructors and move assignment and other move-aware functions take an rvalue reference to know when the data they are passed is "scratch" data that they can "damage" to some extent when using it. This is very useful because many types (from containers, to std::function, to anything that uses the pImpl pattern, to non-copyable resources) can have their internal state moved much easier than it can be copied. Such a move changes the state of the source object: but because the function is told it is scratch data, that isn't impolite.
So the move happens not in std::move, but in the function that understands that the return value of std::move implies that it is permitted to modify the data in a somewhat destructive manner if that would help it.
The other ways you can get an rvalue, or an indication that the source object is "scratch data", is when you have a true temporary (an anonymous object created as the return of some other function, or one created using function-style constructor syntax), or when you return from a function with a statement of the form return local_variable;. In both cases, the data binds to rvalue references.
The short version is that std::move does not move, and std::forward does not forward, it just indicates that such an action would be allowed at this point, and lets the function/constructor being called decide what to do with that information.
from http://en.cppreference.com/w/cpp/utility/move
std::move obtains an rvalue reference to its argument and converts it
to an xvalue.
Code that receives such an xvalue has the opportunity to
optimize away unnecessary overhead by moving data out of the argument,
leaving it in a valid but unspecified state.
Return value
static_cast<typename std::remove_reference<T>::type&&>(t)
you can see move is just static_cast
by calling std::move on an object doesn't really doing anything useful, however it tells that the return value can be modified to "a valid but unspecified state"
I thought move() is just like a move constructor except the client can
use move() to force the compiler??
By essentially casting the type to an r-value type, this allows the compiler to invoke the move constructor over the copy constructor.
std::move is equivalent to static_cast<std::string&&>(x).
In the standard, it is defined like this:
template <class T>
constexpr remove_reference_t<T>&& move(T&&) noexcept;
Complementing other answers, an example could help you to better understand how rvalue references work. Take a look to the following code that emulates rvalue references:
#include <iostream>
#include <memory>
template <class T>
struct rvalue_ref
{
rvalue_ref(T& obj) : obj_ptr{std::addressof(obj)} {}
T* operator->() //For simplicity, we'll use the reference as a pointer.
{ return obj_ptr; }
T* obj_ptr;
};
template <class T>
rvalue_ref<T> move(T& obj)
{
return rvalue_ref<T>(obj);
}
template <class T>
struct myvector
{
myvector(unsigned sz) : data{new T[sz]} {}
myvector(rvalue_ref<myvector> other) //Move constructor
{
this->data = other->data;
other->data = nullptr;
}
~myvector()
{
delete[] data;
}
T* data;
};
int main()
{
myvector<int> vec(5); //vector of five integers
std::cout << vec.data << '\n'; //Print address of data
myvector<int> vec2 = move(vec); //Move data from vec to vec2
std::cout << vec.data << '\n'; //Prints zero
//Prints address of moved data (same as first output line)
std::cout << vec2.data << '\n';
}
As we can see, "move" only generates the correct alias, to indicate to the compiler which constructor overload want to use. The difference between this implementation and real rvalue references is of course that casting to rvalue reference has zero overhead, since it's only a compiler directive.