Why may this code disable move semantics and copy elision? - c++

Sometimes we may defer perfect returning like this:
template<typename Func, typename... Args>
decltype(auto) call(Func f, Args&&... args)
{
decltype(auto) ret{f(std::forward<Args>(args)...)};
// ...
return static_cast<decltype(ret)>(ret);
}
But in Jousttis's new book C++ Move Semantics - The Complete Guide, he says that code below is better:
template<typename Func, typename... Args>
decltype(auto) call(Func f, Args&&... args)
{
decltype(auto) ret{f(std::forward<Args>(args)...)};
// ...
if constexpr (std::is_rvalue_reference_v<decltype(ret)>) {
return std::move(ret); // move xvalue returned by f() to the caller
}
else {
return ret; // return the plain value or the lvalue reference
}
}
Because the first piece of code "might disable move semantics and copy elision. For plain
values, it is like having an unnecessary std::move() in the return statement." What's the difference between these two patterns? From my point of view, for plain values, decltype will deduce just the type itself, so it's just a static_cast<Type>(ret)(i.e. no operation at all) and the returned type is same as the declared type so that copy elision is possible. Is there anything that I take wrong?

I don't know what edition of the book you have, but mine states explicitly:
perfect return but unnecessary copy
The problem does not manifest for references, however it does when returning by value comes into play.
Consider the following code:
#include <iostream>
struct S
{
S() { std::cout << "constr\n";}
S(const S& ) { std::cout << "copy constr\n"; }
S(S&& ) { std::cout << "move constr\n"; }
};
S createSNoElide()
{
S s;
return static_cast<decltype(s)>(s);
}
S createSElide()
{
S s;
return s;
}
int main(int, char*[])
{
std::cout << "Elision\n";
S s1 = createSElide();
std::cout << "No elision\n";
S s2 = createSNoElide();
}
https://godbolt.org/z/YqG54rM9E
The createSNoElide() will be forced to use a copy constructor. Standard-wise it is most likely due to the following part:
This elision of copy/move operations, called copy elision, is
permitted in the following circumstances (which may be combined to
eliminate multiple copies):
in a return statement in a function with a class return type, when the
expression is the name of a non-volatile object with automatic storage
duration (other than a function parameter or a variable introduced by
the exception-declaration of a handler ([except.handle])) with the
same type (ignoring cv-qualification) as the function return type, the
copy/move operation can be omitted by constructing the object directly
into the function call's return object
https://eel.is/c++draft/class.copy.elision
I.e. the only way for elision to occur is to return a name of a local variable. Cast is simply a different type of expression, which effectively prevents elision, however counter-intuitive that might be.
Also, I should give extra credit to that post: https://stackoverflow.com/a/55491382/4885321 which guided me to the exact place in the standard.

Related

destruction order of promise and return object for C++ coroutine

#include <iostream>
#include <coroutine>
class eager {
public:
struct promise_type {
promise_type() { std::cout << "promise_type ctor" << std::endl; }
~promise_type() { std::cout << "~promise_type dtor" << std::endl; }
struct return_object {
return_object() { std::cout << "return_object ctor" << std::endl; }
~return_object() { std::cout << "~return_object dtor" << std::endl; }
operator eager() { return {}; }
};
auto get_return_object() noexcept { return return_object{}; }
constexpr auto initial_suspend() const noexcept { return std::suspend_never{}; }
constexpr auto final_suspend() const noexcept { return std::suspend_never{}; }
constexpr auto return_void() const noexcept {}
auto unhandled_exception() -> void { throw; }
};
};
auto coroutine() -> eager {
co_return;
}
auto main() -> int
{
coroutine();
return 0;
}
You can see the result for MSVC, clang and GCC here: https://godbolt.org/z/Yan9s9TPE
According to lots of articles about coroutine, the coroutine() will be transformed into...
auto coroutine() -> eager {
eager::promise_type promise;
auto res = promise.get_return_object();
// initial suspend
promise.return_void();
// final suspend
return res;
}
At a glance, since the promise object is constructed first, I thought it would be the last object to be destructed.
However, MSVC and GCC show reverse order:
// from MSVC/GCC
promise_type ctor
return_object ctor
~promise_type dtor
~return_object dtor
On the other hand, clang shows what I expected:
// from clang
promise_type ctor
return_object ctor
~return_object dtor
~promise_type dtor
Which one is right?
Or, is the destruction order of promise object and return object just unspecified by standard?
According to lots of articles about coroutine
Then "lots of articles about coroutine[sic]" are incorrect.
The result object is not on the coroutine stack. It cannot be on the coroutine stack, because it's the result object for the initial call to the coroutine.
All that the C++ standard says about get_result_object is this:
The expression promise.get_­return_­object() is used to initialize the glvalue result or prvalue result object of a call to a coroutine.
The call to get_­return_­object is sequenced before the call to initial_­suspend and is invoked at most once.
It happens before initial_suspend and it gets called once. That's all it has to say. Therefore, everything else about result objects from functions work as normal; in this case, it simply gets initialized before the function properly starts instead of when the function is about to return.
By C++'s normal rules, a function's result object is on the caller's stack, not the stack of the function being called. So when promise.get_result_object() is evaluated, it is initializing storage provided by the caller.
main discards the result of the expression coroutine(). This means that it will manifest a temporary from the prvalue result object, and the type of this temporary will be eager. The temporary will be destroyed, but only after control returns to main.
Here's the tricky part: the return prvalue from get_result_object() is not eager. It's eager::promise::result_object. Initializing the return value requires performing an implicit conversion from result_object to eager. This requires manifesting a temporary of type result_object to perform that conversion on.
The standard rule for destroying temporaries is:
Temporary objects are destroyed as the last step in evaluating the full-expression ([intro.execution]) that (lexically) contains the point where they were created.
But... what is the "full-expression" here?
One would assume it would be the promise.get_result_object() expression. But is that a "full-expression" by C++'s rules? These rules are rather esoteric and technical. One could argue that promise.get_result_object() is being used to initialize an object and therefore it is de-facto an "init-declarator". But "init-declarator" is a piece of grammar, and promise.get_result_object() is not stated by the text to be an "init-declarator".
An argument could be made that the only expression that is certainly a full expression is coroutine(). And therefore, one could make the argument that any temporaries used to initialize the return value object should persist until control returns to the caller.
I would argue that the standard wording is under-specified and therefore both versions are equally legitimate until clarification is provided. Clang's version makes more sense (but not for the reasons you claim), but the others are at least arguable.

How can I tell whether I'm forwarding to a copy constructor?

If I'm writing a generic function that forwards arguments to a constructor, is there a way to tell whether that is a copy constructor? Essentially I want to do:
template <typename T, typename... Args>
void CreateTAndDoSomething(Args&&... args) {
// Special case: if this is copy construction, do something different.
if constexpr (...) { ... }
// Otherwise do something else.
...
}
The best I've come up with is checking for sizeof...(args) == 1 and then looking at std::is_same_v<Args..., const T&> || std::is_same_v<Args..., T&>. But I think this misses edge cases like volatile-qualified inputs and things that are implicitly convertible to T.
To be honest I'm not entirely sure this question is well-defined, so feel free to tell me it isn't (and why) as well. If it helps you can assume that the only single-argument constructors for T are T(const T&) and T(T&&).
If I'm right that this isn't well-defined because a copy constructor isn't a Thing, then maybe this can be made more precise by saying "how can I tell whether the expression T(std::forward<Args>(args)...) selects an overload that accepts const T&?
You can use remove_cv_t:
#include <type_traits>
template <typename T, typename... Args>
void CreateTAndDoSomething(Args&&... args) {
// Special case: if this is copy construction, do something different.
if constexpr (sizeof...(Args) == 1 && is_same_v<T&, remove_cv_t<Args...> >) { ... }
// Otherwise do something else.
...
}
This covers all "copy constructors" as defined by the standard, not considering possible default arguments (it is hard to determine whether a given function parameter -- for the function that would be invoked given these parameters -- is defaulted or not).
You had the right idea. Everything that is needed is encoded in the deduced type of Args. Though, if you want to account for all of the cv-qualified cases, there will be a lot to go through. Let's first recognize the different cases that might arise:
Construction (implicit conversions are construction)
Copy construction (commonly T(const T&))
Move construction (commonly T(T&&))
Slicing (calling Base(const Base&) or Base(Base&&) with a Derived)
If weird move or copy constructors are not considered (ones with default parameters), the cases 2-4 could only happen a single argument is passed, everything else is construction. Hence, it is sensible to provide an overload for the single argument case. Trying to do all of these cases in the variadic template is going to be ugly, as you have to use fold expressions or something like std::conjuction/std::disjuction for the if statements to be valid.
We will also find out that recognizing move and copy separately in every single case is impossible. If there is no need to consider copies and moves separately, the solution is easy. But if these cases need to be separated, one can only make a good guess, which should work almost always.
What comes to to slicing, I would probably opt to disable it with a static_assert.
Move and copy combined
Here is the solution using a single argument overload. Let's go over it in detail next.
#include <utility>
#include <type_trait>
#include <iostream>
// Multi-argument case is almost always construction
template<typename T, typename... Args>
void CreateTAndDoSomething(Args&&... args)
{
std::cout << "Constructed" << '\n';
T val(std::forward<Args>(args)...);
}
template<typename T, typename U>
void CreateTAndDoSomething(U&& arg)
{
// U without references and cv-qualifiers
// std::remove_cvref_t in C++20
using StrippedU = std::remove_cv_t<std::remove_reference_t<U>>;
// Extra check is needed because T is a base for itself
static_assert(
std::is_same_v<StrippedU, T> || !std::is_base_of_v<T, StrippedU>,
"Attempting to slice"
);
if constexpr (std::is_same_v<StrippedU, T>)
{
std::cout << "Copied or moved" << '\n';
}
else
{
std::cout << "Constructed" << '\n';
}
T val(std::forward<U>(arg));
}
Here we make use of the fact that U&& (and Args&&) is a forwarding reference. With forwarding references the deduced template argument U is different depending on the value category of the passed arg. Given an arg of type T, the U is deduced such that:
If the arg was an lvalue, the deduced U is T& (cv-qualifiers included).
If the arg was an rvalue, the deduced U is T (cv-qualifiers included).
NOTE: U might deduce to a cv-qualified reference (eg. const Foo&). std::remove_cv only removes toplevel cv-qualifiers, and references can't have toplevel cv-qualifiers. This is why std::remove_cv needs to applied to a non-reference type. If only std::remove_cv was used, the template would fail to recognize cases where U would be const T&, volatile T& or const volatile T&.
Only copy
A copy constructor is called (usually, see note) when U is deduced to T& const T&, volatile T& or const volatile T&. Because we have three cases where the deduced U is a cv-qualified reference and std::remove_cv doesn't work with these, we should just check these cases explicitly:
template<typename T, typename U>
void CreateTAndDoSomething(U&& arg)
{
// U without references and cv-qualifiers
// std::remove_cvref_t in C++20
using StrippedU = std::remove_cv_t<std::remove_reference_t<U>>;
// Extra check is needed because T is a base for itself
static_assert(
std::is_same_v<StrippedU, T> || !std::is_base_of_v<T, StrippedU>,
"Attempting to slice"
);
if constexpr (std::is_same_v<T&, U>
|| std::is_same_v<const T&, U>
|| std::is_same_v<volatile T&, U>
|| std::is_same_v<const volatile T&, U>)
{
std::cout << "Copied" << '\n';
}
else
{
std::cout << "Constructed" << '\n';
}
T val(std::forward<U>(arg));
}
NOTE: This does not recognize copy construction when a move constructor is not available, and the copy constructor with the signature T(const T&) is available. This is because the result of the call to std::forward with an rvalue arg is an xvalue, which can bind to const T&.
Move and copy seperated
DISCLAIMER: this solution only works for the general case (see the pitfalls)
Let's assume that T has a copy constructor with the signature T(const T&) and move constructor with the signature T(T&&), which is really common. const-qualified move constructors do not really make sense, as the moved object needs to be modified almost always.
With this assumption the expression T val(std::forward<U>(arg)); move constructs val, if U was deduced to a non-const T (arg is an non-const rvalue). This gives us two cases:
U is deduced to T
U is deduced to volatile T
By first removing the volatile qualifier from U we can account for both of these cases. When the move construction recognized first, the rest are copy construction:
template<typename T, typename U>
void CreateTAndDoSomething(U&& arg)
{
// U without references and cv-qualifiers
using StrippedU = std::remove_cv_t<std::remove_reference_t<U>>;
// Extra check is needed because T is a base for itself
static_assert(
std::is_same_v<StrippedU, T> || !std::is_base_of_v<T, StrippedU>,
"Attempting to slice"
);
if constexpr (std::is_same_v<std::remove_volatile_t<U>, T>)
{
std::cout << "Moved (usually)" << '\n';
}
else if constexpr (std::is_same_v<StrippedU, T>)
{
std::cout << "Copied (usually)" << '\n';
}
else
{
std::cout << "Constructed" << '\n';
}
T val(std::forward<U>(arg));
}
If you want to play around with the solution, it's available in godbolt. I've also implemented a special class that hopefully helps to visualize the different constructor calls.
Pitfalls of the solution
When the assumption stated earlier is not true, it is impossible to determine exactly whether copy or move constructor is called. There are at least few special cases that cause ambiguity:
If move constructor for T is not available, arg is a rvalue of type T, and the copy constructor has the signature T(const T&):
The xvalue returned by std::forward<U>(arg) will bind to the const T&. This was also discussed in the "only copy" case.
Move recognized, but a copy happens.
If T has a move constructor with the signature T(const T&&) and arg is a const rvalue of type T:
Copy recognized, but a move happens. Similar case with T(const volatile T&&).
I've also decided not to account for the case, when the user explicitly specifies U (T&& and volatile T&& will compile but not recognize properly).

Is the behaviour of std::get on rvalue-reference tuples dangerous?

The following code:
#include <tuple>
int main ()
{
auto f = [] () -> decltype (auto)
{
return std::get<0> (std::make_tuple (0));
};
return f ();
}
(Silently) generates code with undefined behaviour - the temporary rvalue returned by make_tuple is propagated through the std::get<> and through the decltype(auto) onto the return type. So it ends up returning a reference to a temporary that has gone out of scope. See it here https://godbolt.org/g/X1UhSw.
Now, you could argue that my use of decltype(auto) is at fault. But in my generic code (where the type of the tuple might be std::tuple<Foo &>) I don't want to always make a copy. I really do want to extract the exact value or reference from the tuple.
My feeling is that this overload of std::get is dangerous:
template< std::size_t I, class... Types >
constexpr std::tuple_element_t<I, tuple<Types...> >&&
get( tuple<Types...>&& t ) noexcept;
Whilst propagating lvalue references onto tuple elements is probably sensible, I don't think that holds for rvalue references.
I'm sure the standards committee thought this through very carefully, but can anyone explain to me why this was considered the best option?
Consider the following example:
void consume(foo&&);
template <typename Tuple>
void consume_tuple_first(Tuple&& t)
{
consume(std::get<0>(std::forward<Tuple>(t)));
}
int main()
{
consume_tuple_first(std::tuple{foo{}});
}
In this case, we know that std::tuple{foo{}} is a temporary and that it will live for the entire duration of the consume_tuple_first(std::tuple{foo{}}) expression.
We want to avoid any unnecessary copy and move, but still propagate the temporarity of foo{} to consume.
The only way of doing that is by having std::get return an rvalue reference when invoked with a temporary std::tuple instance.
live example on wandbox
Changing std::get<0>(std::forward<Tuple>(t)) to std::get<0>(t) produces a compilation error (as expected) (on wandbox).
Having a get alternative that returns by value results in an additional unnecessary move:
template <typename Tuple>
auto myget(Tuple&& t)
{
return std::get<0>(std::forward<Tuple>(t));
}
template <typename Tuple>
void consume_tuple_first(Tuple&& t)
{
consume(myget(std::forward<Tuple>(t)));
}
live example on wandbox
but can anyone explain to me why this was considered the best option?
Because it enables optional generic code that seamlessly propagates temporaries rvalue references when accessing tuples. The alternative of returning by value might result in unnecessary move operations.
IMHO this is dangerous and quite sad since it defeats the purpose of the "most important const":
Normally, a temporary object lasts only until the end of the full expression in which it appears. However, C++ deliberately specifies that binding a temporary object to a reference to const on the stack lengthens the lifetime of the temporary to the lifetime of the reference itself, and thus avoids what would otherwise be a common dangling-reference error.
In light of the quote above, for many years C++ programmers have learned that this was OK:
X const& x = f( /* ... */ );
Now, consider this code:
struct X {
void hello() const { puts("hello"); }
~X() { puts("~X"); }
};
auto make() {
return std::variant<X>{};
}
int main() {
auto const& x = std::get<X>(make()); // #1
x.hello();
}
I believe anyone should be forgiven for thinking that line #1 is OK. However, since std::get returns a reference to an object that is going to be destroyed, x is a dangling reference. The code above outputs:
~X
hello
which shows that the object that x binds to is destroyed before hello() is called. Clang gives a warning about the issue but gcc and msvc don't. The same issue happens if (as in the OP) we use std::tuple instead of std::variant but, sadly enough, clang doesn't issues a warning for this case.
A similar issue happens with std::optional and this value overload:
constexpr T&& value() &&;
This code, which uses the same X above, illustrates the issue:
auto make() {
return std::optional{X{}};
}
int main() {
auto const& x = make().value();
x.hello();
}
The output is:
~X
~X
hello
Brace yourself for more of the same with C++23's std::except and its methods value() and error():
constexpr T&& value() &&;
constexpr E&& error() && noexcept;
I'd rather pay the price of the move explained in Vittorio Romeo's post. Sure, I can avoid the issue by removing & from lines #1 and #2. My point is that the rule for the "most important const" just became more complicated and we need to consider if the expression involves std::get, std::optional::value, std::expected::value, std::expected::error, ...

When do we practically need 'explicit xvalues'?

The definition of xvalue is as follows:
— An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for example). An xvalue is the result of certain kinds of expressions involving rvalue references (8.3.2). [ Example: The result of calling a function whose return type is an rvalue reference is an xvalue. —end example ]
Will we ever fall into where we practically need to use a function whose return type is an rvalue reference, which is an xvalue?
const int && Foo()
{
// ...
}
Move semantics take an rvalue reference as a parameter, not a return value. So I don't think that's the case.
Returning rvalue references can be of use for functions that already take rvalues as parameters. A simple example:
struct X {
X() = default;
X(X&& other) { std::cout << "move ctor\n"; }
X(X const&) = delete;
void log(std::string const& s){ std::cout << "log: " << s << "\n"; }
};
void sink(X&& x) {
x.log("sink");
}
X&& passOn(X&& in) {
in.log("pass");
return std::move(in);
}
X moveOn(X&& in) {
in.log("move");
return std::move(in);
}
int main() {
sink(passOn(X()));
std::cout << "===============================\n";
sink(moveOn(X()));
}
Live demo →
The second function will call the move constructor to create the returned object, while the first will pass on the reference it already got. This is more useful if we don't return the original reference but instead a reference to a part of the referred object, e.g.
template<class T>
T&& getHead(std::vector<T>&& input) {
return std::move(input.front());
}
That's exactly what std::move is — the result of std::move execution is an xvalue. Other than that it is hard to tell since in the main returning a reference from the function is a bad thing most of the time. But maybe someone will come up with another clever usage of such a function.
Will we ever fall into where we practically need to use a function whose return type is an rvalue reference, which is an xvalue?
It used in container classes, for instance tuple has a get overload that looks like this:
template< std::size_t I, class... Types >
typename std::tuple_element<I, tuple<Types...> >::type&&
get( tuple<Types...>&& t );
I assume that std::optional and std::variant in C++17 will both have a similar overloads.
Granted, the only point is to avoid to type std::move in some very specific situations, like:
auto x = std::get<1>( f() );
Where f returns a tuple by value.

Why use a perfectly forwarded value (a functor)?

C++11 (and C++14) introduces additional language constructs and improvements that target generic programming. These include features such as;
R-value references
Reference collapsing
Perfect forwarding
Move semantics, variadic templates and more
I was browsing an earlier draft of the C++14 specification (now with updated text) and the code in an example in §20.5.1, Compile-time integer sequences, that I found interesting and peculiar.
template<class F, class Tuple, std::size_t... I>
decltype(auto) apply_impl(F&& f, Tuple&& t, index_sequence<I...>) {
return std::forward<F>(f)(std::get<I>(std::forward<Tuple>(t))...);
}
template<class F, class Tuple>
decltype(auto) apply(F&& f, Tuple&& t) {
using Indices = make_index_sequence<std::tuple_size<Tuple>::value>;
return apply_impl(std::forward<F>(f), std::forward<Tuple>(t), Indices());
}
Online here [intseq.general]/2.
Question
Why was the function f in apply_impl being forwarded, i.e. why std::forward<F>(f)(std::get...?
Why not just apply the function as f(std::get...?
In Brief...
The TL;DR, you want to preserve the value category (r-value/l-value nature) of the functor because this can affect the overload resolution, in particular the ref-qualified members.
Function definition reduction
To focus on the issue of the function being forwarded, I've reduced the sample (and made it compile with a C++11 compiler) to;
template<class F, class... Args>
auto apply_impl(F&& func, Args&&... args) -> decltype(std::forward<F>(func)(std::forward<Args>(args)...)) {
return std::forward<F>(func)(std::forward<Args>(args)...);
}
And we create a second form, where we replace the std::forward(func) with just func;
template<class F, class... Args>
auto apply_impl_2(F&& func, Args&&... args) -> decltype(func(std::forward<Args>(args)...)) {
return func(std::forward<Args>(args)...);
}
Sample evaluation
Evaluating some empirical evidence of how this behaves (with conforming compilers) is a neat starting point for evaluating why the code example was written as such. Hence, in addition we will define a general functor;
struct Functor1 {
int operator()(int id) const
{
std::cout << "Functor1 ... " << id << std::endl;
return id;
}
};
Initial sample
Run some sample code;
int main()
{
Functor1 func1;
apply_impl_2(func1, 1);
apply_impl_2(Functor1(), 2);
apply_impl(func1, 3);
apply_impl(Functor1(), 4);
}
And the output is as expected, independent of whether an r-value is used Functor1() or an l-value func when making the call to apply_impl and apply_impl_2 the overloaded call operator is called. It is called for both r-values and l-values. Under C++03, this was all you got, you could not overload member methods based on the "r-value-ness" or "l-value-ness" of the object.
Functor1 ... 1
Functor1 ... 2
Functor1 ... 3
Functor1 ... 4
Ref-qualified samples
We now need to overload that call operator to stretch this a little further...
struct Functor2 {
int operator()(int id) const &
{
std::cout << "Functor2 &... " << id << std::endl;
return id;
}
int operator()(int id) &&
{
std::cout << "Functor2 &&... " << id << std::endl;
return id;
}
};
We run another sample set;
int main()
{
Functor2 func2;
apply_impl_2(func2, 5);
apply_impl_2(Functor2(), 6);
apply_impl(func2, 7);
apply_impl(Functor2(), 8);
}
And the output is;
Functor2 &... 5
Functor2 &... 6
Functor2 &... 7
Functor2 &&... 8
Discussion
In the case of apply_impl_2 (id 5 and 6), the output is not as may have been initially been expected. In both cases, the l-value qualified operator() is called (the r-value is not called at all). It may have been expected that since Functor2(), an r-value, is used to call apply_impl_2 the r-value qualified operator() would have been called. The func, as a named parameter to apply_impl_2, is an r-value reference, but since it is named, it is itself an l-value. Hence the l-value qualified operator()(int) const& is called in both the case of the l-value func2 being the argument and the r-value Functor2() being used as the argument.
In the case of apply_impl (id 7 and 8) the std::forward<F>(func) maintains or preserves the r-value/l-value nature of the argument provided for func. Hence the l-value qualified operator()(int) const& is called with the l-value func2 used as the argument and the r-value qualified operator()(int)&& when the r-value Functor2() is used as the argument. This behaviour is what would have been expected.
Conclusions
The use of std::forward, via perfect forwarding, ensures that we preserve the r-value/l-value nature of the original argument for func. It preserves their value category.
It is required, std::forward can and should be used for more than just forwarding arguments to functions, but also when the use of an argument is required where the r-value/l-value nature must be preserved. Note; there are situations where the r-value/l-value cannot or should not be preserved, in these situations std::forward should not be used (see the converse below).
There are many examples popping up that inadvertently lose the r-value/l-value nature of the arguments via a seemingly innocent use of an r-value reference.
It has always been hard to write well defined and sound generic code. With the introduction of r-value references, and reference collapsing in particular, it has become possible to write better generic code, more concisely, but we need to be ever more aware of what the original nature of the arguments provided are and make sure that they are maintained when we use them in the generic code we write.
Full sample code can be found here
Corollary and converse
A corollary of the question would be; given reference collapsing in a templated function, how is the r-value/l-value nature of the argument maintained? The answer - use std::forward<T>(t).
Converse; does std::forward solve all your "universal reference" problems? No it doesn't, there are cases where it should not be used, such as forwarding the value more than once.
Brief background to perfect forwarding
Perfect forwarding may be unfamiliar to some, so what is perfect forwarding?
In brief, perfect forwarding is there to ensure that the argument provided to a function is forwarded (passed) to another function with the same value category (basically r-value vs. l-value) as originally provided. It is typically used with template functions where reference collapsing may have taken place.
Scott Meyers gives the following pseudo code in his Going Native 2013 presentation to explain the workings of std::forward (at approximately the 20 minute mark);
template <typename T>
T&& forward(T&& param) { // T&& here is formulated to disallow type deduction
if (is_lvalue_reference<T>::value) {
return param; // return type T&& collapses to T& in this case
}
else {
return move(param);
}
}
Perfect forwarding depends on a handful of fundamental language constructs new to C++11 that form the bases for much of what we now see in generic programming:
Reference collapsing
Rvalue references
Move semantics
The use of std::forward is currently intended in the formulaic std::forward<T>, understanding how std::forward works helps understand why this is such, and also aids in identifying non-idiomatic or incorrect use of rvalues, reference collapsing and ilk.
Thomas Becker provides a nice, but dense write up on the perfect forwarding problem and solution.
What are ref-qualifiers?
The ref-qualifiers (lvalue ref-qualifier & and rvalue ref-qualifier &&) are similar to the cv-qualifiers in that they (the ref-qualified members) are used during overload resolution to determine which method to call. They behave as you would expect them to; the & applies to lvalues and && to rvalues. Note: Unlike cv-qualification, *this remains an l-value expression.
Here is a practical example.
struct concat {
std::vector<int> state;
std::vector<int> const& operator()(int x)&{
state.push_back(x);
return state;
}
std::vector<int> operator()(int x)&&{
state.push_back(x);
return std::move(state);
}
std::vector<int> const& operator()()&{ return state; }
std::vector<int> operator()()&&{ return std::move(state); }
};
This function object takes an x, and concatenates it to an internal std::vector. It then returns that std::vector.
If evaluated in an rvalue context it moves to a temporary, otherwise it returns a const& to the internal vector.
Now we call apply:
auto result = apply( concat{}, std::make_tuple(2) );
because we carefully forwarded our function object, only 1 std::vector buffer is allocated. It is simply moved out to result.
Without the careful forwarding, we end up creating an internal std::vector, and we copy it to result, then discard the internal std::vector.
Because the operator()&& knows that the function object should be treated as a rvalue about to be destroyed, it can rip the guts out of the function object while doing its operation. The operator()& cannot do this.
Careful use of perfect forwarding of function objects enables this optimization.
Note, however, that there is very little use of this technique "in the wild" at this point. Rvalue qualified overloading is obscure, and doing so to operator() moreso.
I could easily see future versions of C++ automatically using the rvalue state of a lambda to implicitly move its captured-by-value data in certain contexts, however.