Does NRVO also apply to coroutines? - c++

In the following example, NRVO (Named Return Value Optimization) applies as per this article:
std::string f1()
{
std::string str;
return str; // NVRO applies here!
}
However, consider:
task<std::string> f2()
{
std::string str;
co_return str; // Does NVRO also apply here?
}

I know that NRVO (Named Return Value Optimization) is mandatory since C++17:
It is not. NRVO is still an optimisation.
The non-named Return Value Optimisation (RVO) is mandatory.
// Is NVRO also guaranteed here?
No, because NRVO is never guaranteed.

For completeness, guaranteed elision in C++17 only applies to returning a prvalue from a function directly. Returning a named variable is only subject to elision if the compiler feels like it.
As for the meat of your question, co_return values are never subject to copy elision, guaranteed or otherwise. Elision for return values keys off of the return keyword, and coroutines aren't allowed to use return. They use co_return, which the elision logic in the standard doesn't key off of. So elision does not apply.
The reason why this was done is because of how coroutines work. A coroutine is a function that has a promise object in it. This promise object is how you shepard the coroutine's co_return value (and other state) to the "future" object that the coroutine function returned.
Elision works in normal functions because calling conventions require the caller to pass the storage for a return value to the function. So the function's implementation can choose to just build the object in that storage rather than building a separate stack object and copying into it upon return.
In a coroutine, the return value lives inside the promise, so that can't really happen.

NRVO as defined by the article you linked (i.e. not even creating a temporary) isn't a thing for coroutines because how co_return works is up to the user-provided coroutine promise type: the expression in the co_return statement is fed to the promise's return_value method, which can decide what to do with it.
However there is a related optimization that still may be useful. [class.copy.elision]/3 says the following:
An implicitly movable entity is a variable of automatic storage duration that is either a non-volatile object or an rvalue reference to a non-volatile object type. In the following copy-initialization contexts, a move operation is first considered before attempting a copy operation:
If the expression in a return ([stmt.return]) or co_­return ([stmt.return.coroutine]) statement is a (possibly parenthesized) id-expression that names an implicitly movable entity declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, or
[...]
overload resolution to select the constructor for the copy or the return_­value overload to call is first performed as if the expression or operand were an rvalue. If the first overload resolution fails or was not performed, overload resolution is performed again, considering the expression or operand as an lvalue.
This means that if you return a local variable by name from a coroutine it will be moved, not copied (as long as the promise type supports this). For example, clang accepts the following despite the fact that it's not possible to copy a std::unique_ptr<int>:
// Assume a coroutine task type called Task<T> whose associated promise has a
// return_value(T) method. The co_return here will successfully call that
// method.
Task<std::unique_ptr<int>> MakeInt() {
auto result = std::make_unique<int>(17);
co_return result;
}
So the optimization "value is fed to coroutine promise as an rvalue reference even though std::move wasn't used" does apply. But the standard doesn't say "move constructor isn't even called", and it can't because it's up to the promise what to do with the expression it's given.

Just because answer looks too long. The standard says that the statement co_return <expr>; is equivalent to:
P.return_value(<expr>);
where P is the coroutine's promise object. From that I suppose you can answer this question and many other questions.
If you are looking for coroutine documentation look here:
dcl.fct.def.coroutine
stmt.return.coroutine
expr.await
expr.yield
support.coroutine

Related

Initialization in return statements of functions that return by-value

My question originates from delving into std::move in return statements, such as in the following example:
struct A
{
A() { std::cout << "Constructed " << this << std::endl; }
A(A&&) noexcept { std::cout << "Moved " << this << std::endl; }
};
A nrvo()
{
A local;
return local;
}
A no_nrvo()
{
A local;
return std::move(local);
}
int main()
{
A a1(nrvo());
A a2(no_nrvo());
}
which prints (MSVC, /std:c++17, release)
Constructed 0000000C0BD4F990
Constructed 0000000C0BD4F991
Moved 0000000C0BD4F992
I am interested in the general initialization rules for return statements in functions that return by-value and which rules apply when returning a local variable with std::move as shown above.
The general case
Regarding return statements you can read
Evaluates the expression, terminates the current function and returns the result of the expression to the caller after implicit conversion to the function return type. [...]
on cppreference.com.
Amongst others Copy initialization happens
when returning from a function that returns by value
like so
return other;
Coming back to my example, according to my current knowledge - and in contrast to the above-named rule - A a1(nrvo()); is a statement that direct-initializes a1 with the prvalue nrvo(). So which object exactly is copy-initialized as described at cppreference.com for return statements?
The std::move case
For this case, I've referred to ipc's answer on Are returned locals automatically xvalues. I want to make sure that the following is correct: std::move(local) has the type A&& but no_nrvo() is declared to return the type A, so here the
returns the result of the expression to the caller after implicit conversion to the function return type
part should come into play. I think this should be an Lvalue to rvalue conversion:
A glvalue of any non-function, non-array type T can be implicitly converted to a prvalue of the same type. [...] For a class type, this conversion [...] converts the glvalue to a prvalue whose result object is copy-initialized by the glvalue.
To convert from A&& to A A's move constructor is used, which is also why NRVO is disabled here. Are those the rules that apply in this case, and did I understand them correctly? Also, again they say copy-initialized by the glvalue but A a2(no_nrvo()); is a direct initialization. So this also touches on the first case.
You have to be careful with cppreference.com when diving into such nitty-gritty, as it's not an authoritative source.
So which object exactly is copy-initialized as described at cppreference.com for return statements?
In this case, none. That's what copy elision is: The copy that would normally happen is skipped. The cppreference (4) clause could be written as "when returning from a function that returns by value, and the copy is not elided", but that's kind of redundant. The standard: [stmt.return] is a lot clearer on the subject.
To convert from A&& to A A's move constructor is used, which is also why NRVO is disabled here. Are those the rules that apply in this case, and did I understand them correctly?
That's not quite right. NRVO only applies to names of non-volatile objects. However, in return std::move(local);, it's not local that is being returned, it's the A&& that is the result of the call to std::move(). This has no name, thus mandatory NRVO does not apply.
I think this should be an Lvalue to rvalue conversion:
The A&& returned by std::move() is decidedly not an Lvalue. It's an xvalue, and thus an rvalue already. There is no Lvalue to rvalue conversion happening here.
but A a2(no_nrvo()); is a direct initialization. So this also touches on the first case.
Not really. Whether a function has to perform copy-initialization of its result as part of a return statement is not impacted in any way by how that function is invoked. Similarly, how a function's return argument is used at the callsite is not impacted by the function's definition.
In both cases, an is direct-initialized by the result of the function. In practice, this means that the compiler will use the same memory location for the an object as for the return value of the function.
In A a1(nrvo());, thanks to NRVO, the memory location assigned to local is the same as the function's result value, which happens to be a1 already. Effectively, local and a1 were the same object all along.
In A a2(no_nrvo()), local has its own storage, and the result of the function, aka a2 is move-constructed from it. Effectively, local is moved into a2.

How to do type conversion on an "implicit" rvalue in the return statement

Is this invalid? gcc accepts it, clang and msvc don't.
#include <memory>
class Holder {
std::unique_ptr<int> data;
public:
operator std::unique_ptr<int>() && { return std::move(data); }
};
std::unique_ptr<int> test()
{
Holder val;
return val;
}
Assuming that I don't want to add something like std::unique_ptr<int> Holder::TakeData() { return std::move(data); }, the only other workaround I could think of is moving in the return value:
std::unique_ptr<int> test()
{
Holder val;
return std::move(val); // lets the conversion proceed
}
But then gcc 9.3+ has the gall to tell me that the std::move is redundant (with all warnings enabled). WTF? I mean yeah, gcc doesn't need the move, sure, but nothing else accepts the code then. And if it won't be gcc, then some humans inevitably will balk at it later.
What's the authoritative last word on whether it should compile as-is or not?
How should such code be written? Should I put in this seemingly noisy TakeData function and use it? Worse yet - should I maybe make the TakeData function limited to rvalue context, i.e. having to do return std::move(val).TakeData() ?
Adding operator std::unique_ptr<int>() & { return std::move(data); } is not an option, since it obviously leads to nasty bugs - it can be invoked in wrong context.
The "implicit" rvalue conversion is standard mandated. But depending on which standard version you are using, which compiler is "correct" varies.
In C++17
[class.copy.elision] (emphasis mine)
3 In the following copy-initialization contexts, a move operation
might be used instead of a copy operation:
If the expression in a return statement is a (possibly parenthesized) id-expression that names an object with automatic
storage duration declared in the body or parameter-declaration-clause
of the innermost enclosing function or lambda-expression, or
...
overload resolution to select the constructor for the copy is first
performed as if the object were designated by an rvalue. If the first
overload resolution fails or was not performed, or if the type of the
first parameter of the selected constructor is not an rvalue reference
to the object's type (possibly cv-qualified), overload resolution is
performed again, considering the object as an lvalue. [ Note: This
two-stage overload resolution must be performed regardless of whether
copy elision will occur. It determines the constructor to be called if
elision is not performed, and the selected constructor must be
accessible even if the call is elided.  — end note ]
Up to C++17, GCC is wrong. Using val implicitly as an rvalue should fail to initialize the return type on account of the sentence I marked in bold (the rvalue reference in the unique_ptr c'tor doesn't bind directly to val). But come C++20, that sentence is no longer there.
C++20
3 An implicitly movable entity is a variable of automatic storage
duration that is either a non-volatile object or an rvalue reference
to a non-volatile object type. In the following copy-initialization
contexts, a move operation might be used instead of a copy operation:
If the expression in a return ([stmt.return]) or co_­return ([stmt.return.coroutine]) statement is a (possibly parenthesized)
id-expression that names an implicitly movable entity declared in the
body or parameter-declaration-clause of the innermost enclosing
function or lambda-expression, or
[...]
overload resolution to select the constructor for the copy or the
return_­value overload to call is first performed as if the expression
or operand were an rvalue. If the first overload resolution fails or
was not performed, overload resolution is performed again, considering
the expression or operand as an lvalue. [ Note: This two-stage
overload resolution must be performed regardless of whether copy
elision will occur. It determines the constructor or the return_­value
overload to be called if elision is not performed, and the selected
constructor or return_­value overload must be accessible even if the
call is elided. — end note ]
The correctness of the code is thus subject to the time travel properties of your compiler(s).
As far as how should code like that should be written. If you aren't getting consistent results, an option would be to use the exact return type of the function
std::unique_ptr<int> test()
{
Holder val;
std::unique_ptr<int> ret_val = std::move(val);
return ret_val;
}
I agree from the get go that this may not look as appealing as simply returning val, but at least it plays nice with NRVO. So we aren't likely to get more copies of unique_ptr than we desired originally.
If that is too unappealing still, then I find your idea of a resource stealing member function to be most to my liking. But no accounting for taste.

Can a C++ compiler perform RVO for a named const variable used for the return value?

This question is a slight variant on a related question shown here.
In C++17 I have a local variable that I want to be const to demonstrate it is unmodified once created per Scott Meyers Effective C++ item 3 recommendation to use const whenever possible:
#include <string>
std::string foo()
{
const std::string str = "bar";
return str;
}
int main()
{
std::string txt = foo();
}
Can a compiler perform (named) return-value optimization for txt, even though the type of str is different from the return type of foo due to the const-ness difference?
The named return value optimization is enabled by copy elision specified in C++17 in [class.copy.elision]. The relevant part here is [class.copy.elision]/1.1:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the constructor selected for the copy/move operation and/or the destructor for the object have side effects. […]
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function parameter or a variable introduced by the exception-declaration of a handler ([except.handle])) with the same type (ignoring cv-qualification) as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function call's return object
[…]
emphasis mine. Thus, compilers are allowed to perform the optimization here. And a quick test would seem to verify that compilers will actually perform this optimization here…
Note that the const may be problematic nevertheless. If a compiler does not perform copy elision (it's only allowed, not guaranteed to happen here; even in C++17 since the expression in the return statement is not a prvalue), the const will generally prevent the object from being moved from instead (normally cannot move from const object)…
Agree with Michael Kenzel's answer. But there are some peculiarities with compiler optimizations. If you look at a simple example without optimizations, you can see a feature of the msvc compiler - https://godbolt.org/z/b7z6n1YMf
Without optimizations, msvc will produce a copy of the object instead of NRVO.
This may be important in some cases.

Copy semantics for C++ unique pointer

Is there something wrong if I write something like this:
Try<std::unique_ptr<int> > some_function() {
std::unique_ptr<int> s(new int(2));
return s;
}
Is the copy constructor invoked? Should I use std::move?
std::unique_ptr doesn't have a copy constructor. What you're doing there is the same as assigning with a unique_ptr: the pointer is moved. (though in some situations you have to explicitly move() the pointer or else you'll get a compilation error; but if the compiler doesn't complain with an error, then it's quietly moving the pointer)
In a return statement, overload resolution can be performed as if the id-expression in the return statement designates an rvalue:
When the criteria for elision of a copy/move operation are met, [..], or when the expression in a return statement is a (possibly
parenthesized) id-expression that names an object with automatic storage duration declared in the body [..], overload resolution
to select the constructor for the copy is first performed as if the object were designated by an rvalue.
So your case does indeed qualify for NRVO since s is declared in the body of the function, thus std::move() is not required as overload-resolution can treat s as an rvalue.
Note that std::move() might still be needed if your compiler doesn't support the first phase of overload resolution treating s as an rvalue when the type of the return expression doesn't have the same cv-unqualified type as the function's return type. This seems to be the case with the trunk version of clang but not gcc. More info in this thread.

Result of ternary operator not an rvalue

If you compile this program with a C++11 compiler, the vector is not moved out of the function.
#include <vector>
using namespace std;
vector<int> create(bool cond) {
vector<int> a(1);
vector<int> b(2);
return cond ? a : b;
}
int main() {
vector<int> v = create(true);
return 0;
}
If you return the instance like this, it is moved.
if(cond) return a;
else return b;
Here is a demo on ideone.
I tried it with gcc 4.7.0 and MSVC10. Both behave the same way.
My guess why this happens is this:
The ternary operators type is an lvalue because it is evaluated before return statement is executed. At this point a and b are not yet xvalues (soon to expire).
Is this explanation correct?
Is this a defect in the standard?
This is clearly not the intended behaviour and a very common case in my opinion.
Here are the relevant Standard quotes:
12.8 paragraph 32:
Copy elision is permitted in the following circumstances [...]
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function's return value
[when throwing, with conditions]
[when the source is a temporary, with conditions]
[when catching by value, with conditions]
paragraph 33:
When the criteria for elision of a copy operation are met or would be met save for the fact that the source object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If overload resolution fails, or if the type of the first parameter of the selected constructor is not an rvalue reference to the object's type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue. [Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. - end note]
Since the expression in return (cond ? a : b); is not a simple variable name, it's not eligible for copy elision or rvalue treatment. Maybe a bit unfortunate, but it's easy to imagine stretching the example a little bit further at a time until you create a headache of an expectation for compiler implementations.
You can of course get around all this by explicitly saying to std::move the return value when you know it's safe.
This will fix it
return cond ? std::move(a) : std::move(b);
Consider the ternary operator as a function, like your code is
return ternary(cond, a, b);
The parameters won't be moved implicitly, you need to make it explicit.