Do rvalue references allow dangling references? - c++

Consider the below.
#include <string>
using std::string;
string middle_name () {
return "Jaan";
}
int main ()
{
string&& danger = middle_name(); // ?!
return 0;
}
This doesn't compute anything, but it compiles without error and demonstrates something that I find confusing: danger is a dangling reference, isn't it?

Do rvalue references allow dangling references?
If you meant "Is it possible to create dangling rvalue references" then the answer is yes. Your example, however,
string middle_name () {
return "Jaan";
}
int main()
{
string&& nodanger = middle_name(); // OK.
// The life-time of the temporary is extended
// to the life-time of the reference.
return 0;
}
is perfectly fine. The same rule applies here that makes this example (article by Herb Sutter) safe as well. If you initialize a reference with a pure rvalue, the life-time of the tempoary object gets extended to the life-time of the reference. You can still produce dangling references, though. For example, this is not safe anymore:
int main()
{
string&& danger = std::move(middle_name()); // dangling reference !
return 0;
}
Because std::move returns a string&& (which is not a pure rvalue) the rule that extends the temporary's life-time doesn't apply. Here, std::move returns a so-called xvalue. An xvalue is just an unnamed rvalue reference. As such it could refer to anything and it is basically impossible to guess what a returned reference refers to without looking at the function's implementation.

rvalue references bind to rvalues. An rvalue is either a prvalue or an xvalue [explanation]. Binding to the former never creates a dangling reference, binding to the latter might. That's why it's generally a bad idea to choose T&& as the return type of a function. std::move is an exception to this rule.
T& lvalue();
T prvalue();
T&& xvalue();
T&& does_not_compile = lvalue();
T&& well_behaved = prvalue();
T&& problematic = xvalue();

danger is a dangling reference, isn't it?
Not any more than if you had used a const &: danger takes ownership of the rvalue.

Of course, an rvalue reference is still a reference so it can be dangling as well. You just have to bring the compiler into a situation where he has to drag the reference along and at the same time you just escape the refered-to value's scope, like this:
Demo
#include <cstdio>
#include <tuple>
std::tuple<int&&> mytuple{ 2 };
auto pollute_stack()
{
printf("Dumdudelei!\n");
}
int main()
{
{
int a = 5;
mytuple = std::forward_as_tuple<int&&>(std::move(a));
}
pollute_stack();
int b = std::get<int&&>(mytuple);
printf("Hello b = %d!\n", b);
}
Output:
Dumdudelei!
Hello b = 0!
As you can see, b now has the wrong value. How come? We stuffed an rvalue reference to an automatic variable a into a global tuple. Then we escaped the scope of a and retrieve its value through std::get<int&&> which will evaluate to an rvalue-reference. So the new object b is actually move constructed from a, but the compiler doesn't find a because its scope has ended already. Therefore std::get<int&&> evaluates to 0 (although it is probably UB and could evaluate to anything).
Note that if we don't touch the stack, the rvalue reference will actually still find the original value of object a even after its scope has ended and will retrieve the right value (just try it and uncomment pollute_stack() and see what happens). The pollute_stack() function just moves the stack pointer forward and back while writing values to the stack by doing some io-related stuff through printf().
The compiler doesn't see through this though at all so be aware of this.

Related

C++: Does the following code lead to dangling reference?

I'm studying the rvalue reference concept in C++. I want to understand if the following code can create a dangling reference.
std::string&& s = std::move("test text");
std::cout << s << std::endl;
From my understanding, s should be a dangling reference because its assignment it binds to a return value of std::move. And the correct usage should be std::string&& s = "test text". But when I tried it on here http://cpp.sh/ the program actually runs and prints "test text". Does this mean s is actually not a dangling reference?
I found a similar stack overflow question here:
int main()
{
string&& danger = std::move(middle_name()); // dangling reference !
return 0;
}
which confirms that this will lead to dangling reference. Can anyone give some hints? Thanks!
No it does not.
From my understanding, s should be a dangling reference because its assignment it binds to a return value of std::move
You have looked at the value categories, but not at the types.
The type of "test text" is char const[10]. This array reside in const global data.
You then pass it to move, which will return an rvalue reference to the array. The return type of this move cast expression is char const(&&)[10], an xvalue to a character array.
Then, this is assigned to a std::string&&. But this is not a string, but a character array. A prvalue of type std::string must then be constructed. It is done by calling the constructor std::string::string(char const*), since the reference to array will decay into a pointer when passing it around.
Since the reference is bound to a materialized temporary which is a prvalue, lifetime extension of the reference apply.
The std::move is completely irrelevant and does absolutely nothing in this case.
So in the end your code is functionally equivalent to this:
std::string&& s = std::string{"test text"};
std::cout << s << std::endl;
The answer would be difference if you would have used a std::string literal:
// `"test text"s` is a `std::string` prvalue
std::string&& s = std::move("test text"s);
std::cout << s << std::endl;
In this case, you have a prvalue that you send to std::move, which return an xvalue of type std::string, so std::string&&.
In this case, no temporary is materialized from the xvalue return by std::move, and the reference simply bind to the result of std::move. No extension is applied here.
This code would be UB, like the example you posted.
the program actually runs and prints "test text". Does this mean s is actually not a dangling reference?
No. Program running and printing "test text" does not mean that the program doesn't have a dangling reference. It is possible for a program to have that behaviour even when there is a dangling reference.
I found a similar stack overflow question here: ... which confirms that this will lead to dangling reference.
It confirms no such thing, because the linked question is different.
There is no dangling reference in the example. The string literal is an lvalue to an array of const char. The array has static storage duration, and a reference to the array will remain valid through the entire program.

Returning a reference to an implicitly constructed parameter vs an implicitly constructed internal object

C++11 question
Trying to understand an issue I came across in our code. Not looking for the "right" way to do this, just want to know how this is supposed to work so I can figure out how to fix things in the future
I think function f1() is fine returning a reference to the temp implicitly constructed on the same line as the p1
But what about p2? The temp is constructed implicitly in the body of f2() when calling f1(), but f2() is returning the reference that is being returned by f1(). I thought the lifetime of the temp is extended to match the lifetime of the reference
asking because in one of our compilers p2 is garbage on the next time, but on the others it is not
struct PP
{
int i;
PP(int i_) : i(i_) {}
};
const PP &f1(const PP &p)
{
return p;
}
const PP &f2(int i)
{
return f1(i);
} // does the temp live here after return?
int main()
{
const PP &p1 = f1(1);
const PP &p2 = f2(2); // is p2 valid on the NEXT line
return 0;
}
I think this is undefined behaviour, and the fact that different compilers have different outputs would suggest that.
In f2, you're returning a reference constructed by f1, which itself has a reference to the local object i. At the end of the scope, i dissapears and invalidates the reference.
In fact clang-tidy detects this godbolt example
Also in reference initialization of cppreference it states, as an exception to lifetime extension:
a temporary bound to a return value of a function in a return
statement is not extended: it is destroyed immediately at the end of
the return expression. Such return statement always returns a dangling
reference.

Reference initialization - temporary bound to return value

In an article about reference initialization at cppreference.com (Lifetime of a temporary), it says:
a temporary bound to a return value of a function in a return statement is not extended: it is destroyed immediately at the end of the return expression. Such function always returns a dangling reference.
This excerpt addresses the exceptions of extending the lifetime of a temporary by binding a reference to it. What do they actually mean by that? I've thought about something like
#include <iostream>
int&& func()
{
return 42;
}
int main()
{
int&& foo = func();
std::cout << foo << std::endl;
return 0;
}
So foo should be referencing the temporary 42. According to the excerpt, this should be a dangling reference - but this prints 42 instead of some random value, so it works perfectly fine.
I'm sure I'm getting something wrong here, and would appreciate if somebody could resolve my confusion.
Your example is very good, but your compiler is not.
A temporary is often a literal value, a function return value, but also an object passed to a function using the syntax "class_name(constructor_arguments)". For example, before lambda expressions were introduced to C++, to sort things one would define some struct X with an overloaded operator() and then make a call like this:
std::sort(v.begin(), v.end(), X());
In this case you expect that the lifetime of the temporary constructed with X() will end on the semicolon that ends the instruction.
If you call a function that expects a const reference, say, void f(const int & n), with a temporery, e.g. f(2), the compiler creates a temporary int, initailses it with 2, and passes a reference to this temporary to the function. You expect this temporary to end its life with the semicolon in f(2);.
Now consider this:
int && ref = 2;
std::cout << ref;
This code is perfectly valid. Notice, however, that here the compiler also creates a temporary object of type int and initalises it with 2. This is this temporary that ref binds to. However, if the temporary's lifetime was limited to the instruction it is created within, and ended on the semicolon that marks the end of instruction, the next instruction would be a disaster, as cout would be using a dangling reference. Thus, references to temporaries like the one above would be rather impractical. This is what the "extension of the lifetime of a temporary" is needed for. I suspect that the compiler, upon seeing something like int && ref = 2 is allowed to transform it to something like this
int tmp = 2;
int && ref = std::move(tmp);
std::cout << ref; // equivalent to std::cout << tmp;
Without lifetime expansion, this could look rather like this:
{
int tmp = 2;
int && ref = std::move(tmp);
}
std::cout << ref; // what is ref?
Doing such a trick in a return statement would be pointless. There's no reasonable, safe way to extend the lifetime of any object local to a function.
BTW. Most modern compilers issue a warning and reduce your function
int&& func()
{
return 42;
}
to
int&& func()
{
return nullptr;
}
with an immediate segfault upon any attempt to dereference the return value.

How does rvalue reference work here?

I am puzzled by the following code:
#include <iostream>
int main()
{
int x{};
int&& rvx = static_cast<int&&>(x);
++rvx;
std::cout << x << std::endl;
}
The output of it is 1. I don't understand how this works. The static_cast is supposed to cast the lvalue x into an xvalue, which is then assigned to rvx. Why does incrementing rvx lead to a change in x? Is this because the converted lvalue-to-rvalue is essentially sitting at the same memory location, but it is just considered now a rvalue? I was under the impression (which is probably false) that somehow the cast creates a temporary out of its argument.
rvalue reference is a reference. In this, it works just like any other reference.
An rvalue reference can bind to a temporary. This is what you'd get, for instance, if you write
int x{};
int&& rvx = +x;
But it doesn't need to bind to a temporary. By casting x to int&&, you've indicated to the compiler that that's okay too, that it may treat x as an rvalue to bind directly to the reference.

Why does "most important const" have to be const?

In http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/ it mentions "most important const" where by C++ deliberately specifies that binding a temporary object to a reference to const on the stack lengthens the lifetime of the temporary to the lifetime of the reference itself. I was wondering why c++ only allows the lifetime of the object to be lengthened when the reference is const and not when it isn't? What is the rational behind the feature and why does it have to be const?
Here's an example:
void square(int &x)
{
x = x * x;
}
int main()
{
float f = 3.0f;
square(f);
std::cout << f << '\n';
}
If temporaries could bind to non-const lvalue references, the above would happily compile, but produce rather surprising results (an output of 3 instead of 9).
Consider the following:
int& x = 5;
x = 6;
What should happen if this was allowed? By contrast, if you did
const int& x = 5;
there would be no legal way to modify x.
Note that const references can be bound to objects that don't even have an address normally. A const int & function parameter can take an argument formed by the literal constant expression 42. We cannot take the address of 42, so we cannot pass it to a function that takes a const int *.
const references are specially "blessed" to be able to bind to rvalues such as this.
Of course, for traditional rvalues like 2 + 2, lifetime isn't an issue. It's an issue for rvalues of class type.
If the binding of a reference is allowed to some object which, unlike 42, does not have a pervasive lifetime, that lifetime has to be extended, so that the reference remains sane throughout its scope.
It's not that the const causes a lifetime extension, it's that a non-const reference is not allowed. If that were allowed, it would also require a lifetime extension; there is no point in allowing some reference which then goes bad in some parts of its scope. That behavior undermines the concept that a reference is safer than a pointer.