I'm studying the rvalue reference concept in C++. I want to understand if the following code can create a dangling reference.
std::string&& s = std::move("test text");
std::cout << s << std::endl;
From my understanding, s should be a dangling reference because its assignment it binds to a return value of std::move. And the correct usage should be std::string&& s = "test text". But when I tried it on here http://cpp.sh/ the program actually runs and prints "test text". Does this mean s is actually not a dangling reference?
I found a similar stack overflow question here:
int main()
{
string&& danger = std::move(middle_name()); // dangling reference !
return 0;
}
which confirms that this will lead to dangling reference. Can anyone give some hints? Thanks!
No it does not.
From my understanding, s should be a dangling reference because its assignment it binds to a return value of std::move
You have looked at the value categories, but not at the types.
The type of "test text" is char const[10]. This array reside in const global data.
You then pass it to move, which will return an rvalue reference to the array. The return type of this move cast expression is char const(&&)[10], an xvalue to a character array.
Then, this is assigned to a std::string&&. But this is not a string, but a character array. A prvalue of type std::string must then be constructed. It is done by calling the constructor std::string::string(char const*), since the reference to array will decay into a pointer when passing it around.
Since the reference is bound to a materialized temporary which is a prvalue, lifetime extension of the reference apply.
The std::move is completely irrelevant and does absolutely nothing in this case.
So in the end your code is functionally equivalent to this:
std::string&& s = std::string{"test text"};
std::cout << s << std::endl;
The answer would be difference if you would have used a std::string literal:
// `"test text"s` is a `std::string` prvalue
std::string&& s = std::move("test text"s);
std::cout << s << std::endl;
In this case, you have a prvalue that you send to std::move, which return an xvalue of type std::string, so std::string&&.
In this case, no temporary is materialized from the xvalue return by std::move, and the reference simply bind to the result of std::move. No extension is applied here.
This code would be UB, like the example you posted.
the program actually runs and prints "test text". Does this mean s is actually not a dangling reference?
No. Program running and printing "test text" does not mean that the program doesn't have a dangling reference. It is possible for a program to have that behaviour even when there is a dangling reference.
I found a similar stack overflow question here: ... which confirms that this will lead to dangling reference.
It confirms no such thing, because the linked question is different.
There is no dangling reference in the example. The string literal is an lvalue to an array of const char. The array has static storage duration, and a reference to the array will remain valid through the entire program.
Related
Can someone tell me if this is safe, because I think it isn't:
class A
{
public:
A(int*& i) : m_i(i)
{}
int*& m_i;
};
class B
{
public:
B(int* const& i) : m_i(i)
{}
int* const & m_i;
};
int main()
{
int i = 1;
int *j = &i;
A a1(j); // this works (1)
A a2(&i); // compiler error (2)
B b(&i); // this works again (3)
}
I understand why (1) works. We are passing a pointer, the function accepts it as a reference.
But why doesn't (2) work? From my perspective, we are passing the same pointer, just without assigning it to a pointer variable first. My guess is that &i is an rvalue and has no memory of its own, so the reference cannot be valid. I can accept that explanation (if it's true).
But why the heck does (3) compile? Wouldn't that mean that we allow the invalid reference so b.m_i is essentially undefined?
Am I completely wrong in how this works? I am asking because I am getting weird unit test fails that I can only explain by pointers becoming invalid. They only happen for some compilers, so I was assuming this must be something outside the standard.
So my core question basically is: Is using int* const & in a function argument inherently dangerous and should be avoided, since an unsuspecting caller might always call it with &i like with a regular pointer argument?
Addendum: As #franji1 pointed out, the following is an interesting thought to understand what happens here. I modified main() to change the inner pointer and then print the members m_i:
int main()
{
int i = 1;
int *j = &i; // j points to 1
A a1(j);
B b(&i);
int re = 2;
j = &re; // j now points to 2
std::cout << *a1.m_i << "\n"; // output: 2
std::cout << *b.m_i << "\n"; // output: 1
}
So, clearly a1 works as intended.
However, since b cannot know that j has been modified, it seems to hold a reference to a "personal" pointer, but my worry is that it is not well defined in the standard, so there might be compilers for which this "personal" pointer is undefined. Can anyone confirm this?
A's constructor takes a non-const reference to an int* pointer. A a1(j); works, because j is an int* variable, so the reference is satisfied. And j outlives a1, so the A::m_i member is safe to use for the lifetime of a1.
A a2(&i); fails to compile, because although &i is an int*, operator& returns a temporary value, which cannot be bound to a non-const reference.
B b(&i); compiles, because B's constructor takes a reference to a const int*, which can be bound to a temporary. The temporary's lifetime will be extended by being bound to the constructor's i parameter, but will then expire once the constructor exits, thus the B::m_i member will be a dangling reference and not be safe to use at all after the constructor has exited.
j is an lvalue and as such it can be bound to a non-const lvaue reference.
&i is a prvalue and it cannot be bound to non-const lvalue reference. That's why (2) doesn't compile
&i is a prvalue (a temporary) and it can be bound to a const lvalue reference. Bounding a prvalue to a reference extends the lifetime of the temporary to the lifetime of the reference. In this case this temporary lifetime is extended to the lifetime of the constructor parameter i. You then initialize the reference m_i to i (constructor parameter) (which is a reference to the temporary) but because i is an lvalue the lifetime of the temporary is not extended. In the end you end up with a reference member m_i bound to an object which is not alive. You have a dangling reference. Accessing m_i from now on (after the constructor has finished) is Undefined Behavior.
Simple table of what can references bind to: C++11 rvalue reference vs const reference
Pointer is a memory address. For simplicity, think of a pointer as uint64_t variable holding a number representing the memory address of whatever. Reference is just a alias for some variable.
In example (1) you are passing a pointer to constructor expecting a reference to pointer. It works as intended, as compiler gets the address of memory where the value of pointer is stored and passes it to constructor. The constructor gets that number and creates an alias pointer. As a result you are getting an alias of j. If you modify j to point to something else then m_i will also be modified. You can modify m_i to point to something else too.
In example (2) you are passing a number value to the constructor expecting a reference to pointer. So, instead of an address of an address, constructor gets an address and compiler has no way to satisfy the signature of the constructor.
In example (3) you are passing a number value to constructor expecting a constant reference to pointer. Constant reference is a fixed number, just a memory address. In this case compiler understands the intent and provides the memory address to set in the constructor. As a result you are getting fixed alias of i.
EDIT (for clarity): Difference between (2) and (3) is that &i is not a valid reference to int*, but it is a valid const reference to int*.
#include <iostream>
using namespace std;
int main()
{
int i = 0;
cout << &i << endl;
const auto &ref = (short&&)i;
cout << &ref << endl;
return 0;
}
Why is &i different from &ref? (short&)i doesn't cause this problem. Does (short&&)i generate a temporary variable?
It's because you're doing a different type of cast. The C style explicit conversion cast does always a static cast, if it could be interpreted as a static cast; otherwise it does a reinterpret cast. And/or const cast as needed.
(short&&)i is a static cast because it can be interpreted as static_cast<short&&>(i). It creates a temporary short object, to which ref is bound. Being a different object, it has a different address.
(short&)i is a reinterpret cast because it cannot be interpreted as static_cast<short&>(i) which is ill formed. It reinterprets the int reference as short reference, and ref is bound to the the same object. Note that accessing the object through this reference would have undefined behaviour.
This creates a lvalue reference to a thing that exists:
const auto& ref = i;
The expressions &ref and &i will therefore give the same result.
This is also true of:
const auto& ref = (int&)i;
which is basically the same thing.
However, casting to something that is not a lvalue reference to T (so, to a value, or to an rvalue reference of another type!) must create a temporary; this temporary undergoes lifetime extension when bound to ref. But now ref does not "refer to" i, so the address-of results will differ.
It's actually a little more complicated than that, but you get the idea. Besides, don't write code like this! An int is not a short and you can't pretend that it is.
Apparently it creates a temporary.
Actually the compiler will tell you itself.
Try this:
auto &ref = (short&&)i;
cout << &ref << endl;
The error says:
error: non-const lvalue reference to type 'short' cannot bind to a
temporary of type 'short'
Test code here.
(short&&)i creates a temporary, so you take address of an other object, so address might differ.
When a function parameter is of type lvalue reference lref:
void PrintAddress(const std::string& lref) {
std::cout << &lref << std::endl;
}
and lref is bound to a prvalue:
PrintAddress(lref.substr() /* temporary of type std::string */)
what does the address represent? What lives there?
A prvalue cannot have its address taken. But an lvalue reference to a prvalue can have its address taken, which is curious to me.
Inside the function lref is not a prvalue it is an lvalue and you can take the address of it.
There is a common misconception about rvalues vs. lvalues.
A named parameter is always an lvalue. No matter whether it is a reference type that is bound to an rvalue. Through a const & reference type you can't even tell which kind of value category the object actually has at the point where the function is called. Rvalue references and non-const Lvalue references give you that information:
void foo(std::string& L, std::string&& R)
{
// yeah i know L is already an lvalue at the point where foo is called
// R on the other hand is an rvalue at the point where we get called
// so we can 'safely' move from it or something...
}
The temporary string is a prvalue in the context of the caller (at the point PrintAddress is called). Within the context of the callee (in PrintAddress) lref is an lvalue reference because in this context it actually is an lvalue.
PrintAddress isn't aware of the limited lifetime of the passed argument and from PrintAddress' point of view the object is "always" there.
std::string q("abcd");
PrintAddress(q.substr(1)); // print address of temporary
is conceptually equivalent to:
std::string q("abcd");
{
const std::string& lref = q.substr(1);
std::cout << &lref << std::endl;
}
where the temporary experiences a prolongation of its lifetime to the end of the scope in which lref is defined (which is to the end of PrintAddress function scope in the present example).
what does the address represent? What lives there?
A std::string object containing the passed content.
And is it legal (in C++, and with respect to memory) to write to that address?
No, it would be legal if you'd use an rvalue reference:
void PrintAddressR(std::string&& rref) {
rref += "Hello"; // writing possible
std::cout << &rref << std::endl; // taking the address possible
}
// ...
PrintAddressR(q.substr(1)); // yep, can do that...
The same applies here: rref is an lvalue (it has a name) so you can take its address plus it is mutable.
In short, because the prvalue's lifetime has been extended. By having its lifetime extended - by any reference -, it's an lvalue, and thus can have its address taken.
what does the address represent? What lives there?
The address represents an object, the object referenced by lref.
A prvalue is short lived, it doesn't live for long. In fact, it will be destroyed when the statement creating it ends.
But, when you create a reference to a prvalue (either an rvalue reference or a const lvalue reference), its lifetime is extended. Ref.::
An rvalue may be used to initialize a const lvalue [rvalue] reference, in which case the lifetime of the object identified by the rvalue is extended until the scope of the reference ends.
Now it makes actually sense to take its address, as it is an lvalue for all intents and purposes. Now, that the prvalue has an indeterminate lifetime, it is an lvalue.
Taking the address of a prvalue doesn't make sense however, and that's probably why it is disallowed:
The value is destroyed after the next statements, so you can't do anything with the address, except maybe print it out.
If you take the address of something, the compiler is required to actually create the object. Sometimes, the compiler will optimize out variables that are trivial, but if you were to take the address of them, the compiler won't be allowed to optimize them out.
Taking the address of a prvalue will thus result in the compiler being unable to elide the value completely, for no advantages whatsoever (see point 1).
In simple English:
void PrintAddress(const std::string& lref) {
std::cout << &lref << std::endl;
}
Any object that has a name is an lvalue, hence any use of lref within the scope of the funtion above is an lvalue use.
When you called the function with:
PrintAddress(lref.substr() /* temporary of type std::string */)
Of cause, lref.substr() produces a temporary which is an rvalue, but rvalues can bind to (have its lifetime extended by) const lvalue references or rvalue references.
Even if you provided an rvalue overload, for the fact it has a name, its an "lvalue of something" within its scope, example:
#include <string>
#include <iostream>
void PrintAddress(const std::string& lref) {
std::cout << "LValue: " << &lref << std::endl;
}
void PrintAddress(std::string&& `rref`) {
std::cout << "RValue: " << &rref << std::endl; //You can take address of `rref`
}
int main(){
std::string str = "Hahaha";
PrintAddress(str);
PrintAddress(str.substr(2));
}
Just remember:
In C++, any object(whether value type, reference type or pointer type) that has a name is an lvalue
Also know that some expressions produce lvalues too.
I am just starting learning c++. I found an advice on Internet: "Learn with a good book, it is better than videos on youtube." So as I am motivated and I have time I learn with c++ Primer 5th Ed.
In this book, they say:
Note: "A reference is not an object. Instead, a reference is just another name for an already existing object."
and:
"a reference may be bound only to an object, not to a literal or to the result of a more general expression"
I understand:
int i = 3;
int &ri = i; // is valid: ri is a new name for i
int &ri2 = 2; // is not valid: 2 is not an object
Then I don't understand why:
const int &ri3 = 2; // is valid
They write: "It can be easier to understand complicated pointer or reference declarations if
you read them from right to left."
Ok, it is not very complicated. I understand:
I declare a variable named ri3,
it is a reference (a reference when & is after the type, an address when & is in an expression)
to an object of type int
and it is a constant.
I think it has already been explained many times but when I search on forums I find complicated (to me) answers to complicated problems, and I still don't understand.
Thank you for your help.
https://stackoverflow.com/a/7701261/1508519
You cannot bind a literal to a reference to non-const (because
modifying the value of a literal is not an operation that makes
sense). You can however bind a literal to a reference to const.
http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/
The "const" is important. The first line is an error and the code
won’t compile portably with this reference to non-const, because f()
returns a temporary object (i.e., rvalue) and only lvalues can be
bound to references to non-const.
For illustrative purposes see this answer.
A non-const reference cannot point to a literal.
The following code will produce an error.
error: invalid initialization of non-const reference of type
'double&' from an rvalue of type 'double'
#include <iostream>
double foo(double & x) {
x = 1;
}
int main () {
foo(5.0);
return 0;
}
Here's Lightness' comment.
[C++11: 5.1.1/1]: [..] A string literal is an lvalue; all other
literals are prvalues.
And cppreference (scroll down to rvalue (until C++11) / prvalue (since C++11)):
A prvalue ("pure" rvalue) is an expression that identifies a temporary
object (or a subobject thereof) or is a value not associated with any
object.
The following expressions are prvalues:
Literal (except string literal), such as 42 or true or nullptr.
It is valid because number literals are actually constants. So the compiler can accept such reference only if it is const.
Consider the below.
#include <string>
using std::string;
string middle_name () {
return "Jaan";
}
int main ()
{
string&& danger = middle_name(); // ?!
return 0;
}
This doesn't compute anything, but it compiles without error and demonstrates something that I find confusing: danger is a dangling reference, isn't it?
Do rvalue references allow dangling references?
If you meant "Is it possible to create dangling rvalue references" then the answer is yes. Your example, however,
string middle_name () {
return "Jaan";
}
int main()
{
string&& nodanger = middle_name(); // OK.
// The life-time of the temporary is extended
// to the life-time of the reference.
return 0;
}
is perfectly fine. The same rule applies here that makes this example (article by Herb Sutter) safe as well. If you initialize a reference with a pure rvalue, the life-time of the tempoary object gets extended to the life-time of the reference. You can still produce dangling references, though. For example, this is not safe anymore:
int main()
{
string&& danger = std::move(middle_name()); // dangling reference !
return 0;
}
Because std::move returns a string&& (which is not a pure rvalue) the rule that extends the temporary's life-time doesn't apply. Here, std::move returns a so-called xvalue. An xvalue is just an unnamed rvalue reference. As such it could refer to anything and it is basically impossible to guess what a returned reference refers to without looking at the function's implementation.
rvalue references bind to rvalues. An rvalue is either a prvalue or an xvalue [explanation]. Binding to the former never creates a dangling reference, binding to the latter might. That's why it's generally a bad idea to choose T&& as the return type of a function. std::move is an exception to this rule.
T& lvalue();
T prvalue();
T&& xvalue();
T&& does_not_compile = lvalue();
T&& well_behaved = prvalue();
T&& problematic = xvalue();
danger is a dangling reference, isn't it?
Not any more than if you had used a const &: danger takes ownership of the rvalue.
Of course, an rvalue reference is still a reference so it can be dangling as well. You just have to bring the compiler into a situation where he has to drag the reference along and at the same time you just escape the refered-to value's scope, like this:
Demo
#include <cstdio>
#include <tuple>
std::tuple<int&&> mytuple{ 2 };
auto pollute_stack()
{
printf("Dumdudelei!\n");
}
int main()
{
{
int a = 5;
mytuple = std::forward_as_tuple<int&&>(std::move(a));
}
pollute_stack();
int b = std::get<int&&>(mytuple);
printf("Hello b = %d!\n", b);
}
Output:
Dumdudelei!
Hello b = 0!
As you can see, b now has the wrong value. How come? We stuffed an rvalue reference to an automatic variable a into a global tuple. Then we escaped the scope of a and retrieve its value through std::get<int&&> which will evaluate to an rvalue-reference. So the new object b is actually move constructed from a, but the compiler doesn't find a because its scope has ended already. Therefore std::get<int&&> evaluates to 0 (although it is probably UB and could evaluate to anything).
Note that if we don't touch the stack, the rvalue reference will actually still find the original value of object a even after its scope has ended and will retrieve the right value (just try it and uncomment pollute_stack() and see what happens). The pollute_stack() function just moves the stack pointer forward and back while writing values to the stack by doing some io-related stuff through printf().
The compiler doesn't see through this though at all so be aware of this.