In order to stem the argument going on in the comments of an answer I gave recently, I'd like some constructive answers to the following questions:
Is a reference's lifetime distinct from the object it refers to? Is a reference simply an alias for its target?
Can a reference outlive its target in a well-formed program without resulting in undefined behaviour?
Can a reference be made to refer to a new object if the storage allocated for the original object is reused?
Does the following code demonstrate the above points without invoking undefined behaviour?
Example code by Ben Voigt and simplified (run it on ideone.com):
#include <iostream>
#include <new>
struct something
{
int i;
};
int main(void)
{
char buffer[sizeof (something) + 40];
something* p = new (buffer) something;
p->i = 11;
int& outlives = p->i;
std::cout << outlives << "\n";
p->~something(); // p->i dies with its parent object
new (p) char[40]; // memory is reused, lifetime of *p (and p->i) is so done
new (&outlives) int(13);
std::cout << outlives << "\n"; // but reference is still alive and well
// and useful, because strict aliasing was respected
}
Is a reference's lifetime distinct from the object it refers to? Is a reference simply an alias for its target?
A reference has its own lifetime:
int x = 0;
{
int& r = x;
} // r dies now
x = 5; // x is still alive
A ref-to-const additionally may extend the lifetime of its referee:
int foo() { return 0; }
const int& r = foo(); // note, this is *not* a reference to a local variable
cout << r; // valid; the lifetime of the result of foo() is extended
though this is not without caveats:
A reference to const only extends the lifetime of a temporary object if the reference is a) local and b) bound to a prvalue whose evaluation creates said temporary object. (So it doesn't work for members, or local references which are bound to xvalues.) Also, non-const rvalue references extend the lifetime in the exact same fashion. [#FredOverflow]
Can a reference outlive its target in a well-formed program without resulting in undefined behaviour?
Sure, as long as you don't use it.
Can a reference be made to refer to a new object if the storage allocated for the original object is reused?
Yes, under some conditions:
[C++11: 3.8/7]: If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original
object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object (1.8) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Does the following code demonstrate the above points without invoking undefined behaviour?
Tl;dr.
Yes. For example local nonstatic references have automatic storage duration and corresponding liifetime and can refer to objects that have longer lifetime.
Yes, dangling references are an example. As long as such references are not used in any expressions when they become dangling, they are fine.
There is a special rule in clause 3 about this case. Names of objects, pointers and references automatically refer to the new object that reuses the storage under restricted conditions. I believe it is at the end of 3.8. Someone who has the spec handy please fill in the correct ref here.
Related
The following code example is from cppreference on std::launder:
alignas(Y) std::byte s[sizeof(Y)];
Y* q = new(&s) Y{2};
const int f = reinterpret_cast<Y*>(&s)->z; // Class member access is undefined behavior
It seems to me that third line will result in undefined behaviour because of [basic.life]/6 in the standard:
Before the lifetime of an object has started but after the storage which the object will occupy has been allocated ... The program has undefined behavior if ... the pointer is used to access a non-static data member or call a non-static member function of the object.
Why didn't placement new start the lifetime of object Y?
The placement-new did start the lifetime of the Y object (and its subobjects).
But object lifetime is not what std::launder is about. std::launder can't be used to start the lifetime of objects.
std::launder is used when you have a pointer which points to an object of a type different than the pointer's type, which happens when you reinterpret_cast a pointer to a type for which there doesn't exist an object of the target type which is pointer-interconvertible with the former object.
std::launder can then (assuming its preconditions are met) be used to obtain a pointer to an object of the pointer's type (which must already be in its lifetime) located at the address to which the pointer refers.
Here &s is a pointer pointing to an array of sizeof(Y) std::bytes. There is also an explicitly created Y object sharing the address with that array and the array provides storage for the Y object. However, an array (or array element) is not pointer-interconvertible with an object for which it provides storage. Therefore the result of reinterpret_cast<Y*>(&s) will not point to the Y object, but will remain pointing to the array.
Accessing a member has undefined behavior if the glvalue used doesn't actually refer to an object (similar) to the glvalue's type, which is here the case as the lvalue refers to the array, not the Y object.
So, to get a pointer and lvalue to the Y object located at the same address as &s and already in its lifetime, you need to call std::launder first:
const int f = std::launder(reinterpret_cast<Y*>(&s))->z;
All of this complication can of course be avoided by just using the pointer returned by new directly. It already points to the newly-created object:
const int f = q->z;
In order to stem the argument going on in the comments of an answer I gave recently, I'd like some constructive answers to the following questions:
Is a reference's lifetime distinct from the object it refers to? Is a reference simply an alias for its target?
Can a reference outlive its target in a well-formed program without resulting in undefined behaviour?
Can a reference be made to refer to a new object if the storage allocated for the original object is reused?
Does the following code demonstrate the above points without invoking undefined behaviour?
Example code by Ben Voigt and simplified (run it on ideone.com):
#include <iostream>
#include <new>
struct something
{
int i;
};
int main(void)
{
char buffer[sizeof (something) + 40];
something* p = new (buffer) something;
p->i = 11;
int& outlives = p->i;
std::cout << outlives << "\n";
p->~something(); // p->i dies with its parent object
new (p) char[40]; // memory is reused, lifetime of *p (and p->i) is so done
new (&outlives) int(13);
std::cout << outlives << "\n"; // but reference is still alive and well
// and useful, because strict aliasing was respected
}
Is a reference's lifetime distinct from the object it refers to? Is a reference simply an alias for its target?
A reference has its own lifetime:
int x = 0;
{
int& r = x;
} // r dies now
x = 5; // x is still alive
A ref-to-const additionally may extend the lifetime of its referee:
int foo() { return 0; }
const int& r = foo(); // note, this is *not* a reference to a local variable
cout << r; // valid; the lifetime of the result of foo() is extended
though this is not without caveats:
A reference to const only extends the lifetime of a temporary object if the reference is a) local and b) bound to a prvalue whose evaluation creates said temporary object. (So it doesn't work for members, or local references which are bound to xvalues.) Also, non-const rvalue references extend the lifetime in the exact same fashion. [#FredOverflow]
Can a reference outlive its target in a well-formed program without resulting in undefined behaviour?
Sure, as long as you don't use it.
Can a reference be made to refer to a new object if the storage allocated for the original object is reused?
Yes, under some conditions:
[C++11: 3.8/7]: If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original
object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object (1.8) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Does the following code demonstrate the above points without invoking undefined behaviour?
Tl;dr.
Yes. For example local nonstatic references have automatic storage duration and corresponding liifetime and can refer to objects that have longer lifetime.
Yes, dangling references are an example. As long as such references are not used in any expressions when they become dangling, they are fine.
There is a special rule in clause 3 about this case. Names of objects, pointers and references automatically refer to the new object that reuses the storage under restricted conditions. I believe it is at the end of 3.8. Someone who has the spec handy please fill in the correct ref here.
Allow me to preface by saying that I don't recommend any of the practices below, for obvious reasons. However, I had a discussion today regarding it and some people were adamant about using a reference like this as being undefined behavior.
Here is a test case:
#include <string>
struct my_object {
int a = 1;
int b = 2;
std::string hi = "hello";
};
// Using union purely to reserve uninitialized memory for a class.
union my_object_storage {
char dummy;
my_object memory;
// C++ will yell at you for doing this without some constructors.
my_object_storage() {}
~my_object_storage() {}
} my_object_storage_instance;
// This is so we can easily access the storage memory through "I"
constexpr my_object &I = my_object_storage_instance.memory;
//-------------------------------------------------------------
int main() {
// Initialize the object.
new (&I) my_object();
// Use the reference.
I.a = 1;
// Destroy the object (typically this should be done using RAII).
I.~my_object();
// Phase two, REINITIALIZE an object with the SAME reference.
// We still have the memory allocated which is static, so why not?
new (&I) my_object();
// Use the reference.
I.a = 1;
// Destroy the object again.
I.~my_object();
}
https://wandbox.org/permlink/YEp9aQUcWdA9YiBI
Basically what the code does is reserves static memory for a struct, and then initializes it in main(). Why would you want to do that? It isn't extremely useful and you should just use a pointer, but here is the question:
With this statement given,
constexpr my_object &I = my_object_storage_instance.memory;
is defining a reference to uninitialized memory undefined behavior? Other people have told me it is, but I'm trying to figure out concretely if that's the case. In the C++ standard we see this paragraph:
A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior.
Specifically "a valid object", which may boil down to: is an object that hasn't had its constructor called yet "valid"? What makes it invalid that it would cause undefined behavior? Are there actually real side effects that could arise?
My argument for this being labeled as undefined behavior is:
Compilers might be free to treat it like a valid object, because the standard states that it should be, especially during the assignment and especially if there are hidden debug instructions being inserted for diagnostics that assume such, which would certainly cause undefined behavior.
My arguments against it being undefined behavior is that:
It's not dereferencing anything - the paragraph states that, during initialization of a reference, dereferencing nullptr is undefined. It doesn't specifically state undefined behavior if there isn't any dereferencing.
Dangling references are a thing, and appear in many cases in normal programs. They only cause undefined behavior IF they are used. This is similar to starting with a dangling reference.
Again, not very useful in practice because there are much better ways to spend your time, but what better place for odd questions and expert opinions than stackoverflow? :)
You're perfectly fine, your usage of the reference falls into the explicit exception to the rule that a live object is required. In [basic.life]:
Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways.
For an object under construction or destruction, see [class.cdtor]. Otherwise, such a glvalue refers to allocated storage ([basic.stc.dynamic.allocation]), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:
the glvalue is used to access the object, or
the glvalue is used to call a non-static member function of the object, or
the glvalue is bound to a reference to a virtual base class ([dcl.init.ref]), or
the glvalue is used as the operand of a dynamic_cast ([expr.dynamic.cast]) or as the operand of typeid.
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
neither the original object nor the new object is a potentially-overlapping subobject ([intro.object]).
Thus, your reference validly refers to allocated storage, which is exactly what you need to perform a placement-new and vivify the union member.
And since the dynamic (runtime) type of the object you create exactly matches the static type of the reference you hold, it can be used to access the new object after placement new (either the first or the second).
In 12.2 of C++11 standard:
The temporary to which the reference is bound or the temporary that is
the complete object of a subobject to which the reference is bound
persists for the lifetime of the reference except:
A temporary bound
to a reference member in a constructor’s ctor-initializer (12.6.2)
persists until the constructor exits.
A temporary bound to a
reference parameter in a function call (5.2.2) persists until the
completion of the full-expression containing the call.
The lifetime
of a temporary bound to the returned value in a function return
statement (6.6.3) is not extended; the temporary is destroyed at the
end of the full-expression in the return statement.
A temporary
bound to a reference in a new-initializer (5.3.4) persists until the
completion of the full-expression containing the new-initializer.
And there is an example of the last case in the standard:
struct S {
int mi;
const std::pair<int,int>& mp;
};
S a { 1,{2,3} }; // No problem.
S* p = new S{ 1, {2,3} }; // Creates dangling reference
To me, 2. and 3. make sense and easy to agree. But what's the reason bebind 1. and 4.? The example looks just evil to me.
As with many things in C and C++, I think this boils down to what can be reasonably (and efficiently) implemented.
Temporaries are generally allocated on the stack, and code to call their constructors and destructors are emitted into the function itself. So if we expand your first example into what the compiler is actually doing, it would look something like:
struct S {
int mi;
const std::pair<int,int>& mp;
};
// Case 1:
std::pair<int,int> tmp{ 2, 3 };
S a { 1, tmp };
The compiler can easily extend the life of the tmp temporary long enough to keep "S" valid because we know that "S" will be destroyed before the end of the function.
But this doesn't work in the "new S" case:
struct S {
int mi;
const std::pair<int,int>& mp;
};
// Case 2:
std::pair<int,int> tmp{ 2, 3 };
// Whoops, this heap object will outlive the stack-allocated
// temporary!
S* p = new S{ 1, tmp };
To avoid the dangling reference, we would need to allocate the temporary on the heap instead of the stack, something like:
// Case 2a -- compiler tries to be clever?
// Note that the compiler won't actually do this.
std::pair<int,int> tmp = new std::pair<int,int>{ 2, 3 };
S* p = new S{ 1, tmp };
But then a corresponding delete p would need to free this heap memory! This is quite contrary to the behavior of references, and would break anything that uses normal reference semantics:
// No way to implement this that satisfies case 2a but doesn't
// break normal reference semantics.
delete p;
So the answer to your question is: the rules are defined that way because it sort of the only practical solution given C++'s semantics around the stack, heap, and object lifetimes.
WARNING: #Potatoswatter notes below that this doesn't seem to be implemented consistently across C++ compilers, and therefore is non-portable at best for now. See his example for how Clang doesn't do what the standard seems to mandate here. He also says that the situation "may be more dire than that" -- I don't know exactly what this means, but it appears that in practice this case in C++ has some uncertainty surrounding it.
The main thrust is that reference extension only occurs when the lifetime can be easily and deterministically determined, and this fact can be deduced as possible on the line of code where the temporary is created.
When you call a function, it is extended to the end of the current line. That is long enough, and easy to determine.
When you create an automatic storage reference "on the stack", the scope of that automatic storage reference can be deterministically determined. The temporary can be cleaned up at that point. (Basically, create an anonymous automatic storage variable to store the temporary)
In a new expression, the point of destruction cannot be statically determined at the point of creation. It is whenever the delete occurs. If we wanted the delete to (sometimes) destroy the temporary, then our reference "binary" implementation would have to be more complicated than a pointer, instead of less or equal. It would sometimes own the referred to data, and sometimes not. So that is a pointer, plus a bool. And in C++ you don't pay for what you don't use.
The same holds in a constructor, because you cannot know if the constructor was in a new or a stack allocation. So any lifetime extension cannot be statically understood at the line in question.
How long do you want the temporary object to last? It has to be allocated somewhere.
It can't be on the heap because it would leak; there is no applicable automatic memory management. It can't be static because there can be more than one. It must be on the stack. Then it either lasts until the end of the expression or the end of the function.
Other temporaries in the expression, perhaps bound to function call parameters, are destroyed at the end of the expression, and persisting until the end of the function or "{}" scope would be an exception to the general rules. So by deduction and extrapolation of the other cases, the full-expression is the most reasonable lifetime.
I'm not sure why you say this is no problem:
S a { 1,{2,3} }; // No problem.
The dangling reference is the same whether or not you use new.
Instrumenting your program and running it in Clang produces these results:
#include <iostream>
struct noisy {
int n;
~noisy() { std::cout << "destroy " << n << "\n"; }
};
struct s {
noisy const & r;
};
int main() {
std::cout << "create 1 on stack\n";
s a {noisy{ 1 }}; // Temporary created and destroyed.
std::cout << "create 2 on heap\n";
s* p = new s{noisy{ 2 }}; // Creates dangling reference
}
create 1 on stack
destroy 1
create 2 on heap
destroy 2
The object bound to the class member reference does not have an extended lifetime.
Actually I'm sure this is the subject of a known defect in the standard, but I don't have time to delve in right now…
Imagine the following scenario:
class ABC
{
public:
int abc;
};
ABC& modifyABC(ABC& foo)
{
foo.abc+=1337;
return foo;
}
void saveABC(ABC& bar, std::vector<ABC*>& list)
{
list.push_back(&modifyABC(bar));
}
int main()
{
ABC foobar;
std::vector<ABC*> ABCList;
saveABC(foobar,ABCList);
return 0;
}
modifyABC() returns a reference to ABC(which is internally some sort of pointer too AFAIK). Does the "adress of" & operator now return a pointer to the adress of the reference or the actually object behind the reference?
modifyABC() returns a reference to ABC (which is internally some sort of pointer too AFAIK)
Not exactly.
Pointers are objects (variables) that require some storage and hold in that storage the address in memory of another object. References are pure aliases, like alternative names. In theory, they do not require any storage at all.
Per Paragraph 8.3.2/4 of the C++11 Standard:
It is unspecified whether or not a reference requires storage (3.7).
So a pointer to a reference is actually a pointer to the referenced object, and any operation done on a reference (apart from the act of binding it to an object upon initialization) is actually done on the object for which the reference is an alias.
I am struggling with the last sentence of your question ("a pointer to the adress of the reference"?)
What can be said is that modifyABC() takes a reference to an ABC, and returns exactly the same reference. No copy of the object is made.
The overall effect of your code is that the address of foobar is appended to ABCList.
Does the "adress of" & operator now return a pointer to the adress of the reference or the actually object behind the reference?
In C++, references, as such, do not have their own addresses. So address of a reference means address of the object the reference is referring to.
X x;
X &r = x; //reference
X *p = &r; //same as &x
Hope that helps.
Anything you do with a reference (including taking its address)
is the equivalent of doing it to the referred to object. In
C++, a reference itself is not an object, does not necessarily
occupy space in memory, and does not have an address.
A reference and a pointer are two different concepts. You may think of a reference as an alias to an existing object. So just like an alias to an alias is again alias of the original, here the return value of modifyABC() is again a reference to the original object. Taking the pointer to a reference always return the address of the object you have a reference of.