According to the c++ standard, is it undefined behavior to copy a reference before initializing the object it refers to? This happens in the following example, where I pass a reference to a parent class and I initialize the value of the object only after because the call to the parent constructor has to be first in the initializer list.
#include <iostream>
struct Object
{
int val;
Object(int i): val(i) {}
};
struct Parent
{
Object& ref;
Parent(Object& i): ref(i){}
};
struct Child : Parent
{
Object obj;
Child(int i): Parent(obj), obj(i) {}
};
int main()
{
std::cout << Child(3).ref.val;
}
Here when Parent is initialized with Parent(obj), the value of obj has not been initialized yet.
This compiles fine under gcc, and I do get a correct output, but I'm not sure whether the standard or good coding practice advise against it. Is it an undefined behavior ? And if not, is it a bad practice that I should avoid?
Firstly, let me clarify one thing.
I am not sure if it is even possible to literally copy a reference.
int i = 10;
int& ref = i; // since this moment ref becomes "untouchable"
int& alt_ref = ref; // actually, means int& alt_ref = i;
I think the same happens if ref is a member of some class and you copy an instance of this class.
Besides, if you look closer on your code, you don't even "copy a reference", but rather initialize a reference with uninitialized (yet) object.
struct Parent
{
Object& ref;
Parent(Object& i): ref(i) { }
};
struct Child : Parent
{
Object obj;
Child(int i): Parent(obj), obj(i) { }
};
physically is equivalent to:
struct Child
{
Object& ref;
Object obj;
Child(int i): ref(obj), obj(i) { }
};
With that being said, your question actually means:
Is it undefined behavior to initialize a reference before
initializing the object it is about to refer?
Here is a quote from C++ Standard (§3.8.6 [basic.life/6]) which possibly gives the answer:
Similarly, before the lifetime of an object has started but after the
storage which the object will occupy has been allocated or, after the
lifetime of an object has ended and before the storage which the
object occupied is reused or released, any glvalue that refers to the
original object may be used but only in limited ways. For an object
under construction or destruction, see 12.7. Otherwise, such a glvalue
refers to allocated storage (3.7.4.2), and using the properties of the
glvalue that do not depend on its value is well-defined.
And §12.7.1 [class.cdtor/1] just says:
...referring to any non-static member or base class of the object
before the constructor begins execution results in undefined behavior.
§12.7.1 mentions only "referring to objects members", hence "referring to the object itself" falls under §3.8.6.
This way, I make a conclusion that referring to uninitialized (but already allocated) object is well-defined.
If you see any mistakes, please let me know in the comments. Also feel free to edit this answer.
Edit:
I just want to say, that such conclusion seems reasonable. Initialization of an object cannot change its location in memory. What is bad in storing the reference to already allocated memory even before it is initialized?
Related
In the new C++20 standard, cpprefrence says:
a temporary bound to a reference in a reference element of an aggregate initialized using direct-initialization syntax (parentheses) as opposed to list-initialization syntax (braces) exists until the end of the full expression containing the initializer. Example:
struct A {
int&& r;
};
A a1{7}; // OK, lifetime is extended
A a2(7); // well-formed, but dangling reference
NOTE: I am using GCC as this is the only compiler which supports this feature.
Having read this, and knowing that references are polymorphic, I decided to create a simple class:
template <typename Base>
struct Polymorphic {
Base&& obj;
};
So that the following code works:
struct Base {
inline virtual void print() const {
std::cout << "Base !\n";
}
};
struct Derived: public Base {
inline virtual void print() const {
std::cout << "Derived !" << x << "\n";
}
Derived() : x(5) {}
int x;
};
int main() {
Polymorphic<Base> base{Base()};
base.obj.print(); // Base !
Polymorphic<Base> base2{Derived()};
base2.obj.print(); // Derived 5!
}
The problem I encountered is when changing the value of my polymorphic object (not the value of obj, but the value of a polymorphic object itself). Since I can't reassign to r-value references, I tried to do the following using placement new:
#include <new>
int main() {
Polymorphic<Base> base{Base()};
new (&base) Polymorphic<Base>{Derived()};
std::launder(&base)-> obj.print(); // Segmentation fault... Why is that?
return 0;
}
I believe I have a segmentation fault because Polymorphic<Base> has a reference sub object. However, I am using std::launder - isn't that supposed to make it work? Or is this a bug in GCC? If std::launder does not work, how do I tell the compiler not to cache the references?
On a side note, please do not tell me "your code is stupid", "use unique pointers instead"... I know how normal polymorphism works; I asked this question to deepen my understanding of placement new and std::launder :)
[class.temporary]/6.12 states:
A temporary bound to a reference in a new-initializer persists until the completion of the full-expression containing the new-initializer.
It is not picky about how the reference is initialized; this applies to all ways such references get initialized. Indeed, there's even an example:
struct S { int mi; const std::pair<int,int>& mp; };
S a { 1, {2,3} };
S* p = new S{ 1, {2,3} }; // creates dangling reference
Placement-new is a new-initializer. So it applies. Just like the above, base contains a dangling reference. It doesn't matter how you access it after that point; the object it references has been destroyed, so you get UB.
If the lifetime of a temporary is not obvious by the reader of some code, you should not be using a temporary. Just give it a name, and all your problems go away.
Look at the bullet point immediately above the one you quoted in the link you provided:
a temporary bound to a reference in the initializer used in a new-expression exists until the end of the full expression containing that new-expression, not as long as the initialized object. If the initialized object outlives the full expression, its reference member becomes a dangling reference.
In your crashing example, you are using a new expression, so the lifetime is not extended.
If memory is set aside for an object (e.g., through a union) but the constructor has not yet been called, is it legal to call one of the object's non-static methods, assuming the method does not depend on the value of any member variables?
I researched a bit and found some information about "variant members" but I couldn't find info pertaining to this example.
class D {
public:
D() { printf("D constructor!\n"); }
int a = 123;
void print () const {
printf("Pointer: %p\n", &a);
};
};
class C {
public:
C() {};
union {
D memory;
};
};
int main() {
C c;
c.memory.print();
}
In this example, I'm calling print() without the constructor ever being called. The intent is to later call the constructor, but even before the constructor is called, we know where variable a will reside. Obviously the value of a is uninitialized at this point, but print() doesn't care about the value.
This seems to work as expected when compiling with gcc and clang for c++11. But I'm wondering if I'm invoking some illegal or undefined behavior here.
I believe this is undefined behavior. Your variant member C::memory has not been initialized because the constructor of C does not provide an initializer [class.base.init]/9.2. Therefore, the lifetime of c.memory has not begun at the point where you call the method print() [basic.life]/1. Based on [basic.life]/7.2:
Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. […] The program has undefined behavior if:
[…]
the glvalue is used to call a non-static member function of the object, or
[…]
emphasis mine
Note: I am referring to the current C++ standard draft above, however, the relevant wording is basically the same for C++11 except that, in C++11, the fact that D has non-trivial initialization is crucial as what you're doing may otherwise potentially be OK in C++11…
Let's consider that during the execution of the constructor of a class S, it appears that S could be constructed using another constructor. One solution could be to make a placement new at this to reuse the storage:
struct S{
unsigned int j; //no const neither reference non static members
S(unsigned int i){/*...*/}
S(int i){
if (i>=0) {
new (this) S(static_cast<unsigned int>(i));
return;}
/*...*/
}
};
int i=10;
S x{i};//is it UB?
Storage reuse is defined in [basic.life]. I don't know how to read this section when the storage is (re)used during constructor execution.
The standard is completely underspecified in this case, and I cannot find a relevant CWG issue.
In itself, your placement new is not UB. After all, you have storage without an object, so you can directly construct an object in it. As you correctly said, the lifetime of the first object hasn't started yet.
But now the problem is: What happens to the original object? Because normally, a constructor is only called on storage without an object and the end of constructor marks the start of the lifetime of the object. But now there is already another object. Is the new object destroyed? Does it have no effect?
The standard is missing a paragraph in [class.cdtor] that says what should happen if a new object is created in the storage of an object under construction and destruction.
You can even construct even weirder code:
struct X {
X *object;
int var;
X() : object(new (this) X(4)), var(5) {} // ?!?
X(int x) : var(x) {}
} x;
is it UB?
No it is not. [basic.life]/5 says:
A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type with a non-trivial destructor. For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
Emphasis on the part relevant to your class which has a trivial destructor. About the specific new (this) T; form, I found no exception to this rule in [class.cdtor] nor [class.dtor].
Generally this discussion is up to the local function variable only:
void foo (const int &i)
{
// use i till foo() ends
}
foo(3);
But, does this rule applies to the class member also ?
struct A {
const int &a;
A () : a(3) {} // version 1
A (const int &i) : a(i) {} // version 2
};
Now A used as,
{
return ()? new A : new A(3) : new A(some_local_variable);
}
Will the contents of a remain same through out the life time of the all 3 newly allocated A ?
The C++03 standard (Section "12.2/5 Temporary objects") answers your question aptly:
The temporary to which the reference is bound or the temporary that is the complete object to a subobject of which the temporary is bound persists for the lifetime of the reference except as specified below. A temporary bound to a reference member in a constructor’s ctor-initializer (12.6.2) persists until the constructor exits. A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full expression containing the call.
If you allocate an object using new, it will remain in memory forever - until you delete it. It's not a temporary object.
a is a member of A, and as such part of the allocation.
EDIT: Thanks for the comments. I would say - no, this is not correct. Consider this:
struct A {
const int &a;
A () : a(3) {} // version 1
A (const int &i) : a(i) {} // version 2
};
void foo() {
A *pA;
{
int x;
pA = new A(x);
}
// Now pA->a is pointing to the address where `x` used to be,
// but the compiler may very well put something else in this place now
// because x is out of scope.
}
The answer is more obvious if the lifetime of the A object spans across several functions.
Side note: I find the word "contents" a bit ambiguous here. Equate the reference with a pointer, so your a is basically pointing to an integer. const or not, if the integer does no longer exist (because it was on the stack and has been removed), your a - while still pointing to the same address in memory - is referencing something else now. The GotW article appears to be talking about the compiler prolonging the lifetime of the object being pointed to by the reference. The reference itself, again, is just a pointer.
Given the following code:
class foo
{
};
class bar: public foo
{
public:
~bar() { printf("~bar()\n"); }
};
class zab: public foo
{
public:
~zab() { printf("~zab()\n"); }
};
struct foo_holder
{
const foo &f;
};
int main()
{
foo_holder holder[]= { {bar()}, {zab()} };
printf("done!\n");
return 0;
}
the output is:
~bar()
~zab()
done!
C++0x has a clause that dictates this can create dangling references when used as a new initializer, but it says nothing (at least nothing I can find) about aggregate initialization of const references with temporaries.
Is this unspecified behavior then?
It isn't mentioned in the list of exceptions, therefore the lifetime to temporary should be extended to match lifetime of (array of) foo_holders. However, this looks like oversight to me, perhaps submitting Defect Report might be good idea.
§12.2/5 states, that when reference is bound to a temporary, the lifetime of temporary is extended to match lifetime of the reference and because const foo& f is member of foo_holder, the lifetime of the reference is matching lifetime of foo_holder, according to §3.7.5/1:
The storage duration of member subobjects, base class subobjects and array elements is that of their complete object (1.8).
This might be little bit tricky to interpret considering references, because §3.8/1 states, that lifetime of object ends when the storage is released or reused:
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
— the storage which the object occupies is reused or released.
however, it is left unspecified whether references use storage or not; §8.3.2/4 says
It is unspecified whether or not a reference requires storage (3.7).
Perhaps someone with better knowledge of standard would know this better.
I got an answer on comp.std.c++:
http://groups.google.com/group/comp.std.c++/msg/9e779c0154d2f21b
Basically, the standard does not explicitly address it; therefore, it should behave the same as a reference declared locally.