C++ lifetime of union member

C++ lifetime of union member - c++

In the current version of the C++ standard draft, [basic.life]/1 states:
The lifetime of an object or reference is a runtime property of the
object or reference. A variable is said to have vacuous initialization
if it is default-initialized and, if it is of class type or a
(possibly multi-dimensional) array thereof, that class type has a
trivial default constructor. The lifetime of an object of type T
begins when:
storage with the proper alignment and size for type T is obtained, and
its initialization (if any) is complete (including vacuous
initialization) ([dcl.init]),
except that if the object is a union
member or subobject thereof, its lifetime only begins if that union
member is the initialized member in the union ([dcl.init.aggr],
[class.base.init]), or as described in [class.union]. [...]
From that paragraph I understand that the only way a member of a union begins its lifetime is if:
that member "is the initialized member in the union" (e.g. if it is referenced in a mem-initializer), or
some other way mentioned in [class.union]
However, the only normative paragraph in [class.union] that specifies how a union member can begin its lifetime is [class.union]/5 (but it only applies to specific types, i.e. either non-class, non-array, or class type with a trivial constructor that is not deleted, or array of such types).
The next paragraph, [class.union]/6 (comprising a note and an example, therefore it contains no normative text), describes a way to change the active member of a union, by using a placement new-expression, such as new (&u.n) N;, where
struct N { N() { /* non-trivial constructor */ } };
struct M { M() { /* non-trivial constructor */ } };
union
{
N n;
M m;
} u;
My question is where in the standard is it specified that new (&u.n) N; begins the lifetime of u.n?
Thank you!

An important rule regarding this is:
[class.union]/1
In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended ([basic.life]). ...
As far as this rule is considered, the active member could change at any time a member object begins its lifetime. The rule [class.union]/5 further allows changing the active member also by assigning to a non-active member of a limited set of types. The lack of a separate rule for placement new by itself doesn't disallow changing the member. If it begins the lifetime of the member, then the member is the active member of the union.
So, [basic.life/1] says that the lifetime of the member begins only if [class.union] says so1, and [class.union/1] says that the member is active only if its lifetime has begun2. This does seem like a bit of a catch-22.
My best attempt at reading the rules in a way that makes sense is to interpret that placement-new begins the lifetime of the member, therefore [class.union/1] applies, and therefore "or as described in [class.union]" applies and therefore the highlighted exception doesn't apply. Next I would like to say therefore the lifetime begins, but that logic is circular.
The non-normative [class.union]/6 makes it quite clear that the placement new is intended to be allowed, but the normative rules are tangled. I would say that the wording could be improved.
1 (or when the union is initialised with that member, which isn't the case we are considering)
2 (or after assignment as per [class.union]/5, which isn't the case we are considering)

My question is where in the standard is it specified that new (&u.n) N; begins the lifetime of u.n?
Nowhere. Placement new creates a new object which becomes union member subobject per [intro.object]/2:
If an object is created in storage associated with a member subobject or array element e (which may or may not be within its lifetime), the created object is a subobject of e's containing object if:
— the lifetime of e's containing object has begun and not ended, and
— the storage for the new object exactly overlays the storage location associated
with e, and
— the new object is of the same type as e (ignoring cv-qualification).

C++ can't define unions at that point, because:
lvalues by definition must refer to an object
objects have a lifetime; it isn't clear what's a pre-lifetime object
in theory two unrelated objects can't be at the same address, but pre-lifetime objects can, so there is no such thing as pre-lifetime object
So mutable unions can't be well defined in C++, end of story.

Related

What exactly constitutes the "name" of an object for the purposes of object replacement?

According to [basic.life]/8,
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or
released, a new object is created at the storage location which the original object occupied, a pointer that
pointed to the original object, a reference that referred to the original object, or the name of the original
object will automatically refer to the new object and, once the lifetime of the new object has started, can be
used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static
data member whose type is const-qualified or a reference type, and
the original object was a most derived object (4.5) of type T and the new object is a most derived
object of type T (that is, they are not base class subobjects).
... [ Note: If these conditions are not met, a pointer to the new object can be obtained from a
pointer that represents the address of its storage by calling std::launder (21.6). — end note ]
The standard contains an example that demonstrates that when there is a const subobject, "the name of the original object" fails to refer to the new object, and use of that name causes UB. It is located in [intro.object]/2:
Objects can contain other objects, called subobjects. A subobject can be a member subobject (12.2), a base
class subobject (Clause 13), or an array element. An object that is not a subobject of any other object is
called a complete object. If an object is created in storage associated with a member subobject or array
element e (which may or may not be within its lifetime), the created object is a subobject of e’s containing
object if:
the lifetime of e’s containing object has begun and not ended, and
the storage for the new object exactly overlays the storage location associated with e, and
the new object is of the same type as e (ignoring cv-qualification).
[ Note: If the subobject contains a reference member or a const subobject, the name of the original subobject
cannot be used to access the new object (6.8). — end note ] [ Example:
struct X { const int n; };
union U { X x; float f; };
void tong() {
U u = {{ 1 }};
u.f = 5.f; // OK, creates new subobject of u (12.3)
X *p = new (&u.x) X {2}; // OK, creates new subobject of u
assert(p->n == 2); // OK
assert(*std::launder(&u.x.n) == 2); // OK
assert(u.x.n == 2); // undefined behavior, u.x does not name new subobject
}
However, it would seem to me that the fact that [basic.life]/8 doesn't give the lvalue-to-rvalue conversion on u.x.n defined behaviour is irrelevant, because it is given defined behaviour by [expr.ref]/4.2, which has the following to say about the class member access expression E1.E2:
If E2 is a non-static data member and the type of E1 is “cq1 vq1 X”, and the type of E2 is “cq2 vq2 T”,
the expression designates the named member of the object designated by the first expression. ...
My reading of this is that the expression u.x yields an lvalue referring to the current x subobject of whatever object u currently refers to. Since, according to [intro.object]/2, the creation of the new X object in the place of u.x causes the new X object to actually be a subobject of u, performing an lvalue-to-rvalue conversion on u.x.n should be well-defined.
If we assume that the UB in this example reflects the intent of the standard, it appears that we must read [basic.life]/8 as saying that notwithstanding the fact that certain expressions may appear to access the new object (in this case, due to [expr.ref]/4.2), they are nonetheless UB if they attempt the access using the original object's "name". (Or, in practical terms, that the compiler may assume that the "name" continues to refer to the original object, and thus not re-read the const member's value.)
Normally, though, I would not think that u.x counts as "naming" the X subobject of u, because I think that subobjects do not have names. Thus, [basic.life]/8 appears to be saying that UB occurs in some particular situations but without precisely explaining what those situations are.
Thus, my questions are:
Am I correct in saying that [basic.life]/8 causes this example to contain UB rather than simply failing to give it defined behaviour?
Is there a precise specification of which cases are given UB by [basic.life]/8?
Should the standard be reworded to be more clear on when [basic.life]/8 causes UB (i.e., when std::launder is needed)?

My reading of this is that the expression u.x yields an lvalue referring to the current x subobject of whatever object u currently refers to.
And that's true. What you misunderstand is what "the current x subobject" means.
When you do new (&u.x) X {2}, that creates a new object at the address of u.x. However, nowhere in the standard does it say that this object is named x or u.x. Yes, it is a subobject of u, but it has no name. Nowhere does the standard say that the newly created subobject has that name, or any name for that matter.
Indeed, if what you said was true, [basic.life]/8 and launder wouldn't need to exist at all, since an object in overlaid storage of another object would always be accessible through the name of the old object.
Normally, though, I would not think that u.x counts as "naming" the X subobject of u, because I think that subobjects do not have names.
I don't know how you came to that conclusion, since you quoted a part of the specification that clearly states that member subobjects can have names:
the expression designates the named member of the object
Emphasis added. That clearly suggests to me that the member subobject has a name, and the expression u.x designates the name of that particular member subobject (not merely any member subobject in that address of the appropriate type).
That is, just as u refers specifically to the object which was declared by the declaration of u, u.x refers specifically to the x-named member of the object declared by the declaration u.

Does copying an empty object involve accessing it

Inspired from this question.
struct E {};
E e;
E f(e); // Accesses e?
To access is to
read or modify the value of an object
The empty class has an implicitly defined copy constructor
The implicitly-defined copy/move constructor for a non-union class X performs a memberwise copy/move of its bases and members. [...] The order of initialization is the same as the order of initialization of bases and members in a user-defined constructor. Let x be either the parameter of the constructor or, for the move constructor, an xvalue referring to the parameter. Each base or non-static data member is copied/moved in the manner appropriate to its type:
[...] the base or member is direct-initialized with the corresponding base or member of x.

I think that the part of the standard that describes the most precisely what performs an access is [basic.life]. In this paragraph it is explained what can be done with a reference that refers to, or a pointer that point to, an object which is out of its lifetime period. Everything that is authorized to do with such entities does not perform an access to the object value since such value does not exist (otherwise the standard would be inconsistent).
So we can take a more drastic example, if this is not undefined behavior, so there are no access to e in your example code (accordingly to the reasonning above):
struct E{
E()=default;
E(const E&){}
};
E e;
e.~E();
E f(e);
Here e is an object whose lifetime has ended but whose storage is still allocated. What can be done with such a lvalue is described in [basic.life]/6
Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see [class.cdtor]. Otherwise, such a glvalue refers to allocated storage ([basic.stc.dynamic.deallocation]), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:
an lvalue-to-rvalue conversion ([conv.lval]) is applied to such a glvalue,
the glvalue is used to access a non-static data member or call a non-static member function of the object, or
the glvalue is implicitly converted ([conv.ptr]) to a reference to a base class type, or
the glvalue is used as the operand of a static_cast ([expr.static.cast]) except when the conversion is ultimately to cv char& or cv unsigned char&, or
the glvalue is used as the operand of a dynamic_cast ([expr.dynamic.cast]) or as the operand of typeid.
None of the cited point above does happen inside E copy constructor so the example code in this answer is well defined, which implies that there have been no access to the value of the destroyed object. So there is no access to e in your example code.

I think it does not access the object, though a valid object is required to be present.
E f(e);
This calls E's implicitly defined constructor E::E(const E&). Obviously the body of this constructor is empty (because there is nothing to do). So if anything happens, it must happen during argument passing, i.e. during the initialization of const E& from e.
It is self-evident that this initialization does not modify e. Now, to read the value of e, a lvalue-to-rvalue conversion must take place. However, the standard actually says that this conversion does not take place during direct reference binding1. That is to say, no read is performed.
However, the standard does require that a reference must be initialized to refer to a valid object or function2 (though this is subject to CWG 453), so things like E f(*reinterpret_cast<E*>(nullptr)); will be ill-formed.
1. This is done by not normatively requiring such conversion, and further strengthened by the non-normative note in [dcl.init.ref].
2. [dcl.ref].

How are contents of an object defined when using placement new on existing object

Looking at the following example. Does the C++ standard guarantee that the value of object.x will be equal to 1 at the end? What if I don't call the destructor object.~Class();?
#include <new>
class Class
{
public:
Class() {}
~Class() {}
int x;
};
int main()
{
Class object;
object.x = 1;
object.~Class();
new (&object) Class();
object.x == ?
Class object_2;
object_2.x = 1;
new (&object_2) Class();
object_2.x == ?
return 0;
}

No.
The object whose x is equal to 1 was destroyed.
You know that, because you're the one who destroyed it.
Your new x is uninitialised and has an unspecified value, which may be 1 in practice due to memory re-use. That's no different than for any other uninitialised value.
Update
Since there seems to be a lot of people throwing about assertions, here are some facts, direct from the standard.
There is no direct statement about this case because it follows from the general rules governing what objects are and what object lifetime means.
Generally, once you've grokked that C++ is an abstraction rather than a direct mapping of bytes, you can understand what's going on here, and why there is no such guarantee as that the OP seeks.
First, some background on object lifetime and destruction:
[C++14: 12.4/5]: A destructor is trivial if it is not user-provided and if:
the destructor is not virtual,
all of the direct base classes of its class have trivial destructors, and
for all of the non-static data members of its class that are of class type (or array thereof), each such class has a trivial destructor.
Otherwise, the destructor is non-trivial.
[C++14: 3.8]: [..] The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
[C++14: 3.8/3]: The properties ascribed to objects throughout this International Standard apply for a given object only during its lifetime. [..]
[C++14: 3.8/4]: A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type with a non-trivial destructor. For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
(Most of this passage is irrelevant for your first case, as it does have an explicit destructor call: your second violates the rules in this paragraph and thus has undefined behaviour; it shall not be discussed further.)
[C++14: 12.4/2]: A destructor is used to destroy objects of its class type. [..]
[C++14: 12.4/11]: [..] Destructors can also be invoked explicitly.
[C++14: 12.4/15]: Once a destructor is invoked for an object, the object no longer exists. [..]
Now, some specifics. What if we were to inspect object.x before the placement-new?
[C++14: 12.7/1]: [..] For an object with a non-trivial destructor, referring to any non-static member or base class of the object after the destructor finishes execution results in undefined behavior.
Wow, okay.
And what if we were to inspect it after the placement-new? i.e. what value does the x in the new object hold? Is it guaranteed, as the question asks, that it'll be 1? Bear in mind that Class's constructor includes no initialiser for x:
[C++14: 5.3.4/17]: A new-expression that creates an object of type T initializes that object as follows:
If the new-initializer is omitted, the object is default-initialized (8.5); if no initialization is performed, the object has indeterminate value.
Otherwise, the new-initializer is interpreted according to the initialization rules of 8.5 for direct-initialization.
[C++14: 8.5/16]: The initialization that occurs in the forms
T x(a);
T x{a};
as well as in new expressions (5.3.4), static_cast expressions (5.2.9), functional notation type conversions (5.2.3), and base and member initializers (12.6.2) is called direct-initialization.
[C++14: 8.5/17]: The semantics of initializers are as follows. The destination type is the type of the object or reference being initialized and the source type is the type of the initializer expression. If the initializer is not a single (possibly parenthesized) expression, the source type is not defined. [..]
If the initializer is (), the object is value-initialized.
[..]
[C++14: 8.5/8]: To value-initialize an object of type T means:
if T is a (possibly cv-qualified) class type (Clause 9) with either no default constructor (12.1) or a default constructor that is user-provided or deleted, then the object is default-initialized.
[..]
[C++14: 8.5/7]: To default-initialize an object of type T means:
if T is a (possibly cv-qualified) class type (Clause 9), the default constructor (12.1) for T is called (and the initialization is ill-formed if T has no default constructor or overload resolution (13.3) results in an
ambiguity or in a function that is deleted or inaccessible from the context of the initialization);
if T is an array type, each element is default-initialized;
— otherwise, no initialization is performed.
If a program calls for the default initialization of an object of a const-qualified type T, T shall be a class type with a user-provided default constructor.
[C++14: 12.6.2/8]: In a non-delegating constructor, if a given non-static data member or base class is not designated by a mem-initializer-id (including the case where there is no mem-initializer-list because the constructor has no
ctor-initializer) and the entity is not a virtual base class of an abstract class (10.4), then:
if the entity is a non-static data member that has a brace-or-equal-initializer, the entity is initialized as specified in 8.5;
otherwise, if the entity is an anonymous union or a variant member (9.5), no initialization is performed;
otherwise, the entity is default-initialized (8.5).
[C++14: 8.5/12]: If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value.
So, does the standard guarantee that your replacement object's x has value 1? No. It does not.
In practice, why might it not be? Well, any number of reasons. Class's destructor is non-trivial so, per 3.8, the first object's lifetime has ended immediately after you called its destructor.
Technically, the compiler is then free to place an object at that location as long as it's destroyed by the time the placement-new takes effect. There's no reason for it to do so here, but there's nothing prohibiting it; more importantly, a simple { int x = 5; x = 42; } in between the destructor call and the placement-new would be more than entitled to re-use that memory. It's not being used to represent any object at that point!
More realistically, there are implementations (e.g. Microsoft Visual Studio) that, for debug-mode programs, write a recognisable bit-pattern into unused stack memory to aid in diagnosing program faults. There's no reason to think that such an implementation wouldn't hook into destructors in order to do that, and such an implementation would be overwriting your 1 value. There is nothing in the standard prohibiting this whatsoever.
Indeed, if I replace the ? lines in your code with std::cout statements, so that we actually inspect the values, I get a warning about an uninitialised variable and a value 0 in the case where you used a destructor:
main.cpp: In function 'int main()':
main.cpp:19:23: warning: 'object.Class::x' is used uninitialized in this function [-Wuninitialized]
std::cout << object.x << '\n';
^
0
1
Live demo
I'm unsure how much more proof you need that the standard does not guarantee a 1 value here.

Class that holds a reference to itself

Skimming through the standard draft (n3242) I found this sentence in Clause 9.2 (emphasis mine):
Non-static (9.4) data members shall not have incomplete types. In
particular, a class C shall not contain a non-static member of class
C, but it can contain a pointer or reference to an object of class
C.
From this I argue that is fine to define a class like this:
class A {
public:
A(A& a) : a_(a){
}
private:
A& a_;
};
Then in clause 8.3.2 I found the following:
A reference shall be initialized to refer to a valid object or
function
Question 1: Is it permitted to define an object of this type passing its name as a reference:
A a(a);
or will this trigger undefined behavior?
Question 2: If yes, what are the parts of the standard that permit the initialization of the reference from a still-to-be-constructed object?
Question 3: If no, does this mean the definition of class A is well formed but no first object can be created without triggering UB? In this case what is the rationale behind this?

"valid object" is not defined anywhere in the standard, but it is intented to mean a region of memory with appropriate size and alignment that can contain an object of the specified type. It just means to exclude references to such things as dereferenced null pointers, misaligned regions of memory, etc. An uninitialised object is valid.
There is an open issue to clear up the wording, CWG 453.

n3337 § 3.8/6
Similarly, before the lifetime of an object has started but after the
storage which the object will occupy has been allocated or, after the
lifetime of an object has ended and before the storage which the
object occupied is reused or released, any glvalue that refers to the
original object may be used but only in limited ways. For an object
under construction or destruction, see 12.7. Otherwise, such a glvalue
refers to allocated storage (3.7.4.2), and using the properties of the
glvalue that do not depend on its value is well-defined. The program
has undefined behavior if:
— an lvalue-to-rvalue conversion (4.1) is
applied to such a glvalue,
— the glvalue is used to access a
non-static data member or call a non-static member function of the
object, or
— the glvalue is implicitly converted (4.10) to a reference
to a base class type, or
— the glvalue is used as the operand of a
static_cast (5.2.9) except when the conversion is ultimately to cv
char& or cv unsigned char&, or
— the glvalue is used as the operand of
a dynamic_cast (5.2.7) or as the operand of typeid.
So, to answer your questions:
Question 1: Is it permitted to define an object of this type passing
its name as a reference?
Yes. Using just the address seems not to violate this (at least for a variable put on stack).
A a(a);
or will this trigger undefined behavior?
No.
Question 2: If yes, what are the parts of the standard that permit the
initialization of the reference from a still-to-be-constructed object?
§ 3.8/6 (above)
The only question that remains is how this correspond to
A reference shall be initialized to refer to a valid object or
function.
The problem is in term valid object. Because § 8.3.2/4 says that
It is unspecified whether or not a reference requires storage
it seems that § 8.3.2 is problematic and should be reworded. The confusion lead to change proposed in document C++ Standard Core Language Active Issues, Revision 87 dated on 20.01.2014:
A reference shall be initialized to refer to an object or function.
Change 8.3.2 [dcl.ref] paragraph 4 as follows:
If an lvalue to which a reference is directly bound designates neither
an existing object or function of an appropriate type (8.5.3
[dcl.init.ref]), nor a region of memory of suitable size and alignment
to contain an object of the reference's type (1.8 [intro.object], 3.8
[basic.life], 3.9 [basic.types]), the behavior is undefined.

From n1905, 3.3.1.1
The point of declaration for a name is immediately after its complete
declarator (clause 8 ) and before its initializer (if any), except as
noted below.
[ Example:
int x = 12;
{ int x = x; }
Here the second x
is initialized with its own (indeterminate) value.
—end example ]
My emphasis ( correct me if I am wrong ): In your example -
A a(a);
is equivalent to -
A a = a; // Copy initialization
So, according to standard a is initialized with it's own indeterminate value. And the member is holding reference to one such indeterminate value.

Are there any unintuitive side-effects of member subobjects inheriting storage duration?

I didn't know this before, but it turns out that:
[C++11: 3.7.5]: The storage duration of member subobjects, base class subobjects and array elements is that of their complete object (1.8).
That means that x->a in the example below has dynamic storage duration.
I'm wondering whether there are any elsewhere-defined semantics that make reference to storage duration that imbue member a with different behaviour between object *x and y? An example would be the rules governing object lifetime.
struct T
{
int a;
};
int main()
{
std::unique_ptr<T> x(new T);
T y;
}
And how about if T were non-POD (and other kinds of UDTs)?
In short, my lizard brain expects any declaration looking like int a; to have automatic (or static) storage duration, and I wonder whether any standard wording accidentally expects this too.
Update:
Here's an example:
[C++11: 3.7.4.3/4]: [..] Alternatively, an implementation may have strict pointer safety, in which case a pointer value that is not a safely-derived pointer value is an invalid pointer value unless the referenced complete object is of dynamic storage duration [..]
On the surface of it, I wouldn't expect the semantics to differ between my x->a and my y.a, but it's clear that there are areas, that are not obviously related to object lifetime, where they do.
I'm also concerned about lambda capture rules, which explicitly state "with automatic storage duration" in a number of places, e.g.:
[C++11: 5.1.2/11]: If a lambda-expression has an associated capture-default and its compound-statement odr-uses (3.2) this or a variable with automatic storage duration [..]
[C++11: 5.1.2/18]: Every occurrence of decltype((x)) where x is a possibly parenthesized id-expression that names an entity of automatic storage duration is treated as if x were transformed into an access to a corresponding data member of the closure type that would have been declared if x were an odr-use of the denoted entity.
and others.

No. This storage duration inheritance is what makes subobjects work. Doing anything else would simply be quite impossible. Else, you could not design any type that could be allocated both statically and dynamically.
Simply put, any violation of this rule would simply break everything.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js