Lifetime of object which has vacuous initialization - c++

Current draft standard says (previous standards have similar wording) in [basic.life/1]:
The lifetime of an object or reference is a runtime property of the
object or reference. An object is said to have non-vacuous
initialization if it is of a class or aggregate type and it or one of
its subobjects is initialized by a constructor other than a trivial
default constructor. [ Note: Initialization by a trivial copy/move
constructor is non-vacuous initialization. — end note ] The lifetime
of an object of type T begins when:
(1.1) storage with the proper
alignment and size for type T is obtained, and
(1.2) if the object has
non-vacuous initialization, its initialization is complete,
See this code:
alignas(int) char obj[sizeof(int)];
Does basic.life/1 mean that here an int (and several other types, which has the same or less alignment/size requirements as int) has begun its lifetime?
What does this even mean? If an object has begun its lifetime, is it created? [intro.object/1] says:
[...] An object is created by a definition ([basic.def]), by a new-expression, when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary]) [...]
So, according to this, my obj (as an int) is not created. But its lifetime as an int (and as other, possibly infinite-type vacuously-initializable objects) has started.
I'm confused, can you give a clarification on this?

You cannot begin the lifetime of a object unless the object has been created. And [intro.object]/1 defines the only ways in which objects can be created:
An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2).
The object created by this definition is of type char[]. Therefore, that is the only object whose lifetime begins. And no other objects are created by this construct.
To lend credence to this interpretation, the proposal for C++20 P0593 exists whose primary purpose is to allow that very declaration to implicitly create other such objects.
Comments:
The condition in (1.2) still bothers me. Why is it there?
It is there because it cannot say "the initialization is complete" for an object that doesn't undergo initialization.
suppose, that I have a new(obj) int afterwards. That clearly creates an int object. But before that, obj has obtained the necessary storage.
No, the declaration of obj obtained storage for an object of type char[]. What obtains storage for the int object being created is new(obj). Yes, the placement-new expression obtains storage for the object that it creates. Just like a declaration of a variable obtains storage for the object it creates.
Just because that storage happens to exist already doesn't mean it isn't being obtained.

I interpret
The lifetime of an object of type T begins when...
to mean
Given that a program creates an object of T, the following describes when that object's lifetime is said to begin...
and not
If the following conditions are satisfied, then an object of type T exists, and its lifetime begins when...
That is, there's an implicit additional condition that the object is "created" in some way described in [intro.object]/1. But the paragraph [basic.life]1/ does not mean to by itself imply that any object exist, only one of the properties of objects that do exist.
So for your declaration, the text describes the beginning of the lifetimes of one object of type char[sizeof(int)] and one or more objects of type char (even if the declaration is a statement in a block scope and there is no initialization), but since there is no object of type int implied to exist, we won't say anything about the lifetime of such an object.

Because the Standard deliberately refrains from requiring that all implementations be suitable for all purposes, it will often be necessary for quality implementations intended for various purposes to guarantee the behavior of code over which the Standard itself would impose no requirements.
If some type T supports implicit object creation and a program converts the address of some object to a T*, a high-quality implementation which is intended to support low-level programming concepts without requiring special syntax will behave as though such conversion creates an object of type T in cases where that would allow the program to have defined behavior, but would not implicitly create such objects is cases where doing so would not be necessary but would instead result in Undefined Behavior by destroying other objects.
Thus, if float and uint32_t are the same size and have the same alignment requirements, then given e.g.
alignas(uint32_t) char obj[sizeof(uint32_t)];
float *fp = (float*)obj;
*fp = 1.0f;
uint32_t *up = (uint32_t*)obj;
The initialization of fp would create a float because that would be needed to make the assignment to *fp work. If up will be used in a fashion that would require a uint32_t to exist there, the assignment to up could create one while destroying the float that was there. If up isn't used in such a fashion, but fp is used in a way that would require that the float still exist, that float would still exist. If both pointers are used in ways that would require that the respective objects still exist, even a quality compiler intended for low-level programming might be incapable of handling that possibility.
Note that implementations which are not particularly suitable for low-level programming may not support the semantics described here. The authors of the Standard allows compiler writers to support such semantics or not, based upon whether they are necessary for their compilers' intended purposes; unfortunately, there is not as yet any standard way to distinguish compilers that are suitable for such purposes from those that aren't.

Related

Does std::optional<>::emplace() invalidate references to the inner value?

Consider the following fragment (assume that T is trivially constructible and trivially destructible):
std::optional<T> opt;
opt.emplace();
T& ref = opt.value();
opt.emplace();
// is ref guaranteed to be valid here?
From the definition of std::optional we know that the contained instance is guaranteed to be allocated inside the std::optional container, hence we know that the reference ref will always be referring to the same memory location. Are there circumstances where said reference will not retain validity after the pointed-to object is destroyed and then constructed again?
C++20 has the following rule, [basic.life]/8:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object. An object o1 is transparently replaceable by an object o2 if:
the storage that o2 occupies exactly overlays the storage that o1 occupied, and
o1 and o2 are of the same type (ignoring the top-level cv-qualifiers), and
o1 is not a complete const object, and
neither o1 nor o2 is a potentially-overlapping subobject (6.7.2), and
either o1 and o2 are both complete objects, or o1 and o2 are direct subobjects of objects p1 and p2 ,
respectively, and p1 is transparently replaceable by p2.
This suggests that as long as T is not const-qualified, destroying the T inside an std::optional<T> and then recreating it should result in a reference to the old object automatically referring to the new object. As pointed out in the comments section, this is a change from the old behaviour, abolishing a requirement that T must not contain a non-static data member of const-qualified or reference type. (Edit: I previously asserted that the change was made retroactively, as I confused it with a different change in C++20. I am not sure whether the resolution to RU 007 and US 042 as indicated in N4858 were made retroactive, but I suspect the answer is yes, because the change was needed to fix code involving standard library templates that was probably not intended to be broken from C++11 through C++17.)
However, we are making the assumption that the new T object is being created "before the storage which the [old] object occupied is reused or released". If I were writing an "adversarial" implementation of the standard library, I could set it up so that the emplace call reuses the underlying storage prior to creating the new T object. This would prevent the old T object from being transparently replaced by the new one.
How might an implementation "reuse" the storage? Typically, the underlying storage might be declared like this:
union {
char no_object;
T object;
};
When the default constructor of optional is called, no_object is initialized (the value does not matter)1. An emplace() call checks whether there is a T object or not (by checking a flag that is not shown here). If a T object is present, then object.~T() is called. Finally, something similar to construct_at(addressof(object)) is called in order to construct the new T object.
Not that any implementation would ever do this, but you could imagine an implementation that, in between the calls to object.~T() and construct_at(addressof(object)), re-initializes the no_object member. This would be a "reuse" of the storage that was previously occupied by object. This would imply that the requirements of [basic.life]/8 are not met.
Of course, the practical answer to your question is that (1) there is no reason for an implementation to do something like this, and (2) even if an implementation did it, the developers would ensure that your code still behaves as if the T object was transparently replaced. Your code is reasonable under the assumption that the standard library implementation is reasonable, and compiler developers do not like to break code with that property, since doing so would needlessly aggravate their users.
But if a compiler developer were inclined to break your code (based on the argument that the more undefined behaviour there is, the more the compiler can optimize) then they could break your code even without changing the <optional> header file. The user is required to treat the standard library like a "black box" that only guarantees what the standard explicitly guarantees. So under a pedantic reading of the standard, it's unspecified whether or not attempting to access ref after the second emplace call has undefined behaviour. If it's unspecified whether it's UB, then the compiler is allowed to start treating it as UB whenever it wants.
1 The reason for this is historical; C++17 requires that a constexpr constructor initialize exactly one variant member of a union. This rule was abolished in C++20, so a C++20 implementation could omit the no_object member.

Is it really possible to separate storage allocation from object initialization?

From [basic.life/1]:
The lifetime of an object or reference is a runtime property of the object or reference. A variable is said to have vacuous initialization if it is default-initialized and, if it is of class type or a (possibly multi-dimensional) array thereof, that class type has a trivial default constructor. The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
its initialization (if any) is complete (including vacuous initialization) ([dcl.init]),
except that if the object is a union member or subobject thereof, its lifetime only begins if that union member is the initialized member in the union ([dcl.init.aggr], [class.base.init]), or as described in [class.union] and [class.copy.ctor], and except as described in [allocator.members].
From [dcl.init.general/1]:
If no initializer is specified for an object, the object is default-initialized.
From [basic.indet/1]:
When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced ([expr.ass]).
[Note 1: Objects with static or thread storage duration are zero-initialized, see [basic.start.static]. — end note]
Consider this C++ program:
int main() {
int i;
i = 3;
return 0;
}
Is initialization performed in the first statement int i; or second statement i = 3; of the function main according to the C++ standard?
I think it is the former, which performs vacuous initialization to an indeterminate value and therefore begins the lifetime of the object (the latter does not perform initialization, it performs assignment to the value 3). If that is so, is it really possible to separate storage allocation from object initialization?
If that is so, is it really possible to separate storage allocation from object initialization?
Yes:
void *ptr = malloc(sizeof(int));
ptr points to allocated storage, but no objects live in that storage (C++20 says that some objects may be there, but nevermind that now). Objects won't exist unless we create some there:
new(ptr) int;
You are getting confused between allocating storage for an object and initializing an object, and they are definitely not the same thing.
In your example, the object i is never initialized. As a local value, it has space reserved for its storage, but it is not initialized with any value.
The second line’s statement assigns a value of 3 to it. This again is not initialization.
Objects with global storage are required by the standard to be both allocated and initialized (to zero or whatever the default initializer does). All other objects are only initialized if the written language construct can support it.
C++ allocators work on this same distinction. The new operator, behind the scenes, both allocates and initializes objects, after which you may assign the object a new value. If you need to, though, you can use the underlying language constructs to manage the two things separately.
For most purposes you do not need to care about the difference between object initialization and assignment in your code. If you get to the point where it matters, you either already know how the concepts differ or need to learn really quickly.
Is initialization performed in the first statement int i; or second statement i = 3; of the function main according to the C++ standard?
The first. The second statement is assignment, not initialization. The second statement marks the point "until that value is replaced ([expr.ass])" from your quotes of the standard.
If [initialization is the first statement] is so, is it really possible to separate storage allocation from object initialization?
Yes, but not in a such a simple example as yours. A common example that comes to mind is a std::vector. Reserving capacity allocates storage space, but that storage is not initialized until an object is added to the vector.
std::vector<int> v; // Allocates and initializes the vector object.
v.reserve(1); // Ensures space has been allocated for an int object.
/*
At this point, the first contained element has space allocated, but has
not yet been initialized. If you want to do nutty things between allocation
and object initialization, this is the place to do it. Note that you are
not allowed to access the allocated space since it belongs to the vector.
You'd have to replicate the inner workings of a vector to do that...
*/
v.emplace_back(3); // Initializes the first contained object.
Quoting the standard
Short version:
There is nothing to quote because the standard does not explicitly prohibit all spurious actions. Compilers avoid spurious actions by their own volition.
Long version:
Strictly speaking, the standard does not guarantee that reserve() does not initialize anything. The requirements imposed on reserve() in [vector.capacity] are more focused on what must be done than on prohibiting spurious activity. The closest it comes to this guarantee is the requirement that the time complexity of reserve() be linear in the size of the container (not in the capacity, but in the current size). This would make it impossible to always initialize everything that was reserved. However, a compiler could still choose to initialize a fixed number of reserved elements, say up to 10 million of them. As long as this limit is fixed, it counts as constant-time complexity, so is allowed by [vector.capacity].
Now let's get real. Compilers are designed to produce fast code, without introducing unnecessary, useless busywork. Compilers do not seek out the possibility of doing additional work simply because the standard does not prohibit it. Except for debug builds, no compiler is going to introduce an initialization when it it not required. The people who view the possibility as something worth considering are language lawyers who lose sight of the big picture. You don't pay for what you don't need. The question to ask here is not "Could you quote the standard supporting that no initialization happens?" but "Could you quote the standard supporting that no initialization is required?" Since the additional work of initialization is not required, it will not happen in practice.
Still, reality means little to some language lawyers, and this question does have that tag. To be thorough, I will demonstrate that it is "possible to separate storage allocation from object initialization" even if you happen to use a pathological, yet standards-compliant, compiler that was over-engineered by masochists. I need only one case to demonstrate "possible", so let's abandon int for a more bizarre, yet fully legal, type.
The sole precondition for reserve() is that the contained type can be move-inserted into the container. This precondition is satisfied by the following class.
class C {
// Default construction is not supported.
C() = delete;
public:
// Move construction is allowed, even outside this class.
C(C &&) = default;
};
I have designed this class to be rather hard to initialize. The only allowed construction is move-construction; in order to initialize an object of this type, you need to already have an object of this type. Who creates the first object? No one. No objects of this type can exist. However, one can still create a vector of these objects (an empty vector, but still a vector).
It is legal to define std::vector<C> v;, and to follow that by a call to v.reserve(1);. This allocates space (1 byte is needed on my system) for an object of type C, and yet there is no possible initialization of this object. QED.

Is replacing `this` with a different type allowed?

In the comments and answers to this question:
Virtual function compiler optimization c++
it is argued that a virtual function call in a loop cannot be devirtualized, because the virtual function might replace this by another object using placement new, e.g.:
void A::foo() { // virtual
static_assert(sizeof(A) == sizeof(Derived));
new(this) Derived;
}
The example is from a LLVM blog article about devirtualization
Now my question is: is that allowed by the standard?
I could find this on cppreference about storage reuse: (emphasis mine)
A program is not required to call the destructor of an object to end its lifetime if the object is trivially-destructible or if the program does not rely on the side effects of the destructor. However, if a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
If the new object must have the same type, it must have the same virtual functions. So it is not possible to have a different virtual function, and thus, devirtualization is acceptable.
Or do I misunderstand something?
The quote you provided says:
If a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
The intent of this statement relates to something a bit different to what you are doing. The statement is meant to say that when you destroy an object without destroying its name, something still refers to that storage with the original type, o you need to construct a new object there so that when the implicit destruction occurs, there is a valid object to destroy. This is relevant for example if you have an automatic ("stack") variable, and you call its destructor--you need to construct a new instance there before the destructor is called when the variable goes out of scope.
The statement as a whole, and its "of the same type" clause in particular, has no bearing on the topic you're discussing, which is whether you are allowed to construct a different polymorphic type having the same storage requirements in place of an old one. I don't know of any reason why you shouldn't be allowed to do that.
Now, that being said, the question you linked to is doing something different: it is calling a function using implicit this in a loop, and the question is whether the compiler could assume that the vptr for this will not change in that loop. I believe the compiler could (and clang -fstrict-vtable-pointers does) assume this, because this is only valid if the type is the same after the placement new.
So while the quotes from the standard you have provided are not relevant to this issue, the end result is that it does seem possible for an optimizer to devirtualize function calls made in a loop under the assumption that the type of *this (or its vptr) cannot change. The type of an object stored at an address (and its vptr) can change, but if it does, the old this is no longer valid.
It appears that you intend to use the new object using handles (pointers, references, or the original variable name) that existed prior to its recreation. That's allowed only if the instance type is not changed, plus some other conditions excluding const objects and sub-objects:
From [basic.life]:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied,
and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Your quote from the Standard is merely a consequence of this one.
Your proposed "devirtualization counter-example" does not meet these requirements, therefore all attempts to access the object after it is replaced will cause undefined behavior.
The blog post even pointed this out, in the very next sentence after the example code you looked at.

Why are things like cin, cout, string, etc. considered objects?

From what I understand (and what my textbook says), an object is a programming element that is self-contained, which holds data and a procedure that performs an operation on that data. With this being said, why are things like cin, cout, string, etc. considered objects? Is cin an object, in the way that I defined? Is cin the name of a self-contained unit, which holds data and a procedure that performs operations on that data, found within the source code of the iostream header file?
cin and cout are variables, and as such they're objects.
An object, in C++, is a not-necessarily contiguous region of storage, with an associated content interpretation in the form of a type.
This is a term defined by the C++ standard.
C++11 §1.8/1
” The constructs in a C ++ program create, destroy, refer to, access, and manipulate objects. An object is a
region of storage. [Note: A function is not an object, regardless of whether or not it occupies storage in the
way that objects do. —end note ] An object is created by a definition (3.1), by a new-expression (5.3.4) or
by the implementation (12.2) when needed. The properties of an object are determined when the object is
created. An object can have a name (Clause 3). An object has a storage duration (3.7) which influences
its lifetime (3.8). An object has a type (3.9). The term object type refers to the type with which the object
is created. Some objects are polymorphic (10.3); the implementation generates information associated with
each such object that makes it possible to determine that object’s type during program execution. For other
objects, the interpretation of the values found therein is determined by the type of the expressions (Clause 5)
used to access them.
The non-contiguous thing was primarily in support of multiple inheritance, but at least one committee member argued strongly, in a discussion with me, that it was intended to support making objects in general non-contiguous. However, I know of no extant compiler that does that. It seems meaningless to me.
std::string is not an object, it's a type.
Note: with some other programming languages, and in computer science in general, the term “object” often denotes an instance of a class type. In C++ even instances of non-class types such as int, are objects.
They are considered objects, because they are "objects". They are not types, they are instances.
You can see how they are defined on cppreference.
Example:
extern std::istream cin;
extern std::wistream wcin;
As you can see, cin is a variable whose type is std::istream.
Regarding your assumption about std::string: again, cppreference is very helpful.
We can see that std::string is not a variable/object, but a type alias for std::basic_string<char> instead.

Definition of object in C++

Could someone point me to the (official) definition of object in C++? In the current specification, the word "object" is used a few thousand of times, but I can't seem to find a section or reference that explains what an object is.
The background to this somehow basic question is a discussion I recently had with another user, who was surprised to my question of whether a pointer to a variable of a scoped enum type could be considered an object pointer.
According to what he says, in C++ each variable is an object, hence also the variable i in int i = 42;.
Anyway, I could find other sources stating that an object in C++ is an instance of a class (and this is surely what I was taught at school many years ago), which contradicts in my understanding the assumption above that any variable is an object. Or is there an explanation to this apparent contradiction?
References aren't objects. Instances of pretty much any other type are.
Here's the definition, found in section 1.8:
The constructs in a C ++ program create, destroy, refer to, access, and manipulate objects. An object is a region of storage. [ Note: A function is not an object, regardless of whether or not it occupies storage in the way that objects do. — end note ] An object is created by a definition (3.1), by a new-expression (5.3.4) or by the implementation (12.2) when needed. The properties of an object are determined when the object is created. An object can have a name (Clause 3). An object has a storage duration (3.7) which influences its lifetime (3.8). An object has a type (3.9). The term object type refers to the type with which the object is created. Some objects are polymorphic (10.3); the implementation generates information associated with each such object that makes it possible to determine that object's type during program execution. For other objects, the interpretation of the values found therein is determined by the type of the expressions (Clause 5) used to access them.
More useful is the definition of object type in 3.9p8:
An object type is a (possibly cv-qualified) type that is not a function type, not a reference type, and not a void type.
Functions have function type but they aren't instances, and there never are instances of void.
To deal with your particular debate, you need the definition of object pointer, from 3.9.2p3:
The type of a pointer to void or a pointer to an object type is called an object pointer type.
As it turns out, the definition of object never mattered, only the definition of object type. A pointer to a scoped enum is certainly an object pointer (and it is itself also an object).
You'll find that the Standard uses the phrase class object when it means to restrict to instances of class, struct, or union type.