Does std::optional<>::emplace() invalidate references to the inner value?

Does std::optional<>::emplace() invalidate references to the inner value? - c++

Consider the following fragment (assume that T is trivially constructible and trivially destructible):
std::optional<T> opt;
opt.emplace();
T& ref = opt.value();
opt.emplace();
// is ref guaranteed to be valid here?
From the definition of std::optional we know that the contained instance is guaranteed to be allocated inside the std::optional container, hence we know that the reference ref will always be referring to the same memory location. Are there circumstances where said reference will not retain validity after the pointed-to object is destroyed and then constructed again?

C++20 has the following rule, [basic.life]/8:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object. An object o1 is transparently replaceable by an object o2 if:
the storage that o2 occupies exactly overlays the storage that o1 occupied, and
o1 and o2 are of the same type (ignoring the top-level cv-qualifiers), and
o1 is not a complete const object, and
neither o1 nor o2 is a potentially-overlapping subobject (6.7.2), and
either o1 and o2 are both complete objects, or o1 and o2 are direct subobjects of objects p1 and p2 ,
respectively, and p1 is transparently replaceable by p2.
This suggests that as long as T is not const-qualified, destroying the T inside an std::optional<T> and then recreating it should result in a reference to the old object automatically referring to the new object. As pointed out in the comments section, this is a change from the old behaviour, abolishing a requirement that T must not contain a non-static data member of const-qualified or reference type. (Edit: I previously asserted that the change was made retroactively, as I confused it with a different change in C++20. I am not sure whether the resolution to RU 007 and US 042 as indicated in N4858 were made retroactive, but I suspect the answer is yes, because the change was needed to fix code involving standard library templates that was probably not intended to be broken from C++11 through C++17.)
However, we are making the assumption that the new T object is being created "before the storage which the [old] object occupied is reused or released". If I were writing an "adversarial" implementation of the standard library, I could set it up so that the emplace call reuses the underlying storage prior to creating the new T object. This would prevent the old T object from being transparently replaced by the new one.
How might an implementation "reuse" the storage? Typically, the underlying storage might be declared like this:
union {
char no_object;
T object;
};
When the default constructor of optional is called, no_object is initialized (the value does not matter)1. An emplace() call checks whether there is a T object or not (by checking a flag that is not shown here). If a T object is present, then object.~T() is called. Finally, something similar to construct_at(addressof(object)) is called in order to construct the new T object.
Not that any implementation would ever do this, but you could imagine an implementation that, in between the calls to object.~T() and construct_at(addressof(object)), re-initializes the no_object member. This would be a "reuse" of the storage that was previously occupied by object. This would imply that the requirements of [basic.life]/8 are not met.
Of course, the practical answer to your question is that (1) there is no reason for an implementation to do something like this, and (2) even if an implementation did it, the developers would ensure that your code still behaves as if the T object was transparently replaced. Your code is reasonable under the assumption that the standard library implementation is reasonable, and compiler developers do not like to break code with that property, since doing so would needlessly aggravate their users.
But if a compiler developer were inclined to break your code (based on the argument that the more undefined behaviour there is, the more the compiler can optimize) then they could break your code even without changing the <optional> header file. The user is required to treat the standard library like a "black box" that only guarantees what the standard explicitly guarantees. So under a pedantic reading of the standard, it's unspecified whether or not attempting to access ref after the second emplace call has undefined behaviour. If it's unspecified whether it's UB, then the compiler is allowed to start treating it as UB whenever it wants.
1 The reason for this is historical; C++17 requires that a constexpr constructor initialize exactly one variant member of a union. This rule was abolished in C++20, so a C++20 implementation could omit the no_object member.

Related

Clarification and reasons for object lifetime constraints change in C++20

As of C++20, there's a significant change for the constraints of object lifetime from basic.life#8.3 to
n4861/basic.life#8.3. The concrete change I want to focus on here is (C++20 draft)
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object. An object o1 is transparently replaceable by an object o2 if
the storage that o2 occupies exactly overlays the storage that o1 occupied, and
o1 and o2 are of the same type (ignoring the top-level cv-qualifiers), and
o1 is not a complete const object, and
neither o1 nor o2 is a potentially-overlapping subobject, and
either o1 and o2 are both complete objects, or o1 and o2 are direct subobjects of objects p1 and p2, respectively, and p1 is
transparently replaceable by p2.
vs. the pre-C++20 one (not the only change, see the drafts for details)
the type of the original object is not const-qualified, and, if a
class type, does not contain any non-static data member whose type is
const-qualified or a reference type...
Question 1:
Maybe too obvious, just to be sure: With complete const object, the standard refers to complete objects that are const, right? (So it has nothing to do with partial/full const objects that was part of the previous quality of wording within this section?)
Question 2:
Can anyone explain the reasons behind these changes (aliasing?)?
Question 3:
Is my assumption valid, that this one is a heavy relaxation of the former rules, i.e. guaranteeing now a lot more object/memory reusage scenarios not to be UB for instance? In doubt, shouldn't they affect the way the optimizers are allowed to operate in a heavy way too (shift of efficiency fields)?

How risky is it to destroy_at/construct non-copyable vector elements?

I've searched through SO and have a question about whether the following is safe or not.
Consider this vector:
std::vector<std::pair<const key, value>> vec;
Imagine that I want to swap-and-pop an element at position i with the last one.
Ignore for a moment the fact that if something goes wrong the vector may be in an unsafe state. I perfectly know that but the question is more about the other operation around.
Since I can't swap the two elements, I'd be tempted to do this:
auto *elem = std::addressof(vec[i]);
std::destroy_at(elem);
std::construct(alloc, elem, std::move(vec.back());
vec.pop_back();
Again, we are in an unsafe state between line 2 and 3 but let's ignore it.
I'm curious to know if destroying and reconstructing the element at position i is safe for all i instead.
From my understanding, I've a couple of doubts about it:
When i is 0, we're somewhat playing with the same initial pointer as stored by the vector itself. Therefore, I wonder if in this case we should force the vector to std::launder it (yeah, it's not possible, I'm just enetering the in theory field here).
Since the pair has a const key, I guess destroying and recostructing it can lead to UB. Though, I'm not that sure about this, just my gut feeling.

At first, I'm only referring to this question of you with a look onto your const-members concerns and onto the issue without the context of std::vector and std::construct_at (std::construct requires the allocator further on...):
Since the pair has a const key, I guess destroying and recostructing it can lead to UB. Though, I'm not that sure about this, just my gut feeling.
Before C++20, you're right, this would be UB if old references and pointers are still on use. But as of C++20, there's a relevant change here in basic.life#8.3 to
n4861/basic.life#8.3
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object. An object o1 is transparently replaceable by an object o2 if
the storage that o2 occupies exactly overlays the storage that o1 occupied, and
o1 and o2 are of the same type (ignoring the top-level cv-qualifiers), and
o1 is not a complete const object, and
neither o1 nor o2 is a potentially-overlapping subobject, and
either o1 and o2 are both complete objects, or o1 and o2 are direct subobjects of objects p1 and p2, respectively, and p1 is
transparently replaceable by p2.
So as of these changes, you should actually be fine in general for your particular(!) example since vector elements have to be non-const, i.e. the referred objects cannot be const complete ones (subpoint 3). Keep in mind that these phrases are relevant for the case of further using old references and pointers to this reused location! A simple placement new after an explicit destruction (not de-allocation) is fine a priori, it's the further context, that's relevant for the question about UB.
Also see the discussion from
Is it possible now with the current C++ standard draft version to define a copy assignment operator for classes with const fields without UB
Taking the std::vector-context into account with std::construct_at except the zero position case:
Your concerns about constant members are actually not longer relevant here since we have an in-between layer: The allocator (of std::vector) and the promise of simple contiguous storage and the fact, that there are no "pending" references to the old memory (as long as you do not refer to your old elem pointer further on...). Otherwise, a lot of highly efficient operations on/inside the standard vector container would not be possible (emplace data into pre-allocated ranges). The standard library itself uses similar schemes for moving and insertion for instance for many containers types. It's still very ugly to circumvent the inner memory "assumptions" of an external container this way but it should be ok for all common actual library implementations of an std::vector (but surely not for containers with complex inner logic like a map!).

This is not UB but it opens pitfalls. Before C++20 all pointers and references to vec[i] are gone at destroy time so using them after the operation would be UB.
And even in C++20 pointers or references to vec[i].first still go away, because vec[i].first is a complete const object, so it is not transparently replaceable.
It is not worse than using pop_back and then push_back or emplace_back to replace the last element of a vector. But it has the exact same caveats: you have a different object at the place where a former object existed. So using a reference or pointer to anything up to C++17, or to anything that is declared as const since C++20, is explicitely UB.
The underlying reason is that optimizing compilers are fond of caching anything that is declared as const. So any trick ending in replacing a const object is only safe if you are sure to never use a direct pointer, reference or variable that refered to that damned const thing.
It is rather evident when you play with the tail of a container, because future users of that code are aware that something has disappeared. I believe that doing that in the middle of a vector is more dangerous, because it could easily mislead future maintainers or users. At least it requires a warning comment in red flashing font.

Is it safe to call placement new on `this` for trivial object?

I know that this question was asked several times already but I couldn't find an answer for this particular case.
Let's say I have a trivial class that doesn't own any resources and has empty destructor and default constructor. It has a handful of member variables with in-class initialization; not one of them is const.
I want to re-initialize and object of such class it without writing deInit method by hand. Is it safe to do it like this?
void A::deInit()
{
new (this)A{};
}
I can't see any problem with it - object is always in valid state, this still points to the same address; but it's C++ so I want to be sure.

Similarly to the legality of delete this, placement new to this is also allowed as far as I know. Also, regarding whether this, or other pre-existing pointers / references can be used afterwards, there are a few restrictions:
[basic.life]
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is
const-qualified or a reference type, and
neither the original object nor the new object is a potentially-overlapping subobject ([intro.object]).
The first two are satisfied in this example, but the last two will need to be taken into consideration.
Regarding the third point, given that the function is non-const-qualified, it should be fairly safe to assume that the original object is non-const. The fault is on the caller side if the constness has been cast away. Regarding const / reference member, I think that can be checked by asserting that this is assignable:
static_assert(std::is_trivial_v<A> && std::is_copy_assignable_v<A>);
Of course, since assignability is a requirement, you could instead simply use *this = {}; which I would expect to produce the same program. A perhaps more interesting use case might be to reuse memory of *this for an object of another type (which would fail the requirements for using this, at least without reinterpreting + laundering).
Similar to delete this, placement new to this could hardly be described as "safe".

The rules that cover this are in [basic.life]/5
A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type. For an object of a class type, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression is not used to release the storage, the destructor is not implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
and [basic.life]/8
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
neither the original object nor the new object is a potentially-overlapping subobject ([intro.object]).
Since your object is trivial you don't have to worry about [basic.life]/5 and as long as you satisfy the bullet points from [basic.life]/8, then it is safe.

Is replacing `this` with a different type allowed?

In the comments and answers to this question:
Virtual function compiler optimization c++
it is argued that a virtual function call in a loop cannot be devirtualized, because the virtual function might replace this by another object using placement new, e.g.:
void A::foo() { // virtual
static_assert(sizeof(A) == sizeof(Derived));
new(this) Derived;
}
The example is from a LLVM blog article about devirtualization
Now my question is: is that allowed by the standard?
I could find this on cppreference about storage reuse: (emphasis mine)
A program is not required to call the destructor of an object to end its lifetime if the object is trivially-destructible or if the program does not rely on the side effects of the destructor. However, if a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
If the new object must have the same type, it must have the same virtual functions. So it is not possible to have a different virtual function, and thus, devirtualization is acceptable.
Or do I misunderstand something?

The quote you provided says:
If a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
The intent of this statement relates to something a bit different to what you are doing. The statement is meant to say that when you destroy an object without destroying its name, something still refers to that storage with the original type, o you need to construct a new object there so that when the implicit destruction occurs, there is a valid object to destroy. This is relevant for example if you have an automatic ("stack") variable, and you call its destructor--you need to construct a new instance there before the destructor is called when the variable goes out of scope.
The statement as a whole, and its "of the same type" clause in particular, has no bearing on the topic you're discussing, which is whether you are allowed to construct a different polymorphic type having the same storage requirements in place of an old one. I don't know of any reason why you shouldn't be allowed to do that.
Now, that being said, the question you linked to is doing something different: it is calling a function using implicit this in a loop, and the question is whether the compiler could assume that the vptr for this will not change in that loop. I believe the compiler could (and clang -fstrict-vtable-pointers does) assume this, because this is only valid if the type is the same after the placement new.
So while the quotes from the standard you have provided are not relevant to this issue, the end result is that it does seem possible for an optimizer to devirtualize function calls made in a loop under the assumption that the type of *this (or its vptr) cannot change. The type of an object stored at an address (and its vptr) can change, but if it does, the old this is no longer valid.

It appears that you intend to use the new object using handles (pointers, references, or the original variable name) that existed prior to its recreation. That's allowed only if the instance type is not changed, plus some other conditions excluding const objects and sub-objects:
From [basic.life]:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied,
and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Your quote from the Standard is merely a consequence of this one.
Your proposed "devirtualization counter-example" does not meet these requirements, therefore all attempts to access the object after it is replaced will cause undefined behavior.
The blog post even pointed this out, in the very next sentence after the example code you looked at.

Is it legal to use placement new on initialised memory?

I am exploring the possibility of implementing true (partially) immutable data structures in C++. As C++ does not seem to distinguish between a variable and the object that variable stores, the only way to truly replace the object (without assignment operation!) is to use placement new:
auto var = Immutable(state0);
// the following is illegal as it requires assignment to
// an immutable object
var = Immutable(state1);
// however, the following would work as it constructs a new object
// in place of the old one
new (&var) Immutable(state1);
Assuming that there is no non-trivial destructor to run, is this legal in C++ or should I expect undefined behaviour? If its standard-dependant, which is the minimal/maximal standard version where I can expect this to work?
Addendum: since it seems people still read this in 2019, a quick note — this pattern is actually legally possible in modern (post 17) C++ using std::launder().

What you wrote is technically legal but almost certainly useless.
Suppose
struct Immutable {
const int x;
Immutable(int val):x(val) {}
};
for our really simple immutable type.
auto var = Immutable(0);
::new (&var) Immutable(1);
this is perfectly legal.
And useless, because you cannot use var to refer to the state of the Immutable(1) you stored within it after the placement new. Any such access is undefined behavior.
You can do this:
auto var = Immutable(0);
auto* pvar1 = ::new (&var) Immutable(1);
and access to *pvar1 is legal. You can even do:
auto var = Immutable(0);
auto& var1 = *(::new (&var) Immutable(1));
but under no circumstance may you ever refer to var after you placement new'd over it.
Actual const data in C++ is a promise to the compiler that you'll never, ever change the value. This is in comparison to references to const or pointers to const, which is just a suggestion that you won't modify the data.
Members of structures declared const are "actually const". The compiler will presume they are never modified, and won't bother to prove it.
You creating a new instance in the spot where an old one was in effect violates this assumption.
You are permitted to do this, but you cannot use the old names or pointers to refer to it. C++ lets you shoot yourself in the foot. Go right ahead, we dare you.
This is why this technique is legal, but almost completely useless. A good optimizer with static single assignment already knows that you would stop using var at that point, and creating
auto var1 = Immutable(1);
it could very well reuse the storage.
Caling placement new on top of another variable is usually defined behaviour. It is usually a bad idea, and it is fragile.
Doing so ends the lifetime of the old object without calling the destructor. References and pointers to and the name of the old object refer to the new one if some specific assumptions hold (exact same type, no const problems).
Modifying data declared const, or a class containing const fields, results in undefined behaviour at the drop of a pin. This includes ending the lifetime of an automatic storage field declared const and creating a new object at that location. The old names and pointers and references are not safe to use.
[Basic.life 3.8]/8:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or
released, a new object is created at the storage location which the original object occupied, a pointer that
pointed to the original object, a reference that referred to the original object, or the name of the original
object will automatically refer to the new object and, once the lifetime of the new object has started, can be
used to manipulate the new object, if:
(8.1)
the storage for the new object exactly overlays the storage location which the original object occupied,
and
(8.2)
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
(8.3)
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static
data member whose type is const-qualified or a reference type, and
(8.4)
the original object was a most derived object (1.8) of type
T
and the new object is a most derived
object of type
T
(that is, they are not base class subobjects).
In short, if your immutability is encoded via const members, using the old name or pointers to the old content is undefined behavior.
You may use the return value of placement new to refer to the new object, and nothing else.
Exception possibilities make it extremely difficult to prevent code that exdcutes undefined behaviour or has to summarially exit.
If you want reference semantics, either use a smart pointer to a const object or an optional const object. Both handle object lifetime. The first requires heap allocation but permits move (and possibly shared references), the second permits automatic storage. Both move manual object lifetime management out of business logic. Now, both are nullable, but avoiding that robustly is difficult doing it manually anyhow.
Also consider copy on write pointers that permit logically const data with mutation for efficiency purposes.

From the C++ standard draft N4296:
3.8 Object lifetime
[...]
The lifetime of an object of type T ends when:
(1.3) — if T is a class
type with a non-trivial destructor (12.4), the destructor call starts,
or
(1.4) — the storage which the object occupies is reused or
released.
[...]
4 A program may end the lifetime of any object by
reusing the storage which the object occupies or by explicitly calling
the destructor for an object of a class type with a non-trivial
destructor. For an object of a class type with a non-trivial
destructor, the program is not required to call the destructor
explicitly before the storage which the object occupies is reused or
released; however, if there is no explicit call to the destructor or
if a delete-expression (5.3.5) is not used to release the storage, the
destructor shall not be implicitly called and any program that depends
on the side effects produced by the destructor has undefined behavior.
So yes, you can end the lifetime of an object by reusing its memory, even of one with non-trivial destructor, as long as you don't depend on the side effects of the destructor call.
This applies when you have non-const instances of objects like struct ImmutableBounds { const void* start; const void* end; }

You've actually asked 3 different questions :)
1. The contract of immutability
It's just that - a contract, not a language construct.
In Java for instance, instances of String class are immutable. But that means that all methods of the class have been designed to return new instances of class rather than modifying the instance.
So if you would like to make Java's String into a mutable object, you couldn't, without having access to its source code.
Same applies to classes written in C++, or any other language. You have an option to create a wrapper (or use a Proxy pattern), but that's it.
2. Using placement constructor and allocating into an initialized are off memory.
That's actually what they were created to do in the first place.
The most common use case for the placement constructor are memory pools - you preallocate a large memory buffer, and then you allocate your stuff into it.
So yes - it is legal, and nobody won't mind.
3. Overwriting class instance's contents using a placement allocator.
Don't do that.
There's a special construct that handles this type of operation, and it's called a copy constructor.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js