Is this assignment invalid after explicit destructor call? - c++

I have come across some code that is intended to replace an object in-place without reallocation of memory:
static void move(void* const* src, void** dest) {
(*reinterpret_cast<T**>(dest))->~T();
**reinterpret_cast<T**>(dest) = **reinterpret_cast<T* const*>(src);
}
This looks like UB to me, since the object is destroyed and then assigned to without being constructed, i.e. it needs to either just copy-assign (the second line only) or explicitly destruct (the first line) followed by placement-new copy construction instead of the assignment.
I only ask because although this seems like a glaring bug to me, it has existed for some time in both boost::spirit::hold_any and the original cdiggins::any on which it is based. (I have asked about it on the Boost developers mailing list, but while awaiting responses wish to fix this locally if it is indeed incorrect.)

Assuming the reinterpret_casts are well-defined (that is, dest really is a pointer to pointer to T), the standard defines the end of an object's lifetime as:
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
It then gives some restrictions over what can be done with the glvalue **reinterpret_cast<T**>(dest):
Similarly, [...] after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. [...] The program has undefined behavior if:
an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,
the glvalue is used to access a non-static data member or call a non-static member function of the object, or
the glvalue is implicitly converted (4.10) to a reference to a base class type, or
the glvalue is used as the operand of a static_cast (5.2.9) except when the conversion is ultimately to cv char& or cv unsigned char&, or
the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.
Emphasis added.
If the object doesn't end up in this after-life state because it has a trivial destructor, there is no problem. However, for any T which is a class type with non-trivial destructor, we know that the assignment operator is considered a member function operator= of that class. Calling a non-static member function of the object through this glvalue results in undefined behaviour.

This looks like UB to me, since the object is destroyed and then
assigned to without being constructed, i.e. it needs to either just
copy-assign (the second line only) or explicitly destruct (the first
line) followed by placement-new copy construction instead of the
assignment.
There is no need to fix anything, although this code is certainly not safe without further qualification (it's certain to be safe in the context where it's used though).
The object at dest is destroyed, and then the memory backing the object at src is copied over to where the object at dest used to live. End result: you have destroyed one object and placed a shallow clone of another object where the first one used to live.
If you only do the copy assignment the first object will not have been destructed, resulting in resource leaks.
Using placement new to populate the memory at dest would be an option, but it has very different semantics than the existing code (creates a brand new object instead of making a shallow clone of an existing one). Placement new and using the copy constructor also has different semantics: the object needs to have an accessible copy constructor, and you are no longer in control of what the result will be (the copy constructor does whatever it wants).

Related

Is it safe to call placement new on `this` for trivial object?

I know that this question was asked several times already but I couldn't find an answer for this particular case.
Let's say I have a trivial class that doesn't own any resources and has empty destructor and default constructor. It has a handful of member variables with in-class initialization; not one of them is const.
I want to re-initialize and object of such class it without writing deInit method by hand. Is it safe to do it like this?
void A::deInit()
{
new (this)A{};
}
I can't see any problem with it - object is always in valid state, this still points to the same address; but it's C++ so I want to be sure.
Similarly to the legality of delete this, placement new to this is also allowed as far as I know. Also, regarding whether this, or other pre-existing pointers / references can be used afterwards, there are a few restrictions:
[basic.life]
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is
const-qualified or a reference type, and
neither the original object nor the new object is a potentially-overlapping subobject ([intro.object]).
The first two are satisfied in this example, but the last two will need to be taken into consideration.
Regarding the third point, given that the function is non-const-qualified, it should be fairly safe to assume that the original object is non-const. The fault is on the caller side if the constness has been cast away. Regarding const / reference member, I think that can be checked by asserting that this is assignable:
static_assert(std::is_trivial_v<A> && std::is_copy_assignable_v<A>);
Of course, since assignability is a requirement, you could instead simply use *this = {}; which I would expect to produce the same program. A perhaps more interesting use case might be to reuse memory of *this for an object of another type (which would fail the requirements for using this, at least without reinterpreting + laundering).
Similar to delete this, placement new to this could hardly be described as "safe".
The rules that cover this are in [basic.life]/5
A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type. For an object of a class type, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression is not used to release the storage, the destructor is not implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
and [basic.life]/8
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
neither the original object nor the new object is a potentially-overlapping subobject ([intro.object]).
Since your object is trivial you don't have to worry about [basic.life]/5 and as long as you satisfy the bullet points from [basic.life]/8, then it is safe.

Is replacing `this` with a different type allowed?

In the comments and answers to this question:
Virtual function compiler optimization c++
it is argued that a virtual function call in a loop cannot be devirtualized, because the virtual function might replace this by another object using placement new, e.g.:
void A::foo() { // virtual
static_assert(sizeof(A) == sizeof(Derived));
new(this) Derived;
}
The example is from a LLVM blog article about devirtualization
Now my question is: is that allowed by the standard?
I could find this on cppreference about storage reuse: (emphasis mine)
A program is not required to call the destructor of an object to end its lifetime if the object is trivially-destructible or if the program does not rely on the side effects of the destructor. However, if a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
If the new object must have the same type, it must have the same virtual functions. So it is not possible to have a different virtual function, and thus, devirtualization is acceptable.
Or do I misunderstand something?
The quote you provided says:
If a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
The intent of this statement relates to something a bit different to what you are doing. The statement is meant to say that when you destroy an object without destroying its name, something still refers to that storage with the original type, o you need to construct a new object there so that when the implicit destruction occurs, there is a valid object to destroy. This is relevant for example if you have an automatic ("stack") variable, and you call its destructor--you need to construct a new instance there before the destructor is called when the variable goes out of scope.
The statement as a whole, and its "of the same type" clause in particular, has no bearing on the topic you're discussing, which is whether you are allowed to construct a different polymorphic type having the same storage requirements in place of an old one. I don't know of any reason why you shouldn't be allowed to do that.
Now, that being said, the question you linked to is doing something different: it is calling a function using implicit this in a loop, and the question is whether the compiler could assume that the vptr for this will not change in that loop. I believe the compiler could (and clang -fstrict-vtable-pointers does) assume this, because this is only valid if the type is the same after the placement new.
So while the quotes from the standard you have provided are not relevant to this issue, the end result is that it does seem possible for an optimizer to devirtualize function calls made in a loop under the assumption that the type of *this (or its vptr) cannot change. The type of an object stored at an address (and its vptr) can change, but if it does, the old this is no longer valid.
It appears that you intend to use the new object using handles (pointers, references, or the original variable name) that existed prior to its recreation. That's allowed only if the instance type is not changed, plus some other conditions excluding const objects and sub-objects:
From [basic.life]:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied,
and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Your quote from the Standard is merely a consequence of this one.
Your proposed "devirtualization counter-example" does not meet these requirements, therefore all attempts to access the object after it is replaced will cause undefined behavior.
The blog post even pointed this out, in the very next sentence after the example code you looked at.

Is it legal to use placement new on initialised memory?

I am exploring the possibility of implementing true (partially) immutable data structures in C++. As C++ does not seem to distinguish between a variable and the object that variable stores, the only way to truly replace the object (without assignment operation!) is to use placement new:
auto var = Immutable(state0);
// the following is illegal as it requires assignment to
// an immutable object
var = Immutable(state1);
// however, the following would work as it constructs a new object
// in place of the old one
new (&var) Immutable(state1);
Assuming that there is no non-trivial destructor to run, is this legal in C++ or should I expect undefined behaviour? If its standard-dependant, which is the minimal/maximal standard version where I can expect this to work?
Addendum: since it seems people still read this in 2019, a quick note — this pattern is actually legally possible in modern (post 17) C++ using std::launder().
What you wrote is technically legal but almost certainly useless.
Suppose
struct Immutable {
const int x;
Immutable(int val):x(val) {}
};
for our really simple immutable type.
auto var = Immutable(0);
::new (&var) Immutable(1);
this is perfectly legal.
And useless, because you cannot use var to refer to the state of the Immutable(1) you stored within it after the placement new. Any such access is undefined behavior.
You can do this:
auto var = Immutable(0);
auto* pvar1 = ::new (&var) Immutable(1);
and access to *pvar1 is legal. You can even do:
auto var = Immutable(0);
auto& var1 = *(::new (&var) Immutable(1));
but under no circumstance may you ever refer to var after you placement new'd over it.
Actual const data in C++ is a promise to the compiler that you'll never, ever change the value. This is in comparison to references to const or pointers to const, which is just a suggestion that you won't modify the data.
Members of structures declared const are "actually const". The compiler will presume they are never modified, and won't bother to prove it.
You creating a new instance in the spot where an old one was in effect violates this assumption.
You are permitted to do this, but you cannot use the old names or pointers to refer to it. C++ lets you shoot yourself in the foot. Go right ahead, we dare you.
This is why this technique is legal, but almost completely useless. A good optimizer with static single assignment already knows that you would stop using var at that point, and creating
auto var1 = Immutable(1);
it could very well reuse the storage.
Caling placement new on top of another variable is usually defined behaviour. It is usually a bad idea, and it is fragile.
Doing so ends the lifetime of the old object without calling the destructor. References and pointers to and the name of the old object refer to the new one if some specific assumptions hold (exact same type, no const problems).
Modifying data declared const, or a class containing const fields, results in undefined behaviour at the drop of a pin. This includes ending the lifetime of an automatic storage field declared const and creating a new object at that location. The old names and pointers and references are not safe to use.
[Basic.life 3.8]/8:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or
released, a new object is created at the storage location which the original object occupied, a pointer that
pointed to the original object, a reference that referred to the original object, or the name of the original
object will automatically refer to the new object and, once the lifetime of the new object has started, can be
used to manipulate the new object, if:
(8.1)
the storage for the new object exactly overlays the storage location which the original object occupied,
and
(8.2)
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
(8.3)
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static
data member whose type is const-qualified or a reference type, and
(8.4)
the original object was a most derived object (1.8) of type
T
and the new object is a most derived
object of type
T
(that is, they are not base class subobjects).
In short, if your immutability is encoded via const members, using the old name or pointers to the old content is undefined behavior.
You may use the return value of placement new to refer to the new object, and nothing else.
Exception possibilities make it extremely difficult to prevent code that exdcutes undefined behaviour or has to summarially exit.
If you want reference semantics, either use a smart pointer to a const object or an optional const object. Both handle object lifetime. The first requires heap allocation but permits move (and possibly shared references), the second permits automatic storage. Both move manual object lifetime management out of business logic. Now, both are nullable, but avoiding that robustly is difficult doing it manually anyhow.
Also consider copy on write pointers that permit logically const data with mutation for efficiency purposes.
From the C++ standard draft N4296:
3.8 Object lifetime
[...]
The lifetime of an object of type T ends when:
(1.3) — if T is a class
type with a non-trivial destructor (12.4), the destructor call starts,
or
(1.4) — the storage which the object occupies is reused or
released.
[...]
4 A program may end the lifetime of any object by
reusing the storage which the object occupies or by explicitly calling
the destructor for an object of a class type with a non-trivial
destructor. For an object of a class type with a non-trivial
destructor, the program is not required to call the destructor
explicitly before the storage which the object occupies is reused or
released; however, if there is no explicit call to the destructor or
if a delete-expression (5.3.5) is not used to release the storage, the
destructor shall not be implicitly called and any program that depends
on the side effects produced by the destructor has undefined behavior.
So yes, you can end the lifetime of an object by reusing its memory, even of one with non-trivial destructor, as long as you don't depend on the side effects of the destructor call.
This applies when you have non-const instances of objects like struct ImmutableBounds { const void* start; const void* end; }
You've actually asked 3 different questions :)
1. The contract of immutability
It's just that - a contract, not a language construct.
In Java for instance, instances of String class are immutable. But that means that all methods of the class have been designed to return new instances of class rather than modifying the instance.
So if you would like to make Java's String into a mutable object, you couldn't, without having access to its source code.
Same applies to classes written in C++, or any other language. You have an option to create a wrapper (or use a Proxy pattern), but that's it.
2. Using placement constructor and allocating into an initialized are off memory.
That's actually what they were created to do in the first place.
The most common use case for the placement constructor are memory pools - you preallocate a large memory buffer, and then you allocate your stuff into it.
So yes - it is legal, and nobody won't mind.
3. Overwriting class instance's contents using a placement allocator.
Don't do that.
There's a special construct that handles this type of operation, and it's called a copy constructor.

How to initialize a std::shared_ptr from a function returning by value?

I am doing it like this:
class Something;
Something f();
...
std::shared_ptr<Something> ptr(new Something(f()));
but this doesn't feel right. Moreover it needs the copy constructor. Is there a better way?
Use std::make_shared to avoid explicitly calling new. Similarly, use std::make_unique.
make_shared might be more efficient because it can allocate the counters for the smart-pointer and the object in one block together.
Still, it does not come into its own until you have at least one more way for your statement to cause an exception after construction of the object but before it is safely ensconced in its smart-pointer. Said exceptions would otherwise cause a memory-leak.
Example for bad behaviour:
void f(std::shared_ptr<int> a, std::shared_ptr<int> b);
f(std::shared_ptr<int>(new int(0)), std::shared_ptr<int>(new int(4)));
And corrected:
f(std::make_shared<int>(0), std::make_shared<int>(4));
Now, someone advises you to return Something not by value but as a dynamically allocated pointer. For your use-case, there's actually no difference with an acceptable compiler as long as Something is copyable, due to copy-ellision, aka directly constructing the returned value in the space allocated by new/make_shared/make_unique.
So, just do what you think best there.
Copy-ellision is explicitly allowed by the standard. Just be aware the copy-constructor must be accessible anyway:
12.8. Copying and moving class objects §32
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class
object, even if the copy/move constructor and/or destructor for the object have side effects. In such cases,
the implementation treats the source and target of the omitted copy/move operation as simply two different
ways of referring to the same object, and the destruction of that object occurs at the later of the times
when the two objects would have been destroyed without the optimization.123 This elision of copy/move
operations, called copy elision, is permitted in the following circumstances (which may be combined to
eliminate multiple copies):
— in a return statement in a function with a class return type, when the expression is the name of a
non-volatile automatic object (other than a function or catch-clause parameter) with the same cvunqualified
type as the function return type, the copy/move operation can be omitted by constructing
the automatic object directly into the function’s return value
— in a throw-expression, when the operand is the name of a non-volatile automatic object (other than
a function or catch-clause parameter) whose scope does not extend beyond the end of the innermost
enclosing try-block (if there is one), the copy/move operation from the operand to the exception
object (15.1) can be omitted by constructing the automatic object directly into the exception object
— when a temporary class object that has not been bound to a reference (12.2) would be copied/moved
to a class object with the same cv-unqualified type, the copy/move operation can be omitted by
constructing the temporary object directly into the target of the omitted copy/move
— when the exception-declaration of an exception handler (Clause 15) declares an object of the same type
(except for cv-qualification) as the exception object (15.1), the copy/move operation can be omitted
by treating the exception-declaration as an alias for the exception object if the meaning of the program
will be unchanged except for the execution of constructors and destructors for the object declared by
the exception-declaration.
You can use std::make_shared.
It is better to use it for the following reason:
This function typically allocates memory for the T object and for the shared_ptr's control block with a single memory allocation (it is a non-binding requirement in the Standard). In contrast, the declaration std::shared_ptr p(new T(Args...)) performs at least two memory allocations, which may incur unnecessary overhead.
The better way would be to have f() return Something* (allocated with new) or shared_ptr<Something>. Otherwise, the Something returned by f() will have automatic storage and putting it in a shared_ptr doesn't make sense. You could, in theory, use a shared_ptr with a custom deleter, but that wouldn't change the storage class of the underlying object, and you'd most likely just end up with a wild pointer.
If you can't change f(), your solution of making a copy with dynamic storage is really all you can do. If you can give Something a move constructor, you could at least reduce the cost of making the copy (assuming it's expensive enough to be worth reducing).
But see this answer for why the copy isn't worth worrying about. Do whatever you think makes the code most readable.

Is invoking the destructor before the constructor has finished legal?

Suppose I have a class whose constructor spawns a thread that deletes the object:
class foo {
public:
foo()
: // initialize other data-members
, t(std::bind(&foo::self_destruct, this))
{}
private:
// other data-members
std::thread t;
// no more data-members declared after this
void self_destruct() {
// do some work, possibly involving other data-members
delete this;
}
};
The problem here is that the destructor might get invoked before the constructor has finished. Is this legal in this case? Since t is declared (and thus initialized) last, and there is no code in the constructor body, and I never intend to subclass this class, I assume that the object has been completely initialized when self_destruct is called. Is this assumption correct?
I know that the statement delete this; is legal in member-functions if this is not used after that statement. But constructors are special in several ways, so I am not sure if this works.
Also, if it is illegal, I am not sure how to work around it, other spawning the thread in a special initialization-function that must be called after construction of the object, which I really would like to avoid.
P.S.: I am looking for an answer for C++03 (I am restricted to an older compiler for this project). The std::thread in the example is just for illustration-purposes.
Firstly, we see that an object of type foo has non-trivial initialization because its constructor is non-trivial (§3.8/1):
An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor.
Now we see that an object of type foo's lifetime begins after the constructor ends (§3.8/1):
The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-trivial initialization, its initialization is complete.
Now, it is undefined behaviour if you do delete on the object before the end of the constructor if the type foo has a non-trivial destructor (§3.8/5):
Before the lifetime of an object has started but after the storage which the object will occupy has been allocated [...] any pointer that refers to the storage location where the object will be or was located may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, [...]
So since our object is under construction, we take a look at §12.7:
Member functions, including virtual functions (10.3), can be called during construction or destruction (12.6.2).
That means that it's fine for self_destruct to be called while the object is being constructed. However, this section says nothing specifically about destroying an object while it is being constructed. So I suggest we look at the operation of the delete-expression.
First, it "will invoke the destructor (if any) for the object [...] being deleted." The destructor is a special case of member function, so it is fine to call it. However, §12.4 Destructors says nothing about whether it is well-defined when the destructor is called during construction. No luck here.
Second, "the delete-expression will call a deallocation function" and "the deallocation function shall deallocate the storage referenced by the pointer". Once again, nothing is said about doing this to storage that is currently being used be an object under construction.
So I argue that this is undefined behaviour by the fact that the standard hasn't defined it very precisely.
Just to note: the lifetime of an object of type foo ends when the destructor call starts, because it has a non-trivial destructor. So if delete this; occurs before the end of the object's construction, its lifetime ends before it starts. This is playing with fire.
I daresay it is well-defined to be illegal (though it might obviously still work with some compilers).
This is somewhat the same situation as "destructor not called when exception is thrown from constructor".
A delete-expression, according to the standard, destroys a most derived object (1.8) or array created by a new-expression (5.3.2). Before the end of the constructor, an object is not a most derived object, but an object of its direct ancestor's type.
Your class foo has no base class, so there is no ancestor, this therefore has no type and your object is not really an object at all at the time delete is called. But even if there was a base class, the object would be a not-most-derived object (still rendering it illegal), and the wrong constructor would be called.
Formally the object doesn't exist until the constructor has finished successfully. Part of the reason is that the constructor might be called from a derived class' constructor. In that case you certainly don't want to destroy the constructed sub-object via an explicit destructor call, and even less invoke UB by calling delete this on a (part of a) not completely constructed object.
Standardese about the object existence, emphasis added:
C++11 §3.8/1:
The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization
if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial
default constructor. [Note: initialization by a trivial copy/move constructor is non-trivial initialization. —end note ] The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and
— if the object has non-trivial initialization, its initialization is complete.
The constructor in this case is non-trivial just by being user-provided.
delete this; works correctly in practice on most platforms; some may even guarantee correct behavior as a platform-specific extension. But IIRC it isn't well-defined according to the Standard.
The behavior you're relying on is that it's often possible to call a non-virtual non-static member function on a dead object, as long as that member function doesn't actually access this. But this behavior is not allowed by the Standard; it is at best non-portable.
Section 3.8p6 of the Standard makes it undefined behavior if an object isn't live during a call to a non-static member function:
Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a glvalue refers to allocated
storage, and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:
an lvalue-to-rvalue conversion is applied to such a glvalue,
the glvalue is used to access a non-static data member or call a non-static member function of the object, or
the glvalue is implicitly converted to a reference to a base class type, or
the glvalue is used as the operand of a static_cast except when the conversion is ultimately
to cvchar& or cvunsigned char&, or
the glvalue is used as the operand of a dynamic_cast or as the operand of typeid.
For this specific case (deleting an object under construction), we find in section 5.3.5p2:
... In the first alternative (delete object), the value of the operand of delete may be a null pointer value, a pointer to a non-array object created by a previous new-expression, or a pointer to a subobject representing a base class of such an object (Clause 10). If not, the behavior is undefined. In the second alternative (delete array), the value of the operand of delete may be a null pointer value or a pointer value that resulted from a previous array new-expression. If not, the behavior is undefined.
This requirement is not met. *this is not an object created, past tense, by a new-expression. It is an object being created (present progressive). And this interpretation is supported by the array case, where the pointer must be the result of a previous new-expression... but the new-expression is not yet completely evaluated; it is not previous and it has no result yet.