Undefined behaviour on reinitializing object via placement new on this pointer - c++

I saw a presentation on cppcon of Piotr Padlewski saying that the following is undefined behaviour:
int test(Base* a){
int sum = 0;
sum += a->foo();
sum += a->foo();
return sum;
}
int Base::foo(){
new (this) Derived;
return 1;
}
Note: Assume sizeof(Base) == sizeof(Derived) and foo is virtual.
Obviously this is bad, but I'm interested in WHY it is UB. I do understand the UB on accessing a realloced pointer but he says, that this is the same.
Related questions: Is `new (this) MyClass();` undefined behaviour after directly calling the destructor? where it says "ok if no exceptions"
Is it valid to directly call a (virtual) destructor? Where it says new (this) MyClass(); results in UB. (contrary to the above question)
C++ Is constructing object twice using placement new undefined behaviour? it says:
A program may end the lifetime of any object by reusing the storage
which the object occupies or by explicitly calling the destructor for
an object of a class type with a non-trivial destructor. For an object
of a class type with a non-trivial destructor, the program is not
required to call the destructor explicitly before the storage which
the object occupies is reused or released; however, if there is no
explicit call to the destructor or if a delete-expression (5.3.5) is
not used to release the storage, the destructor shall not be
implicitly called and any program that depends on the side effects
produced by the destructor has undefined behavior.
which again sounds like it is ok.
I found another description of the placement new in Placement new and assignment of class with const member
If, after the lifetime of an object has ended and before the storage
which the object occupied is reused or released, a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, or the name of the original object will
automatically refer to the new object and, once the lifetime of the
new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is
const-qualified or a reference type, and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not
base class subobjects).
This seems to explain the UB. But is really true?
Doesn't this mean, that I could not have a std::vector<Base>? Because I assume due to its pre-allocation std::vector must rely on placement-news and explicit ctors. And point 4 requires it to be the most-derived type which Base clearly isn't.

I believe Elizabeth Barret Browning said it best. Let me count the ways.
If Base isn't trivially destructible, we're failing to cleanup resources.
If sizeof(Derived) is larger than the size of the dynamic type of this, we're going to clobber other memory.
If Base isn't the first subobject of Derived, then the storage for the new object won't exactly overlay the original storage, and you'd also end up clobbering other memory.
If Derived is just a different type from the initial dynamic type, even if it's the same size, than the object that we're calling foo() on cannot be used to refer to the new object. The same is true if any of the members of Base or Derived are const qualified or are references. You'd need to std::launder any external pointers/references.
However, if sizeof(Base) == sizeof(Derived), and Derived is trivially destructible, Base is the first subobject of Derived, and you only actually have Derived objects... this is fine.

Regarding your question
...Because I assume due to its pre-allocation std::vector must rely on
placement-news and explicit ctors. And point 4 requires it to be the
most-derived type which Base clearly isn't. And point 4 requires it to
be the most-derived type which Base clearly isn't.
, I think the misunderstanding comes from the term "most derived object" or "most derived type":
The "most derived type" of an object of class type is the class with which the object was instantiated, regardless of whether this class has further subclasses or not. Consider the following program:
struct A {
virtual void foo() { cout << "A" << endl; };
};
struct B : public A {
virtual void foo() { cout << "B" << endl; };
};
struct C : public B {
virtual void foo() { cout << "C" << endl; };
};
int main() {
B b; // b is-a B, but it also is-an A (referred to as a base object of b).
// The most derived class of b is, however, B, and not A and not C.
}
When you now create a vector<B>, then the elements of this vector will be instances of class B, and so the most derived type of the elements will always be B, and not C (or Derived) in your case.
Hope this brings some light in.

Related

Reusing data member storage via placement new during enclosing object's lifetime

This is a follow-up to my previous question where I seem to have made the problem more involved than I had originally intended. (See discussions in question and answer comments there.)
This question is a slight modification of the original question removing the issue of special rules during construction/destruction of the enclosing object.
Is it allowed to reuse storage of a non-static data member during the lifetime of its enclosing object and if so under what conditions?
Consider the program
#include<new>
#include<type_traits>
using T = /*some type*/;
using U = /*some type*/;
static_assert(std::is_object_v<T>);
static_assert(std::is_object_v<U>);
static_assert(sizeof(U) <= sizeof(T));
static_assert(alignof(U) <= alignof(T));
struct A {
T t /*initializer*/;
U* u;
void construct() {
t.~T();
u = ::new(static_cast<void*>(&t)) U /*initializer*/;
}
void destruct() {
u->~U();
::new(static_cast<void*>(&t)) T /*initializer*/;
}
A() = default;
A(const A&) = delete;
A(A&&) = delete;
A& operator=(const A&) = delete;
A& operator=(A&&) = delete;
};
int main() {
auto a = new A;
a->construct();
*(a->u) = /*some assignment*/;
a->destruct(); /*optional*/
delete a; /*optional*/
A b; /*alternative*/
b.construct(); /*alternative*/
*(b.u) = /*some assignment*/; /*alternative*/
b.destruct(); /*alternative*/
}
Aside from the static_asserts assume that the initializers, destructors and assignments of T and U do not throw.
What conditions do object types T and U need to satisfy additionally, so that the program has defined behavior, if any?
Does it depend on the destructor of A actually being called (e.g. on whether the /*optional*/ or /*alternative*/ lines are present)?.
Does it depend on the storage duration of A, e.g. whether /*alternative*/ lines in main are used instead?
Note that the program does not use the t member after the placement-new, except in the destructor and the destruct function. Of course using it while its storage is occupied by a different type is not allowed.
Also note that the program constructs an object of the original type in t before its destructor is called in all execution paths since I disallowed T and U to throw exceptions.
Please also note that I do not encourage anyone to write code like that. My intention is to understand details of the language better. In particular I did not find anything forbidding such placement-news as long as the destructor is not called, at least.
If a is destroyed (whether by delete or by falling out of scope), then t.~T() is called, which is UB if t isn't actually a T (by not calling destruct).
This doesn't apply if
the destructor of T is trivial, or
for delete U is derived from T, or
you're using a destroying delete
After destruct is called you are not allowed to use t if T has const or reference members (until C++20).
Apart from that there is no restriction on what you do with the class as written as far as I can see.
This answer is based on the draft available at http://eel.is/c++draft/
We can try to apply (by checking each condition) what I've decided to call the "undead object" clause to any previous object that used to exist, here we apply it to the member t of type T:
Lifetime [basic.life]/8
If, after the lifetime of an object has ended and before the storage
which the object occupied is reused or released, a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, or the name of the original object will
automatically refer to the new object and, once the lifetime of the
new object has started, can be used to manipulate the new object, if:
(8.1) the storage for the new object exactly overlays the storage
location which the original object occupied, and
(8.2) the new object is of the same type as the original object
(ignoring the top-level cv-qualifiers), and
(8.3) the original object is neither a complete object that is
const-qualified nor a subobject of such an object, and
(8.4) neither the original object nor the new object is a
potentially-overlapping subobject ([intro.object]).
Conditions 1 and 2 are automatically guaranteed by the use of placement new on the old member:
struct A {
T t /*initializer*/; (...)
void destruct() { (...)
::new(static_cast<void*>(&t)) T /*initializer*/;
}
The location is the same and the type is the same. Both conditions are easily verified.
Neither A objects created:
auto a = new A;
...
A b; /*alternative*/
are const qualified complete objects so t isn't a member of a const qualified complete object. Condition 3 is met.
Now the definition of potentially-overlapping is in Object model [intro.object]/7:
A potentially-overlapping subobject is either:
(7.1) a base class subobject, or
(7.2) a non-static data member declared with the no_­unique_­address
attribute.
The t member is neither and condition 4 is met.
All 4 conditions are met so the member name t can be used to name the new object.
[Note that at no point I even mentioned the fact the subobject isn't a const member not its subobjects. That isn't part of the latest draft.
It means that a const sub object can legally have its value changed, and a reference member can have its referent changed for an existing object. This is more than unsettling and probably not supported by many compilers. End note.]

Calling methods of unconstructed objects: Legal?

If memory is set aside for an object (e.g., through a union) but the constructor has not yet been called, is it legal to call one of the object's non-static methods, assuming the method does not depend on the value of any member variables?
I researched a bit and found some information about "variant members" but I couldn't find info pertaining to this example.
class D {
public:
D() { printf("D constructor!\n"); }
int a = 123;
void print () const {
printf("Pointer: %p\n", &a);
};
};
class C {
public:
C() {};
union {
D memory;
};
};
int main() {
C c;
c.memory.print();
}
In this example, I'm calling print() without the constructor ever being called. The intent is to later call the constructor, but even before the constructor is called, we know where variable a will reside. Obviously the value of a is uninitialized at this point, but print() doesn't care about the value.
This seems to work as expected when compiling with gcc and clang for c++11. But I'm wondering if I'm invoking some illegal or undefined behavior here.
I believe this is undefined behavior. Your variant member C::memory has not been initialized because the constructor of C does not provide an initializer [class.base.init]/9.2. Therefore, the lifetime of c.memory has not begun at the point where you call the method print() [basic.life]/1. Based on [basic.life]/7.2:
Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. […] The program has undefined behavior if:
[…]
the glvalue is used to call a non-static member function of the object, or
[…]
emphasis mine
Note: I am referring to the current C++ standard draft above, however, the relevant wording is basically the same for C++11 except that, in C++11, the fact that D has non-trivial initialization is crucial as what you're doing may otherwise potentially be OK in C++11…

Manually constructing a trivial base class via placement-new

Beware, we're skirting the dragon's lair.
Consider the following two classes:
struct Base {
std::string const *str;
};
struct Foo : Base {
Foo() { std::cout << *str << "\n"; }
};
As you can see, I'm accessing an uninitialized pointer. Or am I?
Let's assume I'm only working with Base classes that are trivial, nothing more than (potentially nested) bags of pointers.
static_assert(std::is_trivial<Base>{}, "!");
I would like to construct Foo in three steps:
Allocate raw storage for a Foo
Initialize a suitably-placed Base subobject via placement-new
Construct Foo via placement-new.
My implementation is as follows:
std::unique_ptr<Foo> makeFooWithBase(std::string const &str) {
static_assert(std::is_trivial<Base>{}, "!");
// (1)
auto storage = std::make_unique<
std::aligned_storage_t<sizeof(Foo), alignof(Foo)>
>();
Foo * const object = reinterpret_cast<Foo *>(storage.get());
Base * const base = object;
// (2)
new (base) Base{&str};
// (3)
new (object) Foo();
storage.release();
return std::unique_ptr<Foo>{object};
}
Since Base is trivial, my understanding is that:
Skipping the trivial destructor of the Base constructed at (2) is fine;
The trivial default constructor of the Base subobject constructed as part of the Foo at (3) does nothing;
And so Foo receives an initialized pointer, and all is well.
Of course, this is what happens in practice, even at -O3 (see for yourself!).
But is it safe, or will the dragon snatch and eat me one day?
This seems to be explicitly disallowed by the standard.
Ending an objects lifetime, and starting a new objects
lifetime in the same location is explicitly allowed,
unless it's a base class:
§3.8 Object Lifetime
§3.8.7 - If, after the lifetime of an object has ended and before the storage
which the object occupied is reused or released, a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, or the name of the original object will
automatically refer to the new object and, once the lifetime of the
new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location
which the original object occupied, and
the new object is of the
same type as the original object (ignoring the top-level
cv-qualifiers), and
[snip] and
the original object was a most derived object (1.8) of type T and the
new object is a most derived object of type T (that is, they are not
base class subobjects).

Is it allowed to call destructor explicitly followed by placement new on a variable with fixed lifetime?

I know that calling destructor explicitly can lead to undefined behavior because of double destructor calling, like here:
#include <vector>
int main() {
std::vector<int> foo(10);
foo.~vector<int>();
return 0; // Oops, destructor will be called again on return, double-free.
}
But, what if we call placement new to "resurrect" the object?
#include <vector>
int main() {
std::vector<int> foo(10);
foo.~vector<int>();
new (&foo) std::vector<int>(5);
return 0;
}
More formally:
What will happen in C++ (I'm interested in both C++03 and C++11, if there is a difference) if I explicitly call a destructor on some object which was not constructed with placement new in the first place (e.g. it's either local/global variable or was allocated with new) and then, before this object is destructed, call placement new on it to "restore" it?
If it's ok, is it guaranteed that all non-const references to that object will also be ok, as long as I don't use them while the object is "dead"?
If so, is it ok to use one of non-const references for placement new to resurrect the object?
What about const references?
Example usecase (though this question is more about curiosity): I want to "re-assign" an object which does not have operator=.
I've seen this question which says that "overriding" object which has non-static const members is illegal. So, let's limit scope of this question to objects which do not have any const members.
First, [basic.life]/8 clearly states that any pointers or references to the original foo shall refer to the new object you construct at foo in your case. In addition, the name foo will refer to the new object constructed there (also [basic.life]/8).
Second, you must ensure that there is an object of the original type the storage used for foo before exiting its scope; so if anything throws, you must catch it and terminate your program ([basic.life]/9).
Overall, this idea is often tempting, but almost always a horrible idea.
(8) If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
(8.1) the storage for the new object exactly overlays the storage location which the original object occupied, and
(8.2) the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
(8.3) the type of the original object is not const-qualified, and, if a class type, does not contain any non-static
data member whose type is const-qualified or a reference type, and
(8.4) the original object was a most derived object (1.8) of type
T and the new object is a most derived
object of type T (that is, they are not base class subobjects).
(9) If a program ends the lifetime of an object of type T with static (3.7.1), thread (3.7.2), or automatic (3.7.3) storage duration and if T has a non-trivial destructor, the program must ensure that an object of the
original type occupies that same storage location when the implicit destructor call takes place; otherwise the behavior of the program is undefined. This is true even if the block is exited with an exception.
There are reasons to manually run destructors and do placement new. Something as simple as operator= is not one of them, unless you are writing your own variant/any/vector or similar type.
If you really, really want to reassign an object, find a std::optional implementation, and create/destroy objects using that; it is careful, and you almost certainly won't be careful enough.
This is not a good idea, because you can still end up running the destructor twice if the constructor of the new object throws an exception. That is, the destructor will always run at the end of the scope, even if you leave the scope exceptionally.
Here is a sample program that exhibits this behavior (Ideone link):
#include <iostream>
#include <stdexcept>
using namespace std;
struct Foo
{
Foo(bool should_throw) {
if(should_throw)
throw std::logic_error("Constructor failed");
cout << "Constructed at " << this << endl;
}
~Foo() {
cout << "Destroyed at " << this << endl;
}
};
void double_free_anyway()
{
Foo f(false);
f.~Foo();
// This constructor will throw, so the object is not considered constructed.
new (&f) Foo(true);
// The compiler re-destroys the old value at the end of the scope.
}
int main() {
try {
double_free_anyway();
} catch(std::logic_error& e) {
cout << "Error: " << e.what();
}
}
This prints:
Constructed at 0x7fff41ebf03f
Destroyed at 0x7fff41ebf03f
Destroyed at 0x7fff41ebf03f
Error: Constructor failed

Does reuse storage start lifetime of a new object? [duplicate]

This question already has answers here:
Is it allowed to write an instance of Derived over an instance of Base?
(4 answers)
Closed 8 years ago.
#include <cstdlib>
struct B {
virtual void f();
void mutate();
virtual ~B();
};
struct D1 : B { void f(); };
struct D2 : B { void f(); };
void B::mutate() {
new (this) D2; // reuses storage — ends the lifetime of *this
f(); // undefined behavior - WHY????
... = this; // OK, this points to valid memory
}
I need to be explained why f() invokation has UB? new (this) D2; reuses storage, but it also call a constructor for D2 and since starts lifetime of a new object. In that case f() equals to this -> f(). That is we just call f() member function of D2. Who knows why it is UB?
The standard shows this example § 3.8 67 N3690:
struct C {
int i;
void f();
const C& operator=( const C& );
};
const C& C::operator=( const C& other) {
if ( this != &other ) {
this->~C(); // lifetime of *this ends
new (this) C(other); // new object of type C created
f(); // well-defined
}
return *this;
}
C c1;
C c2;
c1 = c2; // well-defined
c1.f(); // well-defined; c1 refers to a new object of type C
Notice that this example is terminating the lifetime of the object before constructing the new object in-place (compare to your code, which does not call the destructor).
But even if you did, the standard also says:
If, after the lifetime of an object has ended and before the storage
which the object occupied is reused or released, a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, or the name of the original object will
automatically refer to the new object and, once the lifetime of the
new object has started, can be used to manipulate the new object, if:
— the storage for the new object exactly overlays the storage location
which the original object occupied, and — the new object is of the
same type as the original object (ignoring the top-level
cv-qualifiers), and
— the type of the original object is not
const-qualified, and, if a class type, does not contain any non-static
data member whose type is const-qualified or a reference type, and
— the original object was a most derived object (1.8) of type T and the
new object is a most derived object of type T (that is, they are not
base class subobjects).
notice the 'and' words, the above conditions must all be fulfilled.
Since you're not fulfilling all the conditions (you have a derived object in-placed into the memory space of a base class object), you have undefined behavior when referencing stuff with an implicit or explicit use of this pointer.
Depending on the compiler implementation this might or might now blow because a base class virtual object reserves some space for the vtable, in-place constructing an object of a derived type which overrides some of the virtual functions means the vtable might be different, put alignment issues and other low-level internals and you'll have that a simple sizeof won't suffice to determine if your code is right or not.
This construct is very interesting:
The placement-new is not guaranteed to call the destructor of the object. So this code will not properly ensure end of life of the object.
So in principle you should call the destructor before reusing the object. But then you would continue to execute a member function of an object that is dead. According to standard section.9.3.1/2 If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
If you don't explicitely delete your object, as you do in your code, you then recreate a new object (constructing a second B without destoying the first one, then D2 ot top of this new B).
When the creation of your new object is finished, the identity of your current object has in fact changed while executing the function. You cannot be sure if the pointer to the virtual function that will be called was read before your placement-new (thus the old pointer to D1::f) or after (thus D2::f).
By the way, it's exactly for this reason, that there are some constraints about what you can or can't do in a union, where a same memory place is shared for different active objects (see Point 9.5/2 and perticularly point 9.5/4 in the standard).