Swapping storage buffers containing placement new created objects - c++

I recently saw a piece of code which used storage buffers to create objects and then simply swapped the buffers in order to avoid the copying overhead. Here is a simple example using integers:
std::aligned_storage_t<sizeof(int), alignof(int)> storage1;
std::aligned_storage_t<sizeof(int), alignof(int)> storage2;
new (&storage1) int(1);
new (&storage2) int(2);
std::swap(storage1, storage2);
int i1 = reinterpret_cast<int&>(storage1);
int i2 = reinterpret_cast<int&>(storage2);
//this prints 2 1
std::cout << i1 << " " << i2 << std::endl;
This feels like undefined behaviour in the general case (specifically swapping the buffers and then accessing the objects as if they were still there) but I am not sure what the standard says about such usage of storage and placement new. Any feedback is much appreciated!

I suspect there are a few factors rendering this undefined, but we only need one:
[C++11: 3.8/1]: [..] The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
All subsequent use is use after end-of-life, which is bad and wrong.
The key is that each buffer is being reused.
So, although I would expect this to work in practice at least for trivial types (and for some classes), it's undefined.
The following may have been able to save you:
[C++11: 3.8/7]: If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object [..]
…except that you are not creating a new object.
It may or may not be worth noting here that, surprisingly, the ensuing implicit destructor calls are both well-defined:
[C++11: 3.8/8]: If a program ends the lifetime of an object of type T with static (3.7.1), thread (3.7.2), or automatic (3.7.3) storage duration and if T has a non-trivial destructor, the program must ensure that an object of the original type occupies that same storage location when the implicit destructor call takes place; otherwise the behavior of the program is undefined.

Related

Is accessing memory after a destructor call undefined behavior?

I'm wondering if the following is undefined?
int main()
{
struct Doggy { int a; ~Doggy() {} };
Doggy* p = new Doggy[100];
p[50].~Doggy();
p[50].a = 3; // Is this not allowed? The destructor was called on an
// object occupying that area of memory.
// Can I access it safely?
if (p[50].a == 3);
}
I guess this is generally good to know, but the reason I'm specifically wanting to know is that I have a data structure consisting of an array, where the buckets can be nullable by setting a value, kind of like buckets in a hash table array. And when the bucket is emptied the destructor is called, but then checking and setting the null state after the destructor is called I'm wondering if it's illegal.
To elaborate a little, say I have an array of objects and each object can be made to represent null in each bucket, such as:
struct Handle
{
int value = 0; // Zero is null value
~Handle(){}
};
int main()
{
Handle* p = new Handle[100];
// Remove object 50
p[50].~Handle();
p[50].value = 0; // Set to null
if (p[50].value == 0) ; // Then it's null, can I count on this?
// Is this defined? I'm accessing memory that was occupied by
// object that was destroyed.
}
Yes it'll be UB:
[class.dtor/19]
Once a destructor is invoked for an object, the object's lifetime ends; the behavior is undefined if the destructor is invoked for an object whose lifetime has ended ([basic.life]).
[Example 2: If the destructor for an object with automatic storage duration is explicitly invoked, and the block is subsequently left in a manner that would ordinarily invoke implicit destruction of the object, the behavior is undefined. — end example]
p[50].~Handle(); and later delete[] p; will make it call the destructor for an object whose lifetime has ended.
For p[50].value = 0; after the lifetime of the object has ended, this applies:
[basic.life/6]
Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that represents the address of the storage location where the object will be or was located may be used but only in limited ways. For an object under construction or destruction, see [class.cdtor]. Otherwise, such a pointer refers to allocated storage ([basic.stc.dynamic.allocation]), and using the pointer as if the pointer were of type void* is well-defined. Indirection through such a pointer is permitted but the resulting lvalue may only be used in limited ways, as described below. The program has undefined behavior if:
6.2 - the pointer is used to access a non-static data member or call a non-static member function of the object
Yes, it's mostly. Handle::value is just an offset to a pointer of type Handle, so it's just going to work wherever you point it to, even if the containing object isn't currently constructed. If you were to use anything with virtual keyword, this would end up broken though.
p[50].~Handle(); this however is a different beast. You should never invoke destructors manually unless you have also explicitly invoked the constructor with placement new. Still not illegal, but dangerous.
delete[] p; (omitted in your example!) is where you end up with double-destruction, at which point you are well beyond UB, straight up in the "it's broken" domain.

Storage reuse in C++

I have been trying to understand storage reuse in C++. Imagine we have an object a with a non-trivial destructor whose storage is reused with a placement new-expression:
struct A {
~A() { std::cout << "~A()" << std::endl; }
};
struct B: A {};
A* a = new A; // lifetime of *a begins
A* b = new(a) B; // storage reuse, lifetime of *b begins
[basic.life/8] specifies:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object.
Since in my example the lifetime of *a has not ended when we reuse the storage it occupies, we cannot apply that rule. So what rule describes the behavior in my case?
The applicable rule for this is laid out in §3.8 [basic.life]/p1 and 4:
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
4 A program may end the lifetime of any object by reusing the storage
which the object occupies or by explicitly calling the destructor for
an object of a class type with a non-trivial destructor. For an object
of a class type with a non-trivial destructor, the program is not
required to call the destructor explicitly before the storage which
the object occupies is reused or released; however, if there is no
explicit call to the destructor or if a delete-expression (5.3.5) is
not used to release the storage, the destructor shall not be
implicitly called and any program that depends on the side effects
produced by the destructor has undefined behavior.
So A *b = new (a) B; reuses the storage of the A object created in the previous statement, which is well-defined behavior provided that sizeof(A) >= sizeof(B)*. That A object's lifetime has ended by virtue of its storage being reused. A's destructor is not called for that object, and if your program depends on the side effect produced by that destructor, it has undefined behavior.
The paragraph you cited, §3.8 [basic.life]/p7, governs when a pointer/reference to the original object can be reused. Since this code doesn't satisfy the criteria listed in that paragraph, you may only use a only in the limited ways permitted by §3.8 [basic.life]/p5-6, or undefined behavior results (example and footnote omitted):
5 Before the lifetime of an object has started but after the storage
which the object will occupy has been allocated or, after the lifetime
of an object has ended and before the storage which the object
occupied is reused or released, any pointer that refers to the storage
location where the object will be or was located may be used but only
in limited ways. For an object under construction or destruction, see
12.7. Otherwise, such a pointer refers to allocated storage (3.7.4.2), and using the pointer as if the pointer were of type void*, is
well-defined. Such a pointer may be dereferenced but the resulting
lvalue may only be used in limited ways, as described below. The
program has undefined behavior if:
the object will be or was of a class type with a non-trivial destructor and the pointer is used as the operand of a
delete-expression,
the pointer is used to access a non-static data member or call a non-static member function of the object, or
the pointer is implicitly converted (4.10) to a pointer to a base class type, or
the pointer is used as the operand of a static_cast (5.2.9) (except when the conversion is to void*, or to void* and
subsequently to char*, or unsigned char*), or
the pointer is used as the operand of a dynamic_cast (5.2.7).
6 Similarly, before the lifetime of an object has started but after
the storage which the object will occupy has been allocated or, after
the lifetime of an object has ended and before the storage which the
object occupied is reused or released, any glvalue that refers to the
original object may be used but only in limited ways. For an object
under construction or destruction, see 12.7. Otherwise, such a glvalue
refers to allocated storage (3.7.4.2), and using the properties of the
glvalue that do not depend on its value is well-defined. The program
has undefined behavior if:
an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,
the glvalue is used to access a non-static data member or call a non-static member function of the object, or
the glvalue is implicitly converted (4.10) to a reference to a base class type, or
the glvalue is used as the operand of a static_cast (5.2.9) except when the conversion is ultimately to cv char& or cv unsigned char&, or
the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.
* To prevent UB from cases where sizeof(B) > sizeof(A), we can rewrite A *a = new A; as char c[sizeof(A) + sizeof(B)]; A* a = new (c) A;.
There are some potential problems with this:
If B is larger than A, it will overwrite bytes not allocated - which is undefined behaviour.
Destructor of A is not called for a (or b - your code doesn't show whether you delete a or delete b or neither). This is very important if either for A or B destructor is doing something like reference counting, locks, memory deallocation (including std:: containers such as std::vector or std::string), etc.
If a is not used again after you create b, you still need to call the A destructor to make sure it's lifetime is over - see the example in the third bulled after the section you quoted. So if your purpose was to avoid the "expensive" destructor call, then your code is failing to abide by the rules given in section 3.8/7 of the standard.
You are also breaching the bullet of:
The original object was a most derived object (1.8) of type T and the new object is a most derived object of type T.
as A is not the most derived type.
In summary, "broken". Even in cases where it does work (e.g. changing to A* a = new B;), it should be discouraged, as it can lead to subtle and difficult bugs.
As an addendum, in order to do this correctly you may call the destructor explicitly.
Note: the located memory is of size B to accommodate the potential size between A and B.
Note 2: with your implementation of class A this will not work. ~A() must be made virtual!!
A *b = new B; //Lifetime of b is starting. It is important that we use `new B` rather than `new A` so as to get the correct size.
b->~B(); //lifetime of b has ended. The memory still remain allocated however.
A *a = new (a) A; //lifetime of a is starting
a->~A(); // lifetime of a has ended
// a is still allocated but in an undefined state
::operator delete(b); // release the memory allocated without calling the destructor. This is different from calling 'delete b'
I believe that calling operator delete on a base pointer should be safe. Please do correct me if this is not the case.
Alternatively, if you allocate the memory for a as a char buffer, you can then use placement new to construct A and B objects, and safely call delete[] to deallocate the buffer (since char has a trivial destructor):
char* buf = new char[sizeof(B)];
A *a = new (a) A;
a->~();
A *b = new (a) B;
b->~B();
delete[] buf;

Assigning to a reference in C++ class operator=() [duplicate]

Is the following legal in C++?
As far as I can tell, Reference has a trivial destructor, so it should be legal.
But I thought references can't be rebound legally... can they?
template<class T>
struct Reference
{
T &r;
Reference(T &r) : r(r) { }
};
int main()
{
int x = 5, y = 6;
Reference<int> r(x);
new (&r) Reference<int>(y);
}
You aren't rebinding a reference, you're creating a new object in the memory of another one with a placement new. Since the destructor of the old object was never run I think this would be undefined behavior.
There is no reference being rebound in your example. The first reference (constructed on line two with the name r.r) is bound to the int denoted by x for the entire of its lifetime. This reference's lifetime is ended when the storage for its containing object is re-used by the placement new expression on line three. The replacement object contains a reference which is bound y for its entire lifetime which lasts until the end of its scope - the end of main.
I think I found the answer in a passage below the "quoted" one that talks about trivial dtor / dtor side effects, namely [basic.life]/7:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can
be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
By reusing the storage, we end the lifetime of original object [basic.life]/1
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor, the destructor call starts, or
the storage which the object occupies is reused or released.
So I think [basic.life]/7 covers the situation
Reference<int> r(x);
new (&r) Reference<int>(y);
where we end the lifetime of the object denoted by r, and create a new object at the same location.
As Reference<int> is a class type with a reference data member, the requirements of [basic.life]/7 are not fulfilled. That is, r might not even refer to the new object, and we may not use it to "manipulate" this newly created object (I interpret this "manipulate" also as read-only accesses).

Can I set a member variable before constructor call?

I started to implement an ID based memory pool, where every element has an id, which is basically an index in a vector. In this special case I know the index before I construct the object itself so I thought I set the ID before I call the constructor.
Some details
Allocating an object from an ID based pool is the following:
allocate a free id from the pool
get a memory address based on the id value
construct the object on the memory address
set the ID member of the object
and the deallocation is based on that id
here is the code (thanks jrok):
#include <new>
#include <iostream>
struct X
{
X()
{
// id come from "nothing"
std::cout << "X constructed with id: " << id << std::endl;
}
int id;
};
int main()
{
void* buf = operator new(sizeof(X));
// can I set the ID before the constructor call
((X*)buf)->id = 42;
new (buf) X;
std::cout << ((X*)buf)->id;
}
EDIT
I found a stock solution for this in boost sandbox:
sandbox Boost.Tokenmap
Can I set a member variable before constructor call?
No, but you can make a base class with ID that sets ID within its constructor (and throws exception if ID can't be allocated, for example). Derive from that class, and at the moment derived class enter constructor, ID will be already set. You could also manage id generation within another class - either within some kind of global singleton, or you could pass id manager as a first parameter to constructor.
typedef int Id;
class IdObject{
public:
Id getId() const{
return id;
}
protected:
IdManager* getIdManager() ...
IdObject()
:id(0){
IdManager* manager = getIdManager();
id = manager->generateId();
if (!id)
throw IdException;
manager->registerId(id, this);
}
~IdObject(){
if (id)
getIdManager()->unregisterId(id, this);
}
private:
Id id;
IdObject& operator=(IdObject &other){
}
IdObject(IdObject &other)
:id(0){
}
};
class DerivedObject: public IdObject{
public:
DerivedObject(){
//at this point, id is set.
}
};
This kind of thing.
Yes, you can do what you're doing, but it's really not a good idea. According to the standard, your code invokes Undefined Behaviour:
3.8 Object lifetime [basic.life]
The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization
if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial
default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. —
end note ] The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and
— if the object has non-trivial initialization, its initialization is complete.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
— the storage which the object occupies is reused or released.
Before the lifetime of an object has started but after the storage which the object will occupy has been
allocated or, after the lifetime of an object has ended and before the storage which the object occupied is
reused or released, any pointer that refers to the storage location where the object will be or was located
may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise,
such a pointer refers to allocated storage (3.7.4.2), and using the pointer as if the pointer were of type void*,
is well-defined. Such a pointer may be dereferenced but the resulting lvalue may only be used in limited
ways, as described below. The program has undefined behavior if:
— the pointer is used to access a non-static data member or call a non-static member function of the
object
When your code invokes Undefined Behaviour, the implementation is allowed to do anything it wants to. In most cases nothing will happen - and if you're lucky your compiler will warn you - but occasionally the result will be unexpectedly catastrophic.
You describe a pool of N objects of the same type, using a contiguous array as the underlying storage. Note that in this scenario you do not need to store an integer ID for each allocated object - if you have a pointer to the allocated object, you can derive the ID from the offset of the object within the array like so:
struct Object
{
};
const int COUNT = 5; // allow enough storage for COUNT objects
char storage[sizeof(Object) * COUNT];
// interpret the storage as an array of Object
Object* pool = static_cast<Object*>(static_cast<void*>(storage));
Object* p = pool + 3; // get a pointer to the third slot in the pool
int id = p - pool; // find the ID '3' for the third slot
No, you cannot set anything in an object before its constructor is called. However, you have a couple of choices:
Pass the ID to the constructor itself, so it can store the ID in the object.
Allocate extra memory in front of the object being constructed, store the ID in that extra memory, then have the object access that memory when needed.
If you know the object's to-be address, which is the case for your scenario, then yes you can do that kind of thing. However, it is not well-defined behaviour, so it's most probably not a good idea (and in every case not good design). Although it will probably "work fine".
Using a std::map as suggested in a comment above is cleaner and has no "ifs" and "whens" of UB attached.
Despite writing to a known memory address will probably be "working fine", an object doesn't exist before the constructor is run, so using any of its members is bad mojo.
Anything is possible. No compiler will likely do any such thing, but the compiler might for example memset the object's storage with zero before running the constructor, so even if you don't set your ID field, it's still overwritten. You have no way of knowing, since what you're doing is undefined.
Is there a reason you want to do this before the constructor call?
Allocating an object from an ID based pool is the following:
1) allocate a free id from the pool
2) get a memory address based on the id value
3) construct the object on the memory address
4) set the ID member of the object and the deallocation is based on that id
According to your steps, you are setting the ID after the constructor.
so I thought I set the ID before I call the constructor.
I hate to be blunt, but you need to have a better reason than that to wade into the undefined behaviour territory. Remember, as programmers, there is a lot we're learning all the time and unless there is absolutely no way around it, we need to stay away from minefields, undefined behavior being one of them.
As other people have pointed out, yes you can do it, but that's like saying you can do rm -rf / as root. Doesn't mean you should :)
C makes it easy to shoot yourself in the foot. C++ makes it harder, but when you do, you blow away your whole leg! — Bjarne Stroustrup

Does destroying and recreating an object make all pointers to this object invalid?

This is a follow-up to this question. Suppose I have this code:
class Class {
public virtual method()
{
this->~Class();
new( this ) Class();
}
};
Class* object = new Class();
object->method();
delete object;
which is a simplified version of what this answer suggests.
Now once a destructor is invoked from within method() the object lifetime ends and the pointer variable object in the calling code becomes invalid. Then the new object gets created at the same location.
Does this make the pointer to the object in the calling valid again?
This is explicitly approved in 3.8:7:
3.8 Object lifetime [basic.life]
7 - If, after the lifetime of an object has ended [...], a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object [...] can be used to manipulate the new object, if: (various requirements which are satisfied in this case)
The example given is:
struct C {
int i;
void f();
const C& operator=( const C& );
};
const C& C::operator=( const C& other) {
if ( this != &other ) {
this->~C(); // lifetime of *this ends
new (this) C(other); // new object of type C created
f(); // well-defined
}
return *this;
}
Strictly, this is fine. However, without extreme care, it will become a hideous piece of UB. For example, any derived classes calling this method won't get the right type re-constructed- or what happens if Class() throws an exception. Furthermore, this doesn't really accomplish anything.
It's not strictly UB, but is a giant pile of crap and fail and should be burned on sight.
The object pointer doesn't become invalid at any time (assuming your destructor doesn't call delete this). Your object was never deallocated, it has only called it's destructor, i.e. it has cleaned up its internal state (with regard to implementation, please note that standard strictly defines that the object is destroyed after destructor call). As you have used placement new to instantiate the new object at the exactly same address, it is technically ok.
This exact scenario is covered by section 3.8.7 of C++ standard:
If, after the lifetime of an object has ended and before the storage
which the object occupied is reused or released, a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, or the name of the original object will
automatically refer to the new object and, once the lifetime of the
new object has started, can be used to manipulate the new object [...]
That said, this is interesting only as learning code, as production code, this is horrible :)
The pointer only knows its address and as soon as you can confirm that the address of the new object is the one the pointer is pointing to, the answer is yes.
There are some cases where people would believe the address does not change, but in some cases it does change, e.g. when using C's realloc(). But that's another story.
You may want to reconsider explicitly calling the destructor. If there's code you want executed that happens to be in the destructor, move that code to a new method and call that method from the destructor to preserve the current functionality. The destructor is really meant to be used for objects going out of scope.
Creating a new object at the location of a destroyed one does not make any pointers valid again. They may point to a valid new object, but not the object you were originally referencing.
You should guarantee that all references were removed or somehow marked as invalid before destroying the original object.
This would be a particularly difficult situation to debug.