Is accessing memory after a destructor call undefined behavior? - c++

I'm wondering if the following is undefined?
int main()
{
struct Doggy { int a; ~Doggy() {} };
Doggy* p = new Doggy[100];
p[50].~Doggy();
p[50].a = 3; // Is this not allowed? The destructor was called on an
// object occupying that area of memory.
// Can I access it safely?
if (p[50].a == 3);
}
I guess this is generally good to know, but the reason I'm specifically wanting to know is that I have a data structure consisting of an array, where the buckets can be nullable by setting a value, kind of like buckets in a hash table array. And when the bucket is emptied the destructor is called, but then checking and setting the null state after the destructor is called I'm wondering if it's illegal.
To elaborate a little, say I have an array of objects and each object can be made to represent null in each bucket, such as:
struct Handle
{
int value = 0; // Zero is null value
~Handle(){}
};
int main()
{
Handle* p = new Handle[100];
// Remove object 50
p[50].~Handle();
p[50].value = 0; // Set to null
if (p[50].value == 0) ; // Then it's null, can I count on this?
// Is this defined? I'm accessing memory that was occupied by
// object that was destroyed.
}

Yes it'll be UB:
[class.dtor/19]
Once a destructor is invoked for an object, the object's lifetime ends; the behavior is undefined if the destructor is invoked for an object whose lifetime has ended ([basic.life]).
[Example 2: If the destructor for an object with automatic storage duration is explicitly invoked, and the block is subsequently left in a manner that would ordinarily invoke implicit destruction of the object, the behavior is undefined. — end example]
p[50].~Handle(); and later delete[] p; will make it call the destructor for an object whose lifetime has ended.
For p[50].value = 0; after the lifetime of the object has ended, this applies:
[basic.life/6]
Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that represents the address of the storage location where the object will be or was located may be used but only in limited ways. For an object under construction or destruction, see [class.cdtor]. Otherwise, such a pointer refers to allocated storage ([basic.stc.dynamic.allocation]), and using the pointer as if the pointer were of type void* is well-defined. Indirection through such a pointer is permitted but the resulting lvalue may only be used in limited ways, as described below. The program has undefined behavior if:
6.2 - the pointer is used to access a non-static data member or call a non-static member function of the object

Yes, it's mostly. Handle::value is just an offset to a pointer of type Handle, so it's just going to work wherever you point it to, even if the containing object isn't currently constructed. If you were to use anything with virtual keyword, this would end up broken though.
p[50].~Handle(); this however is a different beast. You should never invoke destructors manually unless you have also explicitly invoked the constructor with placement new. Still not illegal, but dangerous.
delete[] p; (omitted in your example!) is where you end up with double-destruction, at which point you are well beyond UB, straight up in the "it's broken" domain.

Related

What kind of value does a pointer hold after using it to explicitly call the pointed object's destructor?

From https://timsong-cpp.github.io/cppwp/basic.compound#3 :
Every value of pointer type is one of the following:
a pointer to an object or function (the pointer is said to point to the object or function), or
a pointer past the end of an object ([expr.add]), or
the null pointer value for that type, or
an invalid pointer value.
After using a pointer to explicit call an object's destructor, which of these four kinds of value does the pointer have? Example :
#include <vector>
struct foo {
std::vector<int> m;
};
int main()
{
auto f = new foo;
f->~foo();
// What is the value of `f` here?
}
I don't believe it can be a pointer to an object or function. There is no longer an object to point to
and it is not a function pointer.
I don't believe it can be a pointer past the end of an object. There wasn't any sort of pointer arithmetic and no array is involved.
I don't believe it can be a null pointer value since the pointer is not nullptr. It still points to the storage the object had, you could use it to perform placement new.
I don't believe it can be an invalid pointer value. Invalid pointer values are associated with the end of storage duration, not object lifetime. "A pointer value becomes invalid when the storage it denotes reaches the end of its storage duration". The storage is still valid.
It seems to me like there is no pointer value the pointer could have. Where did I go wrong?
It's the pointer to the object, but the object is just not within its lifetime.
In [basic.compound], footnote 42):
For an object that is not within its lifetime, this is the first byte in memory that it will occupy or used to occupy.
It is a pointer that pointed to the object that no longer exists. Of the possible values that the standard lists, only pointer to an object can apply, but only if we consider objects outside of their lifetime to be included in that definition. Invalid could apply if that could include pointers to storage with no objects.
If neither of these can apply, then the list of all values would be defective.
If you were to create a new object into the storage, then this would apply:
[basic.life]
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object.

Swapping storage buffers containing placement new created objects

I recently saw a piece of code which used storage buffers to create objects and then simply swapped the buffers in order to avoid the copying overhead. Here is a simple example using integers:
std::aligned_storage_t<sizeof(int), alignof(int)> storage1;
std::aligned_storage_t<sizeof(int), alignof(int)> storage2;
new (&storage1) int(1);
new (&storage2) int(2);
std::swap(storage1, storage2);
int i1 = reinterpret_cast<int&>(storage1);
int i2 = reinterpret_cast<int&>(storage2);
//this prints 2 1
std::cout << i1 << " " << i2 << std::endl;
This feels like undefined behaviour in the general case (specifically swapping the buffers and then accessing the objects as if they were still there) but I am not sure what the standard says about such usage of storage and placement new. Any feedback is much appreciated!
I suspect there are a few factors rendering this undefined, but we only need one:
[C++11: 3.8/1]: [..] The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
All subsequent use is use after end-of-life, which is bad and wrong.
The key is that each buffer is being reused.
So, although I would expect this to work in practice at least for trivial types (and for some classes), it's undefined.
The following may have been able to save you:
[C++11: 3.8/7]: If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object [..]
…except that you are not creating a new object.
It may or may not be worth noting here that, surprisingly, the ensuing implicit destructor calls are both well-defined:
[C++11: 3.8/8]: If a program ends the lifetime of an object of type T with static (3.7.1), thread (3.7.2), or automatic (3.7.3) storage duration and if T has a non-trivial destructor, the program must ensure that an object of the original type occupies that same storage location when the implicit destructor call takes place; otherwise the behavior of the program is undefined.

Storage reuse in C++

I have been trying to understand storage reuse in C++. Imagine we have an object a with a non-trivial destructor whose storage is reused with a placement new-expression:
struct A {
~A() { std::cout << "~A()" << std::endl; }
};
struct B: A {};
A* a = new A; // lifetime of *a begins
A* b = new(a) B; // storage reuse, lifetime of *b begins
[basic.life/8] specifies:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object.
Since in my example the lifetime of *a has not ended when we reuse the storage it occupies, we cannot apply that rule. So what rule describes the behavior in my case?
The applicable rule for this is laid out in §3.8 [basic.life]/p1 and 4:
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
4 A program may end the lifetime of any object by reusing the storage
which the object occupies or by explicitly calling the destructor for
an object of a class type with a non-trivial destructor. For an object
of a class type with a non-trivial destructor, the program is not
required to call the destructor explicitly before the storage which
the object occupies is reused or released; however, if there is no
explicit call to the destructor or if a delete-expression (5.3.5) is
not used to release the storage, the destructor shall not be
implicitly called and any program that depends on the side effects
produced by the destructor has undefined behavior.
So A *b = new (a) B; reuses the storage of the A object created in the previous statement, which is well-defined behavior provided that sizeof(A) >= sizeof(B)*. That A object's lifetime has ended by virtue of its storage being reused. A's destructor is not called for that object, and if your program depends on the side effect produced by that destructor, it has undefined behavior.
The paragraph you cited, §3.8 [basic.life]/p7, governs when a pointer/reference to the original object can be reused. Since this code doesn't satisfy the criteria listed in that paragraph, you may only use a only in the limited ways permitted by §3.8 [basic.life]/p5-6, or undefined behavior results (example and footnote omitted):
5 Before the lifetime of an object has started but after the storage
which the object will occupy has been allocated or, after the lifetime
of an object has ended and before the storage which the object
occupied is reused or released, any pointer that refers to the storage
location where the object will be or was located may be used but only
in limited ways. For an object under construction or destruction, see
12.7. Otherwise, such a pointer refers to allocated storage (3.7.4.2), and using the pointer as if the pointer were of type void*, is
well-defined. Such a pointer may be dereferenced but the resulting
lvalue may only be used in limited ways, as described below. The
program has undefined behavior if:
the object will be or was of a class type with a non-trivial destructor and the pointer is used as the operand of a
delete-expression,
the pointer is used to access a non-static data member or call a non-static member function of the object, or
the pointer is implicitly converted (4.10) to a pointer to a base class type, or
the pointer is used as the operand of a static_cast (5.2.9) (except when the conversion is to void*, or to void* and
subsequently to char*, or unsigned char*), or
the pointer is used as the operand of a dynamic_cast (5.2.7).
6 Similarly, before the lifetime of an object has started but after
the storage which the object will occupy has been allocated or, after
the lifetime of an object has ended and before the storage which the
object occupied is reused or released, any glvalue that refers to the
original object may be used but only in limited ways. For an object
under construction or destruction, see 12.7. Otherwise, such a glvalue
refers to allocated storage (3.7.4.2), and using the properties of the
glvalue that do not depend on its value is well-defined. The program
has undefined behavior if:
an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,
the glvalue is used to access a non-static data member or call a non-static member function of the object, or
the glvalue is implicitly converted (4.10) to a reference to a base class type, or
the glvalue is used as the operand of a static_cast (5.2.9) except when the conversion is ultimately to cv char& or cv unsigned char&, or
the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.
* To prevent UB from cases where sizeof(B) > sizeof(A), we can rewrite A *a = new A; as char c[sizeof(A) + sizeof(B)]; A* a = new (c) A;.
There are some potential problems with this:
If B is larger than A, it will overwrite bytes not allocated - which is undefined behaviour.
Destructor of A is not called for a (or b - your code doesn't show whether you delete a or delete b or neither). This is very important if either for A or B destructor is doing something like reference counting, locks, memory deallocation (including std:: containers such as std::vector or std::string), etc.
If a is not used again after you create b, you still need to call the A destructor to make sure it's lifetime is over - see the example in the third bulled after the section you quoted. So if your purpose was to avoid the "expensive" destructor call, then your code is failing to abide by the rules given in section 3.8/7 of the standard.
You are also breaching the bullet of:
The original object was a most derived object (1.8) of type T and the new object is a most derived object of type T.
as A is not the most derived type.
In summary, "broken". Even in cases where it does work (e.g. changing to A* a = new B;), it should be discouraged, as it can lead to subtle and difficult bugs.
As an addendum, in order to do this correctly you may call the destructor explicitly.
Note: the located memory is of size B to accommodate the potential size between A and B.
Note 2: with your implementation of class A this will not work. ~A() must be made virtual!!
A *b = new B; //Lifetime of b is starting. It is important that we use `new B` rather than `new A` so as to get the correct size.
b->~B(); //lifetime of b has ended. The memory still remain allocated however.
A *a = new (a) A; //lifetime of a is starting
a->~A(); // lifetime of a has ended
// a is still allocated but in an undefined state
::operator delete(b); // release the memory allocated without calling the destructor. This is different from calling 'delete b'
I believe that calling operator delete on a base pointer should be safe. Please do correct me if this is not the case.
Alternatively, if you allocate the memory for a as a char buffer, you can then use placement new to construct A and B objects, and safely call delete[] to deallocate the buffer (since char has a trivial destructor):
char* buf = new char[sizeof(B)];
A *a = new (a) A;
a->~();
A *b = new (a) B;
b->~B();
delete[] buf;

Can I set a member variable before constructor call?

I started to implement an ID based memory pool, where every element has an id, which is basically an index in a vector. In this special case I know the index before I construct the object itself so I thought I set the ID before I call the constructor.
Some details
Allocating an object from an ID based pool is the following:
allocate a free id from the pool
get a memory address based on the id value
construct the object on the memory address
set the ID member of the object
and the deallocation is based on that id
here is the code (thanks jrok):
#include <new>
#include <iostream>
struct X
{
X()
{
// id come from "nothing"
std::cout << "X constructed with id: " << id << std::endl;
}
int id;
};
int main()
{
void* buf = operator new(sizeof(X));
// can I set the ID before the constructor call
((X*)buf)->id = 42;
new (buf) X;
std::cout << ((X*)buf)->id;
}
EDIT
I found a stock solution for this in boost sandbox:
sandbox Boost.Tokenmap
Can I set a member variable before constructor call?
No, but you can make a base class with ID that sets ID within its constructor (and throws exception if ID can't be allocated, for example). Derive from that class, and at the moment derived class enter constructor, ID will be already set. You could also manage id generation within another class - either within some kind of global singleton, or you could pass id manager as a first parameter to constructor.
typedef int Id;
class IdObject{
public:
Id getId() const{
return id;
}
protected:
IdManager* getIdManager() ...
IdObject()
:id(0){
IdManager* manager = getIdManager();
id = manager->generateId();
if (!id)
throw IdException;
manager->registerId(id, this);
}
~IdObject(){
if (id)
getIdManager()->unregisterId(id, this);
}
private:
Id id;
IdObject& operator=(IdObject &other){
}
IdObject(IdObject &other)
:id(0){
}
};
class DerivedObject: public IdObject{
public:
DerivedObject(){
//at this point, id is set.
}
};
This kind of thing.
Yes, you can do what you're doing, but it's really not a good idea. According to the standard, your code invokes Undefined Behaviour:
3.8 Object lifetime [basic.life]
The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization
if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial
default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. —
end note ] The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and
— if the object has non-trivial initialization, its initialization is complete.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
— the storage which the object occupies is reused or released.
Before the lifetime of an object has started but after the storage which the object will occupy has been
allocated or, after the lifetime of an object has ended and before the storage which the object occupied is
reused or released, any pointer that refers to the storage location where the object will be or was located
may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise,
such a pointer refers to allocated storage (3.7.4.2), and using the pointer as if the pointer were of type void*,
is well-defined. Such a pointer may be dereferenced but the resulting lvalue may only be used in limited
ways, as described below. The program has undefined behavior if:
— the pointer is used to access a non-static data member or call a non-static member function of the
object
When your code invokes Undefined Behaviour, the implementation is allowed to do anything it wants to. In most cases nothing will happen - and if you're lucky your compiler will warn you - but occasionally the result will be unexpectedly catastrophic.
You describe a pool of N objects of the same type, using a contiguous array as the underlying storage. Note that in this scenario you do not need to store an integer ID for each allocated object - if you have a pointer to the allocated object, you can derive the ID from the offset of the object within the array like so:
struct Object
{
};
const int COUNT = 5; // allow enough storage for COUNT objects
char storage[sizeof(Object) * COUNT];
// interpret the storage as an array of Object
Object* pool = static_cast<Object*>(static_cast<void*>(storage));
Object* p = pool + 3; // get a pointer to the third slot in the pool
int id = p - pool; // find the ID '3' for the third slot
No, you cannot set anything in an object before its constructor is called. However, you have a couple of choices:
Pass the ID to the constructor itself, so it can store the ID in the object.
Allocate extra memory in front of the object being constructed, store the ID in that extra memory, then have the object access that memory when needed.
If you know the object's to-be address, which is the case for your scenario, then yes you can do that kind of thing. However, it is not well-defined behaviour, so it's most probably not a good idea (and in every case not good design). Although it will probably "work fine".
Using a std::map as suggested in a comment above is cleaner and has no "ifs" and "whens" of UB attached.
Despite writing to a known memory address will probably be "working fine", an object doesn't exist before the constructor is run, so using any of its members is bad mojo.
Anything is possible. No compiler will likely do any such thing, but the compiler might for example memset the object's storage with zero before running the constructor, so even if you don't set your ID field, it's still overwritten. You have no way of knowing, since what you're doing is undefined.
Is there a reason you want to do this before the constructor call?
Allocating an object from an ID based pool is the following:
1) allocate a free id from the pool
2) get a memory address based on the id value
3) construct the object on the memory address
4) set the ID member of the object and the deallocation is based on that id
According to your steps, you are setting the ID after the constructor.
so I thought I set the ID before I call the constructor.
I hate to be blunt, but you need to have a better reason than that to wade into the undefined behaviour territory. Remember, as programmers, there is a lot we're learning all the time and unless there is absolutely no way around it, we need to stay away from minefields, undefined behavior being one of them.
As other people have pointed out, yes you can do it, but that's like saying you can do rm -rf / as root. Doesn't mean you should :)
C makes it easy to shoot yourself in the foot. C++ makes it harder, but when you do, you blow away your whole leg! — Bjarne Stroustrup

Is there a destructor for a pointer in c++?

string * str=new string;
delete str;
when I delete 'str' which points to an object, do two destructors get called - one for the pointer itself, and one for the object it points to?
What would the pointer's destructor do?
delete just causes the object that the given pointer is pointing at to be destroyed (in this case, the string object. The pointer itself, denoted by str, has automatic storage duration and will be destroyed when it goes out of scope like any other local variable.
Note, however, that non-class types do not have destructors. So even when you use delete with non-class types, no destructor is called, but when the pointer goes out of scope, it gets destroyed as normally happens with any other automatic variable (means the pointer just reaches the end of its lifetime, though the memory pointed to by the pointer is not deallocated until you use delete to explicitly deallocate it.).
The pointer it self doesn't been destructed by the delete statement. but as any scope variable it's been destroyed when the scope ends.
Example:
void Function()
{
string * str=new string;
delete str; // <-- here the string is destructed
} // <-- here the pointer is "destructed", which is mean it's memory freed from the stuck but no actual destruction function is called..
The concept of destructor is applicable only to objects (i.e. entities defined with class or struct), not to plain types, like a pointer is. A pointer lives just like a int variable does.
when I delete 'str' which points to an object, do two destructors get called - one for the pointer itself, and one for the object it points to?
No. delete takes a pointer argument. It destroys the object that's pointed to (using its destructor, if it has one, and doing nothing otherwise), and deallocates the memory that's pointed to. You must previously have used new to allocate the memory and create the object there.
The pointer itself is not affected; but it no longer points to a valid object, so you mustn't do anything with it. This is sometimes known as a "dangling pointer".
What would the pointer's destructor do?
Nothing. Only class types have destructors.
The destructor for a raw pointer, like your example of std::string*, is trivial (just like the destructors for other primitive types: int, double, etc.)
Smart pointer classes have non-trivial destructors that do things like free resources, adjust reference counts, etc.
I like the simplification you get from the notion that every type has a destructor. That way you don't have a mental glitch with a template that explicitly destroys a stored value, even if the type of that stored value is an int or a pointer:
template <class T> struct wrapper {
unsigned char data[sizeof(T)];
wrapper(T t) { ptr = new (data) T; }
~wrapper() { (T*)&data->~T(); } // ignore possible alignment problem
wrapper<int> i(3);
However, the destructors for ints and pointers are utterly trivial: they don't do anything, and there is no place you can go to see the definition of the destructor, because the definition doesn't exist. So it's also reasonable to say that they don't have destructors.
Either way, when a pointer goes out of scope it simply disappears; no special code runs.