Calling a member-function from within overloaded operator delete? - c++

This article on Binary-compatible C++ Interfaces contains the code:
class Window {
public:
// ...
virtual void destroy() = 0;
void operator delete(void* p) {
if (p) {
Window* w = static_cast<Window*>(p);
w->destroy(); // VERY BAD IDEA
}
}
};
To me, that seems wrong: operator delete() works on raw memory with the intended purpose to free it. The object's destructor has already been called, so calling destroy() works on a "phantom" object (if it works at all). Indeed: that's why operator delete() takes a void* and not a Window* (in this case).
So, the design is flawed, right? (If it is correct, why is it correct?)

I agree (assuming Window::destroy is not a static member function):
3.8p1: The lifetime of an object of type T ends when: if T is a class type with a non-trivial destructor, the destructor call starts, or [...]
3.8p5: [...] After the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that refers to the storage location where the object [...] was located may be used but only in limited ways. ... If the object will be or was of a non-POD class type, the program has undefined behavior if: the pointer is used to access a non-static data member or call a non-static member function of the object, or [...]
(emphasis mine)

Yes, that code (if labelled C++) is very wrong.
Operator new and delete as you say handle raw memory, not objects.
That article however deals with specific compilers and specific implementation issues so may be that in that restricted context the code works... moreover MS isn't exactly well renowned for how nicely they care about portable C++ (au contraire, indeed) so may be that kind of bad code is (or was in 2002) actually not only working but even a reasonable solution.

That is really nasty code, trying to be helpful.
The article says that in COM code should not delete the object, but call obj->destroy() instead. So to be "helpful" the author makes operator delete do the destroy call. Nasty!
Much better would have been to just declare a private operator delete, so the user cannot delete objects and must instead figure out the correct way.

Related

Missed Optimization: std::vector<T>::pop_back() not qualifying destructor call?

In an std::vector<T> the vector owns the allocated storage and it constructs Ts and destructs Ts. Regardless of T's class hierarchy, std::vector<T> knows that it has only created a T and thus when .pop_back() is called it only has to destroy a T (not some derived class of T). Take the following code:
#include <vector>
struct Bar {
virtual ~Bar() noexcept = default;
};
struct FooOpen : Bar {
int a;
};
struct FooFinal final : Bar {
int a;
};
void popEm(std::vector<FooOpen>& v) {
v.pop_back();
}
void popEm(std::vector<FooFinal>& v) {
v.pop_back();
}
https://godbolt.org/z/G5ceGe6rq
The PopEm for FooFinal simply just reduces the vector's size by 1 (element). This makes sense. But PopEm for FooOpen calls the virtual destructor that the class got by extending Bar. Given that FooOpen is not final, if a normal delete fooOpen was called on a FooOpen* pointer, it would need to do the virtual destructor, but in the case of std::vector it knows that it only made a FooOpen and no derived class of it was constructed. Therefore, couldn't std::vector<FooOpen> treat the class as final and omit the call to the virtual destructor on the pop_back()?
Long story short - compiler doesn't have enough context information to deduce it https://godbolt.org/z/roq7sYdvT
Boring part:
The results are similar for all 3: msvc, clang, and gcc, so I guess the problem is general.
I analysed the libstdc++ code just to find pop_back() runs like this:
void pop_back() // a bit more convoluted but boils-down to this
{
--back;
back->~T();
}
Not any surprise. It's like in C++ textbooks. But it shows the problem - virtual call to a destructor from a pointer.
What we're looking for is the 'devirtualisation' technique described here: Is final used for optimisation in C++ - it states devirtualisation is 'as-if' behaviour, so it looks like it is open for optimisation if the compiler has enough information to do it.
My opinion:
I meddled with the code a little and i think optimisation doesn't happen because the compiler cannot deduce the only objects pointed by "back" are FooOpen instances. We - humans - know it because we analyse the entire class, and see the overall concept of storing the elements in a vector. We know the pointer must point to FooOpen instance only, but compiler fails to see it - it only sees a pointer which can point anywhere (vector allocates uninitialized chunk of memory and its interpretation is a part of vector's logic, also the pointer is modified outside the scope of pop_back()). Without knowing the entire concept of vector<> i don't think of how it can be deduced (without analysing the entire class) that it won't point to any descendant of FooOpen which can be defined in other translation units.
FooFinal doesn't have this problem because it already guarantees no other class can inherit from it so devirtualisation is safe for objects pointed by FooFinal* or FooFinal&.
Update
I made several findings which may be useful:
https://godbolt.org/z/3a1bvax4o), devirtualisation can occur for non-final classes as long as there is no pointer arithmetic involved.
https://godbolt.org/z/xTdshfK7v std::array performs devirtualisation on non-final classes. std::vector fails to do it even if it is constructed and destroyed in the same scope.
https://godbolt.org/z/GvoaKc9Kz devirtualisation can be enabled using wrapper.
https://godbolt.org/z/bTosvG658 destructor devirtualisation can be enabled with allocator. Bit hacky, but is transparent to the user. Briefly tested.
Yes, this is a missed optimisation.
Remember that a compiler is a software project, where features have to be written to exist. It may be that the relative overhead of virtual destruction in cases like this is low enough that adding this in hasn't been a priority for the gcc team so far.
It is an open-source project, so you could submit a patch that adds this in.
It feels a lot like § 11.4.7 (14) gives some insight into this. As of latest working draft (N4910 Post-Winter 2022 C++ working draft, Mar. 2022):
After executing the body of the destructor and destroying any objects with automatic storage duration
allocated within the body, a destructor for class X calls the destructors for X’s direct non-variant non-static data
members, the destructors for X’s non-virtual direct base classes and, if X is the most derived class (11.9.3), its
destructor calls the destructors for X’s virtual base classes. All destructors are called as if they were referenced
with a qualified name, that is, ignoring any possible virtual overriding destructors in more derived classes.
Bases and members are destroyed in the reverse order of the completion of their constructor (see 11.9.3).
[Note 4 : A return statement (8.7.4) in a destructor might not directly return to the caller; before transferring control
to the caller, the destructors for the members and bases are called. — end note]
Destructors for elements of an array are called in reverse order of their construction (see 11.9).
Also interesting for this topic, § 11.4.6, (17):
In an explicit destructor call, the destructor is specified by a ~ followed by a type-name or decltype-specifier
that denotes the destructor’s class type. The invocation of a destructor is subject to the usual rules for
member functions (11.4.2); that is, if the object is not of the destructor’s class type and not of a class derived
from the destructor’s class type (including when the destructor is invoked via a null pointer value), the program has undefined behavior.
So, as far as the standard cares, the invocation of a destructor is subject to the usual rules for member functions.
This, to me, sounds a lot like destructor calls do so much that compilers are likely unable to determine, at compile-time, that a destructor call does "nothing" - as it also calls destructors of members, and std::vector doesn't know this.

Calling destructor in member function

If we are implementing, for example, smart pointers, and we want to do a = std::move(b) --- we need to delete memory to which a is pointing, but can we call destructor inside move assignment operator, instead of copy-pasting destructor's function body?
Is the behavior on calling destructor inside move-assignment defined?
If it's not, are there any better ways dealing with it rather than copy-pasting destructor's body?
You are allowed to call the destructor on this inside a member function and it has well-defined behavior.
However, that behavior involves ending the lifetime of the object *this. After the destructor call you are not allowed to access/use any members of the object anymore. This also has bad consequences in many situations, e.g. if the object has automatic storage duration the destructor will be called a second time on it during scope exit, which would cause undefined behavior.
So that isn't useful for what you want to do.
Although I strongly advice against it, in theory you could then follow the destructor call by construction of a new object at the same storage location via placement-new. However there are some preconditions outside the control of the class on when this is allowed without causing undefined behavior down the line. Under some conditions such a placement-new itself might be undefined behavior, under some conditions the names/references/pointers referring to the old object will not refer to the new one causing undefined behavior if used after the placement-new and under some conditions the lifetime of any parent object that *this is a subobject/member of will also be ended by such an operation, causing undefined behavior if used afterwards.
You can see a demonstration on how this would be implemented, against my advice and under certain (unstated) assumptions, in the standard (draft): https://timsong-cpp.github.io/cppwp/n4868/basic.life#example-2
The linked example and the paragraph preceding it don't spell out all the conditions and possible problems with the approach that I hinted at above. Only the very specific usage of the class/function shown in the example is definitively allowed.
If you just want to reuse code, move the body of the destructor into a new member function and call that one from both locations requiring the same behavior.
Explictly calling destuctor is technically available, you can use this->~Object() in non static method of the class Object.
However this is a bad practice. Consider use these instead.
class Test {
public:
Test() {}
~Test() {
Deallocate();
}
Test(const Test& other) = delete;
Test(Test&& other) {
Deallocate();
// Do sth
}
private:
void Deallocate() {
// Do sth
}
};

Self destruction: this->MyClass::~MyClass() vs. this->~MyClass()

In my quest to learn C++ I stumbled across the article Writing Copy Constructors and Assignment Operators which proposes a mechanism to avoid code duplication across copy constructors and assignment operators.
To summarise/duplicate the content of that link, the proposed mechanism is:
struct UtilityClass
{
...
UtilityClass(UtilityClass const &rhs)
: data_(new int(*rhs_.data_))
{
// nothing left to do here
}
UtilityClass &operator=(UtilityClass const &rhs)
{
//
// Leaves all the work to the copy constructor.
//
if(this != &rhs)
{
// deconstruct myself
this->UtilityClass::~UtilityClass();
// reconstruct myself by copying from the right hand side.
new(this) UtilityClass(rhs);
}
return *this;
}
...
};
This seems like a nice way to avoid code duplication whilst ensuring "programatic integrity" but needs weighed against the risk of wasting effort freeing-then-allocating nested memory that could, instead, be re-used (as its author points out).
But I'm not familiar with the syntax that lies at its core:
this->UtilityClass::~UtilityClass()
I assume that this is a way to call the object's destructor (to destroy the contents of the object structure) whilst keeping the structure itself. To a C++ newbie, the syntax looks like a strange mixture of an object method and a class method.
Could someone please explain this syntax to me, or point me to a resource which explains it?
How does that call differ from the following?
this->~UtilityClass()
Is this a legitimate call? Does this additionally destroy the object structure (free from heap; pop off the stack)?
TL;DR version: DO NOT FOLLOW ANY ADVICE GIVEN BY THE AUTHOR OF THAT LINK
The link suggests that this technique can be used in a base class as long as a virtual destructor call is not used, because doing so would destruct parts of the derived class, which isn't the responsibility of the base class operator=.
This line of reasoning totally fails. The technique can never ever be used in a base class. The reason is that the C++ Standard only allows in-place replacement of an object with another object of the exact same type (see section 3.8 of the Standard):
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object (1.8) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
In the original code, both return *this; and subsequent use of the object are undefined behavior; they access an object which has been destroyed, not the newly created object.
This is a problem in practice as well: the placement-new call will set up a v-table ptr corresponding to the base class, not the correct derived type of the object.
Even for leaf classes (non-base classes) the technique is highly questionable.
TL;DR DO NOT DO THIS.
To answer the specific question:
In this particular example, there's no difference. As explained in the article you link to, there would be a difference if this were a polymorphic base class, with a virtual destructor.
A qualified call:
this->UtilityClass::~UtilityClass()
would specifically call the destructor of this class, not that of the most derived class. So it only destroys the subobject being assigned to, not the entire object.
An unqualified call:
this->~UtilityClass()
would use virtual dispatch to call the most derived destructor, destroying the complete object.
The article writer claims that the first is what you want, so that you only assign to the base sub-object, leaving the derived parts intact. However, what you actually do overwrite part of the object with a new object of the base type; you've changed the dynamic type, and leaked whatever was in the derived parts of the old object. This is a bad thing to do in any circumstances. You've also introduced an exception issue: if the construction of the new object fails, then the old object is left in an invalid state, and can't even be safely destroyed.
UPDATE: you also have undefined behaviour since, as described in another answer, it's forbidden to use placement-new to create an object on top of (part of) a differently-typed object.
For non-polymorphic types, a good way to write a copy-assignment operator is with the copy-and-swap idiom. That both avoids duplication by reusing the copy-constructor, and provides a strong exception guarantee - if the assignment fails, then the original object is unmodified.
For polymorphic types, copying objects is more involved, and can't generally be done with a simple assignment operator. A common approach is a virtual clone function, which each type overrides to dynamically allocate a copy of itself with the correct type.
You can decide to how to call the destructor:
this->MyClass::~MyClass(); // Non-virtual call
this->~MyClass(); // Virtual call

Does C++ require a destructor call for each placement new?

I understand that placement new calls are usually matched with explicit calls to the destructor. My question is: if I have no need for a destructor (no code to put there, and no member variables that have destructors) can I safely skip the explicit destructor call?
Here is my use case: I want to write C++ bindings for a C API. In the C API many objects are accessible only by pointer. Instead of creating a wrapper object that contains a single pointer (which is wasteful and semantically confusing). I want
to use placement new to construct an object at the address of the C object. The C++ object will do nothing in its constructor or destructor, and its methods will do nothing but delegate to the C methods. The C++ object will contain no virtual methods.
I have two parts to this question.
Is there any reason why this idea will not work in practice on any production compiler?
Does this technically violate the C++ language spec?
If I understand your question correctly you have a C object in memory and you want to initialize a C++ object with the same layout "over the top" of the existing object.
CppObject* cppobject = new (cobject) CppObject;
While there is no problem with not calling a destructor for the old object - whether this causes resource leaks or other issues is entirely down to the type of the old object and is a user code issue, not a language conformance issue - the fact that you reuse the memory for a new object means that the old object is no longer accessible.
Although the placement form of operator new must just return the address that it was given, there is nothing to stop the new expression itself wiping the memory for the new object before any constructor (if any) is called. Members of the new object that are not initialized according to C++ language rules have unspecified contents which definitely does not mean the same as having the contents of any old object that once lived in the memory being reused.
If I understand you correctly, what you are trying to do is not guaranteed to work.
I think the answer is that if your class is POD (which it is, if it's true that it does nothing in the con/destructor, has no virtual member functions, and has no non-static data members with any of those things), then you don't need to call a constructor or a destructor, its lifetime is just the lifetime of the underlying memory. You can use it the same way that a struct is used in C, and you can call its member functions regardless of whether it has been constructed.
The purpose of placement new is to allow you to create object pools or align multiple objects together in contiguous memory space as with std::vector.
If the objects are C-structs then you do not need placement new to do this, you can simply use the C method of allocating the memory based on sizeof(struct Foo) where Foo is the struct name, and if you allocate multiple objects you may need to multiple the size up to a boundary for alignment.
However there is no need to placement-new the objects there, you can simply memcpy them in.
To answer your main question, yes you still need to call the destructor because other stuff has to happen.
Is there any reason why this idea will not work in practice on any production compiler?
You had damn well be sure your C++ object fits within the size of the C object.
Does this technically violate the C++ language spec?
No, but not everything that is to spec will work like you want.
I understand that placement new calls are usually matched with explicit calls to the destructor. If I have no need for a destructor (no code to put there, and no member variables that have destructors) can I safely skip the explicit destructor call?
Yes. If I don't need to fly to New York before posting this answer, can I safely skip the trip? :) However, if the destructor is truly unneeded because it does nothing, then what harm is there in calling it?
If the compiler can figure out a destructor should be a no-op, I'd expect it to eliminate that call. If you don't write an explicit dtor, remember that your class still has a dtor, and the interesting case – here – is whether it is what the language calls trivial.
Solution: destroy previously constructed objects before constructing over them, even when you believe the dtor is a no-op.
I want to write C++ bindings for a C API. In the C API many objects are accessible only by pointer. Instead of creating a wrapper object that contains a single pointer (which is wasteful and semantically confusing). I want to use placement new to construct an object at the address of the C object.
This is the purpose of layout-compatible classes and reinterpret_cast. Include a static assert (e.g. Boost's macro, 0x static_assert, etc.) checking size and/or alignment, as you wish, for a short sanity check, but ultimately you have to know a bit of how your implementation lays out the classes. Most provide pragmas (or other implementation-specific mechanisms) to control this, if needed.
The easiest way is to contain the C struct within the C++ type:
// in C header
typedef struct {
int n;
//...
} C_A;
C_A* C_get_a();
// your C++
struct A {
void blah(int n) {
_data.num += n;
}
// convenience functions
static A* from(C_A *p) {
return reinterpret_cast<A*>(p);
}
static A const* from(C_A const *p) {
return reinterpret_cast<A const*>(p);
}
private:
C_A _data; // the only data member
};
void example() {
A *p = A::from(C_get_a());
p->blah(42);
}
I like to keep such conversions encapsulated, rather than strewing reinterpret_casts throughout, and more uniform (i.e. compare call-site for const and non-const), hence the convenience functions. It's also a bit harder to modify the class without noticing this type of use must still be supported.
Depending on the exact class, you might make the data member public.
The first question is: why don't you just use a cast? Then there's no issue of the placement new doing anything, and clearly no issue of failing to use a destructor. The result will work if the C and C++ types are layout compatible.
The second question is: what is the point? If you have no virtual functions, you're not using the constructor or destructor, the C++ class doesn't seem to offer any advantages over just using the C type: any methods you write should have been global functions anyhow.
The only advantage I can imagine is if you want to hide the C representation, you can overlay the C object with a class with all private members and use methods for access. Is that your purpose? [That's a reasonable thing to do I think]

Will using delete with a base class pointer cause a memory leak?

Given two classes have only primitive data type and no custom destructor/deallocator.
Does C++ spec guarantee it will deallocate with correct size?
struct A { int foo; };
struct B: public A { int bar[100000]; };
A *a = (A*)new B;
delete a;
I want to know do I need to write an empty virtual dtor?
I have tried g++ and vc++2008 and they won't cause a leak. But I would like to know what is correct in C++ standard.
Unless the base class destructor is virtual, it's undefined behaviour. See 5.3.5/4:
If the static type of the operand [of the delete operator] is different from its dynamic type, the static type shall be a base class of the operand's dynamic type and the static type shall have a virtual destructor or the behaviour is undefined.
According to the C++ standard, what you have is undefined behaviour - this may manifest itself as a leak, it may not, For your code to be correct you need a virtual destructor.
Also, you do not need that (A*) cast. Whenever you find yourself using a C-style cast in C++, you can be fairly sure that either it is unecessary, or your code is wrong.
This is undefined behaviour - maybe everything's fine, maybe whetever goes wrong. Either don't do it or supply the base class with a virtual destructor.
In most implementations this will not leak - there're no heap-allocated member functions in the class, so the only thing needed when delete is done is to deallocate memory. Deallocating memory uses only the address of the object, nothing more, the heap does all the rest.
It will deallocate with correct size, because the size to be deallocated is a property of the heap memory region you obtained (there is no size passed to free()-like functions!).
However, no d'tor is called. If 'B' defines a destructor or contains any members with a non-trivial destructor they will not be called, causing a potential memory leak. This is not the case in your code sample, however.
For only primitive data I believe you're fine. You might legitimately not want to incur the cost of a v-table in this case.
Otherwise, a virtual d'tor is definitely preferred.