Is double-construction undefined behaviour? - c++

In our codebase, a pool of memory chunks is used and upon 'allocation' the object gets constructed using a "placement new". I'm missing the destructor call though, finding it odd to allow "double construction", and wonder if it is undefined behaviour to call the constructor another time on the same object.
In C++11 3.8.4 [basic.life] it reads
A program may end the lifetime of any object by reusing the storage which the object occupies or by
explicitly calling the destructor for an object of a class type with a non-trivial destructor. For an object
of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly
before the storage which the object occupies is reused or released; however, if there is no explicit call to
the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be
implicitly called and any program that depends on the side effects produced by the destructor has undefined
behavior.
Does this mean that missing out the destructor call is actually ok, as long as we talk about types which have destructors without side-effects?
Edit: Context is: embedded SW without heap, own container implementations doing placement-new on c-array-elements or byte-arrays.

You can double construct without UB. The new object is a completely different one, and following pointers/references to the old one is UB.
It is really easy to trip over UB when doing placement new, let alone when double constructing.
Not calling a destructor just means the object is not cleaned up. If the destructor is trivial, this is not going to have much risk. Constructing an object over another object end an objects lifetime, as does calling the destructor.
If the object is in automatic storage with the proper type, constructing a different object over it and exiting the scope (causing the no longer existing original object to be destroyed) is usually UB (you can avoid UB here with a trivial destructor of the original object; this is why byte-buffers can have placement new done in them).

This answer already answers the question well.
Although, I would add that using placement new in production code is usually a bad practice, error prone and/or a mistake. Many excellent alternatives exist (including some libraries). So you should look to simpler alternatives (could be as simpel as using a std::vector instead - calling reserve() if absolutely needed!) - also sometimes the use and introduction of placement new in codebase in the first place is oftentime misguided, legacy or sometimes a case of premature optimization.
Albeit it's hard to give concrete advice - one method could be just remove placement new or replace it with some kind of standard container - and then check if any problems arises (usually performance).
In my experience just removing the use of placement new in legacy code leads to no issues since the performance problem that it (maybe) used to fix has become largely obsolete by a wide range of 'system' optimizations (from the c++ level down to the hardward level).

Related

Why isn't it undefined behaviour to destroy an object that was overwritten by placement new?

I'm trying to figure out whether the following is undefined behaviour. I have a feeling it's not UB, but my reading of the standard makes it look like it is UB:
#include <iostream>
struct A {
A() { std::cout << "1"; }
~A() { std::cout << "2"; }
};
int main() {
A a;
new (&a) A;
}
Quoting the C++11 standard:
basic.life¶4 says "A program may end the lifetime of any object by reusing the storage which the object occupies"
So after new (&a) A, the original A object has ended its lifetime.
class.dtor¶11.3 says that "Destructors are invoked implicitly for constructed objects with automatic storage duration ([basic.stc.auto]) when the block in which an object is created exits ([stmt.dcl])"
So the destructor for the original A object is invoked implicitly when main exits.
class.dtor¶15 says "the behavior is undefined if the destructor is invoked for an object whose lifetime has ended ([basic.life])."
So this is undefined behaviour, since the original A no longer exists (even if the new a now exists in the same storage).
The question is whether the destructor for the original A is called, or whether the destructor for the object currently named a is called.
I am aware of basic.life¶7, which says that the name a refers to the new object after the placement new. But class.dtor¶11.3 explicitly says that it's the destructor of the object which exits scope which is called, not the destructor of the object referred to by a name that exits scope.
Am I misreading the standard, or is this actually undefined behaviour?
Edit: Several people have told me not to do this. To clarify, I'm definitely not planning on doing this in production code! This is for a CppQuiz question, which is about corner cases rather than best practices.
You're misreading it.
"Destructors are invoked implicitly for constructed objects" … meaning those that exist and their existence has gone as far as complete construction. Although arguably not entirely spelled out, the original A does not meet this criterion as it is no longer "constructed": it does not exist at all! Only the new/replacement object is automatically destructed, then, at the end of main, as you'd expect.
Otherwise, this form of placement new would be pretty dangerous and of debatable value in the language. However, it's worth pointing out that re-using an actual A in this manner is a bit strange and unusual, if for no other reason than it leads to just this sort of question. Typically you'd placement-new into some bland buffer (like a char[N] or some aligned storage) and then later invoke the destructor yourself too.
Something resembling your example may actually be found at basic.life¶8 — it's UB, but only because someone constructed a T on top of an B; the wording suggests pretty clearly that this is the only problem with the code.
But here's the clincher:
The properties ascribed to objects throughout this International Standard apply for a given object only during its lifetime. [..] [basic.life¶3]
Am I misreading the standard, or is this actually undefined behaviour?
None of those. The standard is not unclear but it could be clearer. The intent though is that the new object's destructor is called, as implied in [basic.life]p9.
[class.dtor]p12 isn't very accurate. I asked Core about it and Mike Miller (a very senior member) said:
I wouldn't say that it's a contradiction [[class.dtor]p12 vs [basic.life]p9], but clarification is certainly needed. The destructor description was written slightly naively, without taking into consideration that the original object occupying a bit of automatic storage might have been replaced by a different object occupying that same bit of automatic storage, but the intent was that if a constructor was invoked on that bit of automatic storage to create an object therein - i.e., if control flowed through that declaration - then the destructor will be invoked for the object presumed to occupy that bit of automatic storage when the block is exited - even it it's not the "same" object that was created by the constructor invocation.
I'll update this answer with the CWG issue as soon as it is published. So, your code does not have UB.
Too long for a comment.
Lightness' answer is correct and his link is the proper reference.
But let's examine terminology more precisely. There is
"Storage duration", concerning memory.
"Lifetime", concerning objects.
"Scope", concerning names.
For automatic variables all three coincide, which is why we often do not clearly distinguish: A "variable goes out of scope". That is: The name goes out of scope; if it is an object with automatic storage duration, the destructor is called, ending the lifetime of the named object; and finally the memory is released.
In your example only name scope and storage duration coincide — at any point during its existence the name a refers to valid memory — , while object lifetime is split between two distinct objects at the same memory location and with the same name a.
And no, I think you cannot understand "constructed" in 11.3 as "fully constructed and not destroyed" because the dtor will be called again (wrongly) if the object's lifetime was ended prematurely by a preceding explicit destructor call.
In fact, that's one of the concerns with the concept of memory re-use: If construction of the new object fails with an exception the scope will be left and a destructor call will be attempted on an incomplete object, or on the old object which was deleted already.
I suppose you can imagine the automatically allocated, typed memory marked with a tag "to be destroyed" which is evaluated when the stack is unwound. The C++ runtime does not really track individual objects or their state beyond this simple concept. Since variable names are basically constant addresses it is convenient to think of "the name going out of scope" triggering the destructor call on the named object of the supposed type supposedly present at that location. If one of these suppositions is wrong all bets are off.
Imagine using placement new to create a struct B to the storage where the A a objects lives. In the end of the scope, the destructor of the struct A will be called (because the variable a of type A goes out of scope), even if an object of type B is in reallty living there right now.
As already cited:
"If a program ends the lifetime of an object of type T with static
([basic.stc.static]), thread ([basic.stc.thread]), or automatic
([basic.stc.auto]) storage duration and if T has a non-trivial
destructor,39 the program must ensure that an object of the original
type occupies that same storage location when the implicit destructor
call takes place;"
So after putting B into the a storage, you need to destroy B and put an A there again, to not violate the rule above. This somehow not apply here directly, because you are putting an A to an A, but it shows the behavior. It shows, that this thinking is wrong:
So the destructor for the original A object is invoked implicitly when
main exits.
There is no "original" object any longer. There is just an object currently alive in the storage of a. And thats it. And on the data currently sitting in a, a function is called, namely the destructor of A. Thats what the program compiles to. If it would magically kept track of all "original" objects you would somehow have a dynamic runtime behavior.
Additionally:
A program may end the lifetime of any object by reusing the storage
which the object occupies or by explicitly calling the destructor for
an object of a class type with a non-trivial destructor. For an object
of a class type with a non-trivial destructor, the program is not
required to call the destructor explicitly before the storage which
the object occupies is reused or released; however, if there is no
explicit call to the destructor or if a delete-expression
([expr.delete]) is not used to release the storage, the destructor
shall not be implicitly called and any program that depends on the
side effects produced by the destructor has undefined behavior.
Since the destructor of A is not trivial and has side effects, (i think) its undefined behavior. For build in types, this does not apply (hence you can use a char buffer as object buffer without reconstructing the chars back into the buffer after using it) since they have a trivial (no-op) destructor.

How bad is not freeing up memory right before the end of program?

As an example let's talk about singleton implementation using new (the one where you create the actual instance at the first call to getInstance() method instead of using a static field. It dawned on me that it never frees that memory up. But then again it would have to do it right before the application closes so the system will free that memory up anyway.
Aside from bad design, what practical downsides does this approach have?
Edit: Ad comments - all valid points, thanks guys. So let me ask this instead - for a single thread app and POD singleton class are there any practical downsides? Just theoretically, I'm not going to actually do that.
for a single thread app and POD singleton class are there any practical downsides? Just theoretically, I'm not going to actually do that.
in standardese
[c++14-Object lifetime-4]For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
where the dtor is implicitly called on automatic/static variables and such.
So, (assuming a new expression were used to construct the object) the runtime implementation of the invoked allocation function is free to release the memory and let the object decay in oblivion, as long as no observable effects depends on its destruction ( which is trivially true for types with trivial dtors ).
Use a schwartz counter for all singleton types. It's how std::cout is implemented.
Benefits:
thread safe
correct initialisation order guaranteed when singletons depend on each other
correct destruction order at program termination
does not use the heap
100% compliant with c++98, c++03, c++11, c++14, c++17...
no need for an ugly getInstance() function. Just use the global object.
https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Nifty_Counter
As for the headline question:
How bad is not freeing up memory right before the end of program?
Not freeing memory is not so bad when the program is running under an OS which manages process memory.
However, other things that singletons do could include flushing IO buffers (e.g. std::cout, std::cerr). That's probably something you want to avoid losing.

Are moved from string and vector required to not own any heap memory?

I know moved from objects are in unspecified but destructible state and I know generally that means that they can own memory, file handles...
But I do not know if moved from std::strings and std::vectors are allowed to own any memory.
So for example is the following function potentially leaking memory or is it fine according to C++ standard?
void f(){
std::aligned_storage_t<sizeof(std::string), alignof(std::string)> memory;
std::string& src = *new (&memory) std::string ("98->03->11->14->17->20");
std::string dest(std::move(src ));
}
notes:
I am interested in ISO standard, I know that for most obvious
implementation src should not be owning any memory after move, I
am interested in "legal" status of this code.
I know code presented here is not "proper" way to code in C++, it is
just an example to explain my question
I am asking specifically about std::string and std::vector, I know this is not generally true
No; pathological implementations are free to move-construct any specific std string as a copy, leaving the source alone, so long as the operation doesn't throw. (there must be a length beyond which this does not happen to obey O(1) guarantee).
A std vector's iterator invalidation rules are tighter; the move would have to be pathologically evil to own memory afterwards. Similarly it may not throw, even if allocation fails.
Both if these are unreasonable possibilities; but so is skipping destruction.
There's nothing in the standard that requires a moved-from object to no longer own any resources. (Other than performance guarantees but I don't see them preventing such ownership in this case).
Regarding your program, see [basic.life/4]:
For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
The part of this "any program that depends on the side effects" is not as precise wording as we like to see in a standards document, but it's usually interpreted to mean "anything other than a destructor that has no observable behaviour" . We don't know what the library implementation might have put in its destructor for vector and string (e.g. it could have debugging tracking in debug mode).
So I would say your program causes undefined behaviour by omitting the destructor call , although there is some room to debate.

Garbage Collection in C++11

I have been looking through and playing with different features of C++11, specifically in Visual Studio 2010.
One of the things mentioned is minimal garbage collection:
According to this blog post, VC10 supports this feature.
My tests show that the destructor is not called on objects that are lost, so I am not sure as to whether their memory location has been freed or if they are leaking.
I have no intention of depending on it, by any means, but couldn't find a straight, definitive answer on its behavior.
Minimal GC support (n2670) only means functions like std::declare_reachable are included, and define what is the meaning of a "safely-derived pointer", so making certain operations like XOR-ing pointer values becomes undefined behavior and the GC don't need to worry about it. See also Bjarne Stroustrup's C++11 FAQ on the GC ABI, and n2585: Minimal Support for Garbage
Collection and Reachability-Based Leak Detection.
The proposal allows a GC to be implemented within C++11's framework. But the proposal itself does not mean the implementation needs to support GC. Some library e.g. libc++ simply implement the library functions as no-op.
I'm pretty sure, at this point, the memory in your case is just leaked away. But notice that the destructor is indeed not required to run when GC happens. Assuming "§3.8 Object lifetime" also supplies to GC-ed pointers, we have (§3.8/4):
... For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
So it is also possible the memory is already freed without the destructor called. In fact, earlier GC proposals such as n2310: Transparent Programmer-Directed Garbage Collection for C++ explicitly states that (n2310 §7)
When an object is recycled by the garbage collector, its destructor is not invoked (Of
course, explicit deletion always invokes destructors).

Why exactly is calling the destructor for the second time undefined behavior in C++?

As mentioned in this answer simply calling the destructor for the second time is already undefined behavior 12.4/14(3.8).
For example:
class Class {
public:
~Class() {}
};
// somewhere in code:
{
Class* object = new Class();
object->~Class();
delete object; // UB because at this point the destructor call is attempted again
}
In this example the class is designed in such a way that the destructor could be called multiple times - no things like double-deletion can happen. The memory is still allocated at the point where delete is called - the first destructor call doesn't call the ::operator delete() to release memory.
For example, in Visual C++ 9 the above code looks working. Even C++ definition of UB doesn't directly prohibit things qualified as UB from working. So for the code above to break some implementation and/or platform specifics are required.
Why exactly would the above code break and under what conditions?
I think your question aims at the rationale behind the standard. Think about it the other way around:
Defining the behavior of calling a destructor twice creates work, possibly a lot of work.
Your example only shows that in some trivial cases it wouldn't be a problem to call the destructor twice. That's true but not very interesting.
You did not give a convincing use-case (and I doubt you can) when calling the destructor twice is in any way a good idea / makes code easier / makes the language more powerful / cleans up semantics / or anything else.
So why again should this not cause undefined behavior?
The reason for the formulation in the standard is most probably that everything else would be vastly more complicated: it’d have to define when exactly double-deleting is possible (or the other way round) – i.e. either with a trivial destructor or with a destructor whose side-effect can be discarded.
On the other hand, there’s no benefit for this behaviour. In practice, you cannot profit from it because you can’t know in general whether a class destructor fits the above criteria or not. No general-purpose code could rely on this. It would be very easy to introduce bugs that way. And finally, how does it help? It just makes it possible to write sloppy code that doesn’t track life-time of its objects – under-specified code, in other words. Why should the standard support this?
Will existing compilers/runtimes break your particular code? Probably not – unless they have special run-time checks to prevent illegal access (to prevent what looks like malicious code, or simply leak protection).
The object no longer exists after you call the destructor.
So if you call it again, you're calling a method on an object that doesn't exist.
Why would this ever be defined behavior? The compiler may choose to zero out the memory of an object which has been destructed, for debugging/security/some reason, or recycle its memory with another object as an optimisation, or whatever. The implementation can do as it pleases. Calling the destructor again is essentially calling a method on arbitrary raw memory - a Bad Idea (tm).
When you use the facilities of C++ to create and destroy your objects, you agree to use its object model, however it's implemented.
Some implementations may be more sensitive than others. For example, an interactive interpreted environment or a debugger might try harder to be introspective. That might even include specifically alerting you to double destruction.
Some objects are more complicated than others. For example, virtual destructors with virtual base classes can be a bit hairy. The dynamic type of an object changes over the execution of a sequence of virtual destructors, if I recall correctly. That could easily lead to invalid state at the end.
It's easy enough to declare properly named functions to use instead of abusing the constructor and destructor. Object-oriented straight C is still possible in C++, and may be the right tool for some job… in any case, the destructor isn't the right construct for every destruction-related task.
Destructors are not regular functions. Calling one doesn't call one function, it calls many functions. Its the magic of destructors. While you have provided a trivial destructor with the sole intent of making it hard to show how it might break, you have failed to demonstrate what the other functions that get called do. And neither does the standard. Its in those functions that things can potentially fall apart.
As a trivial example, lets say the compiler inserts code to track object lifetimes for debugging purposes. The constructor [which is also a magic function that does all sorts of things you didn't ask it to] stores some data somewhere that says "Here I am." Before the destructor is called, it changes that data to say "There I go". After the destructor is called, it gets rid of the information it used to find that data. So the next time you call the destructor, you end up with an access violation.
You could probably also come up with examples that involve virtual tables, but your sample code didn't include any virtual functions so that would be cheating.
The following Class will crash in Windows on my machine if you'll call destructor twice:
class Class {
public:
Class()
{
x = new int;
}
~Class()
{
delete x;
x = (int*)0xbaadf00d;
}
int* x;
};
I can imagine an implementation when it will crash with trivial destructor. For instance, such implementation could remove destructed objects from physical memory and any access to them will lead to some hardware fault. Looks like Visual C++ is not one of such sort of implementations, but who knows.
Standard 12.4/14
Once a destructor is invoked for an
object, the object no longer exists;
the behavior is undefined if the
destructor is invoked for an object
whose lifetime has ended (3.8).
I think this section refers to invoking the destructor via delete. In other words: The gist of this paragraph is that "deleting an object twice is undefined behavior". So that's why your code example works fine.
Nevertheless, this question is rather academic. Destructors are meant to be invoked via delete (apart from the exception of objects allocated via placement-new as sharptooth correctly observed). If you want to share code between a destructor and second function, simply extract the code to a separate function and call that from your destructor.
Since what you're really asking for is a plausible implementation in which your code would fail, suppose that your implementation provides a helpful debugging mode, in which it tracks all memory allocations and all calls to constructors and destructors. So after the explicit destructor call, it sets a flag to say that the object has been destructed. delete checks this flag and halts the program when it detects the evidence of a bug in your code.
To make your code "work" as you intended, this debugging implementation would have to special-case your do-nothing destructor, and skip setting that flag. That is, it would have to assume that you're deliberately destroying twice because (you think) the destructor does nothing, as opposed to assuming that you're accidentally destroying twice, but failed to spot the bug because the destructor happens to do nothing. Either you're careless or you're a rebel, and there's more mileage in debug implementations helping out people who are careless than there is in pandering to rebels ;-)
One important example of an implementation which could break:
A conforming C++ implementation can support Garbage Collection. This has been a longstanding design goal. A GC may assume that an object can be GC'ed immediately when its dtor is run. Thus each dtor call will update its internal GC bookkeeping. The second time the dtor is called for the same pointer, the GC data structures might very well become corrupted.
By definition, the destructor 'destroys' the object and destroy an object twice makes no sense.
Your example works but its difficult that works generally
I guess it's been classified as undefined because most double deletes are dangerous and the standards committee didn't want to add an exception to the standard for the relatively few cases where they don't have to be.
As for where your code could break; you might find your code breaks in debug builds on some compilers; many compilers treat UB as 'do the thing that wouldn't impact on performance for well defined behaviour' in release mode and 'insert checks to detect bad behaviour' in debug builds.
Basically, as already pointed out, calling the destructor a second time will fail for any class destructor that performs work.
It's undefined behavior because the standard made it clear what a destructor is used for, and didn't decide what should happen if you use it incorrectly. Undefined behavior doesn't necessarily mean "crashy smashy," it just means the standard didn't define it so it's left up to the implementation.
While I'm not too fluent in C++, my gut tells me that the implementation is welcome to either treat the destructor as just another member function, or to actually destroy the object when the destructor is called. So it might break in some implementations but maybe it won't in others. Who knows, it's undefined (look out for demons flying out your nose if you try).
It is undefined because if it weren't, every implementation would have to bookmark via some metadata whether an object is still alive or not. You would have to pay that cost for every single object which goes against basic C++ design rules.
The reason is because your class might be for example a reference counted smart pointer. So the destructor decrements the reference counter. Once that counter hits 0 the actual object should be cleaned up.
But if you call the destructor twice then the count will be messed up.
Same idea for other situations too. Maybe the destructor writes 0s to a piece of memory and then deallocates it (so you don't accidentally leave a user's password in memory). If you try to write to that memory again - after it has been deallocated - you will get an access violation.
It just makes sense for objects to be constructed once and destructed once.
The reason is that, in absence of that rule, your programs would become less strict. Being more strict--even when it's not enforced at compile-time--is good, because, in return, you gain more predictability of how program will behave. This is especially important when the source code of classes is not under your control.
A lot of concepts: RAII, smart pointers, and just generic allocation/freeing of memory rely on this rule. The amount of times the destructor will be called (one) is essential for them. So the documentation for such things usually promises: "Use our classes according to C++ language rules, and they will work correctly!"
If there wasn't such a rule, it would state as "Use our classes according to C++ lanugage rules, and yes, don't call its destructor twice, then they will work correctly." A lot of specifications would sound that way.
The concept is just too important for the language in order to skip it in the standard document.
This is the reason. Not anything related to binary internals (which are described in Potatoswatter's answer).