Use of std::uninitialized_copy to an initialized memory - c++

If std::uninitialized_copy is used to an initialized memory, does this use cause a memory leak or an undefined behavior?
For example:
std::vector<std::string> u = {"1", "2", "3"};
std::vector<std::string> v = {"4", "5", "6"};
// What happens to the original elements in v?
std::uninitialized_copy(u.begin(), u.end(), v.begin());

TL;DR: Don't do this.
Assuming your std::string implementation uses SSO1 (all modern ones do, I believe), then this doesn't leak anything only if the strings are short enough to be SSO'ed.
Any possible UB here is governed by [basic.life]/5:
A program may end the lifetime of an object of class type without invoking the destructor, by reusing or releasing the storage as described above.
[Note 3: A delete-expression ([expr.delete]) invokes the destructor prior to releasing the storage.
— end note]
In this case, the destructor is not implicitly invoked and any program that depends on the side effects produced by the destructor has undefined behavior.
It's not entirely clear to me what "depending on side effects" entails (can you solemnly declare that you don't mind the lack of side effects, and get rid of UB this way?), but destroying a SSO'ed string should have no side effects.
But! If you enable iterator debugging, then the destructor might get side effects regardless of SSO (to somehow notify the iterators that they should be invalidated). Then skipping the destructor might be problematic.
1 SSO = small (or short) string optimization = not allocating a short string on the heap, and instead embedding it directly into the std::string instance.

Base on specialized.algorithms.general and uninitialized.copy I don't find any formal requirements that the destination range must be uninitialized. As such, we can consider the effect of std::uninitialized_copy which is defined as equivalent to the follow (excluding the implied exception safety boilerplate code) :
for (; first != last; ++result, (void) ++first)
::new (voidify(*result))
typename iterator_traits<NoThrowForwardIterator>::value_type(*first);
We can conclude that std::uninitialized_copy does not call any destructor or otherwise cares about what was previously located in the destination range. It simply overwrites it, assuming it is uninitialized.
To figure out what this means in terms of correctness, we can refer to basic.life
A program may end the lifetime of an object of class type without invoking the destructor, by reusing or releasing the storage as described above. In this case, the destructor is not implicitly invoked and any program that depends on the side effects produced by the destructor has undefined behavior.
This however uses the loosely defined notion of "any program that depends on the side effects of". What does it mean to "depend on the side effects of"?
If your question was about overwriting vectors of char or int there would be no problem, as these are not class types and do not have destructors so there can be no side effects to depend on. However std::string's destructor may have the effect of releasing resources. std::basic_string may have addition, more directly observable side effects if a user defined allocator is used. Note that in the case of a range containing non-class type elements std::uninitialized_copy is not required. These elements allow for vacuous initialization and simply copying them to uninitialized storage with std::copy is fine.
Since I don't believe it is possible for the behavior of a program to depend on the release of std::string's resources, I believe the code above is correct in terms of having well defined behavior, though it may leak resources. An argument could be made that the behavior might rely on std::bad_alloc eventually being thrown, but std::string isn't strictly speaking required to dynamically allocate. However, if the type of element used had side effects which could influence the behavior, and the program depended on those effects, then it would be UB.
In general, while it may be well defined in some cases, the code shown violates assumptions on which RAII is based, which is a fundamental feature most real programs depend on. On these grounds std::uninitialized_copy should not be used to copy to a range which already contains objects of class type.

Uninitialized memory in C++ is a memory that contain no valid object(s), is obtained by call to std::get_temporary_buffer, std::aligned_storage, std::aligned_alloc, std::malloc or similar functions.
There is no way to determine if a supplied memory is initialized with objects. Developers must care of it. The compiler sanitizers can do it, but their memory attributes are not available to a program.
std::uninitialized_copy expects uninitialized memory and makes no other assumptions. Giving an initialized memory to it may result in memory leaks and undefined behavior.
Here the specialized memory algorithm uninitialized_­copy in Algorithms library in the Standard:
template<class InputIterator, class NoThrowForwardIterator>
NoThrowForwardIterator uninitialized_copy(InputIterator first, InputIterator last,
NoThrowForwardIterator result);
Preconditions: result +[0, (last - first)) does not overlap with [first, last).
Effects: Equivalent to:
for (; first != last; ++result, (void) ++first)
::new (voidify(*result))
typename iterator_traits<NoThrowForwardIterator>::value_type(*first);
Returns: result.

Related

Are moved from string and vector required to not own any heap memory?

I know moved from objects are in unspecified but destructible state and I know generally that means that they can own memory, file handles...
But I do not know if moved from std::strings and std::vectors are allowed to own any memory.
So for example is the following function potentially leaking memory or is it fine according to C++ standard?
void f(){
std::aligned_storage_t<sizeof(std::string), alignof(std::string)> memory;
std::string& src = *new (&memory) std::string ("98->03->11->14->17->20");
std::string dest(std::move(src ));
}
notes:
I am interested in ISO standard, I know that for most obvious
implementation src should not be owning any memory after move, I
am interested in "legal" status of this code.
I know code presented here is not "proper" way to code in C++, it is
just an example to explain my question
I am asking specifically about std::string and std::vector, I know this is not generally true
No; pathological implementations are free to move-construct any specific std string as a copy, leaving the source alone, so long as the operation doesn't throw. (there must be a length beyond which this does not happen to obey O(1) guarantee).
A std vector's iterator invalidation rules are tighter; the move would have to be pathologically evil to own memory afterwards. Similarly it may not throw, even if allocation fails.
Both if these are unreasonable possibilities; but so is skipping destruction.
There's nothing in the standard that requires a moved-from object to no longer own any resources. (Other than performance guarantees but I don't see them preventing such ownership in this case).
Regarding your program, see [basic.life/4]:
For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
The part of this "any program that depends on the side effects" is not as precise wording as we like to see in a standards document, but it's usually interpreted to mean "anything other than a destructor that has no observable behaviour" . We don't know what the library implementation might have put in its destructor for vector and string (e.g. it could have debugging tracking in debug mode).
So I would say your program causes undefined behaviour by omitting the destructor call , although there is some room to debate.

Managing trivial types

I have found the intricacies of trivial types in C++ non-trivial to understand and hope someone can enlighten me on the following.
Given type T, storage for T allocated using ::operator new(std::size_t) or ::operator new[](std::size_t) or std::aligned_storage, and a void * p pointing to a location in that storage suitably aligned for T so that it may be constructed at p:
If std::is_trivially_default_constructible<T>::value holds, is the code invoking undefined behavior when code skips initialization of T at p (i.e. by using T * tPtr = new (p) T();) before otherwise accessing *p as T? Can one just use T * tPtr = static_cast<T *>(p); instead without fear of undefined behavior in this case?
If std::is_trivially_destructible<T>::value holds, does skipping destruction of T at *p (i.e by calling tPtr->~T();) cause undefined behavior?
For any type U for which std::is_trivially_assignable<T, U>::value holds, is std::memcpy(&t, &u, sizeof(U)); equivalent to t = std::forward<U>(u); (for any t of type T and u of type U) or will it cause undefined behavior?
No, you can't. There is no object of type T in that storage, and accessing the storage as if there was is undefined. See also T.C.'s answer here.
Just to clarify on the wording in [basic.life]/1, which says that objects with vacuous initialization are alive from the storage allocation onward: that wording obviously refers to an object's initialization. There is no object whose initialization is vacuous when allocating raw storage with operator new or malloc, hence we cannot consider "it" alive, because "it" does not exist. In fact, only objects created by a definition with vacuous initialization can be accessed after storage has been allocated but before the vacuous initialization occurs (i.e. their definition is encountered).
Omitting destructor calls never per se leads to undefined behavior. However, it's pointless to attempt any optimizations in this area in e.g. templates, since a trivial destructor is just optimized away.
Right now, the requirement is being trivially copyable, and the types have to match. However, this may be too strict. Dos Reis's N3751 at least proposes distinct types to work as well, and I could imagine this rule being extended to trivial copy assignment across one type in the future.
However, what you've specifically shown does not make a lot of sense (not least because you're asking for assignment to a scalar xvalue, which is ill-formed), since trivial assignment can hold between types whose assignment is not actually "trivial", that is, has the same semantics as memcpy. E.g. is_trivially_assignable<int&, double> does not imply that one can be "assigned" to the other by copying the object representation.
Technically reinterpreting storage is not enough to introduce a new object as. Look at the note for Trivial default constructor states:
A trivial default constructor is a constructor that performs no action. All data types compatible with the C language (POD types) are trivially default-constructible. Unlike in C, however, objects with trivial default constructors cannot be created by simply reinterpreting suitably aligned storage, such as memory allocated with std::malloc: placement-new is required to formally introduce a new object and avoid potential undefined behavior.
But the note says it's a formal limitation, so probably it is safe in many cases. Not guaranteed though.
No. is_assignable does not even guarantee the assignment will be legal under certain conditions:
This trait does not check anything outside the immediate context of the assignment expression: if the use of T or U would trigger template specializations, generation of implicitly-defined special member functions etc, and those have errors, the actual assignment may not compile even if std::is_assignable::value compiles and evaluates to true.
What you describe looks more like is_trivially_copyable, which says:
Objects of trivially-copyable types are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read().
I don't really know. I would trust KerrekSB's comments.

Does a memory leak cause undefined behaviour? [duplicate]

Turns out many innocently looking things are undefined behavior in C++. For example, once a non-null pointer has been delete'd even printing out that pointer value is undefined behavior.
Now memory leaks are definitely bad. But what class situation are they - defined, undefined or what other class of behavior?
Memory leaks.
There is no undefined behavior. It is perfectly legal to leak memory.
Undefined behavior: is actions the standard specifically does not want to define and leaves upto the implementation so that it is flexible to perform certain types of optimizations without breaking the standard.
Memory management is well defined.
If you dynamically allocate memory and don't release it. Then the memory remains the property of the application to manage as it sees fit. The fact that you have lost all references to that portion of memory is neither here nor there.
Of course if you continue to leak then you will eventually run out of available memory and the application will start to throw bad_alloc exceptions. But that is another issue.
Memory leaks are definitely defined in C/C++.
If I do:
int *a = new int[10];
followed by
a = new int[10];
I'm definitely leaking memory as there is no way to access the 1st allocated array and this memory is not automatically freed as GC is not supported.
But the consequences of this leak are unpredictable and will vary from application to application and from machine to machine for a same given application. Say an application that crashes out due to leaking on one machine might work just fine on another machine with more RAM. Also for a given application on a given machine the crash due to leak can appear at different times during the run.
If you leak memory, execution proceeds as if nothing happens. This is defined behavior.
Down the track, you may find that a call to malloc fails due to there not being enough available memory. But this is a defined behavior of malloc, and the consequences are also well-defined: the malloc call returns NULL.
Now this may cause a program that doesn't check the result of malloc to fail with a segmentation violation. But that undefined behavior is (from the POV of the language specs) due to the program dereferencing an invalid pointer, not the earlier memory leak or the failed malloc call.
My interpretation of this statement:
For an object of a class type with a non-trivial destructor, the
program is not required to call the destructor explicitly before the
storage which the object occupies is reused or released; however, if
there is no explicit call to the destructor or if a delete-expression
(5.3.5) is not used to release the storage, the destructor shall not
be implicitly called and any program that depends on the side effects
produced by the destructor has undefined behavior.
is as follows:
If you somehow manage to free the storage which the object occupies without calling the destructor on the object that occupied the memory, UB is the consequence, if the destructor is non-trivial and has side-effects.
If new allocates with malloc, the raw storage could be released with free(), the destructor would not run, and UB would result. Or if a pointer is cast to an unrelated type and deleted, the memory is freed, but the wrong destructor runs, UB.
This is not the same as an omitted delete, where the underlying memory is not freed. Omitting delete is not UB.
(Comment below "Heads-up: this answer has been moved here from Does a memory leak cause undefined behaviour?" - you'll probably have to read that question to get proper background for this answer O_o).
It seems to me that this part of the Standard explicitly permits:
having a custom memory pool that you placement-new objects into, then release/reuse the whole thing without spending time calling their destructors, as long as you don't depend on side-effects of the object destructors.
libraries that allocate a bit of memory and never release it, probably because their functions/objects could be used by destructors of static objects and registered on-exit handlers, and it's not worth buying into the whole orchestrated-order-of-destruction or transient "phoenix"-like rebirth each time those accesses happen.
I can't understand why the Standard chooses to leave the behaviour undefined when there are dependencies on side effects - rather than simply say those side effects won't have happened and let the program have defined or undefined behaviour as you'd normally expect given that premise.
We can still consider what the Standard says is undefined behaviour. The crucial part is:
"depends on the side effects produced by the destructor has undefined behavior."
The Standard §1.9/12 explicitly defines side effects as follows (the italics below are the Standards, indicating the introduction of a formal definition):
Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.
In your program, there's no dependency so no undefined behaviour.
One example of dependency arguably matching the scenario in §3.8 p4, where the need for or cause of undefined behaviour isn't apparent, is:
struct X
{
~X() { std::cout << "bye!\n"; }
};
int main()
{
new X();
}
An issue people are debating is whether the X object above would be considered released for the purposes of 3.8 p4, given it's probably only released to the O.S. after program termination - it's not clear from reading the Standard whether that stage of a process's "lifetime" is in scope for the Standard's behavioural requirements (my quick search of the Standard didn't clarify this). I'd personally hazard that 3.8p4 applies here, partly because as long as it's ambiguous enough to be argued a compiler writer may feel entitled to allow undefined behaviour in this scenario, but even if the above code doesn't constitute release the scenario's easily amended ala...
int main()
{
X* p = new X();
*(char*)p = 'x'; // token memory reuse...
}
Anyway, however main's implemented the destructor above has a side effect - per "calling a library I/O function"; further, the program's observable behaviour arguably "depends on" it in the sense that buffers that would be affected by the destructor were it to have run are flushed during termination. But is "depends on the side effects" only meant to allude to situations where the program would clearly have undefined behaviour if the destructor didn't run? I'd err on the side of the former, particularly as the latter case wouldn't need a dedicated paragraph in the Standard to document that the behaviour is undefined. Here's an example with obviously-undefined behaviour:
int* p_;
struct X
{
~X() { if (b_) p_ = 0; else delete p_; }
bool b_;
};
X x{true};
int main()
{
p_ = new int();
delete p_; // p_ now holds freed pointer
new (&x){false}; // reuse x without calling destructor
}
When x's destructor is called during termination, b_ will be false and ~X() will therefore delete p_ for an already-freed pointer, creating undefined behaviour. If x.~X(); had been called before reuse, p_ would have been set to 0 and deletion would have been safe. In that sense, the program's correct behaviour could be said to depend on the destructor, and the behaviour is clearly undefined, but have we just crafted a program that matches 3.8p4's described behaviour in its own right, rather than having the behaviour be a consequence of 3.8p4...?
More sophisticated scenarios with issues - too long to provide code for - might include e.g. a weird C++ library with reference counters inside file stream objects that had to hit 0 to trigger some processing such as flushing I/O or joining of background threads etc. - where failure to do those things risked not only failing to perform output explicitly requested by the destructor, but also failing to output other buffered output from the stream, or on some OS with a transactional filesystem might result in a rollback of earlier I/O - such issues could change observable program behaviour or even leave the program hung.
Note: it's not necessary to prove that there's any actual code that behaves strangely on any existing compiler/system; the Standard clearly reserves the right for compilers to have undefined behaviour... that's all that matters. This is not something you can reason about and choose to ignore the Standard - it may be that C++14 or some other revision changes this stipulation, but as long as it's there then if there's even arguably some "dependency" on side effects then there's the potential for undefined behaviour (which of course is itself allowed to be defined by a particular compiler/implementation, so it doesn't automatically mean that every compiler is obliged do something bizarre).
The language specification says nothing about "memory leaks". From the language point of view, when you create an object in dynamic memory, you are doing just that: you are creating an anonymous object with unlimited lifetime/storage duration. "Unlimited" in this case means that the object can only end its lifetime/storage duration when you explicitly deallocate it, but otherwise it continues to live forever (as long as the program runs).
Now, we usually consider a dynamically allocated object become a "memory leak" at the point in program execution when all references (generic "references", like pointers) to that object are lost to the point of being unrecoverable. Note, that even to a human the notion of "all references being lost" is not very precisely defined. What if we have a reference to some part of the object, which can be theoretically "recalculated" to a reference to the entire object? Is it a memory leak or not? What if we have no references to the object whatsoever, but somehow we can calculate such a reference using some other information available to the program (like precise sequence of allocations)?
The language specification doesn't concern itself with issues like that. Whatever you consider an appearance of "memory leak" in your program, from the language point of view it is a non-event at all. From the language point of view a "leaked" dynamically allocated object just continues to live happily until the program ends. This is the only remaining point of concern: what happens when program ends and some dynamic memory is still allocated?
If I remember correctly, the language does not specify what happens to dynamic memory which is still allocated the moment of program termination. No attempts will be made to automatically destruct/deallocate the objects you created in dynamic memory. But there's no formal undefined behavior in cases like that.
The burden of evidence is on those who would think a memory leak could be C++ UB.
Naturally no evidence has been presented.
In short for anyone harboring any doubt this question can never be clearly resolved, except by very credibly threatening the committee with e.g. loud Justin Bieber music, so that they add a C++14 statement that clarifies that it's not UB.
At issue is C++11 §3.8/4:
” For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
This passage had the exact same wording in C++98 and C++03. What does it mean?
the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released
– means that one can grab the memory of a variable and reuse that memory, without first destroying the existing object.
if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called
– means if one does not destroy the existing object before the memory reuse, then if the object is such that its destructor is automatically called (e.g. a local automatic variable) then the program has Undefined Behavior, because that destructor would then operate on a no longer existing object.
and any program that depends on the side effects produced by the destructor has undefined behavior
– can't mean literally what it says, because a program always depends on any side effects, by the definition of side effect. Or in other words, there is no way for the program not to depend on the side effects, because then they would not be side effects.
Most likely what was intended was not what finally made its way into C++98, so that what we have at hand is a defect.
From the context one can guess that if a program relies on the automatic destruction of an object of statically known type T, where the memory has been reused to create an object or objects that is not a T object, then that's Undefined Behavior.
Those who have followed the commentary may notice that the above explanation of the word “shall” is not the meaning that I assumed earlier. As I see it now, the “shall” is not a requirement on the implementation, what it's allowed to do. It's a requirement on the program, what the code is allowed to do.
Thus, this is formally UB:
auto main() -> int
{
string s( 666, '#' );
new( &s ) string( 42, '-' ); // <- Storage reuse.
cout << s << endl;
// <- Formal UB, because original destructor implicitly invoked.
}
But this is OK with a literal interpretation:
auto main() -> int
{
string s( 666, '#' );
s.~string();
new( &s ) string( 42, '-' ); // <- Storage reuse.
cout << s << endl;
// OK, because of the explicit destruction of the original object.
}
A main problem is that with a literal interpretation of the standard's paragraph above it would still be formally OK if the placement new created an object of a different type there, just because of the explicit destruction of the original. But it would not be very in-practice OK in that case. Maybe this is covered by some other paragraph in the standard, so that it is also formally UB.
And this is also OK, using the placement new from <new>:
auto main() -> int
{
char* storage = new char[sizeof( string )];
new( storage ) string( 666, '#' );
string const& s = *(
new( storage ) string( 42, '-' ) // <- Storage reuse.
);
cout << s << endl;
// OK, because no implicit call of original object's destructor.
}
As I see it – now.
Its definately defined behaviour.
Consider a case the server is running and keep allocating heap memory and no memory is released even if there's no use of it.
Hence the end result would be that eventually server will run out of memory and definately crash will occur.
Adding to all the other answers, some entirely different approach. Looking at memory allocation in § 5.3.4-18 we can see:
If any part of the object initialization described above76 terminates
by throwing an exception and a suitable deallocation function can be
found, the deallocation function is called to free the memory in which
the object was being constructed, after which the exception continues
to propagate in the context of the new-expression. If no unambiguous
matching deallocation function can be found, propagating the exception
does not cause the object’s memory to be freed. [ Note: This is
appropriate when the called allocation function does not allocate
memory; otherwise, it is likely to result in a memory leak. —end note
]
Would it cause UB here, it would be mentioned, so it is "just a memory leak".
In places like §20.6.4-10, a possible garbage collector and leak detector is mentioned. A lot of thought has been put into the concept of safely derived pointers et.al. to be able to use C++ with a garbage collector (C.2.10 "Minimal support for garbage-collected regions").
Thus if it would be UB to just lose the last pointer to some object, all the effort would make no sense.
Regarding the "when the destructor has side effects not running it ever UB" I would say this is wrong, otherwise facilities such as std::quick_exit() would be inherently UB too.
If the space shuttle must take off in two minutes, and I have a choice between putting it up with code that leaks memory and code that has undefined behavior, I'm putting in the code that leaks memory.
But most of us aren't usually in such a situation, and if we are, it's probably by a failure further up the line. Perhaps I'm wrong, but I'm reading this question as, "Which sin will get me into hell faster?"
Probably the undefined behavior, but in reality both.
defined, since a memory leak is you forgetting to clean up after yourself.
of course, a memory leak can probably cause undefined behaviour later.
Straight forward answer: The standard doesn't define what happens when you leak memory, thus it is "undefined". It's implicitly undefined though, which is less interesting than the explicitly undefined things in the standard.
This obviously cannot be undefined behaviour. Simply because UB has to happen at some point in time, and forgetting to release memory or call a destructor does not happen at any point in time. What happens is just that the program terminates without ever having released memory or called the destructor; this does not make the behaviour of the program, or of its termination, undefined in any way.
This being said, in my opinion the standard is contradicting itself in this passage. On one hand it ensures that the destructor will not be called in this scenario, and on the other hand it says that if the program depends on the side effects produced by the destructor then it has undefined behaviour. Suppose the destructor calls exit, then no program that does anything can pretend to be independent of that, because the side effect of calling the destructor would prevent it from doing what it would otherwise do; but the text also assures that the destructor will not be called so that the program can go on with doing its stuff undisturbed. I think the only reasonable way to read the end of this passage is that if the proper behaviour of the program would require the destructor to be called, then behaviour is in fact not defined; this then is a superfluous remark, given that it has just been stipulated that the destructor will not be called.
Undefined behavior means, what will happen has not been defined or is unknown. The behavior of memory leaks is definitly known in C/C++ to eat away at available memory. The resulting problems, however, can not always be defined and vary as described by gameover.

How can an implementation guarantee that copy constructor of an iterator is no throw?

Clause 23.2.1.10 of C++11 standard says that
"no copy ctor of a returned iterator throws an exception"
Does this basically state that is it possible for a copy ctor of an iterator not to throw even a bad_alloc presumably (leaving the case where iterator could be just a pointer and here no issues) because it will use the information already constructed in the "returned iterator"? because it is passed by value will the stack be allocated in the called function hence can guarantee no memory issues ?
That paragraph talks about the iterators used by the containers in the standard library. These iterators are known to be implementable in ways so that they don't throw exception while being copied. For example, none of them have to use any dynamically allocated memory.
The guarantee is just for these iterators, not for iterators in general (even though it is a good idea to follow the example).
Legal answer: no. Thtat's just your interpretation. It is technically correct, but it may be not the one and only technically correct interpretation.
Technical answer: The point, here, is avoid that an exception thrown by a mutating iterator (think to an inserter or to an output iterator) causes an algorithm to be abandoned while letting a container in an undefined and inconsistent state (think, for example, to a linked list with the links not yet completely re-linked)
It's not just a matter of bad_alloc for iterators that have a dynamically allocated state, but also of an iterator that -during it's own copy- tries to modify a referred item failing in that (for example, because the item assignment throws).
When such a case happens, the iterator is not required to "complete the algorithm" (that would be impossible) but to left the container in a consistent and still manageable state.
I think there is a misinterpretation about the scope of what copy constructor means.
A copy constructor is not responsible for allocating the memory where the object itself will be built, this is provided externally, by the caller.
The requirement, therefore, is that the body of the copy constructor (be it written or generated) does not throw. It is known in C++ that built-in types (int, T*, ...) can be copied without throwing, and from there one can built types that are thus copyable without throwing exceptions (as long as one avoids dynamic resource allocation and/or IO it's automatic).

Are memory leaks "undefined behavior" class problem in C++?

Turns out many innocently looking things are undefined behavior in C++. For example, once a non-null pointer has been delete'd even printing out that pointer value is undefined behavior.
Now memory leaks are definitely bad. But what class situation are they - defined, undefined or what other class of behavior?
Memory leaks.
There is no undefined behavior. It is perfectly legal to leak memory.
Undefined behavior: is actions the standard specifically does not want to define and leaves upto the implementation so that it is flexible to perform certain types of optimizations without breaking the standard.
Memory management is well defined.
If you dynamically allocate memory and don't release it. Then the memory remains the property of the application to manage as it sees fit. The fact that you have lost all references to that portion of memory is neither here nor there.
Of course if you continue to leak then you will eventually run out of available memory and the application will start to throw bad_alloc exceptions. But that is another issue.
Memory leaks are definitely defined in C/C++.
If I do:
int *a = new int[10];
followed by
a = new int[10];
I'm definitely leaking memory as there is no way to access the 1st allocated array and this memory is not automatically freed as GC is not supported.
But the consequences of this leak are unpredictable and will vary from application to application and from machine to machine for a same given application. Say an application that crashes out due to leaking on one machine might work just fine on another machine with more RAM. Also for a given application on a given machine the crash due to leak can appear at different times during the run.
If you leak memory, execution proceeds as if nothing happens. This is defined behavior.
Down the track, you may find that a call to malloc fails due to there not being enough available memory. But this is a defined behavior of malloc, and the consequences are also well-defined: the malloc call returns NULL.
Now this may cause a program that doesn't check the result of malloc to fail with a segmentation violation. But that undefined behavior is (from the POV of the language specs) due to the program dereferencing an invalid pointer, not the earlier memory leak or the failed malloc call.
My interpretation of this statement:
For an object of a class type with a non-trivial destructor, the
program is not required to call the destructor explicitly before the
storage which the object occupies is reused or released; however, if
there is no explicit call to the destructor or if a delete-expression
(5.3.5) is not used to release the storage, the destructor shall not
be implicitly called and any program that depends on the side effects
produced by the destructor has undefined behavior.
is as follows:
If you somehow manage to free the storage which the object occupies without calling the destructor on the object that occupied the memory, UB is the consequence, if the destructor is non-trivial and has side-effects.
If new allocates with malloc, the raw storage could be released with free(), the destructor would not run, and UB would result. Or if a pointer is cast to an unrelated type and deleted, the memory is freed, but the wrong destructor runs, UB.
This is not the same as an omitted delete, where the underlying memory is not freed. Omitting delete is not UB.
(Comment below "Heads-up: this answer has been moved here from Does a memory leak cause undefined behaviour?" - you'll probably have to read that question to get proper background for this answer O_o).
It seems to me that this part of the Standard explicitly permits:
having a custom memory pool that you placement-new objects into, then release/reuse the whole thing without spending time calling their destructors, as long as you don't depend on side-effects of the object destructors.
libraries that allocate a bit of memory and never release it, probably because their functions/objects could be used by destructors of static objects and registered on-exit handlers, and it's not worth buying into the whole orchestrated-order-of-destruction or transient "phoenix"-like rebirth each time those accesses happen.
I can't understand why the Standard chooses to leave the behaviour undefined when there are dependencies on side effects - rather than simply say those side effects won't have happened and let the program have defined or undefined behaviour as you'd normally expect given that premise.
We can still consider what the Standard says is undefined behaviour. The crucial part is:
"depends on the side effects produced by the destructor has undefined behavior."
The Standard §1.9/12 explicitly defines side effects as follows (the italics below are the Standards, indicating the introduction of a formal definition):
Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.
In your program, there's no dependency so no undefined behaviour.
One example of dependency arguably matching the scenario in §3.8 p4, where the need for or cause of undefined behaviour isn't apparent, is:
struct X
{
~X() { std::cout << "bye!\n"; }
};
int main()
{
new X();
}
An issue people are debating is whether the X object above would be considered released for the purposes of 3.8 p4, given it's probably only released to the O.S. after program termination - it's not clear from reading the Standard whether that stage of a process's "lifetime" is in scope for the Standard's behavioural requirements (my quick search of the Standard didn't clarify this). I'd personally hazard that 3.8p4 applies here, partly because as long as it's ambiguous enough to be argued a compiler writer may feel entitled to allow undefined behaviour in this scenario, but even if the above code doesn't constitute release the scenario's easily amended ala...
int main()
{
X* p = new X();
*(char*)p = 'x'; // token memory reuse...
}
Anyway, however main's implemented the destructor above has a side effect - per "calling a library I/O function"; further, the program's observable behaviour arguably "depends on" it in the sense that buffers that would be affected by the destructor were it to have run are flushed during termination. But is "depends on the side effects" only meant to allude to situations where the program would clearly have undefined behaviour if the destructor didn't run? I'd err on the side of the former, particularly as the latter case wouldn't need a dedicated paragraph in the Standard to document that the behaviour is undefined. Here's an example with obviously-undefined behaviour:
int* p_;
struct X
{
~X() { if (b_) p_ = 0; else delete p_; }
bool b_;
};
X x{true};
int main()
{
p_ = new int();
delete p_; // p_ now holds freed pointer
new (&x){false}; // reuse x without calling destructor
}
When x's destructor is called during termination, b_ will be false and ~X() will therefore delete p_ for an already-freed pointer, creating undefined behaviour. If x.~X(); had been called before reuse, p_ would have been set to 0 and deletion would have been safe. In that sense, the program's correct behaviour could be said to depend on the destructor, and the behaviour is clearly undefined, but have we just crafted a program that matches 3.8p4's described behaviour in its own right, rather than having the behaviour be a consequence of 3.8p4...?
More sophisticated scenarios with issues - too long to provide code for - might include e.g. a weird C++ library with reference counters inside file stream objects that had to hit 0 to trigger some processing such as flushing I/O or joining of background threads etc. - where failure to do those things risked not only failing to perform output explicitly requested by the destructor, but also failing to output other buffered output from the stream, or on some OS with a transactional filesystem might result in a rollback of earlier I/O - such issues could change observable program behaviour or even leave the program hung.
Note: it's not necessary to prove that there's any actual code that behaves strangely on any existing compiler/system; the Standard clearly reserves the right for compilers to have undefined behaviour... that's all that matters. This is not something you can reason about and choose to ignore the Standard - it may be that C++14 or some other revision changes this stipulation, but as long as it's there then if there's even arguably some "dependency" on side effects then there's the potential for undefined behaviour (which of course is itself allowed to be defined by a particular compiler/implementation, so it doesn't automatically mean that every compiler is obliged do something bizarre).
The language specification says nothing about "memory leaks". From the language point of view, when you create an object in dynamic memory, you are doing just that: you are creating an anonymous object with unlimited lifetime/storage duration. "Unlimited" in this case means that the object can only end its lifetime/storage duration when you explicitly deallocate it, but otherwise it continues to live forever (as long as the program runs).
Now, we usually consider a dynamically allocated object become a "memory leak" at the point in program execution when all references (generic "references", like pointers) to that object are lost to the point of being unrecoverable. Note, that even to a human the notion of "all references being lost" is not very precisely defined. What if we have a reference to some part of the object, which can be theoretically "recalculated" to a reference to the entire object? Is it a memory leak or not? What if we have no references to the object whatsoever, but somehow we can calculate such a reference using some other information available to the program (like precise sequence of allocations)?
The language specification doesn't concern itself with issues like that. Whatever you consider an appearance of "memory leak" in your program, from the language point of view it is a non-event at all. From the language point of view a "leaked" dynamically allocated object just continues to live happily until the program ends. This is the only remaining point of concern: what happens when program ends and some dynamic memory is still allocated?
If I remember correctly, the language does not specify what happens to dynamic memory which is still allocated the moment of program termination. No attempts will be made to automatically destruct/deallocate the objects you created in dynamic memory. But there's no formal undefined behavior in cases like that.
The burden of evidence is on those who would think a memory leak could be C++ UB.
Naturally no evidence has been presented.
In short for anyone harboring any doubt this question can never be clearly resolved, except by very credibly threatening the committee with e.g. loud Justin Bieber music, so that they add a C++14 statement that clarifies that it's not UB.
At issue is C++11 §3.8/4:
” For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
This passage had the exact same wording in C++98 and C++03. What does it mean?
the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released
– means that one can grab the memory of a variable and reuse that memory, without first destroying the existing object.
if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called
– means if one does not destroy the existing object before the memory reuse, then if the object is such that its destructor is automatically called (e.g. a local automatic variable) then the program has Undefined Behavior, because that destructor would then operate on a no longer existing object.
and any program that depends on the side effects produced by the destructor has undefined behavior
– can't mean literally what it says, because a program always depends on any side effects, by the definition of side effect. Or in other words, there is no way for the program not to depend on the side effects, because then they would not be side effects.
Most likely what was intended was not what finally made its way into C++98, so that what we have at hand is a defect.
From the context one can guess that if a program relies on the automatic destruction of an object of statically known type T, where the memory has been reused to create an object or objects that is not a T object, then that's Undefined Behavior.
Those who have followed the commentary may notice that the above explanation of the word “shall” is not the meaning that I assumed earlier. As I see it now, the “shall” is not a requirement on the implementation, what it's allowed to do. It's a requirement on the program, what the code is allowed to do.
Thus, this is formally UB:
auto main() -> int
{
string s( 666, '#' );
new( &s ) string( 42, '-' ); // <- Storage reuse.
cout << s << endl;
// <- Formal UB, because original destructor implicitly invoked.
}
But this is OK with a literal interpretation:
auto main() -> int
{
string s( 666, '#' );
s.~string();
new( &s ) string( 42, '-' ); // <- Storage reuse.
cout << s << endl;
// OK, because of the explicit destruction of the original object.
}
A main problem is that with a literal interpretation of the standard's paragraph above it would still be formally OK if the placement new created an object of a different type there, just because of the explicit destruction of the original. But it would not be very in-practice OK in that case. Maybe this is covered by some other paragraph in the standard, so that it is also formally UB.
And this is also OK, using the placement new from <new>:
auto main() -> int
{
char* storage = new char[sizeof( string )];
new( storage ) string( 666, '#' );
string const& s = *(
new( storage ) string( 42, '-' ) // <- Storage reuse.
);
cout << s << endl;
// OK, because no implicit call of original object's destructor.
}
As I see it – now.
Its definately defined behaviour.
Consider a case the server is running and keep allocating heap memory and no memory is released even if there's no use of it.
Hence the end result would be that eventually server will run out of memory and definately crash will occur.
Adding to all the other answers, some entirely different approach. Looking at memory allocation in § 5.3.4-18 we can see:
If any part of the object initialization described above76 terminates
by throwing an exception and a suitable deallocation function can be
found, the deallocation function is called to free the memory in which
the object was being constructed, after which the exception continues
to propagate in the context of the new-expression. If no unambiguous
matching deallocation function can be found, propagating the exception
does not cause the object’s memory to be freed. [ Note: This is
appropriate when the called allocation function does not allocate
memory; otherwise, it is likely to result in a memory leak. —end note
]
Would it cause UB here, it would be mentioned, so it is "just a memory leak".
In places like §20.6.4-10, a possible garbage collector and leak detector is mentioned. A lot of thought has been put into the concept of safely derived pointers et.al. to be able to use C++ with a garbage collector (C.2.10 "Minimal support for garbage-collected regions").
Thus if it would be UB to just lose the last pointer to some object, all the effort would make no sense.
Regarding the "when the destructor has side effects not running it ever UB" I would say this is wrong, otherwise facilities such as std::quick_exit() would be inherently UB too.
If the space shuttle must take off in two minutes, and I have a choice between putting it up with code that leaks memory and code that has undefined behavior, I'm putting in the code that leaks memory.
But most of us aren't usually in such a situation, and if we are, it's probably by a failure further up the line. Perhaps I'm wrong, but I'm reading this question as, "Which sin will get me into hell faster?"
Probably the undefined behavior, but in reality both.
defined, since a memory leak is you forgetting to clean up after yourself.
of course, a memory leak can probably cause undefined behaviour later.
Straight forward answer: The standard doesn't define what happens when you leak memory, thus it is "undefined". It's implicitly undefined though, which is less interesting than the explicitly undefined things in the standard.
This obviously cannot be undefined behaviour. Simply because UB has to happen at some point in time, and forgetting to release memory or call a destructor does not happen at any point in time. What happens is just that the program terminates without ever having released memory or called the destructor; this does not make the behaviour of the program, or of its termination, undefined in any way.
This being said, in my opinion the standard is contradicting itself in this passage. On one hand it ensures that the destructor will not be called in this scenario, and on the other hand it says that if the program depends on the side effects produced by the destructor then it has undefined behaviour. Suppose the destructor calls exit, then no program that does anything can pretend to be independent of that, because the side effect of calling the destructor would prevent it from doing what it would otherwise do; but the text also assures that the destructor will not be called so that the program can go on with doing its stuff undisturbed. I think the only reasonable way to read the end of this passage is that if the proper behaviour of the program would require the destructor to be called, then behaviour is in fact not defined; this then is a superfluous remark, given that it has just been stipulated that the destructor will not be called.
Undefined behavior means, what will happen has not been defined or is unknown. The behavior of memory leaks is definitly known in C/C++ to eat away at available memory. The resulting problems, however, can not always be defined and vary as described by gameover.