TestObject getObject(){
TestObject a(5.0f);
return a;
}
int main(){
TestObject a = getObject();
}
Am I right in saying that in C++ a returned object will not have it's destructor called as it is returned. Is the memory that the object took up in the function call simply deleted without running the destructor?
Ok a specific example..
#include <iostream>
class Test{
public:
Test(){};
~Test(){std::cout << "Goodbye cruel world\n";}
};
Test getAnObject(){
Test a;
return a;
}
int main(){
Test a = getAnObject();
}
If I run this the destructor is run just once (not for the local object in getAnObject()). Can I assume this will always be the case?
#include <iostream>
class Test{
public:
Test(){};
~Test(){std::cout << "Goodbye cruel world\n";}
};
Test getAnObject(){
Test a;
Test b;
int i = 0;
if (i){
return a;
}else{
return b;
}
}
int main(){
Test a = getAnObject();
}
Following the RVO guide this test has the destructor run on both objects in getanobject() and in the main function. Is this a case where I should always implement rule of three to ensure consistent behaviour?
If I run this the destructor is run just once (not for the local object in getAnObject()). Can I assume this will always be the case?
For correctness? No. For efficiency? Yes. -ish.
To elaborate: strictly speaking, the local object will be copied when returning from the function. The local storage will then be cleaned up by calling the local object’s destructor.
However, the compiler is free to generate different code that yields the same observable behaviour. In particular, the standard grants the compilers the right to elide the copying of the return value, and reuse the same storage location for both objects (the local object and the receiving object of the return value). In doing so, the compiler might not need to call the copy constructor, nor the destructor (since it’s reusing the same memory location).
However, this optimization (called “named return value optimization”, NRVO) is not guaranteed by the standard (and in fact it’s not possible to perform everywhere). You cannot assume that it will happen for correctness. In particular, your object still needs a well-defined copy constructor and destructor, otherwise the program is ill-formed.
On the other hand, you can reasonably expect all modern compilers to perform this optimization where ever it is possible. You can therefore (usually) rely on this optimization from a performance point of view.
It is implementation based. It is knows as Return Value Optimization technique. Check this out for more info:
http://en.wikipedia.org/wiki/Return_value_optimization
getObject() would return a copy of a. Here is what happens if the compiler does not do any optimization. A temporary copy of a will be created using the copy constructor of TestObject. Then the original a will be destroyed, and its destructor will be called, and then the temporary object will be copied into the local variable a in the main() function. The temporary will then also be destroyed, and its destructor will be called.
Since the return value of getObject() is immediately assigned to a variable in this particular case, a modern compiler will probably be able to optimize away at least one of the copy operations.
getObject() would return a copy of a, and the original object created in getObject is destroyed on exiting the function, but there can be Return Value Optimization (depends on the compiler you're using).
Besides the mismatched return types in the example, you are probably looking for return value optimization, or more general, copy elision. If I remember correctly, the copy elision rules are even specified in the C++ standard, although somewhat vague.
Is the memory that the object took up in the function call simply deleted without running the destructor?
No. With optimizations disabled, the local object will be destructed (and the destructor invoked). If a copy-elision optimization takes place, the local object will really just be a "reference" (note the quotes) for the one in main - in that case the destructor will not be run within the function, but the memory will not be de-allocated either.
In getObject, you are creating TestObject on the stack, so the return value is invalid. To create an object on the heap, use "new". I don't believe the destructor is called when the method's scope is exited, I think the memory on the stack is simply reclaimed.
Related
Refining from Why is the destructor implicitly called?
My understanding of calling convention is that functions construct their result where the caller asked them to (or in a conventional place?). With that in mind, this surprises me:
#include <memory>
struct X; // Incomplete type.
// Placement-new a null unique_ptr in-place:
void constructAt(std::unique_ptr<X>* ptr) { new (&ptr) std::unique_ptr<X>{nullptr}; }
// Return a null unique_ptr:
std::unique_ptr<X> foo() { return std::unique_ptr<X>{nullptr}; }
https://godbolt.org/z/rqb1fKq3x
Whereas constructAt compiles, happily placement-newing a null unique_ptr<X>, foo() doesn't compile because the compiler wants to instantiate unique_ptr<X>::~unique_ptr(). I understand why it can't instantiate that destructor (because as far as the language is concerned, it needs to follow the non-nullptr branch of the d'tor that then deletes the memory [https://stackoverflow.com/questions/28521950/why-does-unique-ptrtunique-ptr-need-the-definition-of-t]). Basically without a complete X, the unique_ptr's destructor is SFINAE'd away (right?). But why does a function returning a value have to know how to destruct that value? Isn't the caller the one that will have to destruct it?
Clearly my constructAt and foo functions aren't morally equivalent. Is this language pedantry, or is there some code path (exceptions?) where foo() would have to destruct that value?
In your specific case there is no way that the destructor may be invoked. However, the standard specifies the situations in which the destructor is potentially invoked in more general terms. If a destructor is potentially invoked it requires a definition (even if there is no path that could call it) and this will therefore cause implicit instantiation which fails in your case since instantiation of the std::unique_ptr<X> destructor requires X to be complete.
In particular the destructor is potentially invoked for every result object in a return statement.
I think the reason for this choice is described in CWG issue 2176: In general there may be local variables and temporaries in the function which are destroyed after the result object of the return statement has been constructed. But if the destruction of one of these objects throws an exception, then the already constructed result object should also be destroyed. This requires the destructor to be defined.
CWG issue 2426 then made the destructor potentially invoked even if there is no actual invocation due to the above reasoning, in line with implementations. I assume this choice was made simply because it doesn't require any additional decision making on the compiler's part and was already implemented.
There's a project focusing on using C++ 98 without additional dependencies, but it needs to maintain dynamically allocated memory. Smart pointers are not available, so code to manually clean things up has been added. The approach is to explicitly set variables to NULL in the CTOR, read some data during which memory might be allocated dynamically, catch any occurring exception and clean memory up as necessary by manually calling the DTOR. That needs to implement freeing memory anyway in case everything succeeded and has simply been enhanced by safeguards to check if memory has been allocated at all or not.
The following is the most relevant available code for this question:
default_endian_expr_exception_t::doc_t::doc_t(kaitai::kstream* p__io, default_endian_expr_exception_t* p__parent, default_endian_expr_exception_t* p__root) : kaitai::kstruct(p__io) {
m__parent = p__parent;
m__root = p__root;
m_main = 0;
try {
_read();
} catch(...) {
this->~doc_t();
throw;
}
}
void default_endian_expr_exception_t::doc_t::_read() {
m_indicator = m__io->read_bytes(2);
m_main = new main_obj_t(m__io, this, m__root);
}
default_endian_expr_exception_t::doc_t::~doc_t() {
if (m_main) {
delete m_main; m_main = 0;
}
}
The most relevant part of the header is the following:
class doc_t : public kaitai::kstruct {
public:
doc_t(kaitai::kstream* p__io, default_endian_expr_exception_t* p__parent = 0, default_endian_expr_exception_t* p__root = 0);
private:
void _read();
public:
~doc_t();
private:
std::string m_indicator;
main_obj_t* m_main;
default_endian_expr_exception_t* m__root;
default_endian_expr_exception_t* m__parent;
};
The code is tested in three different environments, clang3.5_linux, clang7.3_osx and msvc141_windows_x64, to explicitly throw exceptions when reading data and if it leaks memory under those conditions. The problem is that this triggers SIGABRT on CLANG 3.5 for Linux only. The most interesting stack frames are the following:
<frame>
<ip>0x577636E</ip>
<obj>/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19</obj>
<fn>std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()</fn>
</frame>
<frame>
<ip>0x5ECFB4</ip>
<obj>/home/travis/build/kaitai-io/ci_targets/compiled/cpp_stl_98/bin/ks_tests</obj>
<fn>default_endian_expr_exception_t::doc_t::doc_t(kaitai::kstream*, default_endian_expr_exception_t*, default_endian_expr_exception_t*)</fn>
<dir>/home/travis/build/kaitai-io/ci_targets/tests/compiled/cpp_stl_98</dir>
<file>default_endian_expr_exception.cpp</file>
<line>51</line>
</frame>
[...]
<frame>
<ip>0x577636E</ip>
<obj>/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19</obj>
<fn>std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()</fn>
</frame>
<frame>
<ip>0x5ED17E</ip>
<obj>/home/travis/build/kaitai-io/ci_targets/compiled/cpp_stl_98/bin/ks_tests</obj>
<fn>default_endian_expr_exception_t::doc_t::~doc_t()</fn>
<dir>/home/travis/build/kaitai-io/ci_targets/tests/compiled/cpp_stl_98</dir>
<file>default_endian_expr_exception.cpp</file>
<line>62</line>
</frame>
The lines 51 one and 62 are the last lines of the CTOR and DTOR as provided above, so really the closing brackets. This looks like some added code by the compiler is simply trying to free the maintained std::string two times, once in the DTOR and an additional time in the CTOR, most likely only when throwing an exception.
Is this analysis correct at all?
And if so, is this expected behvaiour of C++ in general or this concrete compiler only? I wonder because the other compilers don't SIGABRT, even though the code is the same for all. Does this mean that different compilers clean non-pointers like std::string up differently? How does one know how each compiler behaves?
Looking at what the C++-standard says, I would have expected that the std::string being freed only by the CTOR because of the exception:
C++11 15.2 Constructors and destructors (2)
An object of any storage duration whose initialization or destruction is terminated by an exception will have destructors executed for all of its fully constructed subobjects (excluding the variant members of a union-like class), that is, for subobjects for which the principal constructor (12.6.2) has completed execution and the destructor has not yet begun execution.
The destruction is NOT terminated by an exception in this case, only the construction. But because the DTOR is a DTOR, it's designed to automatically clean things up as well? And if so, in general with all compilers or only this one?
Is calling a DTOR manually reliable at all?
According to my research, calling a DTOR manually shouldn't be too bad. Is that a wrong expression and it's a big no-go because of the things I see right now? I had the impression that if a DTOR is called manually, it simply needs to be compatible to be called this way. Which the above should be from my understanding. It only fails because of aut-generated code by the compiler I wasn't aware of.
How to fix this?
Instead of calling the DTOR manually and trigger the automatically generated code, one should simply use a custom cleanUp-function freeing memory and setting pointers to NULL? It should be safe to call that in the CTOR in case of an exception and always in the DTOR, correct? Or is there some way to keep calling the DTOR in a compatible way for all compilers?
Thanks!
Here's a simplified example that resembles your case, and makes the behavior obvious:
#include <iostream>
struct S {
S() { std::cout << "S constructed\n";}
~S() { std::cout << "S destroyed\n";}
};
class Throws {
S s;
public:
Throws() {
try {
throw 42;
} catch (int) {
this->~Throws();
throw;
}
}
};
int main() {
try {
Throws t;
} catch (int) {}
}
Output:
S constructed
S destroyed
S destroyed
Demo with clang, demo with gcc.
The example exhibits undefined behavior, by destroying the same S instance twice. Since the destructor doesn't do much, and in particular doesn't access this, the undefined behavior manifests itself by actually running the destructor twice successfully, so it can be easily observed in action.
Apparently, the OP has doubts that a destructor is supposed to actually destroy the object, together with all its members and base classes. To assuage those doubts, here's the relevant quote from the standard:
[class.dtor]/14 After executing the body of the destructor and destroying any objects with automatic storage duration allocated within the body, a destructor for class X calls the destructors for X’s direct non-variant non-static data members, the destructors for X’s non-virtual direct base classes and, if X is the most derived class (11.10.2), its destructor calls the destructors for X’s virtual base classes...
Once the destructor is called, the object ceases to be (leaving you with uninitialized memory). This means that destructors may omit "finalizing" memory writes, such as setting a pointer to zero (the object ceases to be, so its value cannot ever be read). It also means that basically any further operation on that object is UB.
People assume some leeway on destroying *this, if the this pointer is not used in any way anymore. This is not the case in your example, as the destructor is called twice.
I am aware of exactly one case in which calling the destructor manually is correct and one where it is mostly-correct: When the object was created with placement new (in which case there will be no operation that automatically calls the destructor). The mostly-correct case is when destroying the object is immediately followed by re-initializing the object via a call to placement-new at the very same location.
As to your second question: Why do you want to explicitly call the destructor anyway? As far as I can see, your code should work just fine without all the contortions:
default_endian_expr_exception_t::doc_t::doc_t(kaitai::kstream* p__io, default_endian_expr_exception_t* p__parent, default_endian_expr_exception_t* p__root)
: kaitai::kstruct(p__io), m__parent(p__parent), m__root(p__root), m_main() {
_read();
}
The object is initialized to a valid state before the user-provided constructor is run. If _read throws an exception that should still be the case (otherwise fix _read!) and therefore the implicit destructor call should clean up everything nicely.
I have read this question "Why doesn't C++ support functions returning arrays?". It is said that when we attempt to access the array from outside of this function (via the return value), we have a problem because we are attempting to access memory that is not in the scope with which you are working (the function call's stack).
Doesn't the same problem happen when we return a std::string or std::vector which is declared inside the function or does C++ makes a copy of string or vector and returns the copy to caller so that the string or vector does not go out of scope.
vector<int> foo(const vector<int> a)
{
vector<int> b = a;
return b;
}
int main()
{
vector<int> a;
vector<int> c = foo(a);
}
It is making a copy of the std::vector object. The memory the std::vector uses for storing its data is allocated on the heap (and that is also copied).
(Some compiler optimizations mean the copies don't always happen, behind the scenes; e.g. in your sample code, I think most compilers will copy from a to b, inside foo(), but b will become the c in main() rather than be copied again.)
Further Reading: http://en.wikipedia.org/wiki/Copy_elision and http://en.wikipedia.org/wiki/Return_value_optimization (thanks to millsj for the suggestion)
Item 20 of More Effective C++, by Scott Meyers, also covers this.
Yes, in your example, it will call the copy constructor to make a copy, the original copy will go out of scope, but the returned copy will not and can be used to do operations on it, such as assignment to other objects inside main. Since nowadays, compilers do return value optimization RVO or named return value optimization, this cost is minimized.
Adding to what others already mentioned.
When optimization is ON: NRVO or RVO ensures that a return value is computed in-place rather than copying back to the caller.
When all optimizations are OFF: returning a vector or a string is like returning an object (specifically, a container type object whose size is known to the compiler).
Since, the compiler knows about the size of the object being returned, it has enough information to allocate the required stack space for copy-by-value.
If you are attempting to return an array(of any type), how would the compiler know how much size to allocate on stack?
Should C++ be forced to return a fixed sized array?
The returning an object will trigger a copy constructor to have temp vector object which will be assigned to b as copy constructor. The source object will be destroyed after it goes out of scope.
Most modern compilers have an optimization called "Return Value Optimization" (RVO for short). C++11 RValue references allow a vector implementation that guarantees ROV.
If I move-construct a from b, is it still necessary to destruct b, or can I get away without doing so?
This question crossed my mind during the implementation of an optional<T> template. Excerpt:
~optional()
{
if (initialized)
{
reinterpret_cast<T*>(data)->~T();
}
}
optional(optional&& o) : initialized(o.initialized)
{
if (initialized)
{
new(data) T(std::move(*o)); // move from o.data
o.initialized = false; // o.data won't be destructed anymore!
}
}
Of course, I could just replace the bool initialized with a three-valued enumeration that distinguishes between initialized, non-initialized and moved-from. I just want to know if this is strictly necessary.
Yes, it is still necessary to destruct b. A moved from object is a valid, constructed object. In some cases, it may even hold resources that still need to be disposed of. In generic code such as you show, T may not even have a move constructor. You may invoke a copy constructor instead in this case. So you can definitely not assume that ~T() is a no-op and can be elided.
Yes, you do still have to destruct them. One of the designs that can show this flaw is for example, observer-based patterns where one object keeps lists of pointers to another. Not running the destructor won't remove the pointer and the code will crash when it attempts to access an object that no longer exists.
The easier thing to do in your example is just to not set initialized to false in the moved-from object. The value is still defined to be in a valid state after being moved from, and the destructor of the rvalue you're referring to would clean it up with no further intervention.
I'd like to answer 'No' to your question but I'm not sure it's even the right question to ask. Consider the following:
{ // start of scope
T maybe_moved;
if(some_condition) {
T(std::move(maybe_moved));
}
// end of scope
}
T::~T() should obviously be called only once for the maybe_moved object. If a move constructor would call it, how would you make such innocuous code work?
Several co-workers and I are having a debate about what happens when a local variable (allocated on the stack) is returned from a C++ method.
The following code works in a unit test, but I believe that is only because the unit test is lucky and doesn't attempt to reuse the memory on the stack used by obj.
Does this work?
static MyObject createMyObject() {
MyObject obj;
return obj;
}
What happens is that the copy constructor gets called to make a copy of the local object, and that is what the caller receives.
The compiler may eliminate the copy in a process called copy elision, but that's at the discretion of the compiler - you don't have much control over it.
This pattern is capable of producing the problems you're afraid of, but only if you're returning a pointer or reference to the local object.
obj is created, and then it is copied out of the method/function using the object's copy constructor.
You can make this obj to not to be in the stack by declaring it static too. Returning the object returns a copy too, but the object doesn't get created each time the function is called. Then, you can return the object as reference:
static MyObject & createMyObject() {
static MyObject obj;
return obj;
}
(no copy here, and also obj is created only once, and its address remains constant at runtime).
You return the object by value, so it's copy constructor will be called, and a COPY of the original object will be returned and stored on the caller's stack. Should this method return a pointer (or reference) of a local variable it would fail.
A copy of MyObject is returned. Which should be fine if MyObject has a copy constructor that copies everything correctly. Note it may have a copy constructor even without explicitly listing one -- The compiler defined default copy constructor (which assigns everything memberwise) may work fine for your purposes.
In this example, MyObject is being returned by value. That means that a copy of it is made and passed back to the calling function. (In certain cases the compiler can optimize the spurious copy away, but only when that would be equivalent to calling the copy constructor on MyObject and placing the copy on the stack.)
Assuming everyone else simply missed an obvious source of confusion in this question -- static:
You're not declaring the MyObject instance created in and returned from createMyObject as having static storage duration; rather, you're declaring the function createMyObject as having internal linkage.
It's ok for a function to "return a local object", because the compiler will transform the function to not really return a value. Instead, it will accept a reference MyObject& __result, and use the local object which will be assigned the return value, i.e. obj, to copy construct the __result. In your case, the function will be rewritten to:
static void createMyObject(MyObject& __result) {
MyObject obj;
// .. process obj
// compiler generated invocation of copy constructor
__result.MyObject::Myobject( obj );
return;
}
and every invocation of createMyObject will also be transformed to bind the reference to an existing object. For example, an invocation of the form:
MyObject a = createMyObject();
will be transformed to:
MyObject a; // no default constructor called here
createMyObject(a);
However, if you return a reference or pointer to a local object, the compiler cannot fulfill the transform. You will be returning a reference or pointer to a already-destroyed object.
This works just fine. It will create a temporary anonymous variable that returns MyObject:
Anonymous variables and objects
binding a temporary object to a reference to const