I have looked at different sources with regard to the internal implementation of auto_ptr<> and auto_ptr_ref<>. I have this question that I couldn't figure out why.
.... When returning a 'auto_ptr' from a function, the compiler finds that there is no suitable ctor to copy construct the returned object. But there is conversion to 'auto_ptr_ref' and ctor taking an 'auto_ptr_ref' constructing an 'auto_ptr'. Thus, the compiler creates an 'auto_ptr_ref' which basically just holds a reference to the original 'auto_ptr' and then constructs an 'auto_ptr' from this object. That's all (well, when returning an object, the compiler goes through this process normally twice because the returned value is copied somewhere but this does not change the process)....
(Reference http://www.josuttis.com/libbook/auto_ptr.html)
In this example, I emulate the implementation of auto_ptr<> and auto_ptr_ref<> and is able to generate results showing that compiler indeed goes through the process twice
The example has these basic traits:
1) A's copy constructor does NOT take const reference.
2) A has conversion operator B
3) A has constructor A(B);
class B {
};
class A {
public:
A () {
printf("A default constructor() # %p\r\n", this);
}
A(B) {
printf("constructor(B) # %p\r\n", this);
}
A (A &a) {
printf("copy constructor(non-const) # %p\r\n", this);
}
operator B() {
printf("A convertion constructor(B) # %p\r\n", this);
return B();
}
};
A foo()
{
return A();
}
int main()
{
A a(foo());
}
So when A a(foo())) is executed
1) foo() generates temporary object X
2) X is converted to type B
3) constructor A(B) is used to construct object a
This is output:
A default constructor() # 0xbfea340f
A convertion constructor(B) # 0xbfea340f
constructor(B) # 0xbfea344f
A convertion constructor(B) # 0xbfea344f
constructor(B) # 0xbfea344e
We can see that compiler wen through the conversion-construct steps 2&3 twice.
Why is that?
If you are calling a function which returns a class type, basically the following happens:
On the call, the calling function reserves some space for the temporary return value on the stack (it has to be done by the calling function because anything the called function allocates on the stack gets deallocated as soon as the function returns). It passes the address of that space to the called function.
On executing the return statement, the called function constructs the return value from the argument given to the return statement into the space provided by the calling function. In your case, that argument is the temporary A value.
After the function returns, the caller uses the temporary value the called function constructed to do whatever the code demands. In your case, you use it to construct a local variable from it.
So you construct a new object of type A from an existing one twice: First, to construct the temporary return value (whose life time persists after the function foo returns, until the end of the declaration) from the explicitly generated temporary in the return statement (whole lifetime ends on the end of the full expression, that is in this case equivalent to on return from foo). And second, to initialize the local variable a from the temporary returned by foo.
Of course, due to no appropriate copy constructor being available, in both cases you go through the B conversions. This works because the return does a direct initialization, and for a you explicitly coded such a direct initialization.
Class A is being created 3 times. Once as a temporary, and copied twice. Other compilers might return different results, based on their optimization settings.
return A() creates a temporary A, and then copies it to the returned A This first copy is the first time that you see steps 2 and 3.
Then A a(foo()); copies the return from foo() into the variable in main. Triggering steps 2 and 3 again.
Related
I have a user defined class Fixed, with a default constructor, a parameter constructor and an assignment operator.
When I declare an object and then assign it:
Fixed a;
a = Fixed( param );
I get:
call to default constructor (line 1)
call to parameter constructor (line 2)
call to assignment operator (line 2)
call to destructor (line 2)
Of course I could (and should) prefer initialization (Fixed a(param)) over assignment.
Yet I'm trying to understand what happens on line 2.
Is a temporary object created ?
Here is what i found about temporary object.
In some cases, it is necessary for the compiler to create temporary
objects. These temporary objects can be created for the following
reasons:
...
To store the return value of a function that returns a user-defined
type. These temporaries are created only if your program does not
copy the return value to an object.
Here, the program does copy the return value of the object, so how come a temporary object is created ?
Is a temporary object created ?
Yes.
The expression Fixed( param ) will create a temporary object. This temporary object will then be passed to the assignment operator of the a object.
The statement
a = Fixed( param );
is somewhat equivalent to
{
Fixed temporary_object( param );
a.operator=( temporary_object );
}
To store the return value of a function that returns a user-defined type. These temporaries are created only if your program does not copy the return value to an object.
The line you mention is irrelevant. Fixed(param) is not a function call.
The line refers to something like:
Fixed f(param) {
return Fixed(param);
}
...
Fixed a;
a = f(param);
In this case the line explains that you shouldn't get a temporary created to hold the result of f and then copy that to a. This would be in addition to what you've seen above.
Also experiment with optimization levels.
I have the following code:
#include <stdio.h>
class Foo {
public:
int a;
~Foo() { printf("Goodbye %d\n", a); }
};
Foo newObj() {
Foo obj;
return obj;
}
int main() {
Foo bar = newObj();
bar.a = 5;
bar = newObj();
}
When I compile with g++ and run it, I get:
Goodbye 32765
Goodbye 32765
The number printed seems to be random.
I have two questions:
Why is the destructor called twice?
Why isn't 5 printed the first time?
I'm coming from a C background, hence the printf, and I'm having trouble understanding destructors, when they are called and how a class should be returned from a function.
Let's see what happens in your main function :
int main() {
Foo bar = newObj();
Here we just instantiate a Foo and initialize it with the return value of newObj(). No destructor is called here because of copy elision: to sum up very quickly, instead of copying/moving obj into bar and then destructing obj, obj is directly constructed in bar's storage.
bar.a = 5;
Nothing to say here. We just change bar.a's value to 5.
bar = newObj();
Here bar is copy-assigned1 the returned value of newObj(), then the temporary object created by this function call is destructed2, this is the first Goodbye. At this point bar.a is no longer 5 but whatever was in the temporary object's a.
}
End of main(), local variables are destructed, including bar, this is the second Goodbye, which does not print 5 because of previous assignment.
1 No move assignment happens here because of the user-defined destructor, no move assignment operator is implicitly declared.
2 As mentioned by YSC in the comments, note that this destructor call has undefined behavior, because it is accessing a which is uninitialized at this point. The assignment of bar with the temporary object, and particularly the assignment of a as part of it, also has undefined behavior for the same reasons.
1) It's simple, there are two Foo objects in your code (in main and in newObj) so two destructor calls. Actually this is the minimum number of destructor calls you would see, the compiler might create an unnamed temporary object for the return value, and if it had done that you would see three destructor calls. The rules on return value optimization have changed over the history of C++ so you may or may not see this behaviour.
2) Because the value of Foo::a is never 5 when the destructor is called, its never 5 in newObj, and though it was 5 in mainit isn't by the time you get to the end of main (which is when the destructor is called).
I'm guessing your misunderstanding is that you think that the assignment statement bar = newObj(); should call the destructor, but that's not the case. During assignment an object gets overwritten, it doesn't get destroyed.
I think one of the main confusions here is object identity.
bar is always the same object. When you assign a different object to bar, you don't destroy the first one - you call operator=(const& Foo) (the copy assignment operator). It is one of the five special member functions that can be auto-generated by the compiler (which it is in this case) and just overwrites bar.a with whatever is in newObj().a. Provide your own operator= to see that/when this happens (and to confirm that a is indeed 5 before this happens).
bar's destructor is only called once - when bar goes out of scope at the end of the function. There is only one other destructor call - for the temporary returned by the second newObj(). The first temporary from newObj() is elided (the language allows it in this exact case because there is never really a point to creating and immediately destroying it) and initializes bar directly with the return value of newObj().
I am reading a Java to C++ crash course, beside others it talks about memory management in C++. An example is given to show what must not be done:
Foo& FooFactory::createBadFoo(int a, int b)
{
Foo aLocalFooInstance(a, b); // creates a local instance of the class Foo
return aLocalFooInstance; // returns a reference to this instance
}
This would not work because aLocalFooInstance leaves scope and is destroyed. Fine, makes sense to me. Now as one solution to this problem the following code is given:
Foo FooFactory::createFoo(int a, int b)
{
return Foo(a, b); // returns an instance of Foo
}
What I don't understand: why is the second example valid C++ code? Is the basic issue not the same in both examples, that is, that an instance of Foo is created, which would go out of scope and is thus destroyed when we return from the method?
Say we have the code
Foo FooFactory::createFoo(int a, int b)
{
return Foo(a, b); // returns an instance of Foo
}
int main() {
Foo foo = FooFactory::createFoo(0, 0);
}
It is important to distinguish between the various Foo objects created.
Conceptually, execution proceeds as follows:
When the expression Foo(a, b) in the return statement is evaluated, a temporary object of type Foo is created.
After the expression in the return statement has been evaluated, the return value itself is initialized from that expression. This results in the creation of another temporary of type Foo.
The temporary created in step 1 is destroyed.
The temporary created in step 2 is the result of the function call expression FooFactory::createFoo(0, 0) in the calling function. That temporary is used to initialize the non-temporary object foo.
The temporary created in step 2 is destroyed.
In the presence of copy elision and return value optimization, it is possible for both temporaries to be elided.
Note that if the function returns by reference, then step 2 does not create a new object; it only creates a reference. Hence, after step 3, the object referred to does not exist anymore, and in step 4, the initialization will occur from a dangling reference.
Because the second example returns the object by value, not by reference.
The first example returns a reference to an instance of an object, the second example returns an instance of an object.
The comments you showed even state that, explicitly.
In the first example, only a reference is returned, and the referenced object gets destroyed.
In the second example, the object itself gets returned. Which means that the object "continues to exist", in a manner of speaking, and it winds up wherever the code that calls this function puts that object.
When you return an instance, you do not extend the lifetime of the object, so it gets destroyed at the end brace of the function and your reference now refers to nothing.
If you return a class, then that class gets copied (unless RVO applies, which is a whole different thing that works effectively the same) to outside the function before it is destroyed, so you now have a copy of that created object that is in perfect working order and can be used fine.
In the second example you are (logically) returning a copy. Imagine if you were returning an int instead of a Foo:
int FooFactory::createInt(int a, int b)
{
return a+b;
}
Can you see that this would not be an issue?
In C++, a Foo will work for the same reason an int does, as long as Foo is movable or copyable.
Suppose I hace a class Student with the method:
Student Student::method(Student x)
{
//nothing important
return x;
}
The copy constructor is called twice, once when the object x is send as a parameter and second when x is returned from the function.
Why and when is the destructor for class Student called twice when I call this method?
The call is like this: a = b.method(c), where a, b and c are Student objects.
For your example, a = b.method(c);, there are three copies that may take place, save for copy elision. The first is when the c object is copied into the function parameter x. The second is when the x object is returned from the function. The third is when the return value is copied into the a object. The first two involve the copy constructor and the last involves the copy assignment operator, unless you change it to Student a = b.method(c);, in which case they all use the copy constructor.
a, b, and c will all be destroyed at the end of their scope. The object x will be destroyed at the end of the method function. The return value of the function will be destroyed at the end of the full expression that contains it - that is, once a = b.method(c); has finished.
However, not all of these copies must occur - the compiler is allowed to elide or omit the copy/move construction of a class under certain situations. The first copy into the function parameter will occur. The second copy out of the function will be treated as a move first, before attempting to copy it. This copy or move may be elided. The final copy, from temporary return value to a, will occur if you're using copy assignment, but may be elided if you use the copy constructor (as in Student a = b.method(c);).
If two Student objects are constructed, they must be destructed. The copies into the parameter and out of the return value need destructing.
The destructor for x is called when the function returns (after x has been copied in to the return value).
The destructor for the return value is called at the end of the full-expression containing the function call (unless the return value has its lifetime extended by being assigned to a reference).
Every object with automatic storage duration that is constructed will automatically be destructed (usually in reverse order of construction). You construct two objects (x and the return value) and so there are two destructor calls.
I've been reading Myers book and came across the item on returning by reference/pointer vs by value.
The point is, if our function for example is like this:
ClassA& AddSomething(ClassA classA)
{
ClassA tempClassA;
//... do something on tempClassA
return tempClassA;
}
This would not work because we are returning a reference to a object that was created on the stack and it is dead now that the function is done.
He gives two solutions:
Using a local static ClassA inside the function. This has its
problems but atleast we can be sure that object exists.
Return as an object:
ClassA AddSomething(ClassA classA)
{
ClassA tempClassA;
//... do something on tempClassA
return tempClassA;
}
Now if I'm to do:
ClassA obj1;
ClassA obj2 = AddSomething(obj1);
My confusion now is, when executing this line:
A 'copy' of tempClassA is made and passed to the copy constructor
of ClassA (to initialize obj2)? OR
tempClassA is passed itself to the copy constructor of ClassA,
because copy constructor takes a reference.
So basically, whats passed to the copy constructor is a reference to tempClassA (which was created in stack inside the function) or a reference to a copy of tempClassA.
Also, another question I have is, I have read that if I get a reference of a function local variable, in that case the local variable will not be deleted.
For example,
ClassA & classRef = AddSomething(obj1);
In this case, if AddSomething() is returning a reference, then classRef not be pointing to a deleted reference because the local variable will be retained. Have I understood this correctly?
At worst, you're right: a copy of tempClassA is passed to the copy constructor. But compilers are allowed to eliminate that copy and construct the result in place form tempClassA. This is known as the "Return Value Optimization", or RVO. I don't know of a compiler that doesn't do this.
When an object is returned by value, there are two copies taking place: one from the local variable into the return value, and one from the return value into the target object. However, the implementation is allowed to elide one or both of these copies; this is called return value optimisation (RVO) in the first case, and copy elision in the second.
Object some_function() {
return Object(); // copy Object() into return value; candidate for RVO
}
Object another_function() {
Object obj;
return obj; // copy obj into return value; candidate for NRVO
}
Object result = some_function(); // copy return value into result; candidate for copy elision
The second function above is a candidate for a refinement type of RVO called named return value optimisation; the simplest form of RVO applies only to return statements that construct the return value inplace.
Regarding your second question, lifetime extension only applies to const references to objects returned by value; your code in the second question will not extend the lifetime of any object. See Returning temporary object and binding to const reference for more details.
You can never return a function local variable by reference. It would NOT work even if you used const reference to capture the returned value like this:
const ClassA& classRef = AddSomething(obj1);
Because if AddSomething returns a local object by reference, it'll be a dangling reference to a non-existing object by the time classRef gets to reference it.