it is stated on this website: http://www.tutorialspoint.com/cplusplus/cpp_variable_scope.htm
Variables that are declared inside a function or block are local
variables. They can be used only by statements that are inside that
function or block of code. Local variables are not known to functions
outside their own.
Then, in the following example;
class foo {
/*....*/
};
foo bar(){
foo f;
return f;
}
void main(){
foo fooReturn = bar();
}
how come when bar() returns, fooReturn contains a valid object? is:
foo f similar to foo *f = new foo(); are both objects on the heap?
thanks
daniel
No, foo f; is very different from foo * f = new foo();, since the former foo is built on the stack, its destructor is automatically called when it goes out of scope, etc.
Instead, the latter foo is built on the heap, requires manual destruction calling delete, etc.
But, in your sample code, the returned foo f is copied or moved (if foo provides move semantics, e.g. move constructor), out of the function bar(). So you have a valid object returned to the caller.
Note
The be more precise, there is an optimization that the C++ compiler may apply, i.e. the RVO (Return Value Optimization), that could avoid the copy or move of the returned foo.
how come when bar() returns, fooReturn contains a valid object?
Because the value of the return expression (f) is used to initialise fooReturn before it's destroyed. As long as the type has correct copy/move semantics, or the copy/move is elided, the resulting object will be valid.
is: foo f similar to foo *f = new foo(); are both objects on the heap?
No, the first is an automatic variable, stored in the function's stack frame and destroyed when it goes out of scope. The second is a dynamic object, stored on the heap, and not destroyed without an explicit delete.
When you return a local object from a function, a copy is created (with a copy costructor). In your example, fooReturn contains a copy of the f object (local to bar). After the f is copied, it is freed.
Related
I have the following code:
#include <stdio.h>
class Foo {
public:
int a;
~Foo() { printf("Goodbye %d\n", a); }
};
Foo newObj() {
Foo obj;
return obj;
}
int main() {
Foo bar = newObj();
bar.a = 5;
bar = newObj();
}
When I compile with g++ and run it, I get:
Goodbye 32765
Goodbye 32765
The number printed seems to be random.
I have two questions:
Why is the destructor called twice?
Why isn't 5 printed the first time?
I'm coming from a C background, hence the printf, and I'm having trouble understanding destructors, when they are called and how a class should be returned from a function.
Let's see what happens in your main function :
int main() {
Foo bar = newObj();
Here we just instantiate a Foo and initialize it with the return value of newObj(). No destructor is called here because of copy elision: to sum up very quickly, instead of copying/moving obj into bar and then destructing obj, obj is directly constructed in bar's storage.
bar.a = 5;
Nothing to say here. We just change bar.a's value to 5.
bar = newObj();
Here bar is copy-assigned1 the returned value of newObj(), then the temporary object created by this function call is destructed2, this is the first Goodbye. At this point bar.a is no longer 5 but whatever was in the temporary object's a.
}
End of main(), local variables are destructed, including bar, this is the second Goodbye, which does not print 5 because of previous assignment.
1 No move assignment happens here because of the user-defined destructor, no move assignment operator is implicitly declared.
2 As mentioned by YSC in the comments, note that this destructor call has undefined behavior, because it is accessing a which is uninitialized at this point. The assignment of bar with the temporary object, and particularly the assignment of a as part of it, also has undefined behavior for the same reasons.
1) It's simple, there are two Foo objects in your code (in main and in newObj) so two destructor calls. Actually this is the minimum number of destructor calls you would see, the compiler might create an unnamed temporary object for the return value, and if it had done that you would see three destructor calls. The rules on return value optimization have changed over the history of C++ so you may or may not see this behaviour.
2) Because the value of Foo::a is never 5 when the destructor is called, its never 5 in newObj, and though it was 5 in mainit isn't by the time you get to the end of main (which is when the destructor is called).
I'm guessing your misunderstanding is that you think that the assignment statement bar = newObj(); should call the destructor, but that's not the case. During assignment an object gets overwritten, it doesn't get destroyed.
I think one of the main confusions here is object identity.
bar is always the same object. When you assign a different object to bar, you don't destroy the first one - you call operator=(const& Foo) (the copy assignment operator). It is one of the five special member functions that can be auto-generated by the compiler (which it is in this case) and just overwrites bar.a with whatever is in newObj().a. Provide your own operator= to see that/when this happens (and to confirm that a is indeed 5 before this happens).
bar's destructor is only called once - when bar goes out of scope at the end of the function. There is only one other destructor call - for the temporary returned by the second newObj(). The first temporary from newObj() is elided (the language allows it in this exact case because there is never really a point to creating and immediately destroying it) and initializes bar directly with the return value of newObj().
Not that I don't trust my compiler, but I like to know what's going on. Let's say I have
struct Foo {
std::string s;
};
and I want to create one of those (on the stack), fill in the very long string, and return it from my function.
Foo f() {
Foo foo {my_very_long_string};
return foo;
// OR: return Foo {my_very_long_string};
}
I know there's such things as RVO and move semantics; how do I know that they're being used, and at runtime it's not allocating a new string with data on the heap, copying it, and freeing the old one? (Other than my program will get slow.)
Is it using a move constructor to reuse the string data? Or is it using RVO to actually return the same string?
NRVO or move for named objects
In the function:
Foo f() {
Foo foo{my_very_long_string};
return foo;
}
The object foo has a name (i.e.: foo), it is a named object.
Named RVO (NRVO), which is an optional optimization, may occur. If no NRVO takes place, then foo is moved, since it is a local object and therefore treated as an rvalue in this context (i.e.: the return statement).
RVO/copy elsion or move for unnamed objects
However, in the function:
Foo f() {
return Foo{my_very_long_string};
}
A unnamed object, which is the one resulting from Foo{my_very_long_string}, is concerned.
As of C++17, the copy has to be necessarily elided (i.e.: same effect as RVO, although different semantics).
Before C++17, RVO, which was back then an optional optimization, may occur. If it doesn't, then it is moved, since Foo{my_very_long_string} is already an rvalue.
No heap allocation for a new string will happen in any of the cases above.
I am reading a Java to C++ crash course, beside others it talks about memory management in C++. An example is given to show what must not be done:
Foo& FooFactory::createBadFoo(int a, int b)
{
Foo aLocalFooInstance(a, b); // creates a local instance of the class Foo
return aLocalFooInstance; // returns a reference to this instance
}
This would not work because aLocalFooInstance leaves scope and is destroyed. Fine, makes sense to me. Now as one solution to this problem the following code is given:
Foo FooFactory::createFoo(int a, int b)
{
return Foo(a, b); // returns an instance of Foo
}
What I don't understand: why is the second example valid C++ code? Is the basic issue not the same in both examples, that is, that an instance of Foo is created, which would go out of scope and is thus destroyed when we return from the method?
Say we have the code
Foo FooFactory::createFoo(int a, int b)
{
return Foo(a, b); // returns an instance of Foo
}
int main() {
Foo foo = FooFactory::createFoo(0, 0);
}
It is important to distinguish between the various Foo objects created.
Conceptually, execution proceeds as follows:
When the expression Foo(a, b) in the return statement is evaluated, a temporary object of type Foo is created.
After the expression in the return statement has been evaluated, the return value itself is initialized from that expression. This results in the creation of another temporary of type Foo.
The temporary created in step 1 is destroyed.
The temporary created in step 2 is the result of the function call expression FooFactory::createFoo(0, 0) in the calling function. That temporary is used to initialize the non-temporary object foo.
The temporary created in step 2 is destroyed.
In the presence of copy elision and return value optimization, it is possible for both temporaries to be elided.
Note that if the function returns by reference, then step 2 does not create a new object; it only creates a reference. Hence, after step 3, the object referred to does not exist anymore, and in step 4, the initialization will occur from a dangling reference.
Because the second example returns the object by value, not by reference.
The first example returns a reference to an instance of an object, the second example returns an instance of an object.
The comments you showed even state that, explicitly.
In the first example, only a reference is returned, and the referenced object gets destroyed.
In the second example, the object itself gets returned. Which means that the object "continues to exist", in a manner of speaking, and it winds up wherever the code that calls this function puts that object.
When you return an instance, you do not extend the lifetime of the object, so it gets destroyed at the end brace of the function and your reference now refers to nothing.
If you return a class, then that class gets copied (unless RVO applies, which is a whole different thing that works effectively the same) to outside the function before it is destroyed, so you now have a copy of that created object that is in perfect working order and can be used fine.
In the second example you are (logically) returning a copy. Imagine if you were returning an int instead of a Foo:
int FooFactory::createInt(int a, int b)
{
return a+b;
}
Can you see that this would not be an issue?
In C++, a Foo will work for the same reason an int does, as long as Foo is movable or copyable.
I read a book S. Lippman "inside c++ object model", is there such code
class Foo { public: int val; Foo *pnext; };
void foo_bar()
{
// Oops: program needs bar's members zeroed out
Foo bar;
Foo* baz = new Foo(); // this line i added myself
if ( bar.val || bar.pnext )
// ... do something
// ...
}
and it says that
"A default constructor is not synthesized for this code fragment.
Global objects are guaranteed to have their associated memory "zeroed out" at program start-up. Local objects
allocated on the program stack and heap objects allocated on the free-store do not have their associated memory
zeroed out; rather, the memory retains the arbitrary bit pattern of its previous use."
In this code the baz object was created on the heap, and according to what has been said above this object is not global and it will not be called the default constructor. I understand correctly ?
The parentheses in new Foo() specify value initialisation; this basically means that each member is zero-initialised. If instead you said new Foo, then the members would be left uninitialised, as they are for your automatic variable.
Unfortunately, to value-initialise the automatic variable, you can't write Foo bar(), since that declares a function. You'll need
Foo bar{}; // C++11
Foo bar = Foo(); // Historical C++
When you do this:
Foo* baz = new Foo();
you are dynamically allocating a Foo instance and value-initializing it. For PODs, this means the members get zero-initialized. If you had said this (assuming non-global context):
Foo* baz = new Foo;
then the Foo instance would be default initialized, which would mean no initialization of its members is performed, since they are PODs.
This also applies to automatic storage instances:
Foo f0; // default initializaiton: members not zeroed out.
Foo f1 = Foo(); // value initialization: members zeroed out.
Foo f2{}; // C++11 value initialization: members zeroed out.
Foo f3(); // Ooops! Function declaration. Something completely different.
If a class have no default constructor (and no other constructor), the compiler will create one for you. It has to, or you would not be able to create instances of the class. However, the generated default constructor will not do anything.
What adding the empty set of parentheses in new Foo() does, is to value initialize the allocated object, which means the members gets initialized to their "default" values, which is zero for integer and floating point values, and nullptr for pointers.
I'm relatively new to C++ and I'm wondering if structs are copied in the following case:
struct foo {
int i;
std::vector<int> bar;
}
class Foobar {
foo m_foo;
void store(foo& f) {
this->m_foo = f;
}
}
void main() {
Foobar foobar;
{
foo f;
f.i = 1;
f.bar.insert(2);
foobar.store(f);
}
// will a copy of f still exist in foobar.m_foo, or am I storing a NULL-Pointer at this point?
}
The reason why I am asking this is that I am originally a .NET developer and in .NET structures will be copied if you pass them to a function (and classes are not).
I'm pretty sure it would be copied if store was not declared to take f by reference, but I cannot change this code.
Edit: Updated the code, because I didn't know that the vector.insert would affect my question. In my case I store the struct as a member in a class, not a vector.
So my question really was: will f be copied at this->m_foo = f;?
Short answer: Yes.
Long answer: You'd have to get a pointer to a stack allocated struct and then let that struct go out of scope in order to end up with a dangling reference in your vector... but even then, you wouldn't have stored a NULL. C and C++ pointers are simple things, and will continue to point at a memory location long after that memory location has become invalid, if your code doesn't overwrite them.
It might also be worth noting that std::vector has a decent set of copy and move functions associated with it that will be called implicitly in this case, so the bar vector inside the struct will also be copied along with the simple integer i. Standard library classes tend to be quite well written, but code by other folk has no such guarantee!
Now, as regards your edit:
class Foobar {
foo m_foo;
void store(foo& f) {
this->m_foo = f;
}
}
You will still not have any problems with the foo instance stored in m_foo. This is because this->m_foo = f invokes a copying operation, as m_foo is not a variable of a reference or pointer type. If you had this instead: foo& m_foo then you would run into difficulties because instead of copying a foo instance you are instead copying a reference to a foo instance, and when that instance goes out of scope, the reference is no longer valid.
Yes, the struct will be copied, in the following function:
foos.insert(f);
As a copy is made, you won't be storing a null pointer / null reference.
However, like you've said, it won't be copied when you call store(f); as the function accepts the argument as a reference.
Your edit will still make a copy of Foo. You are assigning one instance of a variable to another instance of a variable. What you aren't doing is assigning one pointer (reference in C#) to another. You could probably do with doing some reading around C++ object instances, pointers, and references.
A copy of f is made during foos.insert(f)
void store(foo& f) {
foos.insert(f);
}
void main() {
{
foo f;
f.i = 1;
f.bar.insert(2);
store(f);
}
// at this place, local variable `f` runs out of scope, it's destroyed and cleaned up
// foos is holding the copy of `f`
}