Returning value (reference, pointer and object) - c++

I have some difficulties with understanding what is really done behind returning values in C++.
Let's have following code:
class MyClass {
public:
int id;
MyClass(int id) {
this->id = id;
cout << "[" << id << "] MyClass::ctor\n";
}
MyClass(const MyClass& other) {
cout << "[" << id << "] MyClass::ctor&\n";
}
~MyClass() {
cout << "[" << id << "] MyClass::dtor\n";
}
MyClass& operator=(const MyClass& r) {
cout << "[" << id << "] MyClass::operator=\n";
return *this;
}
};
MyClass foo() {
MyClass c(111);
return c;
}
MyClass& bar() {
MyClass c(222);
return c;
}
MyClass* baz() {
MyClass* c = new MyClass(333);
return c;
}
I use gcc 4.7.3.
Case 1
When I call:
MyClass c1 = foo();
cout << c1.id << endl;
The output is:
[111] MyClass::ctor
111
[111] MyClass::dtor
My understanding is that in foo object is created on the stack and then destroyed upon return statement because it's end of a scope. Returning is done by object copying (copy constructor) which is later assigned to c1 in main (assignment operator). If I'm right why there is no output from copy constructor nor assignment operator? Is this because of RVO?
Case 2
When I call:
MyClass c2 = bar();
cout << c2.id << endl;
The output is:
[222] MyClass::ctor
[222] MyClass::dtor
[4197488] MyClass::ctor&
4197488
[4197488] MyClass::dtor
What is going on here? I create variable then return it and variable is destroyed because it is end of a scope. Compiler is trying copy that variable by copy constructor but It is already destroyed and that's why I have random value? So what is actually in c2 in main?
Case 3
When I call:
MyClass* c3 = baz();
cout << c3->id << endl;
The output is:
[333] MyClass::ctor
333
This is the simplest case? I return a dynamically created pointer which lies on heap, so memmory is allocated and not automatically freed. This is the case when destructor isn't called and I have memory leak. Am I right?
Are there any other cases or things that aren't obvious and I should know to fully master returning values in C++? ;) What is a recommended way to return a object from function (if any) - any rules of thumb upon that?

May I just add that case #2 is one of the cases of undefined behavior in the C++ language, since returning a reference to a local variable is illegal. This is because a local variable has a precisely defined lifetime, and - by returning it by a reference - you're returning a reference to a variable that does not exist anymore when the function returns. Therefore, you exhibit undefined behavior and the value of the given variable is practically random. As is the result of the rest of your program, since Anything at all can happen.
Most compilers will issue a warning when you try to do something like this (either return a local variable by reference, or by address) - gcc, for example, tells me something like this :
bla.cpp:37:13: warning: reference to local variable ‘c’ returned [-Wreturn-local-addr]
You should remember, however, that the compiler is not at all required to issue any kind of warning when a statement that may exhibit undefined behavior occurs. Situations such as this one, though, must be avoided at all costs, because they're practically never right.

Case 1:
MyClass foo() {
MyClass c(111);
return c;
}
...
MyClass c1 = foo();
is a typical case when RVO can be applied. This is called copy-initialization and the assignment operator is not used since the object is created in place, unlike the situation:
MyClass c1;
c1 = foo();
where c1 is constructed, temporary c in foo() is constructed, [ copy of c is constructed ], c or copy of c is assigned to c1, [ copy of c is destructed] and c is destructed. (what exactly happens depends on whether the compiler eliminates the redundant copy of c being created or not).
Case 2:
MyClass& bar() {
MyClass c(222);
return c;
}
...
MyClass c2 = bar();
invokes undefined behavior since you are returning a reference to local (temporary) variable c ~ an object with automatic storage duration.
Case 3:
MyClass* baz() {
MyClass* c = new MyClass(333);
return c;
}
...
MyClass c2 = bar();
is the most straightforward one since you control what happens yet with a very unpleasant consequence: you are responsible for memory management, which is the reason why you should avoid dynamic allocation of this kind always when it is possible (and prefer Case 1).

1) Yes.
2) You have a random value because your copy c'tor and operator= don't copy the value of id. However, you are correct in assuming there is no relying on the value of an object after it has been deleted.
3) Yes.

Related

passing value to reference works one way [duplicate]

struct A {
A(int) : i(new int(783)) {
std::cout << "a ctor" << std::endl;
}
A(const A& other) : i(new int(*(other.i))) {
std::cout << "a copy ctor" << std::endl;
}
~A() {
std::cout << "a dtor" << std::endl;
delete i;
}
void get() {
std::cout << *i << std::endl;
}
private:
int* i;
};
const A& foo() {
return A(32);
}
const A& foo_2() {
return 6;
}
int main()
{
A a = foo();
a.get();
}
I know, returning references to local values is bad. But, on the other hand, const reference should extend a temporary object lifetime.
This code produce an UB output. So no life extention.
Why? I mean can someone explain whats happening step by step?
Where is fault in my reasoning chain?
foo():
A(32) - ctor
return A(32) - a const reference to local object is created and is returned
A a = foo(); - a is initialized by foo() returned value, returned value goes out of scope(out of expression) and is destroyed, but a is already initialized;
(But actually destructor is called before copy constructor)
foo_2():
return 6 - temp object of type A is created implicitly,a const reference to this object is created(extending its life) and is returned
A a = foo(); - a is initialized by foo() returned value, returned value goes out of scope(out of expression) and is destroyed, but a is already initialized;
(But actually destructor is called before copy constructor)
Rules of temporary lifetime extension for each specific context are explicitly spelled out in the language specification. And it says that
12.2 Temporary objects
5 The second context is when a reference is bound to a temporary. [...] A temporary bound to the returned value in a function return statement
(6.6.3) persists until the function exits. [...]
Your temporary object is destroyed at the moment of function exit. That happens before the initialization of the recipient object begins.
You seem to assume that your temporary should somehow live longer than that. Apparently you are trying to apply the rule that says that the temporary should survive until the end of the full expression. But that rule does not apply to temporaries created inside functions. Such temporaries' lifetimes are governed by their own, dedicated rules.
Both your foo and your foo_2 produce undefined behavior, if someone attempts to use the returned reference.
You are misinterpeting "until function exit". If you really want to use a const reference to extend the life of an object beyond foo, use
A foo() {
return A(32);
}
int main() {
const A& a = foo();
}
You must return from foo by value, and then use a const reference to reference the return value, if you wish to extend things in the way you expect.
As #AndreyT has said, the object is destroyed in the function that has the const &. You want your object to survive beyond foo, and hence you should not have const &
(or &) anywhere in foo or in the return type of foo. The first mention of const & should be in main, as that is the function that should keep the object alive.
You might think this return-by-value code is slow as there appear to be copies of A made in the return, but this is incorrect. In most cases, the compiler can construct A only once, in its final location (i.e. on the stack of the calling function), and then set up the relevant reference.

Create a non-temporary object in a constructor in c++ [duplicate]

I know that a temporary cannot be bound to a non-const reference, but it can be bound to const reference. That is,
A & x = A(); //error
const A & y = A(); //ok
I also know that in the second case (above), the lifetime of the temporary created out of A() extends till the lifetime of const reference (i.e y).
But my question is:
Can the const reference which is bound to a temporary, be further bound to yet another const reference, extending the lifetime of the temporary till the lifetime of second object?
I tried this and it didn't work. I don't exactly understand this. I wrote this code:
struct A
{
A() { std::cout << " A()" << std::endl; }
~A() { std::cout << "~A()" << std::endl; }
};
struct B
{
const A & a;
B(const A & a) : a(a) { std::cout << " B()" << std::endl; }
~B() { std::cout << "~B()" << std::endl; }
};
int main()
{
{
A a;
B b(a);
}
std::cout << "-----" << std::endl;
{
B b((A())); //extra braces are needed!
}
}
Output (ideone):
A()
B()
~B()
~A()
-----
A()
B()
~A()
~B()
Difference in output? Why the temporary object A() is destructed before the object b in the second case? Does the Standard (C++03) talks about this behavior?
The standard considers two circumstances under which the lifetime of a temporary is extended:
§12.2/4 There are two contexts in which temporaries are destroyed at a different point than the end of the fullexpression. The first context is when an expression appears as an initializer for a declarator defining an object. In that context, the temporary that holds the result of the expression shall persist until the object’s initialization is complete. [...]
§12.2/5 The second context is when a reference is bound to a temporary. [...]
None of those two allow you to extend the lifetime of the temporary by a later binding of the reference to another const reference. But ignore the standarese and think of what is going on:
Temporaries are created in the stack. Well, technically, the calling convention might mean that a returned value (temporary) that fits in the registers might not even be created in the stack, but bear with me. When you bind a constant reference to a temporary the compiler semantically creates a hidden named variable (that is why the copy constructor needs to be accessible, even if it is not called) and binds the reference to that variable. Whether the copy is actually made or elided is a detail: what we have is an unnamed local variable and a reference to it.
If the standard allowed your use case, then it would mean that the lifetime of the temporary would have to be extended all the way until the last reference to that variable. Now consider this simple extension of your example:
B* f() {
B * bp = new B(A());
return b;
}
void test() {
B* p = f();
delete p;
}
Now the problem is that the temporary (lets call it _T) is bound in f(), it behaves like a local variable there. The reference is bound inside *bp. Now that object's lifetime extends beyond the function that created the temporary, but because _T was not dynamically allocated that is impossible.
You can try and reason the effort that would be required to extend the lifetime of the temporary in this example, and the answer is that it cannot be done without some form of GC.
No, the extended lifetime is not further extended by passing the reference on.
In the second case, the temporary is bound to the parameter a, and destroyed at the end of the parameter's lifetime - the end of the constructor.
The standard explicitly says:
A temporary bound to a reference member in a constructor's ctor-initializer (12.6.2) persists until the constructor exits.
§12.2/5 says “The second context [when the lifetime of a temporary
is extended] is when a reference is bound to a temporary.” Taken
literally, this clearly says that the lifetime should be extended in
your case; your B::a is certainly bound to a temporary. (A reference
binds to an object, and I don't see any other object it could possibly
be bound to.) This is very poor wording, however; I'm sure that what is
meant is “The second context is when a temporary is used to
initialize a reference,” and the extended lifetime corresponds to
that of the reference initiailized with the rvalue expression creating
the temporary, and not to that of any other references which may later
be bound to the object. As it stands, the wording requires something
that simply isn't implementable: consider:
void f(A const& a)
{
static A const& localA = a;
}
called with:
f(A());
Where should the compiler put A() (given that it generally cannot see
the code of f(), and doesn't know about the local static when
generating the call)?
I think, actually, that this is worth a DR.
I might add that there is text which strongly suggests that my
interpretation of the intent is correct. Imagine that you had a second
constructor for B:
B::B() : a(A()) {}
In this case, B::a would be directly initialized with a temporary; the
lifetime of this temporary should be extended even by my interpretation.
However, the standard makes a specific exception for this case; such a
temporary only persists until the constructor exits (which again would
leave you with a dangling reference). This exception provides a very
strong indication that the authors of the standard didn't intend for
member references in a class to extend the lifetime of any temporaries
they are bound to; again, the motivation is implementability. Imagine
that instead of
B b((A()));
you'd written:
B* b = new B(A());
Where should the compiler put the temporary A() so that it's lifetime
would be that of the dynamically allocated B?
Your example doesn't perform nested lifetime extension
In the constructor
B(const A & a_) : a(a_) { std::cout << " B()" << std::endl; }
The a_ here (renamed for exposition) is not a temporary. Whether an expression is a temporary is a syntactic property of the expression, and an id-expression is never a temporary. So no lifetime extension occurs here.
Here's a case where lifetime-extension would occur:
B() : a(A()) { std::cout << " B()" << std::endl; }
However, because the reference is initialized in a ctor-initializer, the lifetime is only extended until the end of the function. Per [class.temporary]p5:
A temporary bound to a reference member in a constructor's ctor-initializer (12.6.2) persists until the constructor exits.
In the call to the constructor
B b((A())); //extra braces are needed!
Here, we are binding a reference to a temporary. [class.temporary]p5 says:
A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full-expression containing the call.
Therefore the A temporary is destroyed at the end of the statement. This happens before the B variable is destroyed at the end of the block, explaining your logging output.
Other cases do perform nested lifetime extension
Aggregate variable initialization
Aggregate initialization of a struct with a reference member can lifetime-extend:
struct X {
const A &a;
};
X x = { A() };
In this case, the A temporary is bound directly to a reference, and so the temporary is lifetime-extended to the lifetime of x.a, which is the same as the lifetime of x. (Warning: until recently, very few compilers got this right).
Aggregate temporary initialization
In C++11, you can use aggregate initialization to initialize a temporary, and thus get recursive lifetime extension:
struct A {
A() { std::cout << " A()" << std::endl; }
~A() { std::cout << "~A()" << std::endl; }
};
struct B {
const A &a;
~B() { std::cout << "~B()" << std::endl; }
};
int main() {
const B &b = B { A() };
std::cout << "-----" << std::endl;
}
With trunk Clang or g++, this produces the following output:
A()
-----
~B()
~A()
Note that both the A temporary and the B temporary are lifetime-extended. Because the construction of the A temporary completes first, it is destroyed last.
In std::initializer_list<T> initialization
C++11's std::initializer_list<T> performs lifetime-extension as if by binding a reference to the underlying array. Therefore we can perform nested lifetime extension using std::initializer_list. However, compiler bugs are common in this area:
struct C {
std::initializer_list<B> b;
~C() { std::cout << "~C()" << std::endl; }
};
int main() {
const C &c = C{ { { A() }, { A() } } };
std::cout << "-----" << std::endl;
}
Produces with Clang trunk:
A()
A()
-----
~C()
~B()
~B()
~A()
~A()
and with g++ trunk:
A()
A()
~A()
~A()
-----
~C()
~B()
~B()
These are both wrong; the correct output is:
A()
A()
-----
~C()
~B()
~A()
~B()
~A()
In your first run, the objects are destroyed in the order they were pushed on the stack -> that is push A, push B, pop B, pop A.
In the second run, A's lifetime ends with the construction of b. Therefore, it creates A, it creates B from A, A's lifetime finishes so it is destroyed, and then B is destroyed. Makes sense...
I don't know about standards, but can discuss some facts which I saw in few previous questions.
The 1st output is as is for obvious reasons that a and b are in the same scope. Also a is destroyed after b because it's constructed before b.
I assume that you should be more interested in 2nd output. Before I start, we should note that following kind of object creations (stand alone temporaries):
{
A();
}
last only till the next ; and not for the block surrounding it. Demo. In your 2nd case, when you do,
B b((A()));
thus A() is destroyed as soon as the B() object creation finishes. Since, const reference can be bind to temporary, this will not give compilation error. However it will surely result in logical error if you try to access B::a, which is now bound to already out of scope variable.
§12.2/5 says
A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full expression containing the call.
Pretty cut and dried, really.

Difference calling virtual through named member versus address or reference

Updated below: In clang, using an lvalue of a polymorphic object through its name does not activate virtual dispatch, but it does through its address.
For the following base class B and derived D, virtual function something, union Space
#include <iostream>
using namespace std;
struct B {
void *address() { return this; }
virtual ~B() { cout << "~B at " << address() << endl; }
virtual void something() { cout << "B::something"; }
};
struct D: B {
~D() { cout << "~D at " << address() << endl; }
void something() override { cout << "D::something"; }
};
union Space {
B b;
Space(): b() {}
~Space() { b.~B(); }
};
If you have a value s of Space, in Clang++: (update: incorrectly claimed g++ had the same behavior)
If you do s.b.something(), B::something() will be called, not doing the dynamic binding on s.b, however, if you call (&s.b)->something() will do the dynamic binding to what b really contains (either a B or D).
The completion code is this:
union SpaceV2 {
B b;
SpaceV2(): b() {}
~SpaceV2() { (&b)->~B(); }
};
static_assert(sizeof(D) == sizeof(B), "");
static_assert(alignof(D) == alignof(B), "");
#include <new>
int main(int argc, const char *argv[]) {
{
Space s;
cout << "Destroying the old B: ";
s.b.~B();
new(&s.b) D;
cout << "\"D::something\" expected, but \"";
s.b.something();
cout << "\" happened\n";
auto &br = s.b;
cout << "\"D::something\" expected, and \"";
br.something();
cout << "\" happened\n";
cout << "Destruction of D expected:\n";
}
cout << "But did not happen!\n";
SpaceV2 sv2;
new(&sv2.b) D;
cout << "Destruction of D expected again:\n";
return 0;
}
When compile with -O2 optimization and I run the program, this is the output:
$./a.out
Destroying the old B: ~B at 0x7fff4f890628
"D::something" expected, but "B::something" happened
"D::something" expected, and "D::something" happened
Destruction of D expected:
~B at 0x7fff4f890628
But did not happen!
Destruction of D expected again:
~D at 0x7fff4f890608
~B at 0x7fff4f890608
What surprises me is that setting the dynamic type of s.b using placement new leads to a difference calling something on the very same l-value through its name or through its address. The first question is essential, but I have not been able to find an answer:
Is doing placement new to a derived class, like new(&s.b) D undefined behavior according to the C++ standard?
If it is not undefined behavior, is this choice of not activating virtual dispatch through the l-value of the named member something specified in the standard or a choice in G++, Clang?
Thanks, my first question in S.O. ever.
UPDATE
The answer and the comment that refers to the standard are accurate: According to the standard, s.b will forever refer to an object of exact type B, the memory is allowed to change type, but then any use of that memory through s.b is "undefined behavior", that is, prohibited, or that the compiler can translate however it pleases. If Space was just a buffer of chars, it would be valid to in-place construct, destruct, change the type. Did exactly that in the code that led to this question and it works with standards-compliance AFAIK.
Thanks.
The expression new(&s.b) D; re-uses the storage named s.b and formerly occupied by a B for for storage of a new D.
However you then write s.b.something(); . This causes undefined behaviour because s.b denotes a B but the actual object stored in that location is a D. See C++14 [basic.life]/7:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
[...]
The last bullet point is not satisfied because the new type differs.
(There are other potential issues later in the code too but since undefined behaviour is caused here, they're moot; you'd need to have a major design change to avoid this problem).

Address of an object returned by value from a function

Considering the following minimal code:
class MyClass {
public:
MyClass() {}
};
MyClass myfunc() {
MyClass obj;
cout << "Address of obj in myFunc " << &obj << endl;
return obj;
}
int main() {
MyClass obj(myfunc());
cout << "Address of obj in main " << &obj << endl;
return 0;
}
I obtain the following output:
Address of obj in myFunc 0x7fff345037df
Address of obj in main 0x7fff3450380f
Now, just by adding a destructor in MyClass, I get the following output:
Address of obj in myFunc 0x7fffb6aed7ef
Address of obj in main 0x7fffb6aed7ef
Showing that both objects are now the same... Is this just a coincidence ?!
Also, what does exactly happen in:
MyClass obj(myfunc());
I have overloaded the copy constructor to print a message, but it never appears...
By adding a destructor (whatever it was that you actually did, you're not showing the code) the behavior changed to use Return Value Optimization, known as RVO.
Then a pointer to the caller's storage is passed to the function, and the function constructs the object directly in that storage, instead of e.g. copying a value in a processor register or set of registers.
The same calling convention, with a hidden result storage pointer, can also be used without RVO. Without RVO a copy or move is performed at the end of the function. The standard supports RVO optimization under certain conditions, but, while it can be reasonably expected, a compiler is not under any obligation to perform RVO.

const reference public member to private class member - why does it work?

Recently, I found an interesting discussion on how to allow read-only access to private members without obfuscating the design with multiple getters, and one of the suggestions was to do it this way:
#include <iostream>
class A {
public:
A() : _ro_val(_val) {}
void doSomething(int some_val) {
_val = 10*some_val;
}
const int& _ro_val;
private:
int _val;
};
int main() {
A a_instance;
std::cout << a_instance._ro_val << std::endl;
a_instance.doSomething(13);
std::cout << a_instance._ro_val << std::endl;
}
Output:
$ ./a.out
0
130
GotW#66 clearly states that object's lifetime starts
when its constructor completes successfully and returns normally. That is, control reaches the end of the constructor body or an earlier return statement.
If so, we have no guarantee that the _val memeber will have been properly created by the time we execute _ro_val(_val). So how come the above code works? Is it undefined behaviour? Or are primitive types granted some exception to the object's lifetime?
Can anyone point me to some reference which would explain those things?
Before the constructor is called an appropriate amount of memory is reserved for the object on Freestore(if you use new) or on stack if you create object on local storage. This implies that the memory for _val is already allocated by the time you refer it in Member initializer list, Only that this memory is not properly initialized as of yet.
_ro_val(_val)
Makes the reference member _ro_val refer to the memory allocated for _val, which might actually contain anything at this point of time.
There is still an Undefined Behavior in your program because, You should explicitly initialize _val to 0(or some value,you choose)in the constructor body/Member Initializer List.The output 0 in this case is just because you are lucky it might give you some other values since _val is left unInitialized. See the behavior here on gcc 4.3.4 which demonstrates the UB.
But as for the Question, Yes indeed the behavior is Well-Defined.
The object's address does not change.
I.e. it's well-defined.
However, the technique shown is just premature optimization. You don't save programmers' time. And with modern compiler you don't save execution time or machine code size. But you do make the objects un-assignable.
Cheers & hth.,
In my opinion, it is legal (well-defined) to initialize a reference with an uninitialized object. That is legal but standard (well, the latest C++11 draft, paragraph 8.5.3.3) recommends using a valid (fully constructed) object as an initializer:
A reference shall be initialized to refer to a valid object or function.
The next sentence from the same paragraph throws a bit more light at the reference creation:
[Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior.]
I understand that reference creation means binding reference to an object obtained by dereferencing its pointer and that probably explains that the minimal prerequisite for initialization of reference of type T& is having an address of the portion of the memory reserved for the object of type T (reserved, but not yet initialized).
Accessing uninitialized object through its reference can be dangerous.
I wrote a simple test application that demonstrates reference initialization with uninitialized object and consequences of accessing that object through it:
class C
{
public:
int _n;
C() : _n(123)
{
std::cout << "C::C(): _n = " << _n << " ...and blowing up now!" << std::endl;
throw 1;
}
};
class B
{
public:
// pC1- address of the reference is the address of the object it refers
// pC2- address of the object
B(const C* pC1, const C* pC2)
{
std::cout << "B::B(): &_ro_c = " << pC1 << "\n\t&_c = " << pC2 << "\n\t&_ro_c->_n = " << pC1->_n << "\n\t&_c->_n = " << pC2->_n << std::endl;
}
};
class A
{
const C& _ro_c;
B _b;
C _c;
public:
// Initializer list: members are initialized in the order how they are
// declared in class
//
// Initializes reference to _c
//
// Fully constructs object _b; its c-tor accesses uninitialized object
// _c through its reference and its pointer (valid but dangerous!)
//
// construction of _c fails!
A() : _ro_c(_c), _b(&_ro_c, &_c), _c()
{
// never executed
std::cout << "A::A()" << std::endl;
}
};
int main()
{
try
{
A a;
}
catch(...)
{
std::cout << "Failed to create object of type A" << std::endl;
}
return 0;
}
Output:
B::B(): &_ro_c = 001EFD70
&_c = 001EFD70
&_ro_c->_n = -858993460
&_c->_n = -858993460
C::C(): _n = 123 ...and blowing up now!
Failed to create object of type A