Does lifetime extension also construct an object on the stack? - c++

Given a function returning an object of class type T
T GetT()
{
T t;
...
return t;
}
To me, the only difference between lifetime extension
const T& t1 = GetT();
and normal construct
T t2 = GetT(); // copy elision
is that t1 is a const reference, while t2 is a normal lvalue. They have the same behavior other than types.
Does lifetime extension do anything different from simply creating an object on the stack but limit its access as const?
Does lifetime extension do more magic to reduce the cost of constructing a normal object?

Does lifetime extension do anything different from simply creating an object on the stack but limit its access as const?
No, it doesn't do more than binding to the object returned from the function, and yes that object has automatic storage duration. But lifetime extension doesn't have to make things const. A rvalue reference can do lifetime extension as well, without needing to make things const.
T&& t3 = GeT();
Does lifetime extension do more magic to reduce the cost of constructing a normal object?
No, the object is always going to need to be initialized and destroyed in the scope where GetT() appears. All the reference does is make those two events in the code be further apart. A local variable will behave the same.
Where references behave differently, is if you have inheritance thrown into the mix, along with overload resolution.
class Base{};
class D1 : public Base{};
class D2 : public Base{};
D1 get(int);
D2 get(char);
int main() {
Base&& b1 = get(0);
Base&& b2 = get('0');
}
In that example, b1 and b2 bind to objects of different types (and the compiler keeps track of the type, without slicing). That may not seem like a very useful feature in a world where auto&& exists, but it had its uses pre-c++11.

Related

Apparent bug in clang when assigning a r value containing a `std::string` from a constructor

While testing handling of const object members I ran into this apparent bug in clang. The code works in msvc and gcc. However, the bug only appears with non-consts which is certainly the most common use. Am I doing something wrong or is this a real bug?
https://godbolt.org/z/Gbxjo19Ez
#include <string>
#include <memory>
struct A
{
// const std::string s; // Oddly, declaring s const compiles
std::string s;
constexpr A() = default;
constexpr A(A&& rh) = default;
constexpr A& operator=(A&& rh) noexcept
{
std::destroy_at(this);
std::construct_at(this, std::move(rh));
return *this;
}
};
constexpr int foo()
{
A i0{}; // call ctor
// Fails with clang. OK msvc, gcc
// construction of subobject of member '_M_local_buf' of union with no active member is not allowed in a constant expression { return ::new((void*)__location) _Tp(std::forward<_Args>(__args)...); }
i0 = A{}; // call assign rctor
return 42;
}
int main() {
constexpr int i = foo();
return i;
}
For those interested, here's the full version that turns const objects into first class citizens (usable in vectors, sorting, and such). I really dislike adding getters to maintain immutability.
https://godbolt.org/z/hx7f9Krn8
Yes this is a libstdc++ or clang issue: std::string's move constructor cannot be used in a constant expression. The following gives the same error:
#include <string>
constexpr int f() {
std::string a;
std::string b(std::move(a));
return 42;
}
static_assert(f() == 42);
https://godbolt.org/z/3xWxYW717
https://en.cppreference.com/w/cpp/compiler_support does not show that clang supports constexpr std::string yet.
Your game of "construct a new object in place of the old one" is the problem.
It is completely forbidden if the object is const or contains any const member subobjects.
due to the following rule in [basic.life] (note that a rewrite1 of this rule is proposed in post-C++17 drafts)
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied,
and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers)
and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type
and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
You have to abide by this rule for the purposes both of return *this; and also the implicit destructor call.
It also doesn't work during constexpr evaluation.
... this one is specific to the fact that std::string small-string optimization may be implemented using a union, and changing an active union member has been forbidden during constexpr evaluation, although this rule too seems to have changed post-C++17.
1 I consider said change to be misguided (it doesn't even permit the pattern it was supposed to fix) and break legitimate coding patterns. While it's true that a pointer to const-qualified object only made my view readonly and did not let me assume that the object wasn't being changed by someone else holding a pointer/reference that wasn't so qualified, in the past if I was given a pointer (qualified or not) to an object with a const member, I was assured that no one was changing that member and I (or my optimizing compiler) could safely use a cached copy of that member (or data derived from that member value, such as a hash or a comparison result).
Apparently this is no longer true.
While changing the language rule may automatically remove all compiler optimizations that would have assumed immutability of a const member, there's no automatic patch to user-written code which was correct and bug-free under the old rules, for example std::map and std::unordered_map code using std::pair<const Key, Value>. Yet the DR doesn't appear to have considered this as a breaking change...
I was asked for a code snippet that illustrates a behavior change of existing valid code, here it is. This code was formerly illegal, under the new rules it's legal, and the map will fail to maintain its invariants.
std::map<int, T> m{data_source()};
/* new code, now legal */
for( auto& keyvalue : m ) {
int newkey = -keyvalue.first;
std::construct_at(&keyvalue.first, newkey);
// or new (&keyvalue.first) int(newkey);
}
/* existing valid code that breaks */
std::cout << m[some_key()];
Consider the new relaxed wording of the restriction
the original object is neither a complete object that is const-qualified nor a subobject of such an object
keyvalue.first is const-qualified, but it is not a complete object, and it is a subobject of a complete object (std::pair<const Key, Value>) that is not const-qualified. This code is now legal. It's not even against the spirit of the rule, the DR explicitly mentioned the intent to perform in-place replacement of container elements with const subobjects.
It's the implementation of std::map that breaks, along with all existing code that uses the map instance, an unfortunate action-at-a-distance resulting from addition of now-legal code.
Please note that the actual replacement of the key could take place in code that merely has the pointer &keyvalue and needn't know that std::pair instance is actually inside a std::map), so the stupidity of what's being done won't be so obvious.

What problems are solved by holding a Temporary using `const T &`?

Like, if I write code that looks like the following:
const int & const_value = 10;
What advantage am I gaining over code that would instead look like this:
int value = 10;
or
const int const_value = 10;
Even for a more complicated example, like
const enterprise::complicated_class_which_takes_time_to_construct & obj = factory.create_instance();
Thanks to copy elision, both of the following code snippets shouldn't be significantly faster or less memory consuming.
enterprise::complicated_class_which_takes_time_to_construct obj = factory.create_instance();
or
const enterprise::complicated_class_which_takes_time_to_construct obj = factory.create_instance();
Is there something I'm missing that explains the use of this construct, or a hidden benefit I'm not accounting for?
Edit:
The answer provided by #nwp supplied the example of a factory method that looks like the following:
using cc = enterprise::complicated_class_which_takes_time_to_construct;
struct Factory{
const cc &create_instance(){
static const cc instance;
return instance;
}
} factory;
While this is a good example of why you'd need the exact const reference construct in general, it doesn't answer why it would be needed for dealing with temporaries, or why this particular construct would be better than just seating the temporary in a proper stack-allocated object, which the compiler should still be able to copy-elide in normal situations.
Imagine the factory is "clever" and does something like
using cc = enterprise::complicated_class_which_takes_time_to_construct;
struct Factory{
const cc &create_instance(){
static const cc instance;
return instance;
}
} factory;
and further assume cc is not copyable then the const & version works while the other 2 don't. This is fairly common for classes that implement a cache and keep a local map of objects and you get a reference to the object inside the map.
Edit:
It is possible a coding guideline would say Take return values from functions by const &, because that works universally with values and references.
cc foo();
cc &foo();
const cc &foo();
If you use const cc &var = foo(); it will work without an unnecessary copy in all the cases. Sometimes you get a temporary, sometimes you don't, but the code always does the right thing, freeing you to change the implementation and to not having to care about how the function you use returns its return-value. There is still the issue of having a const in there, so it is not the superior notation in all cases, but it is a viable design choice.
There's one "pathological" case that comes to mind: non-copyable and non-movable objects. Such an object cannot be stored by value, so holding it by a reference is your only chance:
#include <iostream>
struct Immovable
{
Immovable(int i) : i(i) {}
Immovable(const Immovable&) = delete;
Immovable(Immovable&&) = delete;
const int i;
};
Immovable factory()
{
return {42};
}
int main()
{
const Immovable &imm = factory();
std::cout << imm.i;
}
[Live example]
On a less contrived note, you dismissed the "heavy object" optimisation by citing copy elision. Note, however, that copy elision is optional. You can never guarantee that the compiler will do it. Storing the temporary in a const &, on the contrary, guarantees that no copying or moving will happen.
The rationale for the lifetime extension rule is known ¹(because Bjarne Stroustrup, the language creator, said so) to be to have uniform simple rules. In particular, the lifetime of a temporary used as actual argument in a function call, is extended to the end of the full-expression, covering the lifetime of any reference that it's bound to. And the rule for local reference to const, or local rvalue reference, makes that aspect of the behavior the same.
In C++03 one practical use case, as far as I know ²invented by Petru Marginean, was to create a local object of an unknown automatically deduced type, like this:
Base const& o = foo( arg1, arg2, arg3 );
This works when Base is an accessible base class of any possible foo result type, because the lifetime extension mechanism does not slice: it extends the lifetime of the full temporary object. When foo returns a T it's a complete T object that has its lifetime extended. The compiler knows T even though the programmer may not necessarily know T.
Of course, in C++11 and later one can instead, generally, use auto:
auto const o = foo( arg1, arg2, arg3 );
With return value optimization and/or moving this will be just as efficient as Marginean's trick, and it's more clear and simple.
However, when the type T is non-movable and non-copyable, then binding to a reference with lifetime extension of the temporary, appears to be the only way to hold on to it, which is a second use case.
Now, surprise?, the type T of the full temporary object needs not be derived from the statically known type. It's enough that compiler knows that you're holding a reference to a part of the temporary. Whether that part is a base class sub-object, or some other sub-object, doesn't matter, so you can do:
auto const& part = foo( arg1, arg2, arg3 ).a_part;
Which is a third use case, which is only about not introducing an otherwise never used name for the complete object, keeping the code simple.
Standardese about the hold-on-to-a-part (and thus the whole object) case:
C++15 §12.2/5 [class.temporary]
” The temporary to which the reference is
bound or the temporary that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference except …
The exceptions include that a temporary used as actual argument in a function call, persists till the end of the full-expression, i.e. a bit longer than any formal argument reference that it's bound to.
¹ In an exchange with me in the Usenet group comp.lang.c++.
² In the C++03 implementation of the ScopeGuard class, which was invented by Petru. In C++11 a ScopeGuard can be trivially implemented using std::function and auto in the client code's dceclaration.

Does reuse storage start lifetime of a new object? [duplicate]

This question already has answers here:
Is it allowed to write an instance of Derived over an instance of Base?
(4 answers)
Closed 8 years ago.
#include <cstdlib>
struct B {
virtual void f();
void mutate();
virtual ~B();
};
struct D1 : B { void f(); };
struct D2 : B { void f(); };
void B::mutate() {
new (this) D2; // reuses storage — ends the lifetime of *this
f(); // undefined behavior - WHY????
... = this; // OK, this points to valid memory
}
I need to be explained why f() invokation has UB? new (this) D2; reuses storage, but it also call a constructor for D2 and since starts lifetime of a new object. In that case f() equals to this -> f(). That is we just call f() member function of D2. Who knows why it is UB?
The standard shows this example § 3.8 67 N3690:
struct C {
int i;
void f();
const C& operator=( const C& );
};
const C& C::operator=( const C& other) {
if ( this != &other ) {
this->~C(); // lifetime of *this ends
new (this) C(other); // new object of type C created
f(); // well-defined
}
return *this;
}
C c1;
C c2;
c1 = c2; // well-defined
c1.f(); // well-defined; c1 refers to a new object of type C
Notice that this example is terminating the lifetime of the object before constructing the new object in-place (compare to your code, which does not call the destructor).
But even if you did, the standard also says:
If, after the lifetime of an object has ended and before the storage
which the object occupied is reused or released, a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, or the name of the original object will
automatically refer to the new object and, once the lifetime of the
new object has started, can be used to manipulate the new object, if:
— the storage for the new object exactly overlays the storage location
which the original object occupied, and — the new object is of the
same type as the original object (ignoring the top-level
cv-qualifiers), and
— the type of the original object is not
const-qualified, and, if a class type, does not contain any non-static
data member whose type is const-qualified or a reference type, and
— the original object was a most derived object (1.8) of type T and the
new object is a most derived object of type T (that is, they are not
base class subobjects).
notice the 'and' words, the above conditions must all be fulfilled.
Since you're not fulfilling all the conditions (you have a derived object in-placed into the memory space of a base class object), you have undefined behavior when referencing stuff with an implicit or explicit use of this pointer.
Depending on the compiler implementation this might or might now blow because a base class virtual object reserves some space for the vtable, in-place constructing an object of a derived type which overrides some of the virtual functions means the vtable might be different, put alignment issues and other low-level internals and you'll have that a simple sizeof won't suffice to determine if your code is right or not.
This construct is very interesting:
The placement-new is not guaranteed to call the destructor of the object. So this code will not properly ensure end of life of the object.
So in principle you should call the destructor before reusing the object. But then you would continue to execute a member function of an object that is dead. According to standard section.9.3.1/2 If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
If you don't explicitely delete your object, as you do in your code, you then recreate a new object (constructing a second B without destoying the first one, then D2 ot top of this new B).
When the creation of your new object is finished, the identity of your current object has in fact changed while executing the function. You cannot be sure if the pointer to the virtual function that will be called was read before your placement-new (thus the old pointer to D1::f) or after (thus D2::f).
By the way, it's exactly for this reason, that there are some constraints about what you can or can't do in a union, where a same memory place is shared for different active objects (see Point 9.5/2 and perticularly point 9.5/4 in the standard).

why use a const non-reference when const reference lifetime is the length of the current scope

So in c++ if you assign the return value of a function to a const reference then the lifetime of that return value will be the scope of that reference. E.g.
MyClass GetMyClass()
{
return MyClass("some constructor");
}
void OtherFunction()
{
const MyClass& myClass = GetMyClass(); // lifetime of return value is until the end
// of scope due to magic const reference
doStuff(myClass);
doMoreStuff(myClass);
}//myClass is destructed
So it seems that wherever you would normally assign the return value from a function to a const object you could instead assign to a const reference. Is there ever a case in a function where you would want to not use a reference in the assignment and instead use a object? Why would you ever want to write the line:
const MyClass myClass = GetMyClass();
Edit: my question has confused a couple people so I have added a definition of the GetMyClass function
Edit 2: please don't try and answer the question if you haven't read this:
http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/
If the function returns an object (rather than a reference), making a copy in the calling function is necessary [although optimisation steps may be taken that means that the object is written directly into the resulting storage where the copy would end up, according to the "as-if" principle].
In the sample code const MyClass myClass = GetMyClass(); this "copy" object is named myclass, rather than a temporary object that exists, but isn't named (or visible unless you look at the machine-code). In other words, whether you declare a variable for it, or not, there will be a MyClass object inside the function calling GetMyClass - it's just a matter of whether you make it visible or not.
Edit2:
The const reference solution will appear similar (not identical, and this really just written to explain what I mean, you can't actually do this):
MyClass __noname__ = GetMyClass();
const MyClass &myclass = __noname__;
It's just that the compiler generates the __noname__ variable behind the scenes, without actually telling you about it.
By making a const MyClass myclass the object is made visible and it's clear what is going on (and that the GetMyClass is returning a COPY of an object, not a reference to some already existing object).
On the other hand, if GetMyClass does indeed return a reference, then it is certainly the correct thing to do.
IN some compilers, using a reference may even add an extra memory read when the object is being used, since the reference "is a pointer" [yes, I know, the standard doesn't say that, but please before complaining, do me a favour and show me a compiler that DOESN'T implement references as pointers with extra sugar to make them taste sweeter], so to use a reference, the compiler should read the reference value (the pointer to the object) and then read the value inside the object from that pointer. In the case of the non-reference, the object itself is "known" to the compiler as a direct object, not a reference, saving that extra read. Sure, most compilers will optimise such an extra reference away MOST of the time, but it can't always do that.
One reason would be that the reference may confuse other readers of your code. Not everybody is aware of the fact that the lifetime of the object is extended to the scope of the reference.
The semantics of:
MyClass const& var = GetMyClass();
and
MyClass const var = GetMyClass();
are very different. Generally speaking, you would only use the
first when the function itself returns a reference (and is
required to return a reference by its very semantics). And you
know that you need to pay attention to the lifetime of the
object (which is not under your control). You use the second
when you want to own (a copy of) the object. Using the second
in this case is misleading, can lead to surprises (if the
function also returns a reference to an object which is
destructed earlier) and is probably slightly less efficient
(although in practice, I would expect both to generate exactly
the same code if GetMYClass returns by value).
Performance
As most current compilers elide copies (and moves), both version should have about the same efficiency:
const MyClass& rMyClass = GetMyClass();
const MyClass oMyClass = GetMyClass();
In the second case, either a copy or move is required semantically, but it can be elided per [class.copy]/31. A slight difference is that the first one works for non-copyable non-movable types.
It has been pointed out by Mats Petersson and James Kanze that accessing the reference might be slower for some compilers.
Lifetime
References should be valid during their entire scope just like objects with automatic storage are. This "should" of course is meant to be enforced by the programmer. So for the reader IMO there's no differences in the lifetimes implied by them. Although, if there was a bug, I'd probably look for dangling references (not trusting the original code / the lifetime claim for the reference).
In the case GetMyClass could ever be changed (reasonably) to return a reference, you'd have to make sure the lifetime of that object is sufficient, e.g.
SomeClass* p = /* ... */;
void some_function(const MyClass& a)
{
/* much code with many side-effects */
delete p;
a.do_something(); // oops!
}
const MyClass& r = p->get_reference();
some_function(r);
Ownership
A variable directly naming an object like const MyClass oMyClass; clearly states I own this object. Consider mutable members: if you change them later, it's not immediately clear to the reader that's ok (for all changes) if it has been declared as a reference.
Additionally, for a reference, it's not obvious that the object its referring to does not change. A const reference only implies that you won't change the object, not that nobody will change the object(*). A programmer would have to know that this reference is the only way of referring to that object, by looking up the definition of that variable.
(*) Disclaimer: try to avoid unapparent side effects
I don't understand what you want to achieve. The reason that T const& can be bound (on the stack) to a T (by value) which is returned from a function is to make it possible other function can take this temporary as an T const& argument. This prevents you from requirement to create overloads. But the returned value has to be constructed anyway.
But today (with C++11) you can use const auto myClass = GetMyClass();.
Edit:
As an excample of what can happen I will present something:
MyClass version_a();
MyClass const& version_b();
const MyClass var1 =version_a();
const MyClass var2 =version_b();
const MyClass var3&=version_a();
const MyClass var4&=version_b();
const auto var5 =version_a();
const auto var6 =version_b();
var1 is initialised with the result of version_a()
var2 is initialised with a copy of the object to which the reference returned by version_b() belongs
var3 holds a const reference to to the temoprary which is returned and extends its lifetime
var4 is initialised with the reference returned from version_b()
var5 same as var1
var6 same as var4
They are semanticall all different. var3 works for the reason I gave above. Only var5 and var6 store automatically what is returned.
there is a major implication regarding the destructor actually being called. Check Gotw88, Q3 and A3. I put everything in a small test program (Visual-C++, so forgive the stdafx.h)
// Gotw88.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
class A
{
protected:
bool m_destroyed;
public:
A() : m_destroyed(false) {}
~A()
{
if (!m_destroyed)
{
std::cout<<"A destroyed"<<std::endl;
m_destroyed=true;
}
}
};
class B : public A
{
public:
~B()
{
if (!m_destroyed)
{
std::cout<<"B destroyed"<<std::endl;
m_destroyed=true;
}
}
};
B CreateB()
{
return B();
}
int _tmain(int argc, _TCHAR* argv[])
{
std::cout<<"Reference"<<std::endl;
{
const A& tmpRef = CreateB();
}
std::cout<<"Value"<<std::endl;
{
A tmpVal = CreateB();
}
return 0;
}
The output of this little program is the following:
Reference
B destroyed
Value
B destroyed
A destroyed
Here a small explanation for the setup. B is derived from A, but both have no virtual destructor (I know this is a WTF, but here it's important). CreateB() returns B by value. Main now calls CreateB and first stores the result of this call in a const reference of type A. Then CreateB is called and the result is stored in a value of type A.
The result is interesting. First - if you store by reference, the correct destructor is called (B), if you store by value, the wrong one is called. Second - if you store in a reference, the destructor is called only once, this means there is only one object. By value results in 2 calls (to different destructors), which means there are 2 objects.
My advice - use the const reference. At least on Visual C++ it results in less copying. If you are unsure about your compiler, use and adapt this test program to check the compiler. How to adapt? Add copy / move constructor and copy-assignment operator.
I quickly added copy & assignment operators for class A & B
A(const A& rhs)
{
std::cout<<"A copy constructed"<<std::endl;
}
A& operator=(const A& rhs)
{
std::cout<<"A copy assigned"<<std::endl;
}
(same for B, just replace every capital A with B)
this results in the following output:
Reference
A constructed
B constructed
B destroyed
Value
A constructed
B constructed
A copy constructed
B destroyed
A destroyed
This confirms the results from above (please note, the A constructed results from B being constructed as B is derived from A and thus As constructor is called whenever Bs constructor is called).
Additional tests: Visual C++ accepts also the non-const reference with the same result (in this example) as the const reference. Additionally, if you use auto as type, the correct destructor is called (of course) and the return value optimization kicks in and in the end it's the same result as the const reference (but of course, auto has type B and not A).

What is the rationale for not allowing temporaries bound to references in initialization lists to live past end of ctor?

Simple example:
struct A
{
A() : i(int()) {}
const int& i;
};
Error from gcc:
a temporary bound to 'A::i' only persists until the constructor exits
Rule from 12.2p5:
A temporary bound to a reference member in a constructor’s
ctor-initializer (12.6.2) persists until the constructor exits.
Question
Does anybody know the rationale for this rule? It would seem to me that allowing the temporary to live until reference dies would be better.
I don't think the not extending to the object lifetime needs justification. The opposite would!
The lifetime extension of a temporary extends merely to the enclosing scope, which is both natural and useful. This is because we have tight control on the lifetime of the receiving reference variable. By contrast, the class member isn't really "in scope" at all. Imagine this:
int foo();
struct Bar
{
Bar(int const &, int const &, int const &) : /* bind */ { }
int & a, & b, & c;
};
Bar * p = new Bar(foo(), foo(), foo());
It'd be nigh impossible to define a meaningful lifetime extension for the foo() temporaries.
Instead, we have the default behaviour that the lifetime of the foo() temporary extends to the end of the full-expression in which it is contained, and no further.
The int() in your constructor initialization list is on the stack.
Once that value is set, int() goes out of the scope and the reference becomes invalid.
What memory would it live in?
For it to work the way you propose, it can't be on the stack since it has to live longer than any single function call. It can't be put after struct A in memory, because it would have no way to know how A was allocated.
At best, it would have to be turned into a secret heap allocation. That would then require a corresponding secret deallocation when the destructor runs. The copy constructor and assignment operator would need secret behavior too.
I don't know if anyone ever actually considered such a way of doing it. But personally, it sounds like too much extra complexity for a rather obscure feature.