Names have scope (a compile-time property), while objects have lifetimes (a runtime property). Right?
I often see people talking about temporary objects "going out of scope". But since a temporary object does not have a name, I think it does not make sense to talk about "scope" in this context. The lifetime of a temporary object is very clearly defined and has nothing to do with scope. Would you agree?
Names have scope (a compile-time property),
Yes. I would not call it a property thought. But basically yes.
while objects have lifetimes (a runtime property). Right?
There are three types of variables. Each type has different properties in relation to lifetimes.
Automatic storage duration:
Static storage duration
Dynamic storage duration
Note: automatic storage duration objects have a lifetime that is bound to the scope of the variable.
I often see people talking about temporary objects "going out of scope".
Unless bound to a variable a temporary is destroyed at the end of an expression. If they are bound to a variable (a const reference) then they have the same lifespan as the variable. Sometimes it is just easier to refer to this as the scope, but technically you are correct.
But since a temporary object does not have a name, I think it does not make sense to talk about "scope" in this context.
Technically yes. But I think it just makes talking about it easier. To me (though not technically correct) the scope of a temporary (not bound) is the expression. Its easier to say than the lifespan of the temporary variable.
The lifetime of a temporary object is very clearly defined and has nothing to do with scope. Would you agree?
Yes. But it still feels more natural to talk about scope (even if it is not technically correct). As most people understand what you are trying to imply. But when you get down and talk about the very technical stuff you should use the correct terminology and scope in this context is not correct.
The lifetime of temporaries has very little to do with syntactical blocks, but "scope" — as a word rather than a technical term — can be used in other ways. The important question is whether you are confused when people use "scope" to refer to temporaries. (It doesn't appear that you are, from my POV.)
Since you're talking about using the term to communicate with others, that communication is what's really important. If you were defining terms by writing a standardese document or trying to interpret such a document in the context of defined terms, the situation would be different. Interpreting ISO 14882 will, of course, involve communicating with others, so you would just have to ask for clarification if necessary, in that case.
It's counter-productive to make all non-standardese communication be standardese, and it's often better to use code in either case when high precision is required. The C++ standard extensively uses examples for this reason.
For another example, "calling a constructor" is often used, yet technically you can't call a ctor directly; instead, ctors are part of object initialization. This is why there's an explicit form of new solely to construct an object. (Interestingly, you can call a destructor directly.) However, I would expect that phrase to be understood in most contexts, though I wouldn't advocate using it in standardese contexts.
I've seen people say that "an object went out of scope" when it meant (in your parlance) "the lifetime of the object ended when the object's name went out of scope". If you use that short form, it's natural to say that temporay objects go out of scope, too.
Temporary objects do have names, albeit referable by the compiler only. Otherwise how would the compiler refer to them? Just because you can't refer to a temporary once it's instantiated doesn't mean the compiler can't refer to it.
f(Foo(), Bar());
The compiler has to refer to at least one of the temporaries even though you as a programmer can't refer to either of them. Temporary objects do have a scope.
Binding to a const reference extends the lifetime of a temporary to the lifetime of the reference, so in a sense, it does have something to do with scope in this particular case :
std::string foo();
int main()
{
// Lifetime of the temporary returned by foo is indeed the scope of bar
const std::string &bar = foo();
}
See this article from Herb Sutter :
Normally, a temporary object lasts
only until the end of the full
expression in which it appears.
However, C++ deliberately specifies
that binding a temporary object to a
reference to const on the stack
lengthens the lifetime of the
temporary to the lifetime of the
reference itself, and thus avoids what
would otherwise be a common
dangling-reference error.
Related
After learning some Rust and its lifetime specifiers, borrowing semantics, etc, I came across a Rust sample which doesn't allow something like that which is allowed in C++. Why?
struct S {
std::string& str;
S(std::string&& value) : str(value) {}
};
It is not allowed and would cause an error if you actually did try that.
However, the name of a variable is always an lvalue, not a rvalue. What type the variable is doesn't matter at all. You need to call std::move on it to turn it into a rvalue. That's what std::move does.
When using a rvalue reference variable it behaves exactly like a lvalue reference variable. Both refer directly to the bound object as a lvalue. They only differ in how they can or cannot be initialized and how they affect overload resolution and template argument deduction.
The point is to make it explicit when you potentially move from an object. Even if value is a rvalue reference, you may still use it multiple times in the function. But, usually, when you move from the referenced object, you can't use it afterwards anymore or at least it will lose its state. Therefore it must be clear where a potential move can happen while still allowing non-move usage. So the rule makes sense to enforce you to write std::move explicitly everywhere where a move might happen.
If the goal is prevent a user from constructing S from a temporary, which is reasonable, then you should delete the constructor completely:
S(std::string&& value) = delete;
(There are technically some issues with this. A better, but complex approach is to follow what std::reference_wrapper does.)
However, the user already has to think about lifetime of the argument they pass to S. So you only really catch a small subset of potential mistakes with this. There is no way to protect the user from not keeping the passed object alive long enough. If you need to ensure this through S, then S must take (shared) ownership of the object, e.g. by using std::unique_ptr or std::shared_ptr instead of a reference. There is no borrow checker in C++ like there is in Rust to verify that the borrowed/referenced object is kept alive long enough.
C++ lifetime and ownership management is mostly based on convention together with some utilities like the smart pointers. There is no intrinsic core language enforcement aside from automatic storage duration, although core language features like rvalue references are designed to support the conventions, e.g. as I described above.
Is it true that temporary objects are stored in dynamic (heap) memory?
The standard does not mandate any memory area (heap/stack) for them, but they are just like local variables "automatic storage", that is at the end of the expression (or longer when bound to a ref-to-const) they are destructed.
Most implementations will store them on the stack just like local variables.
edit:
As James Kanze pointed out: In the case the lifetime of a temporary is extended via a ref-to-const, its store location is on most implementations somewhat determined by the storage location of that reference. That is, in the case of the reference being in static storage, the temporary will be too (just confirmed on gcc). (although IMHO while this is still a temporary in the standards sense, it is arguable whether this is a temporary in the intuitive english sense of that word)
It depends on their lifetime. Temporaries you create inside of a function that you dont bind to a local static reference to lengthen their lifetime will most likely be created on the stack. Temporaries you bind to local static references will most likely be stored in the .data section of your program binary. Same holds for temporaries you bind to nonlocal references. Temporaries that are created during initialization of a nonlocal variable other that the one bound to by a reference should be on the stack of the function that produces the value of that nonlocal variable.
Exception objects that represent the thrown object during unwinding are temporaries too. Those usually reside on the heap.
This is highly implementation dependent, but they probably reside in automatic storage.
Note that the scope can be counter-intuitive, because of optimizations.
The following:
class A
{
//...
};
//....
A foo()
{
A a;
return a;
}
Here, the object a doesn't necessarily reside only inside the function's scope, but RVO can occur.
Also, when passing by value a temporary object, it might not get destructed immediately.
void foo(A a);
//...
foo( A() );
Here, a temporary is not necessarily only alive in that line, but can be constructed directly inside the method's argument stack.
Most (if not all) implementations store them on the stack (i.e. automatic storage), although I don't think the standard mandates anywhere. It's certainly easier to do it like that, as the compiler has to guarantee the temporary variable's life time, and it is possible that said lifetime will encompass a recursive call of the same function, creating another instance of the temporary variable.
With GC on, is it possible for destructors to be called right after going out of scope.
Is it possible for destructors to be called on going out of scope for all objects [of any type]?
Why aren't destructors called when going out of scope anyway?
In this post "scope on a local variable" is said to be "unsafe". Why is it considered unsafe?
And the rationale for deprecating the feature is
scope as a type constraint was a quirk in the language without a compelling use case.
No compelling use case? Like placing objects on the stack(this feature does that, right?) isn't faster than on the heap.
Generally, the D GC does not call destructors when returning from a function. The GC is triggered when an allocation occurs, and when GC.collect is called.
D has scoped, which wraps a non-RAII type in a struct, which has RAII behavior. This way, the struct's destructor can take care of cleaning up the memory. Note that, while this generally works, there are some corner cases and things to be aware of that a GC will handle automatically. This allows destructors to be called on any object when leaving a scope.
Destructors are not called on all objects when they leave scope because there may be other references to the objects. Consider this code:
int* global;
void func() {
int* p = new int;
global = p;
}
If the int pointed to by p was destructed when func() returned, then global would point to destructed memory.
The article you link is almost ten years old, and D has changed a bit in the meantime. scope now has better semantics, which meaningfully limit what you can do with variables marked such:
return ref
return scope
ref return scope
scope local variables may still be assigned to globals, which seems like an oversight. I'll file a bug if I can't find an existing one.
With GC on, is it possible for destructors to be called right after going out of scope.
Is it possible for destructors to be called on going out of scope for all objects [of any type]?
Destructors are called for any object on the stack, this applies to structs and classes allocated as scope Foo f = new Foo().
In this post "scope on a local variable" is said to be "unsafe". Why is it considered unsafe?
Because one could escape a reference to the stack allocated instance which might outlive the current function call. (But DIP 25 / 1000 detect's most of these issues)
And the rationale for deprecating the feature is [...]
No compelling use case? Like placing objects on the stack(this feature does that, right?) isn't faster than on the heap.
Only scope attached to the type declaration is deprecated.
But as a general recommendation, use structs if your object requires deterministic destruction (e.g. File handles).
rvalue references: what exactly are "temporary" objects, what is their scope, and where are they stored?
Reading some articles, rvalues are always defined as "temporary" objects like Animal(), where Animal is a class, or some literal e.g. 10.
However, what is the formal definition of rvalues/"temporary" objects?
Is new Animal() also considered a "temporary" object? Or is it only values on the stack, like Animal() and literals stored in code?
Also, where are these "temporary" objects stored, what is their scope, and how long are rvalue references to these values valid?
Firstly it is important not to conflate the terms "rvalue" and "temporary object". They have very different meanings.
Temporary objects do not have a storage duration. Instead, they have lifetime rules that are specific to temporary objects. These can be found in section [class.temporary] of the C++ Standard; there is a summary on cppreference, which also includes a list of which expressions create temporary objects.
In practice I'd expect that a compiler would either optimize the object out, or store it in the same location as automatic objects are stored.
Note that "temporary object" only refers to objects of class type. The equivalent for built-in types are called values. (Not "temporary values"). In fact the term "values" includes both values of built-in type, and temporary objects.
A "value" is a completely separate idea to prvalue, xvalue, rvalue. The similarity in spelling is unfortunate.
Values don't have scope. Scope is a property of a name. In many cases the scope of a name coincides with the lifetime of the object or value it names, but not always.
The terms rvalue, lvalue etc. are value categories of an expression. These describe expressions, not values or objects.
Every expression has a value category. Also, every expression has a value, except expressions of void type. These are two different things. (The value of an expression has a non-reference type.)
An expression of value category rvalue may designate a temporary object, or a non-temporary object, or a value of built-in type.
The expressions which create a temporary object all have value category prvalue, however it is then possible to form expressions with category lvalue which designate that same temporary object. For example:
const std::string &v = std::string("hello");
In this case v is an lvalue, but it designates a temporary object. The lifetime of this temporary object matches the lifetime of v, as described in the earlier cppreference link.
Link to further reading about value categories
An rvalue reference is a reference that can only bind to an expression of value category rvalue. (This includes prvalue and xvalue). The word rvalue in its name refers to what it binds to, not its own value category.
All named references in fact have category lvalue. Once bound, there is no difference in behaviour between an rvalue reference and an lvalue reference.
std::string&& rref = std::string("hello");
rref has value category lvalue , and it designates a temporary object. This example is very similar to the previous one, except the temporary object is non-const this time.
Another example:
std::string s1("hello");
std::string&& rref = std::move(s1);
std::string& lref = s1;
In this case, rref is an lvalue, and it designates a non-temporary object. Further, lref and rref (and even s1!) are all indistinguishable from hereon in, except for specifically the result of decltype.
There are two different things to concern about. First of all, there's the language's point of view. Language specifications, such as the C++ standard(s), don't talk about things such as CPU registers, cache coherence, stacks (in the assembly sense), etc... Then, there's a real machine's point of view. Instruction set architectures (ISAs), such as the one(s) defined by Intel manuals, do concern about this stuff. This is, of course, because of portability and abstraction. There's no good reason for C++ to depend on x86-specific details, but a lot of bad ones. I mean, imagine if HelloWorld.cpp would only compile for your specific Core i7 model for no good reason at all! At the same time, you need CPU specific stuff sometimes. For instance, how would you issue a CLI instruction in a portable way? We have different languages because we need to solve different tasks, and we have different ISAs because we need different means to solve them. There's a good reason explaining why your smartphone doesn't use an Intel CPU, or why the Linux kernel is written in C and not, ahem... Brainfuck.
Now, from the language's point of view, a "rvalue" is a temporary value whose lifetime ends at the expression it is evaluated in.
In practice, rvalues are implemented the same way as automatic variables, that is, by storing their value on the stack, or a register if the compiler sees it fit. In the case of an automatic variable, the compiler can't store it in a register if its address is taken somewhere in the program, because registers have no "address". However, if its address is never taken, and no volatile stuff is involved, then the compiler's optimizer can place that variable into a register for optimization's sake. For rvalues, this is always the case, as you can't take a rvalue's address. From the language's point of view, they don't have one (Note: I'm using oldish C terminology here; see the comments for details, as there are way too many C++11 pitfalls to annotate here). This is necessary for some things to work properly. For instance, cdecl requires that small values be returned in the EAX register. Thus, all function calls must evaluate into a rvalue (consider references as pointers for simplicity's sake), because you can't take a register's address, as they don't have one!
There's also the concept of "lifetime". From the language's perspective, once some object's lifetime "ends", it ceases to be, period. When does it "begins" and "ends" depends on the object's allocation means:
For objects with dynamic storage, their lifetime sexplicitly start by means of new expressions and explicitly end by means of delete statements. This mechanism allows them to survive their original scope (e.g: return new int;).
For objects with automatic storage, their lifetimes start when their scope is reached in the program flow, and end when their scope is exited.
For objects with static storage, their lifetimes start before main() is called and end once main() exits.
For objects with thread-local storage, their lifetimes start when their respective thread starts, and end when their respective thread exits.
Construction and destruction are respectively involved in an object's lifetime "start" and "end".
From a real machine's point of view, bits are just bits, period. There are no "objects" but bits and bytes in memory cells and CPU registers. For things like an int, that is, a POD type, "ending its lifetime" translates into doing nothing at all. For non-trivially destructible non-POD types, a destructor must be called at the right moment. However, the memory/register that once contained the "object" is still there. It just happens that it can now be reused by something else.
Is new Animal() also considered a "temporary" object? Or is it only values on the stack, like Animal() and literals stored in code?
new Animal() allocates memory in the heap for an Animal, constructs it, and the whole expression evaluates into a Animal*. Such an expression is an rvalue itself, as you can't say something like &(new Animal()). However, the expression evaluates into a pointer, no? Such a pointer points to an lvalue, as you can say things such as &(*(new Animal())) (will leak, though). I mean, if there's a pointer containing its address, it has an address, no?
Also, where are these "temporary" objects stored, what is their scope, and how long are rvalue references to these values valid?
As explained above, a "temporary object"'s scope is that of the expression that encloses it. For example, in the expression a(b * c) (assuming a is a function taking a rvalue reference as its single argument), b * c is an rvalue whose scope ends after the expression enclosing it, that is, A(...), is evaluated. After that, all remaining rvalue references to it that the function a may have somehow created out of its parameter are dangling and will cause your program to do funny things. In order words, as long as you don't abuse std::move or do other voodoo with rvalues, rvalue references are valid in the circumstances that you'ld expect them to be.
Is it true that temporary objects are stored in dynamic (heap) memory?
The standard does not mandate any memory area (heap/stack) for them, but they are just like local variables "automatic storage", that is at the end of the expression (or longer when bound to a ref-to-const) they are destructed.
Most implementations will store them on the stack just like local variables.
edit:
As James Kanze pointed out: In the case the lifetime of a temporary is extended via a ref-to-const, its store location is on most implementations somewhat determined by the storage location of that reference. That is, in the case of the reference being in static storage, the temporary will be too (just confirmed on gcc). (although IMHO while this is still a temporary in the standards sense, it is arguable whether this is a temporary in the intuitive english sense of that word)
It depends on their lifetime. Temporaries you create inside of a function that you dont bind to a local static reference to lengthen their lifetime will most likely be created on the stack. Temporaries you bind to local static references will most likely be stored in the .data section of your program binary. Same holds for temporaries you bind to nonlocal references. Temporaries that are created during initialization of a nonlocal variable other that the one bound to by a reference should be on the stack of the function that produces the value of that nonlocal variable.
Exception objects that represent the thrown object during unwinding are temporaries too. Those usually reside on the heap.
This is highly implementation dependent, but they probably reside in automatic storage.
Note that the scope can be counter-intuitive, because of optimizations.
The following:
class A
{
//...
};
//....
A foo()
{
A a;
return a;
}
Here, the object a doesn't necessarily reside only inside the function's scope, but RVO can occur.
Also, when passing by value a temporary object, it might not get destructed immediately.
void foo(A a);
//...
foo( A() );
Here, a temporary is not necessarily only alive in that line, but can be constructed directly inside the method's argument stack.
Most (if not all) implementations store them on the stack (i.e. automatic storage), although I don't think the standard mandates anywhere. It's certainly easier to do it like that, as the compiler has to guarantee the temporary variable's life time, and it is possible that said lifetime will encompass a recursive call of the same function, creating another instance of the temporary variable.