It may sound philosophical, but it isn't: in C++, can various (classes, scalars) objects exist outside of their lifetime? What is the existence of an object? What is the creation of an object? Is an object created when its lifetime starts?
(Edited for clarity: the question is not specifically about class types.)
I'm extremely confused and need someone to explain the terminology and basic concepts to me.
Note: Existing is the fact of being a thing. It's the most fundamental philosophical concept. It isn't an attribute of an object and I don't know nor do I care about the number of occurrences of the word "exist" in the standard text. Textbooks probably very rarely say that things "exist". I don't remember ever reading that numbers "exist" in registers or that expressions "exist" in the source code. Numbers fit in registers and the source code has expressions in it.
If we can refer to an object, it means it exists. If a pointer points to an object, that object exists.
This is best understood with [intro.object]/1:
An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created ([conv.rval], [class.temporary]). An object occupies a region of storage in its period of construction ([class.cdtor]), throughout its lifetime, and in its period of destruction ([class.cdtor]).
The second sentence there answers your question. If an object is able to occupy a region of storage outside of the period described as "its lifetime" (like during its construction and destruction), then clearly an object must be able to exist outside of its lifetime.
An object in C++ is an abstract concept. To machine code, it's just a big virtual array of bytes.
The physical "object" is a bunch of bytes in memory that get allocated for some purpose, and are associated some functions that act on those bytes. The lifetime of that object is the time when those bytes are used for that object's purpose, to when they are no longer used. They still exist afterwards, but they're available to be used by some other object.
Related
Assuming X and Y are suitable types for such usage, is it UB to use std::start_lifetime_as<X> on an area of memory in one thread as one type and use std::start_lifetime_as<Y> on the exact same memory in another thread? Does the standard say anything about this? If it doesn't, what is the correct interpretation?
There is no data race from such calls, since none of them access any memory locations, but since (without synchronization) neither thread can know that the other has not ended the lifetime of its desired object by reusing its storage for an object of the other type, the objects created cannot be used. (There are not “even odds” that one thread can use them because it “went last”: there is an execution where it didn’t, so relying on that would have undefined behavior.)
Object lifetime is actually one of the more underspecified parts of the standard, especially when it comes to concurrency (and in some places the wording is outright defective IMO), but I think this specific question is answerable with what's there.
First, let's get data races out of the way.
[intro.races]/21:
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions [...]
[intro.races]/2:
Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.
[intro.memory]/3:
A memory location is either an object of scalar type that is not a bit-field or a maximal sequence of adjacent bit-fields all having nonzero width.
Two unrelated objects are definitely not the same 'memory location', so [intro.races]/21 doesn't apply.
However, [intro.object]/9 says:
Two objects with overlapping lifetimes that are not bit-fields may have the same address if one is nested within the other, or if at least one is a subobject of zero size and they are of different types; otherwise, they have distinct addresses and occupy disjoint bytes of storage.
This means that out of any two (unrelated) objects with overlapping storage, at most one can be within lifetime at any given point. [basic.life]/1.5 ensures this:
The lifetime of an object o of type T ends when: [...]
the storage which the object occupies is released, or is reused by an object that is not nested within o.
Accessing (reading or writing) an object outside its lifetime is not allowed ([basic.life]/4), and we've just established that X and Y can't both be within lifetime at the same time. So, if both threads proceed to access the created objects, the behavior is undefined: at least one will be accessing an object whose lifetime has ended.
Picture of text
To read something, we need somewhere to read into; that is, we need somewhere in the computer's memory to place what we read. We call such a "place" an object. An object is a region of memory with a type that specifies what kind of information can be placed in it. A named object is called a variable. For example, character strings are put into string variables and integers are put into int variables. You can think of an object as a "box" into which you can put a value of the object's type
An object is a region of memory with a type that specifies what kind of information can be placed in it? Here they give an example with ints and strings which are like built in datatypes, I already know some oop so I was kinda confused because like I thought an object was kinda its own separate entity with its own user created datatype? I get that all objects have their own region of memory but they're basically calling a variable of type int, age, with the value 42 an object. Is this correct?
This is the wording from the C++ standard [intro.object]:
The constructs in a C++ program create, destroy, refer to, access, and
manipulate objects. An object is created by a definition, by a
new-expression ([expr.new]), by an operation that implicitly creates
objects (see below), when implicitly changing the active member of a
union, or when a temporary object is created ([conv.rval],
[class.temporary]). An object occupies a region of storage in its
period of construction ([class.cdtor]), throughout its lifetime, and
in its period of destruction ([class.cdtor]).
There is more to read about objects, I just picked this paragraph from the intro so you can see that the term "object" in C++ is more general than "instance of a user defined class". Note how the last sentence is quite similar with what Bjarne wrote in the book. What you refer to (instance of a class) is called "class object".
A perhaps more readable introduction to objects in C++ can be found here: https://en.cppreference.com/w/cpp/language/object.
PS: I am not a historian, but I suppose that the term "object" was already in wide-spread use before Object Orientation came into existance. There have been objects before. They were not invented by OOP, just put into the focus. Also note that C++ is not a (pure) OOP language, but rather multi-paradigm.
but they're basically calling a variable of type int, age, with the value 42 an object. Is this correct?
They're correct.
I already know some oop
Modern OOP nomenclature is different from nomenclature of C++.
and strings which are like built in datatypes
The std::string type is a class i.e. a "user defined" data type (although it is provided by the standard library).
I'm trying to figure out whether the following is undefined behaviour. I have a feeling it's not UB, but my reading of the standard makes it look like it is UB:
#include <iostream>
struct A {
A() { std::cout << "1"; }
~A() { std::cout << "2"; }
};
int main() {
A a;
new (&a) A;
}
Quoting the C++11 standard:
basic.life¶4 says "A program may end the lifetime of any object by reusing the storage which the object occupies"
So after new (&a) A, the original A object has ended its lifetime.
class.dtor¶11.3 says that "Destructors are invoked implicitly for constructed objects with automatic storage duration ([basic.stc.auto]) when the block in which an object is created exits ([stmt.dcl])"
So the destructor for the original A object is invoked implicitly when main exits.
class.dtor¶15 says "the behavior is undefined if the destructor is invoked for an object whose lifetime has ended ([basic.life])."
So this is undefined behaviour, since the original A no longer exists (even if the new a now exists in the same storage).
The question is whether the destructor for the original A is called, or whether the destructor for the object currently named a is called.
I am aware of basic.life¶7, which says that the name a refers to the new object after the placement new. But class.dtor¶11.3 explicitly says that it's the destructor of the object which exits scope which is called, not the destructor of the object referred to by a name that exits scope.
Am I misreading the standard, or is this actually undefined behaviour?
Edit: Several people have told me not to do this. To clarify, I'm definitely not planning on doing this in production code! This is for a CppQuiz question, which is about corner cases rather than best practices.
You're misreading it.
"Destructors are invoked implicitly for constructed objects" … meaning those that exist and their existence has gone as far as complete construction. Although arguably not entirely spelled out, the original A does not meet this criterion as it is no longer "constructed": it does not exist at all! Only the new/replacement object is automatically destructed, then, at the end of main, as you'd expect.
Otherwise, this form of placement new would be pretty dangerous and of debatable value in the language. However, it's worth pointing out that re-using an actual A in this manner is a bit strange and unusual, if for no other reason than it leads to just this sort of question. Typically you'd placement-new into some bland buffer (like a char[N] or some aligned storage) and then later invoke the destructor yourself too.
Something resembling your example may actually be found at basic.life¶8 — it's UB, but only because someone constructed a T on top of an B; the wording suggests pretty clearly that this is the only problem with the code.
But here's the clincher:
The properties ascribed to objects throughout this International Standard apply for a given object only during its lifetime. [..] [basic.life¶3]
Am I misreading the standard, or is this actually undefined behaviour?
None of those. The standard is not unclear but it could be clearer. The intent though is that the new object's destructor is called, as implied in [basic.life]p9.
[class.dtor]p12 isn't very accurate. I asked Core about it and Mike Miller (a very senior member) said:
I wouldn't say that it's a contradiction [[class.dtor]p12 vs [basic.life]p9], but clarification is certainly needed. The destructor description was written slightly naively, without taking into consideration that the original object occupying a bit of automatic storage might have been replaced by a different object occupying that same bit of automatic storage, but the intent was that if a constructor was invoked on that bit of automatic storage to create an object therein - i.e., if control flowed through that declaration - then the destructor will be invoked for the object presumed to occupy that bit of automatic storage when the block is exited - even it it's not the "same" object that was created by the constructor invocation.
I'll update this answer with the CWG issue as soon as it is published. So, your code does not have UB.
Too long for a comment.
Lightness' answer is correct and his link is the proper reference.
But let's examine terminology more precisely. There is
"Storage duration", concerning memory.
"Lifetime", concerning objects.
"Scope", concerning names.
For automatic variables all three coincide, which is why we often do not clearly distinguish: A "variable goes out of scope". That is: The name goes out of scope; if it is an object with automatic storage duration, the destructor is called, ending the lifetime of the named object; and finally the memory is released.
In your example only name scope and storage duration coincide — at any point during its existence the name a refers to valid memory — , while object lifetime is split between two distinct objects at the same memory location and with the same name a.
And no, I think you cannot understand "constructed" in 11.3 as "fully constructed and not destroyed" because the dtor will be called again (wrongly) if the object's lifetime was ended prematurely by a preceding explicit destructor call.
In fact, that's one of the concerns with the concept of memory re-use: If construction of the new object fails with an exception the scope will be left and a destructor call will be attempted on an incomplete object, or on the old object which was deleted already.
I suppose you can imagine the automatically allocated, typed memory marked with a tag "to be destroyed" which is evaluated when the stack is unwound. The C++ runtime does not really track individual objects or their state beyond this simple concept. Since variable names are basically constant addresses it is convenient to think of "the name going out of scope" triggering the destructor call on the named object of the supposed type supposedly present at that location. If one of these suppositions is wrong all bets are off.
Imagine using placement new to create a struct B to the storage where the A a objects lives. In the end of the scope, the destructor of the struct A will be called (because the variable a of type A goes out of scope), even if an object of type B is in reallty living there right now.
As already cited:
"If a program ends the lifetime of an object of type T with static
([basic.stc.static]), thread ([basic.stc.thread]), or automatic
([basic.stc.auto]) storage duration and if T has a non-trivial
destructor,39 the program must ensure that an object of the original
type occupies that same storage location when the implicit destructor
call takes place;"
So after putting B into the a storage, you need to destroy B and put an A there again, to not violate the rule above. This somehow not apply here directly, because you are putting an A to an A, but it shows the behavior. It shows, that this thinking is wrong:
So the destructor for the original A object is invoked implicitly when
main exits.
There is no "original" object any longer. There is just an object currently alive in the storage of a. And thats it. And on the data currently sitting in a, a function is called, namely the destructor of A. Thats what the program compiles to. If it would magically kept track of all "original" objects you would somehow have a dynamic runtime behavior.
Additionally:
A program may end the lifetime of any object by reusing the storage
which the object occupies or by explicitly calling the destructor for
an object of a class type with a non-trivial destructor. For an object
of a class type with a non-trivial destructor, the program is not
required to call the destructor explicitly before the storage which
the object occupies is reused or released; however, if there is no
explicit call to the destructor or if a delete-expression
([expr.delete]) is not used to release the storage, the destructor
shall not be implicitly called and any program that depends on the
side effects produced by the destructor has undefined behavior.
Since the destructor of A is not trivial and has side effects, (i think) its undefined behavior. For build in types, this does not apply (hence you can use a char buffer as object buffer without reconstructing the chars back into the buffer after using it) since they have a trivial (no-op) destructor.
In the comments and answers to this question:
Virtual function compiler optimization c++
it is argued that a virtual function call in a loop cannot be devirtualized, because the virtual function might replace this by another object using placement new, e.g.:
void A::foo() { // virtual
static_assert(sizeof(A) == sizeof(Derived));
new(this) Derived;
}
The example is from a LLVM blog article about devirtualization
Now my question is: is that allowed by the standard?
I could find this on cppreference about storage reuse: (emphasis mine)
A program is not required to call the destructor of an object to end its lifetime if the object is trivially-destructible or if the program does not rely on the side effects of the destructor. However, if a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
If the new object must have the same type, it must have the same virtual functions. So it is not possible to have a different virtual function, and thus, devirtualization is acceptable.
Or do I misunderstand something?
The quote you provided says:
If a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
The intent of this statement relates to something a bit different to what you are doing. The statement is meant to say that when you destroy an object without destroying its name, something still refers to that storage with the original type, o you need to construct a new object there so that when the implicit destruction occurs, there is a valid object to destroy. This is relevant for example if you have an automatic ("stack") variable, and you call its destructor--you need to construct a new instance there before the destructor is called when the variable goes out of scope.
The statement as a whole, and its "of the same type" clause in particular, has no bearing on the topic you're discussing, which is whether you are allowed to construct a different polymorphic type having the same storage requirements in place of an old one. I don't know of any reason why you shouldn't be allowed to do that.
Now, that being said, the question you linked to is doing something different: it is calling a function using implicit this in a loop, and the question is whether the compiler could assume that the vptr for this will not change in that loop. I believe the compiler could (and clang -fstrict-vtable-pointers does) assume this, because this is only valid if the type is the same after the placement new.
So while the quotes from the standard you have provided are not relevant to this issue, the end result is that it does seem possible for an optimizer to devirtualize function calls made in a loop under the assumption that the type of *this (or its vptr) cannot change. The type of an object stored at an address (and its vptr) can change, but if it does, the old this is no longer valid.
It appears that you intend to use the new object using handles (pointers, references, or the original variable name) that existed prior to its recreation. That's allowed only if the instance type is not changed, plus some other conditions excluding const objects and sub-objects:
From [basic.life]:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied,
and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Your quote from the Standard is merely a consequence of this one.
Your proposed "devirtualization counter-example" does not meet these requirements, therefore all attempts to access the object after it is replaced will cause undefined behavior.
The blog post even pointed this out, in the very next sentence after the example code you looked at.
Names have scope (a compile-time property), while objects have lifetimes (a runtime property). Right?
I often see people talking about temporary objects "going out of scope". But since a temporary object does not have a name, I think it does not make sense to talk about "scope" in this context. The lifetime of a temporary object is very clearly defined and has nothing to do with scope. Would you agree?
Names have scope (a compile-time property),
Yes. I would not call it a property thought. But basically yes.
while objects have lifetimes (a runtime property). Right?
There are three types of variables. Each type has different properties in relation to lifetimes.
Automatic storage duration:
Static storage duration
Dynamic storage duration
Note: automatic storage duration objects have a lifetime that is bound to the scope of the variable.
I often see people talking about temporary objects "going out of scope".
Unless bound to a variable a temporary is destroyed at the end of an expression. If they are bound to a variable (a const reference) then they have the same lifespan as the variable. Sometimes it is just easier to refer to this as the scope, but technically you are correct.
But since a temporary object does not have a name, I think it does not make sense to talk about "scope" in this context.
Technically yes. But I think it just makes talking about it easier. To me (though not technically correct) the scope of a temporary (not bound) is the expression. Its easier to say than the lifespan of the temporary variable.
The lifetime of a temporary object is very clearly defined and has nothing to do with scope. Would you agree?
Yes. But it still feels more natural to talk about scope (even if it is not technically correct). As most people understand what you are trying to imply. But when you get down and talk about the very technical stuff you should use the correct terminology and scope in this context is not correct.
The lifetime of temporaries has very little to do with syntactical blocks, but "scope" — as a word rather than a technical term — can be used in other ways. The important question is whether you are confused when people use "scope" to refer to temporaries. (It doesn't appear that you are, from my POV.)
Since you're talking about using the term to communicate with others, that communication is what's really important. If you were defining terms by writing a standardese document or trying to interpret such a document in the context of defined terms, the situation would be different. Interpreting ISO 14882 will, of course, involve communicating with others, so you would just have to ask for clarification if necessary, in that case.
It's counter-productive to make all non-standardese communication be standardese, and it's often better to use code in either case when high precision is required. The C++ standard extensively uses examples for this reason.
For another example, "calling a constructor" is often used, yet technically you can't call a ctor directly; instead, ctors are part of object initialization. This is why there's an explicit form of new solely to construct an object. (Interestingly, you can call a destructor directly.) However, I would expect that phrase to be understood in most contexts, though I wouldn't advocate using it in standardese contexts.
I've seen people say that "an object went out of scope" when it meant (in your parlance) "the lifetime of the object ended when the object's name went out of scope". If you use that short form, it's natural to say that temporay objects go out of scope, too.
Temporary objects do have names, albeit referable by the compiler only. Otherwise how would the compiler refer to them? Just because you can't refer to a temporary once it's instantiated doesn't mean the compiler can't refer to it.
f(Foo(), Bar());
The compiler has to refer to at least one of the temporaries even though you as a programmer can't refer to either of them. Temporary objects do have a scope.
Binding to a const reference extends the lifetime of a temporary to the lifetime of the reference, so in a sense, it does have something to do with scope in this particular case :
std::string foo();
int main()
{
// Lifetime of the temporary returned by foo is indeed the scope of bar
const std::string &bar = foo();
}
See this article from Herb Sutter :
Normally, a temporary object lasts
only until the end of the full
expression in which it appears.
However, C++ deliberately specifies
that binding a temporary object to a
reference to const on the stack
lengthens the lifetime of the
temporary to the lifetime of the
reference itself, and thus avoids what
would otherwise be a common
dangling-reference error.