Herb Sutter mentions in one of his http://www.gotw.ca articles that an object is constructed(has valid existence) only if the constructor executes completes.ie to put it in a crude way control passes beyond its final brace.
Now consider the following code
class A
{
public:
A()
{
f();
}
void f()
{
cout << "hello, world";
}
};
int main()
{
A a;
}
Now from what Herb says, can't we say that since A is not completely constructed inside its constructor Calling f() inside the constructor is invalid as the "this" ptr is not ready yet.
Still there is indeed a valid "this" inside the constructor and f() does get called.
I don't think Herb is saying something incorrect... but guess i am interpreting it incorrectly....can some explain to me what exactly that is?
Here is the link to the article : http://www.gotw.ca/gotw/066.htm
It talks about exceptions from constructors. Specifically here is the extract from it on which my question is based:
-When does an object's lifetime begin?
When its constructor completes successfully and returns normally. That is, control reaches the end of the constructor body or an earlier return statement.
-When does an object's lifetime end?
When its destructor begins. That is, control reaches the beginning of the destructor body.
Important point here is that the state of the object before its lifetime begins is exactly the same as after its lifetime ends -- there is no object, period. This observation brings us to the key question:
We might summarize the C++ constructor model as follows:
Either:
(a) The constructor returns normally by reaching its end or a return statement, and the object exists.
Or:
(b) The constructor exits by emitting an exception, and the object not only does not now exist, but never existed.
Now from what Herb says, can't we say
that since A is not completely
constructed inside its constructor
Calling f() inside the constructor is
invalid as the "this" ptr is not ready
yet.
That is only when f() is a virtual method of class A or its inheritance hierarchy and you expect the runtime resolution for f() according to the right object. In simple words, virtual mechanism doesn't kick in if the method is invoked inside constructor.
If f() is not a virtual function, there is no harm in calling it from constructor(s) provided you know what exactly f() does. Programmers usually call class methods like initialize() from constructor(s).
Can you give me the link to the Herb Sutter's article?
By the time program flow enters your constructor, the object's memory has been allocated and the this pointer is indeed valid.
What Herb means, is that the object's state may not have entirely initialized. In particular, if you are constructing a class derived from A, then that class' constructor will not have been called while you are still inside A's constructor.
This is important if you have virtual member functions, since any virtual function in the derived class will not be run if called from within A's constructor.
Note: it would have been easier with the exact article, so that we could have some context
Lifetime considerations are actually pretty complicated.
Considering the constructor of an object, there are two different point of views:
external: ie the user of an object
internal: ie, you when writing constructors and destructors (notably)
From the external point of view, the lifetime of an object:
begins once the constructor successfully completed
ends when the destructor begins to run
It means that if you attempt to access an object mid-construction or mid-destruction Bad Things Happen (tm). This is mostly relevant to multi-threaded programs, but may happen if you pass pointers to your object to base classes... which leads to...
...the internal point of view. It's more complicated. One thing you are sure of is that the required memory has been allocated, however parts of the objects may not be fully initialized yet (after all, you are constructing it).
in the body of the constructor, you can use the attributes and bases of the class (they are initialized), and call functions normally (virtual calls should be avoided).
if it's a base class, the derived object is not initialized yet (thus the restriction on virtual calls)
The implication from the lifetime not having started yet is mainly that, should the constructor throw an exception, the destructor will not be run.
Beware of member variables that are not yet initialized. Beware of virtual functions: the function that you call might not be the one that you expect if the function is virtual and a derived object is created. Other than that, I do not see any problem calling methods from the constructor. Especially the memory for the object has already been allocated.
Related
In an std::vector<T> the vector owns the allocated storage and it constructs Ts and destructs Ts. Regardless of T's class hierarchy, std::vector<T> knows that it has only created a T and thus when .pop_back() is called it only has to destroy a T (not some derived class of T). Take the following code:
#include <vector>
struct Bar {
virtual ~Bar() noexcept = default;
};
struct FooOpen : Bar {
int a;
};
struct FooFinal final : Bar {
int a;
};
void popEm(std::vector<FooOpen>& v) {
v.pop_back();
}
void popEm(std::vector<FooFinal>& v) {
v.pop_back();
}
https://godbolt.org/z/G5ceGe6rq
The PopEm for FooFinal simply just reduces the vector's size by 1 (element). This makes sense. But PopEm for FooOpen calls the virtual destructor that the class got by extending Bar. Given that FooOpen is not final, if a normal delete fooOpen was called on a FooOpen* pointer, it would need to do the virtual destructor, but in the case of std::vector it knows that it only made a FooOpen and no derived class of it was constructed. Therefore, couldn't std::vector<FooOpen> treat the class as final and omit the call to the virtual destructor on the pop_back()?
Long story short - compiler doesn't have enough context information to deduce it https://godbolt.org/z/roq7sYdvT
Boring part:
The results are similar for all 3: msvc, clang, and gcc, so I guess the problem is general.
I analysed the libstdc++ code just to find pop_back() runs like this:
void pop_back() // a bit more convoluted but boils-down to this
{
--back;
back->~T();
}
Not any surprise. It's like in C++ textbooks. But it shows the problem - virtual call to a destructor from a pointer.
What we're looking for is the 'devirtualisation' technique described here: Is final used for optimisation in C++ - it states devirtualisation is 'as-if' behaviour, so it looks like it is open for optimisation if the compiler has enough information to do it.
My opinion:
I meddled with the code a little and i think optimisation doesn't happen because the compiler cannot deduce the only objects pointed by "back" are FooOpen instances. We - humans - know it because we analyse the entire class, and see the overall concept of storing the elements in a vector. We know the pointer must point to FooOpen instance only, but compiler fails to see it - it only sees a pointer which can point anywhere (vector allocates uninitialized chunk of memory and its interpretation is a part of vector's logic, also the pointer is modified outside the scope of pop_back()). Without knowing the entire concept of vector<> i don't think of how it can be deduced (without analysing the entire class) that it won't point to any descendant of FooOpen which can be defined in other translation units.
FooFinal doesn't have this problem because it already guarantees no other class can inherit from it so devirtualisation is safe for objects pointed by FooFinal* or FooFinal&.
Update
I made several findings which may be useful:
https://godbolt.org/z/3a1bvax4o), devirtualisation can occur for non-final classes as long as there is no pointer arithmetic involved.
https://godbolt.org/z/xTdshfK7v std::array performs devirtualisation on non-final classes. std::vector fails to do it even if it is constructed and destroyed in the same scope.
https://godbolt.org/z/GvoaKc9Kz devirtualisation can be enabled using wrapper.
https://godbolt.org/z/bTosvG658 destructor devirtualisation can be enabled with allocator. Bit hacky, but is transparent to the user. Briefly tested.
Yes, this is a missed optimisation.
Remember that a compiler is a software project, where features have to be written to exist. It may be that the relative overhead of virtual destruction in cases like this is low enough that adding this in hasn't been a priority for the gcc team so far.
It is an open-source project, so you could submit a patch that adds this in.
It feels a lot like § 11.4.7 (14) gives some insight into this. As of latest working draft (N4910 Post-Winter 2022 C++ working draft, Mar. 2022):
After executing the body of the destructor and destroying any objects with automatic storage duration
allocated within the body, a destructor for class X calls the destructors for X’s direct non-variant non-static data
members, the destructors for X’s non-virtual direct base classes and, if X is the most derived class (11.9.3), its
destructor calls the destructors for X’s virtual base classes. All destructors are called as if they were referenced
with a qualified name, that is, ignoring any possible virtual overriding destructors in more derived classes.
Bases and members are destroyed in the reverse order of the completion of their constructor (see 11.9.3).
[Note 4 : A return statement (8.7.4) in a destructor might not directly return to the caller; before transferring control
to the caller, the destructors for the members and bases are called. — end note]
Destructors for elements of an array are called in reverse order of their construction (see 11.9).
Also interesting for this topic, § 11.4.6, (17):
In an explicit destructor call, the destructor is specified by a ~ followed by a type-name or decltype-specifier
that denotes the destructor’s class type. The invocation of a destructor is subject to the usual rules for
member functions (11.4.2); that is, if the object is not of the destructor’s class type and not of a class derived
from the destructor’s class type (including when the destructor is invoked via a null pointer value), the program has undefined behavior.
So, as far as the standard cares, the invocation of a destructor is subject to the usual rules for member functions.
This, to me, sounds a lot like destructor calls do so much that compilers are likely unable to determine, at compile-time, that a destructor call does "nothing" - as it also calls destructors of members, and std::vector doesn't know this.
I am learning C++ using the books listed here. Now I came across the following statement from C++ Primer:
When we allocate a block of memory, we often plan to construct objects in that
memory as needed. In this case, we’d like to decouple memory allocation from object
construction.
Combining initialization with allocation is usually what we want when we
allocate a single object. In that case, we almost certainly know the value the object
should have.
(emphasis mine)
The important thing to note here is that C++ primer seems to suggest that construction is the same as initialization and that they are different from allocation which makes sense to me.
Note that I've just quoted selected parts from the complete paragraph to keep the discussion concise and get my point across. You can read the complete para if you want here.
Now, I came across the following statement from class.dtor:
For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior. For an object with a non-trivial destructor, referring to any non-static member or base class of the object after the destructor finishes execution results in undefined behavior.
(emphasis mine)
Now does the standard specifies exactly when(at what point) the constructor execution begins?
To give you some more context consider the example:
class A {
public:
A(int)
{
}
};
class B : public A {
int j;
public:
int f()
{
return 4;
}
//------v-----------------> #2
B() : A(f()),
//------------^-----------> #3
j(f())
//------------^-----------> #4
{ //<---------------#5
}
};
int main()
{
B b; #1
return 0;
}
My questions are:
At what point does the derived class' constructor B::B() start executing according to the standard? Note that I know that A(f()) is undefined behavior. Does B::B() starts executing at point #1, #2, #3, #4 or #5. In other words, does the standard specifies exactly when(at what point) the constructor execution begins?
Is construction and initialization the same in this given example. I mean I understand that in the member initializer list where we have j(f()), we're initializing data member j but does this initialization also implies that the construction B::B() has begun executing at point #4?
I read in a recent SO post that execution of derived ctor begins at point #4 and so that post also seem to suggest that Initialisation and construction is the same.
I read many posts before asking this question but I wasn't able to come up with an answer that is right according to the C++ standard.
I also read this which seems to suggest that allocation, initialization and construction are all different:
Allocation
This is the step where memory is allocated for the object.
Initialization
This is the step where the language related object properties are "set". The vTable and any other "language implementation" related operations are done.
Construction
Now that an object is allocated and initialized, the constructor is being executed. Whether the default constructor is used or not depends on how the object was created.
As you can see above, the user claims that all of the mentioned terms are different as opposed to what is suggested by C++ primer. So which claim is correct here according to the standard, C++ Primer(which says that construction and Initialisation is same) or the above quoted quoted answer(what says that construction and Initialisation are different).
Initialization and construction are somewhat similar, but not the same.
For objects of class types, initialization is done by calling a constructor (there are exceptions, e.g. aggregate initialization doesn't use a constructor; and value-initialization zeros the members in before calling a constructor).
Non-class types don't have constructors, so they are initialized without them.
Your C++ Primer quote uses "initialization" and "construction" to refer to the same thing. It would be more correct to call it "initialization" to not limit yourself to class types, and to include other parts of initialization other than a constructor call.
does the standard specifies exactly when(at what point) the constructor execution begins?
Yes. B b; is default-initialization, which, for classes, calls the default constructor and does nothing else.
So the default-constructor B() is the first thing that's executed here.
As described here, B() calls A() after evaluating its argument f(), then initializes j after evaluating its initializer f(). Finally, it executes its own body, which is empty in your case.
Since the body is executed last, it's a common misunderstanding to think that B() itself is executed after A() and/or after initializing j.
I read in a recent SO post that execution of derived ctor begins at point #4
You should also read my comments to that post, challenging this statement.
There is no distinction between point 1 and 2 in your example. It's the same point. An object's constructor gets invoked when the object gets instantiated, when it comes into existence. Whether it's point 1 or point 2, that's immaterial.
What you are separating out here is the allocation of the underlying memory for the object (point 1) and when the new object's constructor begins executing (point 2).
But that's a distinction without a difference. These are not two discrete events in C++ that somehow can be carried out as discrete, separate steps. They are indivisible, they are one and the same. You cannot allocate memory for a C++ object but somehow avoid constructing it. Similarly you cannot construct some object without allocating memory for it first. It's the same thing.
Now, you do have other distractions that can happen here, like employing the services of the placement-new operator (and manually invoking the object's destructor at some point later down the road). That seems to suggest that allocation is something that's separate, but its really not. The invocation of the placement new operator is, effectively, implicitly allocating the memory for the new object from the pointer you hand over to the placement new operator, then this is immediately followed by the object's construction. That's the way to think about it. So allocation+construction is still an indivisible step.
Another important point to understand is that the first thing that every object's constructor does is call the constructors of all objects that it's derived from. So, from a practical aspect, the very first thing that actually happens in the shown code is that A's constructor gets called first, to construct it, then, once its construction is finished, then B's "real" construction takes place.
B() : A(f()),
That's exactly what this reads. B's constructor starts executing, and its first order of business is to call A's constructor. B does nothing else, until A's constructor finishes. So, technically, B's constructor starts executing first, then A's constructor. But, B's constructor does nothing until A's constructor handles its business.
The first thing that B's constructor does is call A's constructor. This is specified in the C++ standard.
B() : j(f()), A(f())
If you try to do this your compiler will yell at you.
So, to summarize: when you instantiate an object, any object, the very first thing that happens is its constructor gets called. That's it. End of story. Every possible variation here (placement new, PODs with no constructors, the additional song-and-dance routine with virtual dispatch) is all downstream of this, and can be defined as special, and specific, implementations of constructors.
in C++, when the object's desctructor is called, it first invokes child class's destructor, and then parent's, which is the opposite of construct procedure.
But why? It seems to be a simple question, but I haven't found a satisfying answer on the internet. Could someone explain the nessisity of doing destructing in such sequence?
Since this was tagged [language-lawyer], this is the rule that says what a destructor does:
[class.dtor]
After executing the body of the destructor and destroying any objects with automatic storage duration allocated within the body, a destructor for class X calls the destructors for X's direct non-variant non-static data members, the destructors for X's non-virtual direct base classes and, if X is the most derived class ([class.base.init]), its destructor calls the destructors for X's virtual base classes.
All destructors are called as if they were referenced with a qualified name, that is, ignoring any possible virtual overriding destructors in more derived classes.
Bases and members are destroyed in the reverse order of the completion of their constructor (see [class.base.init]).
Why?... It just makes sense. Things that are constructed in one order are generally destroyed in the opposite order. This convention applies to sub objects of classes, as well as elements of arrays, local variables in a function as well as objects with static storage duration destroyed at the end of the program. I cannot think of anything where this isn't the case.
Although dependencies between objects within their lifetime can be two-directional, during the construction and destruction the dependency can only be in one direction. Object that is created before cannot depend - within its constructor - on an object that is created later (because that object doesn't exist yet). Same applies to destruction but in reverse: An object that is destroyed later cannot depend - within its destructor - on an object destroyed before (because that object doesn't exist anymore).
If you can depend on something within your construction, then you will typically expect to be able to depend on that something within the destruction. In other words, you would expect the direction of dependency to remain in the same direction. The inverse destruction order satisfies this expectation.
To return to the c/d-tors, constructor body is the last thing to execute in the constructor, so the destructor body should be the first thing to execute in the destructor. And it will be useful in some cases to be able to use the base sub object within destructor body, which wouldn't be possible if it had already been destroyed.
Why does the base object need to exsit in a valid state in destructing procedure?
To understand this, it should help to think about why the destructor body exists in the first place. Typically, it is to change the internal state of the class in preparation for ending the lifetime. A typically familiar action is deletion of an owning pointer stored as a member (this is just an example: use smart pointers instead). If the object no longer exists in a valid state, then it would be too late to touch that now-invalid state.
but any base class function could be virtual function, which could also access child member in destructor procedure.
Virtual function calls resolve to the current class during destruction (just like during construction).
Assuming you're talking about parent/child meaning inheritance consider that
struct Car : Vehicle { ... };
is not really much different than
struct Car { Vehicle _base; ... };
except that automatically when you refer to a Vehicle property in a Car method _base. is implicitly added by the compiler.
The standard even calls this the "base class sub-object"... it's an object that has no name and that you use implicitly, but it's still there.
Now in C++ members of a class are constructed before the class itself, and destroyed after. The same happens to the implicit "base class sub-object". So when building a car you need first to build the vehicle... and after destroying the car all individual members are destroyed, including the vehicle.
Another question cites the C++ standard:
3.8/1 "The lifetime of an object of type T ends when: — if T is a class type with a non-trivial destructor (12.4), the destructor call
starts, or — the storage which the object occupies is reused or
released."
It would seem that this means accessing members of an object from a destructor is not allowed. However this seems to be wrong, and the truth is something more like what's explained in Kerrek SB's answer:
Member objects come alive before a constructor body runs, and they
stay alive until after the destructor finishes. Therefore, you can
refer to member objects in the constructor and the destructor.
The object itself doesn't come alive until after its own constructor
finishes, and it dies as soon as its destructor starts execution. But
that's only as far as the outside world is concerned. Constructors and
destructors may still refer to member objects.
I'm wondering if in the destructor I can pass the object's address to an outside class, like:
struct Person;
struct Organizer
{
static void removeFromGuestList(const Person& person); // This then accesses Person members
}
struct Person
{
~Person() {
// I'm about to die, I won't make it to the party
Organizer::removeFromGuestList(*this);
}
};
This seems OK to me as I think an object's lifetime lasts until after the destructor finishes, however this part of the above answer has me doubting:
The object itself doesn't come alive until after its own constructor
finishes, and it dies as soon as its destructor starts execution. But
that's only as far as the outside world is concerned. Constructors and
destructors may still refer to member objects.
The C++ Standard does seem to be just a little bit self-contradictory regarding the exact status of class members during execution of the destructor.
However, the following excerpt from this Draft C++ Standard may give some reassurance that your call to the removeFromGuestList function should be safe (bold italics formatting added by me):
15.7 Construction and destruction
1 For an object with a non-trivial constructor, referring to any non-static member or base
class of the object before the constructor begins execution results in
undefined behavior. For an object with a non-trivial destructor,
referring to any non-static member or base class of the object after
the destructor finishes execution results in undefined behavior.
What remains unclear (at least, to me) is whether or not referring to those class members via a reference to the object being destroyed is valid once the destructor has started execution. That is, assuming your Person class has a member, ObjectType a, is referring to person.a in your removeFromGuestList function valid?
On the other hand, rather than passing *this as its argument, if you were to pass each required member as a 'distinct object', then you would be safe; so, redefining that function as, say, removeFromGuestList(const ObjectType& a) (with possible additional arguments) would be completely safe.
my code :
Scene::Scene(const std::string &scene_file) : ambient_light(0, 0, 0), background(0, 0, 0){
scene_parser parser(*this);
parser.parse(scene_file);
}
scene_parser is a friend of Scene, and in the parse method it accesses(r/w) the members of Scene. Is this going to cause any problems?
Yes, it's ok to give out a reference to this. However, you usually want to do that when the other object will use the pointer later. Your use case looks like it will use the Scene immediately, before the constructor completes, which is a very slippery slope.
Right now, you're not establishing any invariants after calling parse, so it should be ok, but it's also fragile and easy for future changes to introduce breakage.
In your particular example, no problems should arise.
Generally the problem with giving out this references is that the lifetimes of the two objects don't exactly align and the other object could try to access the referred-to object after it has already been destroyed.
In your example, the scene_parser object is on the stack, so it's lifetime ends at the end of the Scene constructor. There is no possible way it could attempt to access a non-existent object via that this reference you provided, so no problems can arise.
It depends.
Inside the constructor body (i.e. once the initializer list is executed), the object is considered "fully constructed" up to the current type. Therefore, you can reference *this, but any virtual function calls will not use overrided functions in derived classes.
All of your subobjects (members and bases) are constructed by the first statement in the body of the constructor. If your object is in a "valid state" (which is part of the definition of your class, sometimes called the "class invariant") at this point, you can treat it as a fully constructed object and do anything with it. However, virtual lookup does work slightly differently than you may expect or require: if this is a base class (and thus this object is a subobject of something else), the final type hasn't been "assigned" yet. For example, this is one way to call pure-virtual methods and get a runtime error (if those methods don't have definitions, anyway).
A more interesting situation is using this in the constructor initializer; that does have some caveats, but that is also before the constructor body.