Construction vs Initialisation Formal difference

Construction vs Initialisation Formal difference - c++

I am learning C++ using the books listed here. Now I came across the following statement from C++ Primer:
When we allocate a block of memory, we often plan to construct objects in that
memory as needed. In this case, we’d like to decouple memory allocation from object
construction.
Combining initialization with allocation is usually what we want when we
allocate a single object. In that case, we almost certainly know the value the object
should have.
(emphasis mine)
The important thing to note here is that C++ primer seems to suggest that construction is the same as initialization and that they are different from allocation which makes sense to me.
Note that I've just quoted selected parts from the complete paragraph to keep the discussion concise and get my point across. You can read the complete para if you want here.
Now, I came across the following statement from class.dtor:
For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior. For an object with a non-trivial destructor, referring to any non-static member or base class of the object after the destructor finishes execution results in undefined behavior.
(emphasis mine)
Now does the standard specifies exactly when(at what point) the constructor execution begins?
To give you some more context consider the example:
class A {
public:
A(int)
{
}
};
class B : public A {
int j;
public:
int f()
{
return 4;
}
//------v-----------------> #2
B() : A(f()),
//------------^-----------> #3
j(f())
//------------^-----------> #4
{ //<---------------#5
}
};
int main()
{
B b; #1
return 0;
}
My questions are:
At what point does the derived class' constructor B::B() start executing according to the standard? Note that I know that A(f()) is undefined behavior. Does B::B() starts executing at point #1, #2, #3, #4 or #5. In other words, does the standard specifies exactly when(at what point) the constructor execution begins?
Is construction and initialization the same in this given example. I mean I understand that in the member initializer list where we have j(f()), we're initializing data member j but does this initialization also implies that the construction B::B() has begun executing at point #4?
I read in a recent SO post that execution of derived ctor begins at point #4 and so that post also seem to suggest that Initialisation and construction is the same.
I read many posts before asking this question but I wasn't able to come up with an answer that is right according to the C++ standard.
I also read this which seems to suggest that allocation, initialization and construction are all different:
Allocation
This is the step where memory is allocated for the object.
Initialization
This is the step where the language related object properties are "set". The vTable and any other "language implementation" related operations are done.
Construction
Now that an object is allocated and initialized, the constructor is being executed. Whether the default constructor is used or not depends on how the object was created.
As you can see above, the user claims that all of the mentioned terms are different as opposed to what is suggested by C++ primer. So which claim is correct here according to the standard, C++ Primer(which says that construction and Initialisation is same) or the above quoted quoted answer(what says that construction and Initialisation are different).

Initialization and construction are somewhat similar, but not the same.
For objects of class types, initialization is done by calling a constructor (there are exceptions, e.g. aggregate initialization doesn't use a constructor; and value-initialization zeros the members in before calling a constructor).
Non-class types don't have constructors, so they are initialized without them.
Your C++ Primer quote uses "initialization" and "construction" to refer to the same thing. It would be more correct to call it "initialization" to not limit yourself to class types, and to include other parts of initialization other than a constructor call.
does the standard specifies exactly when(at what point) the constructor execution begins?
Yes. B b; is default-initialization, which, for classes, calls the default constructor and does nothing else.
So the default-constructor B() is the first thing that's executed here.
As described here, B() calls A() after evaluating its argument f(), then initializes j after evaluating its initializer f(). Finally, it executes its own body, which is empty in your case.
Since the body is executed last, it's a common misunderstanding to think that B() itself is executed after A() and/or after initializing j.
I read in a recent SO post that execution of derived ctor begins at point #4
You should also read my comments to that post, challenging this statement.

There is no distinction between point 1 and 2 in your example. It's the same point. An object's constructor gets invoked when the object gets instantiated, when it comes into existence. Whether it's point 1 or point 2, that's immaterial.
What you are separating out here is the allocation of the underlying memory for the object (point 1) and when the new object's constructor begins executing (point 2).
But that's a distinction without a difference. These are not two discrete events in C++ that somehow can be carried out as discrete, separate steps. They are indivisible, they are one and the same. You cannot allocate memory for a C++ object but somehow avoid constructing it. Similarly you cannot construct some object without allocating memory for it first. It's the same thing.
Now, you do have other distractions that can happen here, like employing the services of the placement-new operator (and manually invoking the object's destructor at some point later down the road). That seems to suggest that allocation is something that's separate, but its really not. The invocation of the placement new operator is, effectively, implicitly allocating the memory for the new object from the pointer you hand over to the placement new operator, then this is immediately followed by the object's construction. That's the way to think about it. So allocation+construction is still an indivisible step.
Another important point to understand is that the first thing that every object's constructor does is call the constructors of all objects that it's derived from. So, from a practical aspect, the very first thing that actually happens in the shown code is that A's constructor gets called first, to construct it, then, once its construction is finished, then B's "real" construction takes place.
B() : A(f()),
That's exactly what this reads. B's constructor starts executing, and its first order of business is to call A's constructor. B does nothing else, until A's constructor finishes. So, technically, B's constructor starts executing first, then A's constructor. But, B's constructor does nothing until A's constructor handles its business.
The first thing that B's constructor does is call A's constructor. This is specified in the C++ standard.
B() : j(f()), A(f())
If you try to do this your compiler will yell at you.
So, to summarize: when you instantiate an object, any object, the very first thing that happens is its constructor gets called. That's it. End of story. Every possible variation here (placement new, PODs with no constructors, the additional song-and-dance routine with virtual dispatch) is all downstream of this, and can be defined as special, and specific, implementations of constructors.

Related

Missed Optimization: std::vector<T>::pop_back() not qualifying destructor call?

In an std::vector<T> the vector owns the allocated storage and it constructs Ts and destructs Ts. Regardless of T's class hierarchy, std::vector<T> knows that it has only created a T and thus when .pop_back() is called it only has to destroy a T (not some derived class of T). Take the following code:
#include <vector>
struct Bar {
virtual ~Bar() noexcept = default;
};
struct FooOpen : Bar {
int a;
};
struct FooFinal final : Bar {
int a;
};
void popEm(std::vector<FooOpen>& v) {
v.pop_back();
}
void popEm(std::vector<FooFinal>& v) {
v.pop_back();
}
https://godbolt.org/z/G5ceGe6rq
The PopEm for FooFinal simply just reduces the vector's size by 1 (element). This makes sense. But PopEm for FooOpen calls the virtual destructor that the class got by extending Bar. Given that FooOpen is not final, if a normal delete fooOpen was called on a FooOpen* pointer, it would need to do the virtual destructor, but in the case of std::vector it knows that it only made a FooOpen and no derived class of it was constructed. Therefore, couldn't std::vector<FooOpen> treat the class as final and omit the call to the virtual destructor on the pop_back()?

Long story short - compiler doesn't have enough context information to deduce it https://godbolt.org/z/roq7sYdvT
Boring part:
The results are similar for all 3: msvc, clang, and gcc, so I guess the problem is general.
I analysed the libstdc++ code just to find pop_back() runs like this:
void pop_back() // a bit more convoluted but boils-down to this
{
--back;
back->~T();
}
Not any surprise. It's like in C++ textbooks. But it shows the problem - virtual call to a destructor from a pointer.
What we're looking for is the 'devirtualisation' technique described here: Is final used for optimisation in C++ - it states devirtualisation is 'as-if' behaviour, so it looks like it is open for optimisation if the compiler has enough information to do it.
My opinion:
I meddled with the code a little and i think optimisation doesn't happen because the compiler cannot deduce the only objects pointed by "back" are FooOpen instances. We - humans - know it because we analyse the entire class, and see the overall concept of storing the elements in a vector. We know the pointer must point to FooOpen instance only, but compiler fails to see it - it only sees a pointer which can point anywhere (vector allocates uninitialized chunk of memory and its interpretation is a part of vector's logic, also the pointer is modified outside the scope of pop_back()). Without knowing the entire concept of vector<> i don't think of how it can be deduced (without analysing the entire class) that it won't point to any descendant of FooOpen which can be defined in other translation units.
FooFinal doesn't have this problem because it already guarantees no other class can inherit from it so devirtualisation is safe for objects pointed by FooFinal* or FooFinal&.
Update
I made several findings which may be useful:
https://godbolt.org/z/3a1bvax4o), devirtualisation can occur for non-final classes as long as there is no pointer arithmetic involved.
https://godbolt.org/z/xTdshfK7v std::array performs devirtualisation on non-final classes. std::vector fails to do it even if it is constructed and destroyed in the same scope.
https://godbolt.org/z/GvoaKc9Kz devirtualisation can be enabled using wrapper.
https://godbolt.org/z/bTosvG658 destructor devirtualisation can be enabled with allocator. Bit hacky, but is transparent to the user. Briefly tested.

Yes, this is a missed optimisation.
Remember that a compiler is a software project, where features have to be written to exist. It may be that the relative overhead of virtual destruction in cases like this is low enough that adding this in hasn't been a priority for the gcc team so far.
It is an open-source project, so you could submit a patch that adds this in.

It feels a lot like § 11.4.7 (14) gives some insight into this. As of latest working draft (N4910 Post-Winter 2022 C++ working draft, Mar. 2022):
After executing the body of the destructor and destroying any objects with automatic storage duration
allocated within the body, a destructor for class X calls the destructors for X’s direct non-variant non-static data
members, the destructors for X’s non-virtual direct base classes and, if X is the most derived class (11.9.3), its
destructor calls the destructors for X’s virtual base classes. All destructors are called as if they were referenced
with a qualified name, that is, ignoring any possible virtual overriding destructors in more derived classes.
Bases and members are destroyed in the reverse order of the completion of their constructor (see 11.9.3).
[Note 4 : A return statement (8.7.4) in a destructor might not directly return to the caller; before transferring control
to the caller, the destructors for the members and bases are called. — end note]
Destructors for elements of an array are called in reverse order of their construction (see 11.9).
Also interesting for this topic, § 11.4.6, (17):
In an explicit destructor call, the destructor is specified by a ~ followed by a type-name or decltype-specifier
that denotes the destructor’s class type. The invocation of a destructor is subject to the usual rules for
member functions (11.4.2); that is, if the object is not of the destructor’s class type and not of a class derived
from the destructor’s class type (including when the destructor is invoked via a null pointer value), the program has undefined behavior.
So, as far as the standard cares, the invocation of a destructor is subject to the usual rules for member functions.
This, to me, sounds a lot like destructor calls do so much that compilers are likely unable to determine, at compile-time, that a destructor call does "nothing" - as it also calls destructors of members, and std::vector doesn't know this.

Clarification about object lifetime in destructor

Another question cites the C++ standard:
3.8/1 "The lifetime of an object of type T ends when: — if T is a class type with a non-trivial destructor (12.4), the destructor call
starts, or — the storage which the object occupies is reused or
released."
It would seem that this means accessing members of an object from a destructor is not allowed. However this seems to be wrong, and the truth is something more like what's explained in Kerrek SB's answer:
Member objects come alive before a constructor body runs, and they
stay alive until after the destructor finishes. Therefore, you can
refer to member objects in the constructor and the destructor.
The object itself doesn't come alive until after its own constructor
finishes, and it dies as soon as its destructor starts execution. But
that's only as far as the outside world is concerned. Constructors and
destructors may still refer to member objects.
I'm wondering if in the destructor I can pass the object's address to an outside class, like:
struct Person;
struct Organizer
{
static void removeFromGuestList(const Person& person); // This then accesses Person members
}
struct Person
{
~Person() {
// I'm about to die, I won't make it to the party
Organizer::removeFromGuestList(*this);
}
};
This seems OK to me as I think an object's lifetime lasts until after the destructor finishes, however this part of the above answer has me doubting:
The object itself doesn't come alive until after its own constructor
finishes, and it dies as soon as its destructor starts execution. But
that's only as far as the outside world is concerned. Constructors and
destructors may still refer to member objects.

The C++ Standard does seem to be just a little bit self-contradictory regarding the exact status of class members during execution of the destructor.
However, the following excerpt from this Draft C++ Standard may give some reassurance that your call to the removeFromGuestList function should be safe (bold italics formatting added by me):
15.7 Construction and destruction
1   For an object with a non-trivial constructor, referring to any non-static member or base
class of the object before the constructor begins execution results in
undefined behavior. For an object with a non-trivial destructor,
referring to any non-static member or base class of the object after
the destructor finishes execution results in undefined behavior.
What remains unclear (at least, to me) is whether or not referring to those class members via a reference to the object being destroyed is valid once the destructor has started execution. That is, assuming your Person class has a member, ObjectType a, is referring to person.a in your removeFromGuestList function valid?
On the other hand, rather than passing *this as its argument, if you were to pass each required member as a 'distinct object', then you would be safe; so, redefining that function as, say, removeFromGuestList(const ObjectType& a) (with possible additional arguments) would be completely safe.

Dynamic allocation of class array with protected destructor

If I have a class defined like
class A {
protected:
~A(){ }
};
then I can dynamically allocate the individual as well as array of objects like
A* ptr1 = new A;
A* ptr2 = new A[10];
However when I define the constructor for this class
class A {
public:
A(){}
protected:
~A(){ }
};
then I can create individual objects with
A* ptr = new A;
but when I try to dynamically allocate the array of object with
A* ptr = new A[10];
compiler(gcc-5.1 and Visual Studio 2015) starts complaining that A::~A() is inaccessible.
Can anyone explain about:-
1- Why is the difference in behavior with constructor being defined and not defined.
2- When the constructor is defined why I am allowed to create individual object and not array of object.

Rejecting an array-new with a protected destructor is correct, as per C++11, §5.3.4 ¶17:
If the new-expression creates an object or an array of objects of class type, access and ambiguity control are done for the allocation function, the deallocation function (12.5), and the constructor (12.1). If the new expression creates an array of objects of class type, access and ambiguity control are done for the destructor (12.4).
(emphasis added; almost exactly the same wording is used in C++03, §5.3.4 ¶16; C++14 moves some stuff around, but that doesn't seem to change the core of the issue - see #Baum mit Augen's answer)
That comes from the fact that new[] succeeds only if only all the elements have been constructed, and wants to avoid leaking objects in case one of the costructor calls fails; thus, if it manages to construct - say - the first 9 objects but the 10th fails with an exception, it has to destruct the first 9 before propagating the exception.
Notice that this restriction logically wouldn't be required if the constructor was declared as noexcept, but still the standard doesn't seem to have any exception in this regard.
So, here gcc is technically wrong in the first case, which, as far as the standard is concerned, should be rejected as well, although I'd argue that "morally" gcc does the right thing (as in practice there's no way that the default constructor of A can ever throw).

As it turns out, gcc is not correct here. In N4141 (C++14), we have:
If the
new-expression creates an array of objects of class type, the destructor is potentially invoked (12.4).
(5.3.4/19 [expr.new]) and
A
program is ill-formed if a destructor that is potentially invoked is deleted or not accessible from the context
of the invocation.
(12.4/11 [class.dtor]). So both array cases should be rejected. (Clang does get that right, live.)
The reason for that is, as mentioned by others and by my old, incorrect answer, that the construction of elements of class type can potentially fail with an exception. When that happens, the destructor of all fully constructed elements must be invoked, and thus the destructor must be accessible.
That limitation does not apply when allocating a single element with operator new (without the []), because there can be no fully constructed instance of the class if the single constructor call fails.

I'm not a language lawyer (very familiar with the standard), but suspect the answer is along the lines of that given earlier by Baum mit Augen (deleted, so only those with sufficient reputation can see it).
If the construction of subsequent array elements fails and throws an exception, then the already constructed elements will need to be deleted, requiring access to the destructor.
However, if the constructor is noexcept, this can be ruled out and access to the destructor is not required. The fact that gcc and clang both still complain even in this case, may well be a compiler bug. That is, the compiler fails to take into account that the constructor is noexcept. Alternatively, the compilers may be within the standard, in which case, this smells like a defect in the standard.

In C++ are constructors called before or after object creation?

I found some answers for this question regarding java but nothing specifically regarding c++. So I've read in Java the object is first created and then the constructor is called. I was wondering if this was the same process for c++? Also, if this is the case, then what's the point of having a default constructor at all? Is it for inheritance purposes?

"Object creation" means different things in different languages. But in C++ the most salient question is "when does the object lifetime begin". When an object's lifetime has begun, that means that when it later ends (you delete it, or if it is a stack object, then when it goes out of scope), the destructor will be called.
If the object's lifetime did not formally begin, then if it goes out of scope later, the destructor will not be called.
C++ resolves this as follows:
When you make an object, say of class type, by invoking a constructor, first the memory is allocated, then the constructor runs.
When the constructor runs to completion, then the lifetime has begun, and the destructor will be called when it ends. After the destructor finishes, the memory will be freed.
If the constructor aborts, say, by throwing an exception, then the destructor for that object will not be called. The memory will still be freed, however.
For more info on object lifetimes you might want to look at e.g. this question, or better, at the standard / a good text book.
The basic idea is that, in C++, we try to minimize the window of time between when the memory has been allocated and when it is initialized -- or rather, the language itself promotes the idea that "resource acquisition is initialization" and makes it un-idiomatic to acquire memory without also giving it a type and initializing it. When typically writing code, e.g. if you have a variable of type A, you can think of it as "This refers to a block of memory for an A, where a constructor for A successfully ran to completion." You don't normally have to consider the possibility that "this is a block of memory the size of an A, but the constructor failed and now its an uninitialized / partially initialized headless blob."

It depends on what you mean by "created". Obviously memory should be allocated before object can be created.
But officially, object is not created (its lifetime is not started) until after constructor finishes execution. If constructor did not execute completely (exception happened, for example), object is considered never existing in first place. For example destructor for this object will not be called (as it would for any existing object)
When you enter constructor body, you can be sure that members are create (either by default constructors or whatever was passed in constructor initializer list; usual language rules for variable initialization applies: initial values of scalar variables are undefined).
When you consider inheritance, it is even more complex. When you enter constructor body, ancestors parts are already existing objects: their constructor had finished, and even if children would not be able to construct itself properly, their destructors will be called.

Neither.
The process of object creation includes a constructor call.
The point of having a default constructor is to allow this part of the object creation process to be a no-op. What else would it be?

Here's the sequence: first, memory is allocated. Then any base class constructors are called. Finally the class constructor is called.
You don't actually need a default constructor if you'll always create your objects with a parameter list.

You seem to be confusing terms but I'll try to define some (unofficial) terms that should clarify this issue:
Allocation
This is the step where memory is allocated for the object.
Initialization
This is the step where the language related object properties are "set". The vTable and any other "language implementation" related operations are done.
Construction
Now that an object is allocated and initialized, the constructor is being executed. Whether the default constructor is used or not depends on how the object was created.
You can consider an object created after the 3rd step.

Also, if this is the case, then what's the point of having a default
constructor at all?
The point of "Default Constructor" is to tell the program how objects with no parameters should be built, in other words - what is the "default" state of an Object.
for example, the default state of std::unique_ptr is to point to null with no costume deleter.
The default state of a string is an empty string with the size of 0.
The default state of a vector is an empty vector with the size of 0.
The default state of T is the one specified by the constructor T().

Think that constructor has to be called after object's creation due to resource allocation issues. An object is no more than a structure with functions pointers (members function) and member variables (attributes). An error would occur if the constructor sets up any of these values before they are allocated in memory.
For example, your constructor stores an int value in a member variable of your object, but your object's member variables haven't been allocated, so the value cannot be stored successfully.
Regards!

Yes, the object's memory is first allocated, then the constructor called to actually construct the content. A bit like building a house, you first purchase [or otherwise legally get permission] the land (memory) to build it on, then start building. If you do it the other way around, you're likely to get into trouble.
A default constructor is used when your object needs to be constructed with no parameters. In some cases, this doesn't make any sense at all, but default constructors are used for example in the std::vector - since a std::vector<myclass> will be implemented as an array of objects, and if you grow it [using push_back], the size will double (or something like it), and the objects at the back of the vector that hasn't been used will be default constructed.
All objects need to be constructed after creation (even if you don't declare one, in which case you get an empty constructor, and if the compiler is clever it will not call the constructor because it doesn't do anything)

The term "default constructor" just means a constructor that needs no parameters, such that it can be used by-default. For example:
struct MyObject {
MyObject() { ... } // default constructor
MyObject(int) { ... } // some other non-default constructor
};
int main()
{
MyObject x; // default constructor is called since you didn't give
// any explicit parameters.
MyObject x2(5); // A non-default constructor is called.
}
Note that the term "default" doesn't mean that you haven't defined it. The default constructor may be provided by you or it may be automatically generated in certain situations. You even have the case of the defaulted default constructor:
struct MyObject {
MyObject() = default; // defaulted default constructor
};
Here, you've told the compiler to generate a default constructor using the default implementation.
Regardless of whether it is a default constructor or not, the object has been constructed as much as automatically possible before the constructor body is executed. This means the memory has been allocated to store the object, any base classes have been constructed, and any members have been constructed. The object also has taken the type identity of the class being constructed for purposes of RTTI and virtual function calls.

Calling a Method in Constructor

Herb Sutter mentions in one of his http://www.gotw.ca articles that an object is constructed(has valid existence) only if the constructor executes completes.ie to put it in a crude way control passes beyond its final brace.
Now consider the following code
class A
{
public:
A()
{
f();
}
void f()
{
cout << "hello, world";
}
};
int main()
{
A a;
}
Now from what Herb says, can't we say that since A is not completely constructed inside its constructor Calling f() inside the constructor is invalid as the "this" ptr is not ready yet.
Still there is indeed a valid "this" inside the constructor and f() does get called.
I don't think Herb is saying something incorrect... but guess i am interpreting it incorrectly....can some explain to me what exactly that is?
Here is the link to the article : http://www.gotw.ca/gotw/066.htm
It talks about exceptions from constructors. Specifically here is the extract from it on which my question is based:
-When does an object's lifetime begin?
When its constructor completes successfully and returns normally. That is, control reaches the end of the constructor body or an earlier return statement.
-When does an object's lifetime end?
When its destructor begins. That is, control reaches the beginning of the destructor body.
Important point here is that the state of the object before its lifetime begins is exactly the same as after its lifetime ends -- there is no object, period. This observation brings us to the key question:
We might summarize the C++ constructor model as follows:
Either:
(a) The constructor returns normally by reaching its end or a return statement, and the object exists.
Or:
(b) The constructor exits by emitting an exception, and the object not only does not now exist, but never existed.

Now from what Herb says, can't we say
that since A is not completely
constructed inside its constructor
Calling f() inside the constructor is
invalid as the "this" ptr is not ready
yet.
That is only when f() is a virtual method of class A or its inheritance hierarchy and you expect the runtime resolution for f() according to the right object. In simple words, virtual mechanism doesn't kick in if the method is invoked inside constructor.
If f() is not a virtual function, there is no harm in calling it from constructor(s) provided you know what exactly f() does. Programmers usually call class methods like initialize() from constructor(s).
Can you give me the link to the Herb Sutter's article?

By the time program flow enters your constructor, the object's memory has been allocated and the this pointer is indeed valid.
What Herb means, is that the object's state may not have entirely initialized. In particular, if you are constructing a class derived from A, then that class' constructor will not have been called while you are still inside A's constructor.
This is important if you have virtual member functions, since any virtual function in the derived class will not be run if called from within A's constructor.

Note: it would have been easier with the exact article, so that we could have some context
Lifetime considerations are actually pretty complicated.
Considering the constructor of an object, there are two different point of views:
external: ie the user of an object
internal: ie, you when writing constructors and destructors (notably)
From the external point of view, the lifetime of an object:
begins once the constructor successfully completed
ends when the destructor begins to run
It means that if you attempt to access an object mid-construction or mid-destruction Bad Things Happen (tm). This is mostly relevant to multi-threaded programs, but may happen if you pass pointers to your object to base classes... which leads to...
...the internal point of view. It's more complicated. One thing you are sure of is that the required memory has been allocated, however parts of the objects may not be fully initialized yet (after all, you are constructing it).
in the body of the constructor, you can use the attributes and bases of the class (they are initialized), and call functions normally (virtual calls should be avoided).
if it's a base class, the derived object is not initialized yet (thus the restriction on virtual calls)

The implication from the lifetime not having started yet is mainly that, should the constructor throw an exception, the destructor will not be run.

Beware of member variables that are not yet initialized. Beware of virtual functions: the function that you call might not be the one that you expect if the function is virtual and a derived object is created. Other than that, I do not see any problem calling methods from the constructor. Especially the memory for the object has already been allocated.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js