Trying to understand default constructors and member initialisatioon - c++

I am used to initialising member variables in class constructors, but I thought I'd check out if default values are set by default constructors. My tests were with Visual Studio 2022 using the C++ 20 language standard. The results confused me:
#include <iostream>
class A
{
public:
double r;
};
class B
{
public:
B() = default;
double r;
};
class C
{
public:
C() {}
double r;
};
int main()
{
A a1;
std::cout << a1.r << std::endl; // ERROR: uninitialized local variable 'a1' used
A a2();
std::cout << a2.r << std::endl; // ERROR: left of '.r' must have class/struct/union
A* pa1 = new A;
std::cout << pa1->r << std::endl; // output: -6.27744e+66
A* pa2 = new A();
std::cout << pa2->r << std::endl; // output: 0
B b1;
std::cout << b1.r << std::endl; // ERROR: uninitialized local variable 'b1' used
B b2();
std::cout << b2.r << std::endl; // ERROR: left of '.r' must have class/struct/union
B* pb1 = new B;
std::cout << pb1->r << std::endl; // output: -6.27744e+66
B* pb2 = new B();
std::cout << pb2->r << std::endl; // output: 0
C c1;
std::cout << c1.r << std::endl; // output: -9.25596e+61
C c2();
std::cout << c2.r << std::endl; // ERROR: left of '.r' must have class/struct/union
C* pc1 = new C;
std::cout << pc1->r << std::endl; // output: -6.27744e+66
C* pc2 = new C();
std::cout << pc2->r << std::endl; // output: -6.27744e+66
}
Thanks to anyone who can enlighten me.

Lets see what is happening on case by case basis in your given example.
Case 1
Here we consider the statements:
A a1; //this creates a variable named a1 of type A using the default constrcutor
std::cout << a1.r << std::endl; //this uses the uninitialized data member r which leads to undefined behavior
In the above snippet, the first statement creates a variable named a1 of type A using the default ctor A::A() synthesized by the compiler. This means that the data member r will default initialized. And since r is of built-in type, it will have undeterminate value. Using this uninitilized variable which you do when you wrote the second statement shown in the above snippet is undefined behavior.
Case 2
Here we consider the statements:
A a2(); //this is a function declaration
std::cout << a2.r << std::endl; //this is not valid since a2 is the name of a function
The first statement in the above snippet, declares a function named a2 that takes no parameters and has the return type of A. That is, the first statement is actually a function declaration. Now, in the second statement you're trying to access a data member r of the function named a2 which doesn't make any sense and hence you get the mentioned error.
Case 3
Here we consider the statements:
A* pa1 = new A;
std::cout << pa1->r << std::endl; // output: -6.27744e+66
The first statement in the above snippet has the following effects:
an unnamed object of type A is created on the heap due to new A using the default constructor A::A() synthesized by the compiler. Moreover, we also get a pointer to this unnamed object as a result.
Next, the pointer that we got in step 1 above, is used as an initializer for pa1. That is, a pointer to A named pa1 is created and is initialized by the pointer to the unnamed object that we got in step 1 above.
Since, the default constructor was used(see step 1) this means that the data member r of the unnamed object is default initilaized. And since the data member r is of built in type, this implies that it has indeterminate value. And using this uninitialized data member r which you do in the second statement of the above code snippet, is undefined behavior. This is why you get some garbage value as output.
Case 4
Here we consider the statements:
A* pa2 = new A();
std::cout << pa2->r << std::endl;
The first statement of the above snippet has the following effects:
An unnamed object of type A is created due to the expression new A(). But this time since you have used parenthesis () and since class A does not have an user provided default constructor, this means value initialization will happen. This essentially means that the data member r will be zero initialized. This is why/how you get the output as 0 in the second statement of the above snippet. Moreover, a pointer to this unnamed object is returned as the result.
Next, a pointer to A named pa2 is created and is initialized using the pointer to the unnamed object that we got in step 1 above.
Exactly the same thing happens with the next 4 statements related to class B. So i am not discussing the next 4 statements that are related to class B since we will learn nothing new from them. The same thing will happen for them as for the previous 4 statement described above.
Now will consider the statements related to class C. We're not skipping over these 4 statements because for class C there is a user-defined default constructor.
Statement 5
Here we consider the statements:
C c1;
std::cout << c1.r << std::endl;
The first statement of the above snippet creates a variable named c1 of type C using the user provided default constructor A::A(). Since this user provided default constructor doesn't do anything, the data member r is left uninitialized and we get the same behavior as we discussed for A a1;. That is, using this uninitialized variable which you do in the second statement is undefined behavior.
Statement 6
Here we consider the statements:
C c2();
std::cout << c2.r << std::endl;
The first statement in the above snippet is a function declaration. Thus you'll get the same behavior/error that we got for class A.
Statement 7
Here we consider the statements:
C* pc1 = new C;
std::cout << pc1->r << std::endl;
The first statement in the above snippet has the following effects:
An unnamed object of type C is created on the heap using the user provided default constructor A::A() due to the expression new A. And since the user provide default constructor does nothing, the data member r is left uninitialized. Moreover, we get a pointer to this unnamed object as result.
Next, a pointer to C named pc1 is created and is initialized by the pionter to unnamed object that we got in step 1.
Now the second statement in the above snippet, uses uninitialized data member r which is undefined behavior and explains why you are getting some garbage value as output.
Statement 8
Here we consider the statements:
C* pc2 = new C();
std::cout << pc2->r << std::endl;
The first statement of the above snippet has the following effects:
An unnamed object of type C is created on the heap due to new C(). Now since you have specificed parenthesis () this will do value-initialization. But because this time we've a user-provide default constructor, value-initialization is the same as default-initialization which will be done using the user-provide default constructor. And since the user provide default constructor does nothing, the data member r will be left uninitialized. Moreover, we get a pointer to the unnamed object as result.
Next, a pointer to C named pc2 is created and is initialized by the pionter to unnamed object that we got in step 1 above.
Now the second statement in the above snippet, uses uninitialized data member r which is undefined behavior and explains why you are getting some garbage value as output.

A a2();, B b2(); and C c2(); could also be parsed as declarations of functions returning A/B/C with empty parameter list. This interpretation is preferred and so you are declaring functions, not variables. This is also known as the "most vexing parse" issue.
None of the default constructors (including the implicit one of A) are initializing r, so it will have an indeterminate value. Reading that value will cause undefined behavior.
An exception are A* pa2 = new A(); and B* pb2 = new B();. The () initializer does value-initialization. The effect of value-initialization is in these two cases that the whole object will be zero-initialized, because both A and B have a default constructor that is not user-provided. (Defaulted on first declaration doesn't count as user-provided.)
In case of C this doesn't apply, because C's default constructor is user-provided and therefore value-initialization will only result in default-initialization, calling the default constructor, which doesn't initialize r.

MyType name(); // This is treated as a function declaration
MyType name{}; // This is the correct way

Related

How unused default member initializer can change program behavior in C++?

Please consider this short code example:
#include <iostream>
struct A
{
A() { std::cout << "A() "; }
~A() { std::cout << "~A() "; }
};
struct B { const A &a; };
struct C { const A &a = {}; };
int main()
{
B b({});
std::cout << ". ";
C c({});
std::cout << ". ";
}
GCC prints here ( https://gcc.godbolt.org/z/czWrq8G5j )
A() ~A() . A() . ~A()
meaning that the lifetime of A-object initializing reference in b is short, but in c the lifetime is prolonged till the end of the scope.
The only difference between structs B and C is in default member initializer, which is unused in main(), still the behavior is distinct. Could you please explain why?
C c(...); is syntax for direct initialisation. Overload resolution would find a match from the constructors of C: The move constructor can be called by temporary materialisation of a C from {}. {} is value initialisation which will use the default member initialiser. Thus, the default member initialiser isn't unused. Since C++17, the move constructor isn't necessary and {} initialises variable c directly; In this case c.a is bound directly to the temporary A and the lifetime is extended until destruction of C.
B isn't default constructible, so the overload resolution won't find a match. Instead, aggregate initialisation is used since C++20 - prior to that it would be ill-formed. The design of the C++20 feature is to not change behaviour of previously valid programs, so aggregate initialisation has lower priority than the move constructor.
Unlike in the case of C, the lifetime of the temporary A isn't extended because parenthesised initialisation list is an exceptional case. It would be extended if you used curly braces:
B b{{}};

C++: Initializing with a temporary object that uses its default constructor declares a function prototype instead of a variable [duplicate]

This question already has an answer here:
Most vexing parse confusion
(1 answer)
Closed 2 years ago.
#include <iostream>
struct A
{
// constructor 1
A() { std::cout << "A() called" << std::endl; }
// constructor 2
A(const A&) { std::cout << "A(const A&) called" << std::endl; }
};
int main()
{
// statement 1
A x1;
// statement 2
A x2( A() );
return 0;
}
I've compiled it with GCC 4.7, 5.1 and 8.1. I got the same results from C++98 to C++17.
Statement 1 creates an A variable that calls constructor 1.
Statement 2 declares a prototype of a function that takes a function pointer as its argument and returns a new A object:
A ( A (*)() )
I was expecting that statement 2 would create an A variable that calls constructor 2 since I am initializing with a temporary A object.
I'm not sure what I'm missing here. Can someone help me understand this unexpected situation?
In what is labelled statement 2 inside your code, you are creating a function pointer pointing to a new object, like you mention, but you don't instruct the program to actually store it, so x2, left as is in your snippet code, copies the pointer to a bygone temporary.
Storing the temporary A object from the function pointer will compile correctly:
// statement 2
A x2( *(new A()) );
[Since this is just for the sake of shedding light on this particular issue, we'll overlook that we are more than likely to create a memory leak if the line above finds its way in a real code base, because we have no means of control of the new A object that gets copied in x2.]
The output will be
A() called
A() called
A(const A&) called
Basically, yours was a good try, but unfortunately you cannot get away from creating another A() object, if you don't want to copy x2 from x1.
In order to save one constructor call, we could construct x2 from x1: using a function pointer, we first define it:
// statement 2
A (*fcnptr)(A);
A x2( (*fcnptr)(x1) );
which is the same as, more trivially
A x2(x1);
without using function pointers.
The output:
A() called
A(const A&) called

Do member variables die off before the destructor is called?

I was running this code to figure out which whether an object is destructed after or before it is reassigned to. But I didn't get the expected output. The variable id is printed correctly by the included print function, but it fails with some other number when printed by the destructor, and the two are the same for both objects. Why does this happen?
#include <iostream>
#include <string>
class A {
static int _idx;
int id;
public:
A()
{
std::cout << "Default constructor" << std::endl;
id = _idx;
_idx++;
}
A(std::string&& str)
{
id = _idx;
_idx++;
std::cout << str << std::endl;
}
void print()
{
std::cout << id << std::endl;
}
~A()
{
std::cout << id << std::endl;
}
};
int A::_idx = 0;
int main(void)
{
A a;
a.print();
a = std::string("World");
}
Output:
Default constructor
0
World
1
1
EDIT: Removed sloppy code and added example output for clarification
Don't call the destructor explicitly. Simply allow the variable to go out of scope.
When A does go out of scope, the first thing to run is the destructor body, followed by any destructors for member variables in the class in the reverse order in which they are constructed, then the process continues for any base classes.
(Member variables are constructed in the order that they appear in the class declaration.)
In your code, the destructor is called twice since you constructing 2 instances of A; note that the compiler generated assignment operator is effectively the thing that causes the value of id to be 1 at the point of their destruction.
The title question:
Do member variables die off before the destructor is called?
The answer is No. All member variables are alive in the body of the destructor.
Question in the post:
The variable id is printed correctly by the included print function, but it fails with some other number when printed by the destructor, and the two are the same for both objects. Why does this happen?
That problem can be traced to the line:
a = std::string("World");
This calls the overloaded constructor of the class to create a temporary object and the temporary object is assigned to a. The problem here is the implementation of the constructor.
It used to be
A(std::string&& str)
{
std::cout << str << std::endl;
}
The constructor left the member variable uninitialized. Hence, the value of member variable could be anything.
Now that you changed it to
A(std::string&& str)
{
id = _idx;
_idx++;
std::cout << str << std::endl;
}
your program will work in a predictable manner.
Why are your last lines of output the way they are?
Consider the line
a = std::string("World");
It is equivalent to:
Constructor a temporary object. (its id is 1)
Assign the temporary object to a. Now a.id is 1.
Destroy the temporary object. The destructor gets called on the temporary. You get the output 1
When the function returns, a is destructed. The destructor gets call on a. Since a.id is set to 1, you get the output 1.
The destructor is called when the object needs to die. After the destructor runs, member variables are destructed in reverse order of declaration. Destructor runs first.
the two are the same for both objects. Why does this happen?
Because of this line.
a = std::string("World");
This constructs a second A, with id = 1 and assigns it to the first A making that id also 1.
That = is an assignment operator. It means "make the left thing like the right thing".

Returning value (reference, pointer and object)

I have some difficulties with understanding what is really done behind returning values in C++.
Let's have following code:
class MyClass {
public:
int id;
MyClass(int id) {
this->id = id;
cout << "[" << id << "] MyClass::ctor\n";
}
MyClass(const MyClass& other) {
cout << "[" << id << "] MyClass::ctor&\n";
}
~MyClass() {
cout << "[" << id << "] MyClass::dtor\n";
}
MyClass& operator=(const MyClass& r) {
cout << "[" << id << "] MyClass::operator=\n";
return *this;
}
};
MyClass foo() {
MyClass c(111);
return c;
}
MyClass& bar() {
MyClass c(222);
return c;
}
MyClass* baz() {
MyClass* c = new MyClass(333);
return c;
}
I use gcc 4.7.3.
Case 1
When I call:
MyClass c1 = foo();
cout << c1.id << endl;
The output is:
[111] MyClass::ctor
111
[111] MyClass::dtor
My understanding is that in foo object is created on the stack and then destroyed upon return statement because it's end of a scope. Returning is done by object copying (copy constructor) which is later assigned to c1 in main (assignment operator). If I'm right why there is no output from copy constructor nor assignment operator? Is this because of RVO?
Case 2
When I call:
MyClass c2 = bar();
cout << c2.id << endl;
The output is:
[222] MyClass::ctor
[222] MyClass::dtor
[4197488] MyClass::ctor&
4197488
[4197488] MyClass::dtor
What is going on here? I create variable then return it and variable is destroyed because it is end of a scope. Compiler is trying copy that variable by copy constructor but It is already destroyed and that's why I have random value? So what is actually in c2 in main?
Case 3
When I call:
MyClass* c3 = baz();
cout << c3->id << endl;
The output is:
[333] MyClass::ctor
333
This is the simplest case? I return a dynamically created pointer which lies on heap, so memmory is allocated and not automatically freed. This is the case when destructor isn't called and I have memory leak. Am I right?
Are there any other cases or things that aren't obvious and I should know to fully master returning values in C++? ;) What is a recommended way to return a object from function (if any) - any rules of thumb upon that?
May I just add that case #2 is one of the cases of undefined behavior in the C++ language, since returning a reference to a local variable is illegal. This is because a local variable has a precisely defined lifetime, and - by returning it by a reference - you're returning a reference to a variable that does not exist anymore when the function returns. Therefore, you exhibit undefined behavior and the value of the given variable is practically random. As is the result of the rest of your program, since Anything at all can happen.
Most compilers will issue a warning when you try to do something like this (either return a local variable by reference, or by address) - gcc, for example, tells me something like this :
bla.cpp:37:13: warning: reference to local variable ‘c’ returned [-Wreturn-local-addr]
You should remember, however, that the compiler is not at all required to issue any kind of warning when a statement that may exhibit undefined behavior occurs. Situations such as this one, though, must be avoided at all costs, because they're practically never right.
Case 1:
MyClass foo() {
MyClass c(111);
return c;
}
...
MyClass c1 = foo();
is a typical case when RVO can be applied. This is called copy-initialization and the assignment operator is not used since the object is created in place, unlike the situation:
MyClass c1;
c1 = foo();
where c1 is constructed, temporary c in foo() is constructed, [ copy of c is constructed ], c or copy of c is assigned to c1, [ copy of c is destructed] and c is destructed. (what exactly happens depends on whether the compiler eliminates the redundant copy of c being created or not).
Case 2:
MyClass& bar() {
MyClass c(222);
return c;
}
...
MyClass c2 = bar();
invokes undefined behavior since you are returning a reference to local (temporary) variable c ~ an object with automatic storage duration.
Case 3:
MyClass* baz() {
MyClass* c = new MyClass(333);
return c;
}
...
MyClass c2 = bar();
is the most straightforward one since you control what happens yet with a very unpleasant consequence: you are responsible for memory management, which is the reason why you should avoid dynamic allocation of this kind always when it is possible (and prefer Case 1).
1) Yes.
2) You have a random value because your copy c'tor and operator= don't copy the value of id. However, you are correct in assuming there is no relying on the value of an object after it has been deleted.
3) Yes.

const reference public member to private class member - why does it work?

Recently, I found an interesting discussion on how to allow read-only access to private members without obfuscating the design with multiple getters, and one of the suggestions was to do it this way:
#include <iostream>
class A {
public:
A() : _ro_val(_val) {}
void doSomething(int some_val) {
_val = 10*some_val;
}
const int& _ro_val;
private:
int _val;
};
int main() {
A a_instance;
std::cout << a_instance._ro_val << std::endl;
a_instance.doSomething(13);
std::cout << a_instance._ro_val << std::endl;
}
Output:
$ ./a.out
0
130
GotW#66 clearly states that object's lifetime starts
when its constructor completes successfully and returns normally. That is, control reaches the end of the constructor body or an earlier return statement.
If so, we have no guarantee that the _val memeber will have been properly created by the time we execute _ro_val(_val). So how come the above code works? Is it undefined behaviour? Or are primitive types granted some exception to the object's lifetime?
Can anyone point me to some reference which would explain those things?
Before the constructor is called an appropriate amount of memory is reserved for the object on Freestore(if you use new) or on stack if you create object on local storage. This implies that the memory for _val is already allocated by the time you refer it in Member initializer list, Only that this memory is not properly initialized as of yet.
_ro_val(_val)
Makes the reference member _ro_val refer to the memory allocated for _val, which might actually contain anything at this point of time.
There is still an Undefined Behavior in your program because, You should explicitly initialize _val to 0(or some value,you choose)in the constructor body/Member Initializer List.The output 0 in this case is just because you are lucky it might give you some other values since _val is left unInitialized. See the behavior here on gcc 4.3.4 which demonstrates the UB.
But as for the Question, Yes indeed the behavior is Well-Defined.
The object's address does not change.
I.e. it's well-defined.
However, the technique shown is just premature optimization. You don't save programmers' time. And with modern compiler you don't save execution time or machine code size. But you do make the objects un-assignable.
Cheers & hth.,
In my opinion, it is legal (well-defined) to initialize a reference with an uninitialized object. That is legal but standard (well, the latest C++11 draft, paragraph 8.5.3.3) recommends using a valid (fully constructed) object as an initializer:
A reference shall be initialized to refer to a valid object or function.
The next sentence from the same paragraph throws a bit more light at the reference creation:
[Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior.]
I understand that reference creation means binding reference to an object obtained by dereferencing its pointer and that probably explains that the minimal prerequisite for initialization of reference of type T& is having an address of the portion of the memory reserved for the object of type T (reserved, but not yet initialized).
Accessing uninitialized object through its reference can be dangerous.
I wrote a simple test application that demonstrates reference initialization with uninitialized object and consequences of accessing that object through it:
class C
{
public:
int _n;
C() : _n(123)
{
std::cout << "C::C(): _n = " << _n << " ...and blowing up now!" << std::endl;
throw 1;
}
};
class B
{
public:
// pC1- address of the reference is the address of the object it refers
// pC2- address of the object
B(const C* pC1, const C* pC2)
{
std::cout << "B::B(): &_ro_c = " << pC1 << "\n\t&_c = " << pC2 << "\n\t&_ro_c->_n = " << pC1->_n << "\n\t&_c->_n = " << pC2->_n << std::endl;
}
};
class A
{
const C& _ro_c;
B _b;
C _c;
public:
// Initializer list: members are initialized in the order how they are
// declared in class
//
// Initializes reference to _c
//
// Fully constructs object _b; its c-tor accesses uninitialized object
// _c through its reference and its pointer (valid but dangerous!)
//
// construction of _c fails!
A() : _ro_c(_c), _b(&_ro_c, &_c), _c()
{
// never executed
std::cout << "A::A()" << std::endl;
}
};
int main()
{
try
{
A a;
}
catch(...)
{
std::cout << "Failed to create object of type A" << std::endl;
}
return 0;
}
Output:
B::B(): &_ro_c = 001EFD70
&_c = 001EFD70
&_ro_c->_n = -858993460
&_c->_n = -858993460
C::C(): _n = 123 ...and blowing up now!
Failed to create object of type A