c++ anonymous constructor doing weird things - c++

This sample program shows how a different constructor will be called depending on whether you pass in a local variable, a global variable, or an anonymous variable. What is going on here?
std::string globalStr;
class aClass{
public:
aClass(std::string s){
std::cout << "1-arg constructor" << std::endl;
}
aClass(){
std::cout << "default constructor" << std::endl;
}
void puke(){
std::cout << "puke" << std::endl;
}
};
int main(int argc, char ** argv){
std::string localStr;
//aClass(localStr); //this line does not compile
aClass(globalStr); //prints "default constructor"
aClass(""); //prints "1-arg constructor"
aClass(std::string("")); //also prints "1-arg constructor"
globalStr.puke(); //compiles, even though std::string cant puke.
}
Given that I can call globalStr.puke(), I'm guessing that by calling aClass(globalStr);, it is creating a local variable named globalStr of type aClass that is being used instead of the global globalStr. Calling aClass(localStr); tries to do the same thing, but fails to compile because localStr is already declared as a std::string. Is it possible to create an anonymous instance of a class by calling its 1-arg constructor with a non-constant expression? Who decided that type(variableName); should be an acceptable way to define a variable named variableName?

aClass(localStr); //this line does not compile
This tries to declare a variable of type aClass named localStr. The syntax is terrible, I agree, but it's way too late for that [changing the standard] now.
aClass(globalStr); //prints "default constructor"
This declares one called globalStr. This globalStr variable hides the global one.
aClass(""); //prints "1-arg constructor"
This creates a temporary object of type aClass.
aClass(std::string("")); //also prints "1-arg constructor"
This also creates a temporary.
globalStr.puke(); //compiles, even though std::string cant puke.
This uses the globalStr in main, which is consistent with every other instance of shadowing.
Is it possible to create an anonymous instance of a class by calling its 1-arg constructor with a non-constant expression?
Yes, I can think of four ways:
aClass{localStr}; // C++11 list-initialization, often called "uniform initialization"
(void)aClass(localStr); // The regular "discard this result" syntax from C.
void(aClass(localStr)); // Another way of writing the second line with C++.
(aClass(localStr)); // The parentheses prevent this from being a valid declaration.
As a side note, this syntax can often be the cause of the Most Vexing Parse. For example, the following declares a function foo that returns aClass, with one parameter localStr of type std::string:
aClass foo(std::string(localStr));
Indeed it's the same rule that's responsible for your problems - If something can be parsed as a valid declaration, it must be. That is why aClass(localStr); is a declaration and not a statement consisting of a lone expression.

Related

Trying to understand default constructors and member initialisatioon

I am used to initialising member variables in class constructors, but I thought I'd check out if default values are set by default constructors. My tests were with Visual Studio 2022 using the C++ 20 language standard. The results confused me:
#include <iostream>
class A
{
public:
double r;
};
class B
{
public:
B() = default;
double r;
};
class C
{
public:
C() {}
double r;
};
int main()
{
A a1;
std::cout << a1.r << std::endl; // ERROR: uninitialized local variable 'a1' used
A a2();
std::cout << a2.r << std::endl; // ERROR: left of '.r' must have class/struct/union
A* pa1 = new A;
std::cout << pa1->r << std::endl; // output: -6.27744e+66
A* pa2 = new A();
std::cout << pa2->r << std::endl; // output: 0
B b1;
std::cout << b1.r << std::endl; // ERROR: uninitialized local variable 'b1' used
B b2();
std::cout << b2.r << std::endl; // ERROR: left of '.r' must have class/struct/union
B* pb1 = new B;
std::cout << pb1->r << std::endl; // output: -6.27744e+66
B* pb2 = new B();
std::cout << pb2->r << std::endl; // output: 0
C c1;
std::cout << c1.r << std::endl; // output: -9.25596e+61
C c2();
std::cout << c2.r << std::endl; // ERROR: left of '.r' must have class/struct/union
C* pc1 = new C;
std::cout << pc1->r << std::endl; // output: -6.27744e+66
C* pc2 = new C();
std::cout << pc2->r << std::endl; // output: -6.27744e+66
}
Thanks to anyone who can enlighten me.
Lets see what is happening on case by case basis in your given example.
Case 1
Here we consider the statements:
A a1; //this creates a variable named a1 of type A using the default constrcutor
std::cout << a1.r << std::endl; //this uses the uninitialized data member r which leads to undefined behavior
In the above snippet, the first statement creates a variable named a1 of type A using the default ctor A::A() synthesized by the compiler. This means that the data member r will default initialized. And since r is of built-in type, it will have undeterminate value. Using this uninitilized variable which you do when you wrote the second statement shown in the above snippet is undefined behavior.
Case 2
Here we consider the statements:
A a2(); //this is a function declaration
std::cout << a2.r << std::endl; //this is not valid since a2 is the name of a function
The first statement in the above snippet, declares a function named a2 that takes no parameters and has the return type of A. That is, the first statement is actually a function declaration. Now, in the second statement you're trying to access a data member r of the function named a2 which doesn't make any sense and hence you get the mentioned error.
Case 3
Here we consider the statements:
A* pa1 = new A;
std::cout << pa1->r << std::endl; // output: -6.27744e+66
The first statement in the above snippet has the following effects:
an unnamed object of type A is created on the heap due to new A using the default constructor A::A() synthesized by the compiler. Moreover, we also get a pointer to this unnamed object as a result.
Next, the pointer that we got in step 1 above, is used as an initializer for pa1. That is, a pointer to A named pa1 is created and is initialized by the pointer to the unnamed object that we got in step 1 above.
Since, the default constructor was used(see step 1) this means that the data member r of the unnamed object is default initilaized. And since the data member r is of built in type, this implies that it has indeterminate value. And using this uninitialized data member r which you do in the second statement of the above code snippet, is undefined behavior. This is why you get some garbage value as output.
Case 4
Here we consider the statements:
A* pa2 = new A();
std::cout << pa2->r << std::endl;
The first statement of the above snippet has the following effects:
An unnamed object of type A is created due to the expression new A(). But this time since you have used parenthesis () and since class A does not have an user provided default constructor, this means value initialization will happen. This essentially means that the data member r will be zero initialized. This is why/how you get the output as 0 in the second statement of the above snippet. Moreover, a pointer to this unnamed object is returned as the result.
Next, a pointer to A named pa2 is created and is initialized using the pointer to the unnamed object that we got in step 1 above.
Exactly the same thing happens with the next 4 statements related to class B. So i am not discussing the next 4 statements that are related to class B since we will learn nothing new from them. The same thing will happen for them as for the previous 4 statement described above.
Now will consider the statements related to class C. We're not skipping over these 4 statements because for class C there is a user-defined default constructor.
Statement 5
Here we consider the statements:
C c1;
std::cout << c1.r << std::endl;
The first statement of the above snippet creates a variable named c1 of type C using the user provided default constructor A::A(). Since this user provided default constructor doesn't do anything, the data member r is left uninitialized and we get the same behavior as we discussed for A a1;. That is, using this uninitialized variable which you do in the second statement is undefined behavior.
Statement 6
Here we consider the statements:
C c2();
std::cout << c2.r << std::endl;
The first statement in the above snippet is a function declaration. Thus you'll get the same behavior/error that we got for class A.
Statement 7
Here we consider the statements:
C* pc1 = new C;
std::cout << pc1->r << std::endl;
The first statement in the above snippet has the following effects:
An unnamed object of type C is created on the heap using the user provided default constructor A::A() due to the expression new A. And since the user provide default constructor does nothing, the data member r is left uninitialized. Moreover, we get a pointer to this unnamed object as result.
Next, a pointer to C named pc1 is created and is initialized by the pionter to unnamed object that we got in step 1.
Now the second statement in the above snippet, uses uninitialized data member r which is undefined behavior and explains why you are getting some garbage value as output.
Statement 8
Here we consider the statements:
C* pc2 = new C();
std::cout << pc2->r << std::endl;
The first statement of the above snippet has the following effects:
An unnamed object of type C is created on the heap due to new C(). Now since you have specificed parenthesis () this will do value-initialization. But because this time we've a user-provide default constructor, value-initialization is the same as default-initialization which will be done using the user-provide default constructor. And since the user provide default constructor does nothing, the data member r will be left uninitialized. Moreover, we get a pointer to the unnamed object as result.
Next, a pointer to C named pc2 is created and is initialized by the pionter to unnamed object that we got in step 1 above.
Now the second statement in the above snippet, uses uninitialized data member r which is undefined behavior and explains why you are getting some garbage value as output.
A a2();, B b2(); and C c2(); could also be parsed as declarations of functions returning A/B/C with empty parameter list. This interpretation is preferred and so you are declaring functions, not variables. This is also known as the "most vexing parse" issue.
None of the default constructors (including the implicit one of A) are initializing r, so it will have an indeterminate value. Reading that value will cause undefined behavior.
An exception are A* pa2 = new A(); and B* pb2 = new B();. The () initializer does value-initialization. The effect of value-initialization is in these two cases that the whole object will be zero-initialized, because both A and B have a default constructor that is not user-provided. (Defaulted on first declaration doesn't count as user-provided.)
In case of C this doesn't apply, because C's default constructor is user-provided and therefore value-initialization will only result in default-initialization, calling the default constructor, which doesn't initialize r.
MyType name(); // This is treated as a function declaration
MyType name{}; // This is the correct way

C++ what is this little used constructor syntax?

EDIT: I don't think this a duplicate of this other question, because the other question simply transposes () for {} in the constructors. Whereas I note different behavior when a constructor is defined in a struct, but not in a class. (And, as pointed out in the comments, this is about using constructors not writing them.) But I've been wrong before.
I came across this strange (to me) syntax for a constructor while tutoring:
Foo obj {i, j};
At first I thought it wouldn't work, and told the student to rewrite it – however they were adamant it worked, and informed me they pulled the example from cplusplus.com, to which I've not been able to find a reference, so I tried it anyway... And it worked. So I experimented with it.
I also researched a bit on my own, and found no reference to that kind of constructor syntax on cplusplus.com. (Maybe it has a specific name?)
Here's what I did to experiment with it.
struct Note { //A musical note.
std::string name;
double freq;
//Note(std::string s, double f): name(s), freq(f){}
//Uncomment the constructor in order to use normal constructor syntax.
};
class Journal {
public:
std::string title;
std::string message;
int idNum;
};
int main() {
Note a { "A", 440.0}; //Works with or without a constructor.
//Note a("A",440.0); //Works ***only*** with a defined constructor.
//Journal journal("hello, world", "just me, a class", 002); //Works regardless of constructor definition.
Journal journal {"hello, world", "just me, a class", 003}; //Works regardless of constructor definition.
std::cout << a.name << " " << a.freq << std::endl;
std::cout << journal.title << " " << journal.message << " " << journal.idNum << std::endl;
return 0;
}
I found that it works with structs and classes regardless if they have a defined constructor.
Obviously default constructors are at work, but this confused me because of a few reasons:
I've never seen this syntax before (not surprising, C++ is huge)
It actually works
Works regardless of a constructor being defined, so may be default behavior
My question is:
Is there a name and specific purpose for this syntax which sets it apart from regular constructor behavior, and if so, why use it (or not)?
Things have changed since the introduction of C++11 and you should probably read about list initialization
int x (0); // Constructor initialization
int x {0}; // Uniform initialization
Before its introduction you had various initialization cases:
objects by calling the usual () constructor (and watch out for most vexing parse if no arguments are present)
aggregate classes or arrays with {}
default constructing with no braces
Now you can use uniform initialization in all of them.
It has to be noted that the two aren't really interchangeable since
std::vector<int> v (100); // 100 elements
std::vector<int> v {100}; // one element of value 100
and you have to pay attention to the fact that if the type has an initializer list constructor, it will take precedence in overload resolution.
That said, uniform initialization can be quite handy and safe (preventing narrowing conversions).
That syntax is called "list initialization". You can read more about it in Section 8.5.4 of C++14.
I would guess that this syntax mainly exists to be backwards compatible with C, which had syntax for initializing structs and arrays that looked very similar.

can a C++ function return an object with a constructor and a destructor

I'm trying to establish whether it is safe for a C++ function to return an object that has a constructor and a destructor. My understanding of the standard is that it ought to be possible, but my tests with simple examples show that it can be problematic. For example the following program:
#include <iostream>
using namespace std;
struct My
{ My() { cout << "My constructor " << endl; }
~My() { cout << "My destructor " << endl; }
};
My function() { My my; cout << "My function" << endl; return my; }
int main()
{ My my = function();
return 0;
}
gives the output:
My constructor
My function
My destructor
My destructor
when compiled on MSVC++, but when compiled with gcc gives the following output:
My constructor
My function
My destructor
Is this a case of "undefined behavior", or is one of the compilers not behaving in a standard way? If the latter, which ? The gcc output is closer to what I would have expected.
To date, I have been designing my classes on the assumption that for each constructor call there will be at most one destructor call, but this example seems to show that this assumption does not always hold, and can be compiler-dependent. Is there anything in the standard that specifies what should happen here, or is it better to avoid having functions return non-trivial objects ? Apologies if this question is a duplicate.
In both cases, the compiler generates a copy constructor for you, that has no output so you won't know if it is called: See this question.
In the first case the compiler generated copy constructor is used, which matches the second destructor call. The line return my; calls the copy constructor, giving it the variable my to be used to construct the return value. This doesn't generate any output.
my is then destroyed. Once the function call has completed, the return value is destroyed at the end of the line { function();.
In the second case, the copy for the return is elided completely (the compiler is allowed to do this as an optimisation). You only ever have one My instance. (Yes, it is allowed to do this even though it changes the observable behaviour of your program!)
These are both ok. Although as a general rule, if you define your own constructor and destructor, you should also define your own copy constructor (and assignment operator, and possibly move constructor and move assignment if you have c++11).
Try adding your own copy constructor and see what you get. Something like
My (const My& otherMy) { cout << "My copy constructor\n"; }
The problem is that your class My violates the Rule of Three; if you write a custom destructor then you should also write a custom copy constructor (and copy assignment operator, but that's not relevant here).
With:
struct My
{ My() { cout << "My constructor " << endl; }
My(const My &) { cout << "My copy constructor " << endl; }
~My() { cout << "My destructor " << endl; }
};
the output for MSVC is:
My constructor
My function
My copy constructor
My destructor
My destructor
As you can see, (copy) constructors match with destructors correctly.
The output under gcc is unchanged, because gcc is performing copy elision as allowed (but not required) by the standard.
You are missing two things here: the copy constructor and NRVO.
The behavior seen with MSVC++ is the "normal" behavior; my is created and the rest of the function is run; then, when returning, a copy of your object is created. The local my object is destroyed, and the copy is returned to the caller, which just discards it, resulting in its destruction.
Why does it seem that you are missing a constructor call? Because the compiler automatically generated a copy constructor, which is called but doesn't print anything. If you added your own copy constructor:
My(const My& Right) { cout << "My copy constructor " << endl; }
you'd see
My constructor <----+
My function | this is the local "my" object
My copy constructor <--|--+
My destructor <----+ | this is the return value
My destructor <-----+
So the point is: it's not that there are more calls to destructors than constructors, it's just that you are not seeing the call to the copy constructor.
In the gcc output, you are also seeing NRVO applied.
NRVO (Named Return Value Optimization) is one of the few cases where the compiler is allowed to perform an optimization that alters the visible behavior of your program. In fact, the compiler is allowed to elide the copy to the temporary return value, and construct the returned object directly, thus eliding temporary copies.
So, no copy is created, and my is actually the same object that is returned.
My constructor <-- called at the beginning of f
My function
My destructor <-- called after f is terminated, since
the caller discarded the return value of f
To date, I have been designing my classes on the assumption that for each constructor call there will be at most one destructor call [...]
You can still "assume" that since it is true. Each constructor call will go in hand with exactly one destructor call. (Remember that if you handle stuff on the free/heap memory on your own.)
[..] and can be compiler-dependent [...]
In this case it can't. It is optimization depedant. Both, MSVC and GCC behave identically if optimization is applied.
Why don't you see identical behaviour?
1. You don't track everything that happens with your object. Compiler-generated functions bypass your output.
If you want to "follow-up" on the things your compiler does with your objects, you should define all of the special members so you can really track everything and do not get bypassed by any implicit function.
struct My
{
My() { cout << "My constructor " << endl; }
My(My const&) { cout << "My copy-constructor " << endl; }
My(My &&) { cout << "My move-constructor " << endl; }
My& operator=(My const&) { cout << "My copy-assignment " << endl; }
My& operator=(My &&) { cout << "My move-assignment " << endl; }
~My() { cout << "My destructor " << endl; }
};
[Note: The move-constructor and move-assignment will not be implicitly present if you have the copy ones but it's still nice to see when the compiler use which of them.]
2. You don't compile with optimization on both MSVC and GCC.
If compiled with MSVC++11 /O2 option the output is:
My constructor
My function
My destructor
If compiled in debug mode / without optimization:
My constructor
My function
My move-constructor
My destructor
My destructor
I can't do a test on gcc to verify if there's an option that enforces all of these steps but -O0 should do the trick I guess.
What's the difference between optimized and non-optimized compilation here?
The case without any copy omittance:
The completely "non-optimized" behaviour in this line My my_in_main = function();
(changed the name to make things clear) would be:
Call function()
In function construct My My my;
Output stuff.
Copy-construct my into the return value instance.
return and destroy my instance.
Copy(or move in my example)-construct the return value instance into my_in_main.
Destroy the return value instance.
As you can see: we have at most two copies (or one copy and one move) here but the compilers may possibly omit them.
To my understanding, the first copy is omited even without optimization turned on (in this case), leaving the process as follows:
Call function()
In function construct My My my; First constructor output!
Output stuff. Function output!
Copy(or move in my example)-construct the return value instance into my_in_main. Move output!
Destroy the return value instance. Destroy output!
The my_in_main is destroy at the end of main giving the last Destroy output!. So we know what happens in the non-optimized case now.
Copy elision
The copy (or move if the class has a move constructor as in my example) can be elided.
§ 12.8 [class.copy] / 31
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects.
So now the question is when does this happen in this example? The reason for the elison of the first copy is given in the very same paragraph:
[...] in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cvunqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value.
Return type matches type in the return statement: function will construct My my; directly into the return value.
The reason for the elison of the second copy/move:
[...] when a temporary class object that has not been bound to a reference (12.2) would be copied/moved to a class object with the same cv-unqualified type, the copy/move operation can be omitted by constructing the temporary object directly into the target of the omitted copy/move.
Target type matches the type returned by the function: The return value of the function will be constructed into my_in_main.
So you have a cascade here:
My my; in your function is directly constructed into the return value which is directly constructed into my_in_main So you have in fact only one object here and function() would (whatever it does) in fact operate on the object my_in_main.
Call function()
In function construct My instance into my_in_main. Constructor output!
Output stuff. Function output!
my_in_main is still destroyed at the end of main giving a Destructor output!.
That makes three outputs in total: Those you observe if optimization is turned on.
An example where elision is not possible.
In the following example both copies mentioned above cannot be omitted because the class types do not match:
The return statement does not match the return type
The target type does not match the return type of the function
I just created two additional types:
#include <iostream>
using namespace std;
struct A
{
A(void) { cout << "A constructor " << endl; }
~A(void) { cout << "A destructor " << endl; }
};
struct B
{
B(A const&) { cout << "B copy from A" << endl; }
~B(void) { cout << "B destructor " << endl; }
};
struct C
{
C(B const &) { cout << "C copy from B" << endl; }
~C(void) { cout << "C destructor " << endl; }
};
B function() { A my; cout << "function" << endl; return my; }
int main()
{
C my_in_main(function());
return 0;
}
Here we have the "completely non-optimized behaviour" I mentioned above. I'll refer to the points I've drawn there.
A constructor (see 2.)
function (see 3.)
B copy from A (see 4.)
A destructor (see 5.)
C copy from B (see 6.)
B destructor (see 7.)
C destructor (instance in main, destroy at end of main)

C++ redefiniton of 'name' with a different type compiler warning

Why do the compiler complain redefiniton of 'reader' with a different type when I try to pass an fstream object from main() into another class´s constructor for it to be read? I am aware of this is maybe i dumb way of doing it, I should really just have a string as parameter asking for filename then pass that into the fstream that I allocate in the constructor of the class. But anyways I am wondering why this don´t work, the compiler warning is cryptic.
my main function:
fstream reader;
reader.open("read.txt");
Markov(reader);
Constructor in Markov.h class:
class Markov {
public:
/** Constructor */
Markov(fstream inStream) {
Map<string, Vector<string> > array0;
char ch;
while (inStream.good())
{
ch = inStream.get();
cout << ch << endl;
}
cout << "End: " << ch;
order0(array0);
}
The line Markov(reader); is creating a variable called reader of type Markov. It is equivalent to the following: Markov reader;. Of course, since the compiler thinks you're declaring another variable called reader, it throws up this error. To create an instance of Markov, do this:
Markov m(reader);
This is an ambiguity in the grammar of C++ that is always taken as a declaration of a variable, rather than the construction of a temporary. In fact, you can have as many parentheses as you like around the variable name in your declaration: Markov (((((reader))))).
Markov(reader) is of course perfectly fine syntax for creating a temporary of type Markov, as long as it isn't in a statement that could be parsed as a declaration. For example, if it's in the middle of an expression, you'll be okay. In the contrived expression something += Markov(reader) - 6, it can't be interpreted as a declaration.
Likewise, if there is more than one argument being passed to the constructor, Markov(reader, writer), or if the single argument is not an identifier, Markov("foo"), it is not ambiguous.
If you're using a C++11 compiler, you can indeed create a temporary (although I see no reason to do it) that takes a single argument identifier using the new initialization syntax:
Markov{reader};
You may want to pass that fstream by reference.
Markov(fstream& inStream)
And while you're at it, if you're only using it for input services, use an ifstream& instead.

Understanding C++ reconstruct syntax

Can we call an object's constructor again after it is created?
#include <iostream>
struct A
{
A ( ) { std::cout << "A::A" << std::endl; }
~A ( ) { std::cout << "A::~A" << std::endl; }
};
int main( )
{
A a;
a.~A(); // OK
a.A::A(); // OK in Visual Studio 2005, 2008, 2010
return 0;
}
You shouldn't be able to call the constructor like this, as a member function call. The reason is (n3242, 12.1/2):
A constructor is used to initialize objects of its class type. Because constructors do not have names, they are
never found during name lookup; however an explicit type conversion using the functional notation (5.2.3)
will cause a constructor to be called to initialize an object.
If you really really want to call constructor on something what should be an object - and you shouldn't do it unless in very special cases - you can use placement new that calls the constructor:
new (&a) A();
Well, a.A() fails to compile because you simply cannot call a constructor in C++. (You can invoke it indirectly, however through several means.) For the same reason, I think a.A::A() should not compile.