I am trying to initialize a rvalue reference member in the following situations: struct A is a aggregate and class B has a user defined constructor so it is not an aggregate. According to cppreference here,
struct A {
int&& r;
};
A a1{7}; // OK, lifetime is extended
A a2(7); // well-formed, but dangling reference
The reference in my struct A should be correctly initialized and the temporary string should be extended, but it's not.
For my class B, in the same cppreference page:
a temporary bound to a reference member in a constructor initializer list persists only until the constructor exits, not as long as the object exists. (note: such initialization is ill-formed as of DR 1696).
(until C++14)
But I am still getting problem with MSVC with /std:c++latest. Am I missing anything?
struct A
{
std::string&& ref;
};
class B
{
std::string&& ref;
public:
B(std::string&& rref):ref(std::move(rref)){}
void print() {std::cout<<ref<<'\n';}
};
int main()
{
A a{ std::string{"hello world"} };
std::cout<<a.ref<<'\n'; //garbage in MSVC with c++latest, correct in GCC9.2
B b{ std::string{"hello world"} };
b.print(); //same
}
EDIT: Please tell me if I am getting dangling reference in these two cases in the first place. And for MSVC, the version is the latest update, v19.24.28315
EDIT2: OK I am actually confused by cppreference's explanation. I am not asking specificly for C++20. Please tell me under which C++ version, the code is well-formed or not.
EDIT3: Is there a proper way to initialize a rvalue reference member of a non-aggregate class? Since bind it to a temporary in a member initializer list is ill-formed (until C++14, does it mean it is good since C++14?) and passing a temporary to a constructor expecting an rvalue cannot extend its lifetime twice.
Your class A is an aggregate type, but B isn't, because it has a user-provided constructor and a private non-static member.
Therefore A a{ std::string{"hello world"} }; is aggregate initialization which does extend the lifetime of the temporary through reference binding to that of a.
On the other hand B is not aggregate initialization. It calls the user-provided constructor, which passes on the reference. Passing on references does not extend the lifetime of the temporary. The std::string object will be destroyed when the constructor of B exits.
Therefore the later use of a has well-defined behavior, while that of b will have undefined behavior.
This holds for all C++ versions since C++11 (before that the program would be obviously ill-formed).
If your compiler is printing garbage for a (after removing b from the program so that it doesn't have undefined behavior anymore), then this is a bug in the compiler.
Regarding edit of question:
There is no way to extend the lifetime of a temporary through binding to a reference member of a non-aggregate class.
Relying on this lifetime extension at all is very dangerous, since you would likely not get any error or warning if you happen to make the class non-aggregate in the future.
If you want the class to always retain the object until its lifetime ends, then just store it by-value instead of by-reference:
class B
{
std::string str;
public:
B(std::string str) : str(std::move(str)) {}
void print() { std::cout << str << '\n'; }
};
Related
I am surprised to accidentally discover that the following works:
#include <iostream>
int main(int argc, char** argv)
{
struct Foo {
Foo(Foo& bar) {
std::cout << &bar << std::endl;
}
};
Foo foo(foo); // I can't believe this works...
std::cout << &foo << std::endl; // but it does...
}
I am passing the address of the constructed object into its own constructor. This looks like a circular definition at the source level. Do the standards really allow you to pass an object into a function before the object is even constructed or is this undefined behavior?
I suppose it's not that odd given that all class member functions already have a pointer to the data for their class instance as an implicit parameter. And the layout of the data members is fixed at compile time.
Note, I'm NOT asking if this is useful or a good idea; I'm just tinkering around to learn more about classes.
This is not undefined behavior. Although foo is uninitialized, you are using it a way that is allowed by the standard. After space is allocated for an object but before it is fully initialized, you are allowed to use it limited ways. Both binding a reference to that variable and taking its address are allowed.
This is covered by defect report 363: Initialization of class from self which says:
And if so, what is the semantics of the self-initialization of UDT?
For example
#include <stdio.h>
struct A {
A() { printf("A::A() %p\n", this); }
A(const A& a) { printf("A::A(const A&) %p %p\n", this, &a); }
~A() { printf("A::~A() %p\n", this); }
};
int main()
{
A a=a;
}
can be compiled and prints:
A::A(const A&) 0253FDD8 0253FDD8
A::~A() 0253FDD8
and the resolution was:
3.8 [basic.life] paragraph 6 indicates that the references here are valid. It's permitted to take the address of a class object before it is fully initialized, and it's permitted to pass it as an argument to a reference parameter as long as the reference can bind directly. Except for the failure to cast the pointers to void * for the %p in the printfs, these examples are standard-conforming.
The full quote of section 3.8 [basic.life] from the draft C++14 standard is as follows:
Similarly, before the lifetime of an object has started but after the
storage which the object will occupy has been allocated or, after the
lifetime of an object has ended and before the storage which the
object occupied is reused or released, any glvalue that refers to the
original object may be used but only in limited ways. For an object
under construction or destruction, see 12.7. Otherwise, such a glvalue
refers to allocated storage (3.7.4.2), and using the properties of the
glvalue that do not depend on its value is well-defined. The program
has undefined behavior if:
an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,
the glvalue is used to access a non-static data member or call a non-static member function of the
object, or
the glvalue is bound to a reference to a virtual base class (8.5.3), or
the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.
We are not doing anything with foo that falls under undefined behavior as defined by the bullets above.
If we try this with Clang, we see an ominous warning (see it live):
warning: variable 'foo' is uninitialized when used within its own initialization [-Wuninitialized]
It is a valid warning since producing an indeterminate value from an uninitialized automatic variable is undefined behavior. However, in this case you are just binding a reference and taking the address of the variable within the constructor, which does not produce an indeterminate value and is valid. On the other hand, the following self-initialization example from the draft C++11 standard:
int x = x ;
does invoke undefined behavior.
Active issue 453: References may only bind to “valid” objects also seems relevant but is still open. The initial proposed language is consistent with Defect Report 363.
The constructor is called at a point where memory is allocated for the object-to-be. At that point, no object exists at that location (or possibly an object with a trivial destructor). Furthermore, the this pointer refers to that memory and the memory is properly aligned.
Since it's allocated and aligned memory, we may refer to it using lvalue expressions of Foo type (i.e. Foo&). What we may not yet do is have an lvalue-to-rvalue conversion. That's only allowed after the constructor body is entered.
In this case, the code just tries to print &bar inside the constructor body. It would even be legal to print bar.member here. Since the constructor body has been entered, a Foo object exists and its members may be read.
This leaves us with one small detail, and that's name lookup. In Foo foo(foo), the first foo introduces the name in scope and the second foo therefore refers back to the just-declared name. That's why int x = x is invalid, but int x = sizeof(x) is valid.
As said in other answers, an object can be initialized with itself as long as you do not use its values before they are initialized. You can still bind the object to a reference or take its address.
But beyond the fact that it is valid, let's explore a usage example.
The example below might be controversial, you can surely propose many other ideas for implementing it. And yet, it presents a valid usage of this strange C++ property, that you can pass an object into its own constructor.
class Employee {
string name;
// manager may change so we don't hold it as a reference
const Employee* pManager;
public:
// we prefer to get the manager as a reference and not as a pointer
Employee(std::string name, const Employee& manager)
: name(std::move(name)), pManager(&manager) {}
void modifyManager(const Employee& manager) {
// TODO: check for recursive connection and throw an exception
pManager = &manager;
}
friend std::ostream& operator<<(std::ostream& out, const Employee& e) {
out << e.name << " reporting to: ";
if(e.pManager == &e)
out << "self";
else
out << *e.pManager;
return out;
}
};
Now comes the usage of initializing an object with itself:
// it is valid to create an employee who manages itself
Employee jane("Jane", jane);
In fact, with the given implementation of class Employee, the user has no other choice but to initialize the first Employee ever created, with itself as its own manager, as there is no other Employee yet that can be passed. And in a way that makes sense, as the first employee created should manage itself.
Code: http://coliru.stacked-crooked.com/a/9c397bce622eeacd
I have the following questions related to the same situation (not in general):
Why does the compiler not produce a warning when a temporary is bound
to a reference?
How does lifetime extension of a temporary work (when it is bound to
a reference)?
How to interpret / understand the C++ standard (C++11)?
Is this a bug (in the compiler)? Should it be?
So this is what we are talking about:
struct TestRefInt {
TestRefInt(const int& a) : a_(a) {};
void DoSomething() { cout << "int:" << a_ << endl; }
protected:
const int& a_;
};
Should TestRefInt tfi(55); and tfi.DoSomething(); work? How and where?
So here is a code
TestRefInt tfi(55);
int main() {
TestRefInt ltfi(8);
tfi.DoSomething();
ltfi.DoSomething();
return 0;
}
What should this do?
Here I will elaborate some more.
Consider this real world (simplified) example. How does it look like to a novice/beginner C++ programmer? Does this make sense?
(you can skip the code the issue is the same as above)
#include <iostream>
using namespace std;
class TestPin //: private NonCopyable
{
public:
constexpr TestPin(int pin) : pin_(pin) {}
inline void Flip() const {
cout << " I'm flipping out man : " << pin_ << endl;
}
protected:
const int pin_;
};
class TestRef {
public:
TestRef(const TestPin& a) : a_(a) {};
void DoSomething() { a_.Flip(); }
protected:
const TestPin& a_;
};
TestRef g_tf(1);
int main() {
TestRef tf(2);
g_tf.DoSomething();
tf.DoSomething();
return 0;
}
Command line:
/** Compile:
Info: Internal Builder is used for build
g++ -std=c++11 -O0 -g3 -Wall -Wextra -Wconversion -c -fmessage-length=0 -o "src\\Scoping.o" "..\\src\\Scoping.cpp"
g++ -o Scoping.exe "src\\Scoping.o"
13:21:39 Build Finished (took 346ms)
*/
Output:
/**
I'm flipping out man : 2293248
I'm flipping out man : 2
*/
The issue:
TestRef g_tf(1); /// shouldn't the reference to temporary extend it's life in global scope too?
What does the standard says?
From How would auto&& extend the life-time of the temporary object?
Normally, a temporary object lasts only until the end of the full
expression in which it appears. However, C++ deliberately specifies
that binding a temporary object to a reference to const on the stack
lengthens the lifetime of the temporary to the lifetime of the
reference itself
^^^^^
Globals are not allocated on stack, so lifetime is not extended. However, the compiler does not produce any warning message whatsoever! Shouldn’t it at least do that?
But my main point is: from a usability standpoint (meaning as the user of C++, GCC as a programmer) it would be useful if the same principle would be extended to not just stack, but to global scope, extending the lifetime (making global temporaries “permanent”).
Sidenote:
The problem is further complicated by the fact that the TestRef g_tf(1); is really TestRef g_tf{ TestPin{1} }; but the temporary object is not visible in the code, and without looking at the constructor declaration it looks like the constructor is called by an integer, and that rarely produces this kind of error.
As far as I know the only way to fix the problem is to disallow temporaries in initialization by deleting TestRefInt(const int&& a) =delete; constructor. But this also disallows it on stack, where lifetime extension worked.
But the previous quote is not exactly what the C++11 standard say. The relevant parts of the C++11 Standard are 12.2 p4 and p5:
4 - There are two contexts in which temporaries are destroyed at a
different point than the end of the full expression. The first context
is [...]
5 - The second context is when a reference is bound to a
temporary. The temporary to which the reference is bound or the
temporary that is the complete object of a subobject to which the
reference is bound persists for the lifetime of the reference except:
— A temporary bound to a reference member in a constructor’s
ctor-initializer (12.6.2) persists until the constructor exits.
— A temporary bound to a reference parameter in a function call (5.2.2)
persists until the completion of the full-expression containing the
call. § 12.2 245 c ISO/IEC N3337
— The lifetime of a temporary bound
to the returned value in a function return statement (6.6.3) is not
extended; the temporary is destroyed at the end of the full-expression
in the return statement.
— A temporary bound to a reference in a
new-initializer (5.3.4) persists until the completion of the
full-expression containing the new-initializer. [Example: struct S {
int mi; const std::pair& mp; }; S a { 1, {2,3} }; S* p = new S{ 1,
{2,3} }; // Creates dangling reference — end example ] [ Note: This
may introduce a dangling reference, and implementations are encouraged
to issue a warning in such a case. — end note ]
My English and understanding of the standard is not good enough, so what exception is this?
The “— A temporary bound to a reference member in a constructor’s ctor-initializer (12.6.2) persists until the constructor exits.” or “reference parameter in a function call” does not mention anything about allocation on stack or in global scope. The other exceptions do not seem to apply. Am I wrong?
Which one is this, or none of them?
We have a temporary in a function call (the constructor) and then a reference to
the temporary is bound to a member reference in the initializer. Is this undefined behavior? Or the exception still applies, and the temporary should be destroyed? (or both)
What about this?
struct TestRefIntDirect {
TestRefIntDirect(int a) : a_(a) {};
void DoSomething() { cout << "int:" << a_ << endl; }
protected:
const int& a_;
};
One less reference, same behavior.
Why does it work in one case (instantiation inside a function) versus in other case (in global scope)?
Does GCC not “destroy” one of them by “accident”?
My understanding is that none of them should work, the temporary should not persist as the standard says. It seems GCC just "lets" you access not persisting objects (sometimes). I guess that the standard does not specify what the compiler should warn about, but can we agree that it should? (in other cases it does warn about ‘returning reference to temporary’) I think it should here too.
Is this a bug or maybe there should be a feature request somewhere?
It seems to me like GCC says “Well, you shouldn’t touch this cookie, but I will leave it here for you and not warn anyone about it.” ...which I don’t like.
This thing others reference about stack I did not find in the standard, where is it coming from? Is this misinformation about the standard? Or is this a consequence of something in the standard? (Or is this just implementation coincidence, that the temporary is not overwritten and can be referenced, because it happened to be on the stack? Or is this compiler defined behavior, extension?)
The more I know C++ the more it seems that every day it is handing me a gun to shoot myself in the foot with it…
I would like to be warned by the compiler if I’m doing something wrong, so I can fix it. Is that too much to ask?
Other related posts:
Temporary lifetime extension
C++: constant reference to temporary
Does a const reference prolong the life of a temporary?
Returning temporary object and binding to const reference
I didn’t want to write this much. If you read it all, thanks.
This is a follow-up of these questions.
Consider the following code:
struct A {
private:
A* const& this_ref{this};
};
int main() {
A a{};
(void)a;
}
If compiled with the -Wextra, both GCC v6.2 and clang v3.9 show a warning.
Anyway, with the slightly modified version shown below they behave differently:
struct A {
A* const& this_ref{this};
};
int main() {
A a{};
(void)a;
}
In this case GCC doesn't give any warning, clang gives the same warning as returned in the previous example.
The warnings are almost the identical.
It follows the one from clang:
3 : warning: binding reference member 'this_ref' to a temporary value [-Wdangling-field]
Which compiler is right?
I would say that GCC is wrong in this case and I were opening an issue, but maybe it's the opposite because of an arcane corner case of the language.
The member declaration
A* const& this_ref{this};
binds a reference to a temporary that only exists during constructor execution (note: this is an rvalue expression).
I'm not sure if this is formally available in that context, but if it is then with any use of that pointer you have serious case of UB.
Re
” Which compiler is right?
… a compiler can issue as many diagnostics as it wants. It's not wrong to issue a diagnostic. So per your description, that both accept the code, then either both compilers are right (which I think is most likely), or both are wrong.
The reason for this warning is IMO this excerpt from standard (12.2.5):
A temporary bound to a reference member in a constructor’s ctor-initializer (12.6.2) persists until the
constructor exits.
and since the keyword this is a prvalue expression, during this_ref initialization a temporary will be created and this_ref is bound to that temporary.
But I have doubt whether your reference is actually initialized in ctor-initializer.
If you write:
struct A {
private:
const int& rr = 1+1;
};
then you will reproduce the exact same problem with gcc, removing private will also remove this warning.
From what I know this pointer might be used in the body of the non-static member function, I have never read that it could be used as argument during default member initialization.
this is prvalue, and temporary object will be created when binding reference to a prvalue, so you're binding reference member to a temporary in default member initializer.
And binding reference member to temporary in default member initializer is ill-formed, which is stated by the standard explicitly.
$12.6.2/11 Initializing bases and members
[class.base.init]:
A temporary expression bound to a reference member from a default
member initializer is ill-formed. [ Example:
struct A {
A() = default; // OK
A(int v) : v(v) { } // OK
const int& v = 42; // OK
};
A a1; // error: ill-formed binding of temporary to reference
A a2(1); // OK, unfortunately
— end example ]
And see CWG 1696, this is applied to C++14.
I am surprised to accidentally discover that the following works:
#include <iostream>
int main(int argc, char** argv)
{
struct Foo {
Foo(Foo& bar) {
std::cout << &bar << std::endl;
}
};
Foo foo(foo); // I can't believe this works...
std::cout << &foo << std::endl; // but it does...
}
I am passing the address of the constructed object into its own constructor. This looks like a circular definition at the source level. Do the standards really allow you to pass an object into a function before the object is even constructed or is this undefined behavior?
I suppose it's not that odd given that all class member functions already have a pointer to the data for their class instance as an implicit parameter. And the layout of the data members is fixed at compile time.
Note, I'm NOT asking if this is useful or a good idea; I'm just tinkering around to learn more about classes.
This is not undefined behavior. Although foo is uninitialized, you are using it a way that is allowed by the standard. After space is allocated for an object but before it is fully initialized, you are allowed to use it limited ways. Both binding a reference to that variable and taking its address are allowed.
This is covered by defect report 363: Initialization of class from self which says:
And if so, what is the semantics of the self-initialization of UDT?
For example
#include <stdio.h>
struct A {
A() { printf("A::A() %p\n", this); }
A(const A& a) { printf("A::A(const A&) %p %p\n", this, &a); }
~A() { printf("A::~A() %p\n", this); }
};
int main()
{
A a=a;
}
can be compiled and prints:
A::A(const A&) 0253FDD8 0253FDD8
A::~A() 0253FDD8
and the resolution was:
3.8 [basic.life] paragraph 6 indicates that the references here are valid. It's permitted to take the address of a class object before it is fully initialized, and it's permitted to pass it as an argument to a reference parameter as long as the reference can bind directly. Except for the failure to cast the pointers to void * for the %p in the printfs, these examples are standard-conforming.
The full quote of section 3.8 [basic.life] from the draft C++14 standard is as follows:
Similarly, before the lifetime of an object has started but after the
storage which the object will occupy has been allocated or, after the
lifetime of an object has ended and before the storage which the
object occupied is reused or released, any glvalue that refers to the
original object may be used but only in limited ways. For an object
under construction or destruction, see 12.7. Otherwise, such a glvalue
refers to allocated storage (3.7.4.2), and using the properties of the
glvalue that do not depend on its value is well-defined. The program
has undefined behavior if:
an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,
the glvalue is used to access a non-static data member or call a non-static member function of the
object, or
the glvalue is bound to a reference to a virtual base class (8.5.3), or
the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.
We are not doing anything with foo that falls under undefined behavior as defined by the bullets above.
If we try this with Clang, we see an ominous warning (see it live):
warning: variable 'foo' is uninitialized when used within its own initialization [-Wuninitialized]
It is a valid warning since producing an indeterminate value from an uninitialized automatic variable is undefined behavior. However, in this case you are just binding a reference and taking the address of the variable within the constructor, which does not produce an indeterminate value and is valid. On the other hand, the following self-initialization example from the draft C++11 standard:
int x = x ;
does invoke undefined behavior.
Active issue 453: References may only bind to “valid” objects also seems relevant but is still open. The initial proposed language is consistent with Defect Report 363.
The constructor is called at a point where memory is allocated for the object-to-be. At that point, no object exists at that location (or possibly an object with a trivial destructor). Furthermore, the this pointer refers to that memory and the memory is properly aligned.
Since it's allocated and aligned memory, we may refer to it using lvalue expressions of Foo type (i.e. Foo&). What we may not yet do is have an lvalue-to-rvalue conversion. That's only allowed after the constructor body is entered.
In this case, the code just tries to print &bar inside the constructor body. It would even be legal to print bar.member here. Since the constructor body has been entered, a Foo object exists and its members may be read.
This leaves us with one small detail, and that's name lookup. In Foo foo(foo), the first foo introduces the name in scope and the second foo therefore refers back to the just-declared name. That's why int x = x is invalid, but int x = sizeof(x) is valid.
As said in other answers, an object can be initialized with itself as long as you do not use its values before they are initialized. You can still bind the object to a reference or take its address.
But beyond the fact that it is valid, let's explore a usage example.
The example below might be controversial, you can surely propose many other ideas for implementing it. And yet, it presents a valid usage of this strange C++ property, that you can pass an object into its own constructor.
class Employee {
string name;
// manager may change so we don't hold it as a reference
const Employee* pManager;
public:
// we prefer to get the manager as a reference and not as a pointer
Employee(std::string name, const Employee& manager)
: name(std::move(name)), pManager(&manager) {}
void modifyManager(const Employee& manager) {
// TODO: check for recursive connection and throw an exception
pManager = &manager;
}
friend std::ostream& operator<<(std::ostream& out, const Employee& e) {
out << e.name << " reporting to: ";
if(e.pManager == &e)
out << "self";
else
out << *e.pManager;
return out;
}
};
Now comes the usage of initializing an object with itself:
// it is valid to create an employee who manages itself
Employee jane("Jane", jane);
In fact, with the given implementation of class Employee, the user has no other choice but to initialize the first Employee ever created, with itself as its own manager, as there is no other Employee yet that can be passed. And in a way that makes sense, as the first employee created should manage itself.
Code: http://coliru.stacked-crooked.com/a/9c397bce622eeacd
Class A
{
A(B& b) : mb(b)
{
// I will not access anything from B here
}
B& mb;
};
Class B
{
B(): a(*this)
{}
A a;
}
I run into such a situation may times, the contained object needs to use the containers functionality. Having a reference to the container object in the contained object seems to be the best way to do this. Of course, I could do this with a pointer, that way I could have a setter setB(B* b) {mb = b;} which I could call later after I am sure B is initialized but I would much prefer to do this with a reference which means I need to initialize it in the constructor, hence the problem.
Since you're only initializing the reference to B, this should be just fine -- by the time B's constructor runs the memory location for it has already been setup.
Do keep in mind that you can't call any methods in B safely from A's constructor because B has not finished constructing yet.
The appropriate quote from the standard is:
§3.8 [basic.life]/6
Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any lvalue which refers to the original object may be used but only in limited ways. Such an lvalue refers to allocated storage (3.7.3.2), and using the properties of the lvalue which do not depend on its value is well-defined. If an lvalue-to-rvalue conversion (4.1) is applied to such an lvalue, the program has undefined behavior; if the original object will be or was of a non-POD class type, the program has undefined behavior if:
— the lvalue is used to access a non-static data member or call a non-static member function of the object, or
— the lvalue is implicitly converted (4.10) to a reference to a base class type, or
— the lvalue is used as the operand of a static_cast(5.2.9) (except when the conversion is ultimately to char& or unsigned char&), or
— the lvalue is used as the operand of a dynamic_cast(5.2.7) or as the operand oftypeid.
It depends what you are doing in the A constructor.
The B object is not fully constructed until the constructor returns. Furthermore, before you enter the body of the B constructor, the objects within the B object may not be fully constructed. For example:
class B
{
A a;
std::string str;
public:
B() : a(*this)
{
}
};
At the time that A::A is called, str is not yet constructed. If you try to use str within A::A (either directly or indirectly), you will have undefined behavior.