Detecting unsafe const reference binding in C++ - c++

I've just spent quite a lot of time debugging an obscure memory corruption problem in one of my programs. It essentially boils down to a function which returns a structure by value being called in a way which passes it into an object constructor. Pseudocode follows.
extern SomeStructure someStructureGenerator();
class ObjectWhichUsesStructure {
ObjectWhichUsesStructure(const SomeStructure& ref): s(ref) {}
const SomeStructure& s;
}
ObjectWhichUsesStructure obj(someStructureGenerator());
My reasoning was: someStructureGenerator() is returning a temporary; this is being bound to a const reference, which means the compiler is extending the lifetime of the temporary to match the place of use; I'm using it to construct an object, so the temporary lifetime is being extended to match that of the object.
That last bit turns out not to be the case. Once the constructor exits, the compiler deletes the temporary and now obj contains a reference to hyperspace, with hilarious results when I try to use it. I need to explicitly bind the const reference to the scope, like this:
const auto& ref = someStructureGenerator();
ObjectWhichUsesStructure obj(ref);
That's not the bit I'm asking about.
What I'm asking about is this: my compiler is gcc 8, I build with -Wall, and it was perfectly happy to compile the code above --- cleanly, with no warnings. My program ran happily (but incorrectly) under valgrind, likewise with no warnings.
I have no idea how many other places in my code I'm using the same idiom. What compiler tooling will detect and flag these places so that I can fix my code, and alert me if I make the same mistake in the future?

First, reference binding does “extend lifetime” here—but only to the lifetime of the constructor parameter (which is no longer than that of the temporary materialized anyway). s(ref) isn’t binding an object (since ref is, well, already a reference), so no further extension occurs.
It’s therefore possible to perform the extension you expected via aggregate initialization:
struct ObjectWhichUsesStructure {
const SomeStructure &s;
};
ObjectWhichUsesStructure obj{someStructureGenerator()};
Here there is no constructor parameter (because there is no constructor at all!) and so only the one, desired binding occurs.
It’s worth seeing why the compiler doesn’t warn about this: even if a constructor does retain a reference to a temporary argument, there are legitimate situations where that works:
void useWrapper(ObjectWhichUsesStructure);
void f() {useWrapper(someStructureGenerator());}
Here the SomeStructure lives until the end of the statement, during which time useWrapper can make profitable use of the reference in the ObjectWhichUsesStructure.
At the expense of forbidding the valid use cases above, you can have the compiler trap the problematic case by providing a deleted constructor taking an rvalue reference:
struct ObjectWhichUsesStructure {
ObjectWhichUsesStructure(const SomeStructure& ref): s(ref) {}
ObjectWhichUsesStructure(SomeStructure&&)=delete;
const SomeStructure& s;
};
This might be worth doing temporarily as a diagnostic measure without having it be a permanent restriction.

Related

Passing an argument value from constructor to a function that takes a reference [duplicate]

Why is it not allowed to get non-const reference to a temporary object,
which function getx() returns? Clearly, this is prohibited by C++ Standard
but I am interested in the purpose of such restriction, not a reference to the standard.
struct X
{
X& ref() { return *this; }
};
X getx() { return X();}
void g(X & x) {}
int f()
{
const X& x = getx(); // OK
X& x = getx(); // error
X& x = getx().ref(); // OK
g(getx()); //error
g(getx().ref()); //OK
return 0;
}
It is clear that the lifetime of the object cannot be the cause, because
constant reference to an object is not prohibited by C++ Standard.
It is clear that the temporary object is not constant in the sample above, because calls to non-constant functions are permitted. For instance, ref() could modify the temporary object.
In addition, ref() allows you to fool the compiler and get a link to this temporary object and that solves our problem.
In addition:
They say "assigning a temporary object to the const reference extends the lifetime of this object" and " Nothing is said about non-const references though".
My additional question. Does following assignment extend the lifetime of temporary object?
X& x = getx().ref(); // OK
From this Visual C++ blog article about rvalue references:
... C++ doesn't want you to accidentally
modify temporaries, but directly
calling a non-const member function on
a modifiable rvalue is explicit, so
it's allowed ...
Basically, you shouldn't try to modify temporaries for the very reason that they are temporary objects and will die any moment now. The reason you are allowed to call non-const methods is that, well, you are welcome to do some "stupid" things as long as you know what you are doing and you are explicit about it (like, using reinterpret_cast). But if you bind a temporary to a non-const reference, you can keep passing it around "forever" just to have your manipulation of the object disappear, because somewhere along the way you completely forgot this was a temporary.
If I were you, I would rethink the design of my functions. Why is g() accepting reference, does it modify the parameter? If no, make it const reference, if yes, why do you try to pass temporary to it, don't you care it's a temporary you are modifying? Why is getx() returning temporary anyway? If you share with us your real scenario and what you are trying to accomplish, you may get some good suggestions on how to do it.
Going against the language and fooling the compiler rarely solves problems - usually it creates problems.
Edit: Addressing questions in comment:
1) `X& x = getx().ref(); // OK when will x die?` - I don't know and I don't care, because this is exactly what I mean by "going against the language". The language says "temporaries die at the end of the statement, unless they are bound to const reference, in which case they die when the reference goes out of scope". Applying that rule, it seems x is already dead at the beginning of the next statement, since it's not bound to const reference (the compiler doesn't know what ref() returns). This is just a guess however.
I stated the purpose clearly: you are not allowed to modify temporaries, because it just does not make sense (ignoring C++0x rvalue references). The question "then why am I allowed to call non-const members?" is a good one, but I don't have better answer than the one I already stated above.
Well, if I'm right about x in X& x = getx().ref(); dying at the end of the statement, the problems are obvious.
Anyway, based on your question and comments I don't think even these extra answers will satisfy you. Here is a final attempt/summary: The C++ committee decided it doesn't make sense to modify temporaries, therefore, they disallowed binding to non-const references. May be some compiler implementation or historic issues were also involved, I don't know. Then, some specific case emerged, and it was decided that against all odds, they will still allow direct modification through calling non-const method. But that's an exception - you are generally not allowed to modify temporaries. Yes, C++ is often that weird.
In your code getx() returns a temporary object, a so-called "rvalue". You can copy rvalues into objects (aka. variables) or bind them to to const references (which will extend their life-time until the end of the reference's life). You cannot bind rvalues to non-const references.
This was a deliberate design decision in order to prevent users from accidentally modifying an object that is going to die at the end of the expression:
g(getx()); // g() would modify an object without anyone being able to observe
If you want to do this, you will have to either make a local copy or of the object first or bind it to a const reference:
X x1 = getx();
const X& x2 = getx(); // extend lifetime of temporary to lifetime of const reference
g(x1); // fine
g(x2); // can't bind a const reference to a non-const reference
Note that the next C++ standard will include rvalue references. What you know as references is therefore becoming to be called "lvalue references". You will be allowed to bind rvalues to rvalue references and you can overload functions on "rvalue-ness":
void g(X&); // #1, takes an ordinary (lvalue) reference
void g(X&&); // #2, takes an rvalue reference
X x;
g(x); // calls #1
g(getx()); // calls #2
g(X()); // calls #2, too
The idea behind rvalue references is that, since these objects are going to die anyway, you can take advantage of that knowledge and implement what's called "move semantics", a certain kind of optimization:
class X {
X(X&& rhs)
: pimpl( rhs.pimpl ) // steal rhs' data...
{
rhs.pimpl = NULL; // ...and leave it empty, but deconstructible
}
data* pimpl; // you would use a smart ptr, of course
};
X x(getx()); // x will steal the rvalue's data, leaving the temporary object empty
What you are showing is that operator chaining is allowed.
X& x = getx().ref(); // OK
The expression is 'getx().ref();' and this is executed to completion before assignment to 'x'.
Note that getx() does not return a reference but a fully formed object into the local context. The object is temporary but it is not const, thus allowing you to call other methods to compute a value or have other side effects happen.
// It would allow things like this.
getPipeline().procInstr(1).procInstr(2).procInstr(3);
// or more commonly
std::cout << getManiplator() << 5;
Look at the end of this answer for a better example of this
You can not bind a temporary to a reference because doing so will generate a reference to an object that will be destroyed at the end of the expression thus leaving you with a dangling reference (which is untidy and the standard does not like untidy).
The value returned by ref() is a valid reference but the method does not pay any attention to the lifespan of the object it is returning (because it can not have that information within its context). You have basically just done the equivalent of:
x& = const_cast<x&>(getX());
The reason it is OK to do this with a const reference to a temporary object is that the standard extends the lifespan of the temporary to the lifespan of the reference so the temporary objects lifespan is extended beyond the end of the statement.
So the only remaining question is why does the standard not want to allow reference to temporaries to extend the life of the object beyond the end of the statement?
I believe it is because doing so would make the compiler very hard to get correct for temporary objects. It was done for const references to temporaries as this has limited usage and thus forced you to make a copy of the object to do anything useful but does provide some limited functionality.
Think of this situation:
int getI() { return 5;}
int x& = getI();
x++; // Note x is an alias to a variable. What variable are you updating.
Extending the lifespan of this temporary object is going to be very confusing.
While the following:
int const& y = getI();
Will give you code that it is intuitive to use and understand.
If you want to modify the value you should be returning the value to a variable. If you are trying to avoid the cost of copying the obejct back from the function (as it seems that the object is copy constructed back (technically it is)). Then don't bother the compiler is very good at 'Return Value Optimization'
Why is discussed in the C++ FAQ (boldfacing mine):
In C++, non-const references can bind to lvalues and const references can bind to lvalues or rvalues, but there is nothing that can bind to a non-const rvalue. That's to protect people from changing the values of temporaries that are destroyed before their new value can be used. For example:
void incr(int& a) { ++a; }
int i = 0;
incr(i); // i becomes 1
incr(0); // error: 0 is not an lvalue
If that incr(0) were allowed either some temporary that nobody ever saw would be incremented or - far worse - the value of 0 would become 1. The latter sounds silly, but there was actually a bug like that in early Fortran compilers that set aside a memory location to hold the value 0.
The main issue is that
g(getx()); //error
is a logical error: g is modifying the result of getx() but you don't have any chance to examine the modified object. If g didn't need to modify its parameter then it wouldn't have required an lvalue reference, it could have taken the parameter by value or by const reference.
const X& x = getx(); // OK
is valid because you sometimes need to reuse the result of an expression, and it's pretty clear that you're dealing with a temporary object.
However it is not possible to make
X& x = getx(); // error
valid without making g(getx()) valid, which is what the language designers were trying to avoid in the first place.
g(getx().ref()); //OK
is valid because methods only know about the const-ness of the this, they don't know if they are called on an lvalue or on an rvalue.
As always in C++, you have a workaround for this rule but you have to signal the compiler that you know what you're doing by being explicit:
g(const_cast<x&>(getX()));
Seems like the original question as to why this is not allowed has been answered clearly: "because it is most likely an error".
FWIW, I thought I'd show how to it could be done, even though I don't think it's a good technique.
The reason I sometimes want to pass a temporary to a method taking a non-const reference is to intentionally throw away a value returned by-reference that the calling method doesn't care about. Something like this:
// Assuming: void Person::GetNameAndAddr(std::string &name, std::string &addr);
string name;
person.GetNameAndAddr(name, string()); // don't care about addr
As explained in previous answers, that doesn't compile. But this compiles and works correctly (with my compiler):
person.GetNameAndAddr(name,
const_cast<string &>(static_cast<const string &>(string())));
This just shows that you can use casting to lie to the compiler. Obviously, it would be much cleaner to declare and pass an unused automatic variable:
string name;
string unused;
person.GetNameAndAddr(name, unused); // don't care about addr
This technique does introduce an unneeded local variable into the method's scope. If for some reason you want to prevent it from being used later in the method, e.g., to avoid confusion or error, you can hide it in a local block:
string name;
{
string unused;
person.GetNameAndAddr(name, unused); // don't care about addr
}
-- Chris
Why would you ever want X& x = getx();? Just use X x = getx(); and rely on RVO.
The evil workaround involves the 'mutable' keyword. Actually being evil is left as an exercise for the reader. Or see here: http://www.ddj.com/cpp/184403758
Excellent question, and here's my attempt at a more concise answer (since a lot of useful info is in comments and hard to dig out in the noise.)
Any reference bound directly to a temporary will extend its life [12.2.5]. On the other hand, a reference initialized with another reference will not (even if it's ultimately the same temporary). That makes sense (the compiler doesn't know what that reference ultimately refers to).
But this whole idea is extremely confusing. E.g. const X &x = X(); will make the temporary last as long as the x reference, but const X &x = X().ref(); will NOT (who knows what ref() actually returned). In the latter case, the destructor for X gets called at the end of this line. (This is observable with a non-trivial destructor.)
So it seems generally confusing and dangerous (why complicate the rules about object lifetimes?), but presumably there was a need at least for const references, so the standard does set this behavior for them.
[From sbi comment]: Note that the fact that binding it to a const reference enhances a
temporary's lifetimes is an exception that's been added deliberately
(TTBOMK in order to allow manual optimizations). There wasn't an
exception added for non-const references, because binding a temporary
to a non-const reference was seen to most likely be a programmer
error.
All temporaries do persist until the end of the full-expression. To make use of them, however, you need a trick like you have with ref(). That's legal. There doesn't seem to be a good reason for the extra hoop to jump through, except to remind the programmer that something unusual is going on (namely, a reference parameter whose modifications will be quickly lost).
[Another sbi comment] The reason Stroustrup gives (in D&E) for disallowing the binding of
rvalues to non-const references is that, if Alexey's g() would modify
the object (which you'd expect from a function taking a non-const
reference), it would modify an object that's going to die, so nobody
could get at the modified value anyway. He says that this, most
likely, is an error.
"It is clear that the temporary object is not constant in the sample above, because calls
to non-constant functions are permitted. For instance, ref() could modify the temporary
object."
In your example getX() does not return a const X so you are able to call ref() in much the same way as you could call X().ref(). You are returning a non const ref and so can call non const methods, what you can't do is assign the ref to a non const reference.
Along with SadSidos comment this makes your three points incorrect.
I have a scenario I would like to share where I wish I could do what Alexey is asking. In a Maya C++ plugin, I have to do the following shenanigan in order to get a value into a node attribute:
MFnDoubleArrayData myArrayData;
MObject myArrayObj = myArrayData.create(myArray);
MPlug myPlug = myNode.findPlug(attributeName);
myPlug.setValue(myArrayObj);
This is tedious to write, so I wrote the following helper functions:
MPlug operator | (MFnDependencyNode& node, MObject& attribute){
MStatus status;
MPlug returnValue = node.findPlug(attribute, &status);
return returnValue;
}
void operator << (MPlug& plug, MDoubleArray& doubleArray){
MStatus status;
MFnDoubleArrayData doubleArrayData;
MObject doubleArrayObject = doubleArrayData.create(doubleArray, &status);
status = plug.setValue(doubleArrayObject);
}
And now I can write the code from the beginning of the post as:
(myNode | attributeName) << myArray;
The problem is it doesn't compile outside of Visual C++, because it's trying to bind the temporary variable returned from the | operator to the MPlug reference of the << operator. I would like it to be a reference because this code is called many times and I'd rather not have MPlug being copied so much. I only need the temporary object to live until the end of the second function.
Well, this is my scenario. Just thought I'd show an example where one would like to do what Alexey describe. I welcome all critiques and suggestions!
Thanks.

What is the difference between these two versions of the same Template Class [duplicate]

Why is it not allowed to get non-const reference to a temporary object,
which function getx() returns? Clearly, this is prohibited by C++ Standard
but I am interested in the purpose of such restriction, not a reference to the standard.
struct X
{
X& ref() { return *this; }
};
X getx() { return X();}
void g(X & x) {}
int f()
{
const X& x = getx(); // OK
X& x = getx(); // error
X& x = getx().ref(); // OK
g(getx()); //error
g(getx().ref()); //OK
return 0;
}
It is clear that the lifetime of the object cannot be the cause, because
constant reference to an object is not prohibited by C++ Standard.
It is clear that the temporary object is not constant in the sample above, because calls to non-constant functions are permitted. For instance, ref() could modify the temporary object.
In addition, ref() allows you to fool the compiler and get a link to this temporary object and that solves our problem.
In addition:
They say "assigning a temporary object to the const reference extends the lifetime of this object" and " Nothing is said about non-const references though".
My additional question. Does following assignment extend the lifetime of temporary object?
X& x = getx().ref(); // OK
From this Visual C++ blog article about rvalue references:
... C++ doesn't want you to accidentally
modify temporaries, but directly
calling a non-const member function on
a modifiable rvalue is explicit, so
it's allowed ...
Basically, you shouldn't try to modify temporaries for the very reason that they are temporary objects and will die any moment now. The reason you are allowed to call non-const methods is that, well, you are welcome to do some "stupid" things as long as you know what you are doing and you are explicit about it (like, using reinterpret_cast). But if you bind a temporary to a non-const reference, you can keep passing it around "forever" just to have your manipulation of the object disappear, because somewhere along the way you completely forgot this was a temporary.
If I were you, I would rethink the design of my functions. Why is g() accepting reference, does it modify the parameter? If no, make it const reference, if yes, why do you try to pass temporary to it, don't you care it's a temporary you are modifying? Why is getx() returning temporary anyway? If you share with us your real scenario and what you are trying to accomplish, you may get some good suggestions on how to do it.
Going against the language and fooling the compiler rarely solves problems - usually it creates problems.
Edit: Addressing questions in comment:
1) `X& x = getx().ref(); // OK when will x die?` - I don't know and I don't care, because this is exactly what I mean by "going against the language". The language says "temporaries die at the end of the statement, unless they are bound to const reference, in which case they die when the reference goes out of scope". Applying that rule, it seems x is already dead at the beginning of the next statement, since it's not bound to const reference (the compiler doesn't know what ref() returns). This is just a guess however.
I stated the purpose clearly: you are not allowed to modify temporaries, because it just does not make sense (ignoring C++0x rvalue references). The question "then why am I allowed to call non-const members?" is a good one, but I don't have better answer than the one I already stated above.
Well, if I'm right about x in X& x = getx().ref(); dying at the end of the statement, the problems are obvious.
Anyway, based on your question and comments I don't think even these extra answers will satisfy you. Here is a final attempt/summary: The C++ committee decided it doesn't make sense to modify temporaries, therefore, they disallowed binding to non-const references. May be some compiler implementation or historic issues were also involved, I don't know. Then, some specific case emerged, and it was decided that against all odds, they will still allow direct modification through calling non-const method. But that's an exception - you are generally not allowed to modify temporaries. Yes, C++ is often that weird.
In your code getx() returns a temporary object, a so-called "rvalue". You can copy rvalues into objects (aka. variables) or bind them to to const references (which will extend their life-time until the end of the reference's life). You cannot bind rvalues to non-const references.
This was a deliberate design decision in order to prevent users from accidentally modifying an object that is going to die at the end of the expression:
g(getx()); // g() would modify an object without anyone being able to observe
If you want to do this, you will have to either make a local copy or of the object first or bind it to a const reference:
X x1 = getx();
const X& x2 = getx(); // extend lifetime of temporary to lifetime of const reference
g(x1); // fine
g(x2); // can't bind a const reference to a non-const reference
Note that the next C++ standard will include rvalue references. What you know as references is therefore becoming to be called "lvalue references". You will be allowed to bind rvalues to rvalue references and you can overload functions on "rvalue-ness":
void g(X&); // #1, takes an ordinary (lvalue) reference
void g(X&&); // #2, takes an rvalue reference
X x;
g(x); // calls #1
g(getx()); // calls #2
g(X()); // calls #2, too
The idea behind rvalue references is that, since these objects are going to die anyway, you can take advantage of that knowledge and implement what's called "move semantics", a certain kind of optimization:
class X {
X(X&& rhs)
: pimpl( rhs.pimpl ) // steal rhs' data...
{
rhs.pimpl = NULL; // ...and leave it empty, but deconstructible
}
data* pimpl; // you would use a smart ptr, of course
};
X x(getx()); // x will steal the rvalue's data, leaving the temporary object empty
What you are showing is that operator chaining is allowed.
X& x = getx().ref(); // OK
The expression is 'getx().ref();' and this is executed to completion before assignment to 'x'.
Note that getx() does not return a reference but a fully formed object into the local context. The object is temporary but it is not const, thus allowing you to call other methods to compute a value or have other side effects happen.
// It would allow things like this.
getPipeline().procInstr(1).procInstr(2).procInstr(3);
// or more commonly
std::cout << getManiplator() << 5;
Look at the end of this answer for a better example of this
You can not bind a temporary to a reference because doing so will generate a reference to an object that will be destroyed at the end of the expression thus leaving you with a dangling reference (which is untidy and the standard does not like untidy).
The value returned by ref() is a valid reference but the method does not pay any attention to the lifespan of the object it is returning (because it can not have that information within its context). You have basically just done the equivalent of:
x& = const_cast<x&>(getX());
The reason it is OK to do this with a const reference to a temporary object is that the standard extends the lifespan of the temporary to the lifespan of the reference so the temporary objects lifespan is extended beyond the end of the statement.
So the only remaining question is why does the standard not want to allow reference to temporaries to extend the life of the object beyond the end of the statement?
I believe it is because doing so would make the compiler very hard to get correct for temporary objects. It was done for const references to temporaries as this has limited usage and thus forced you to make a copy of the object to do anything useful but does provide some limited functionality.
Think of this situation:
int getI() { return 5;}
int x& = getI();
x++; // Note x is an alias to a variable. What variable are you updating.
Extending the lifespan of this temporary object is going to be very confusing.
While the following:
int const& y = getI();
Will give you code that it is intuitive to use and understand.
If you want to modify the value you should be returning the value to a variable. If you are trying to avoid the cost of copying the obejct back from the function (as it seems that the object is copy constructed back (technically it is)). Then don't bother the compiler is very good at 'Return Value Optimization'
Why is discussed in the C++ FAQ (boldfacing mine):
In C++, non-const references can bind to lvalues and const references can bind to lvalues or rvalues, but there is nothing that can bind to a non-const rvalue. That's to protect people from changing the values of temporaries that are destroyed before their new value can be used. For example:
void incr(int& a) { ++a; }
int i = 0;
incr(i); // i becomes 1
incr(0); // error: 0 is not an lvalue
If that incr(0) were allowed either some temporary that nobody ever saw would be incremented or - far worse - the value of 0 would become 1. The latter sounds silly, but there was actually a bug like that in early Fortran compilers that set aside a memory location to hold the value 0.
The main issue is that
g(getx()); //error
is a logical error: g is modifying the result of getx() but you don't have any chance to examine the modified object. If g didn't need to modify its parameter then it wouldn't have required an lvalue reference, it could have taken the parameter by value or by const reference.
const X& x = getx(); // OK
is valid because you sometimes need to reuse the result of an expression, and it's pretty clear that you're dealing with a temporary object.
However it is not possible to make
X& x = getx(); // error
valid without making g(getx()) valid, which is what the language designers were trying to avoid in the first place.
g(getx().ref()); //OK
is valid because methods only know about the const-ness of the this, they don't know if they are called on an lvalue or on an rvalue.
As always in C++, you have a workaround for this rule but you have to signal the compiler that you know what you're doing by being explicit:
g(const_cast<x&>(getX()));
Seems like the original question as to why this is not allowed has been answered clearly: "because it is most likely an error".
FWIW, I thought I'd show how to it could be done, even though I don't think it's a good technique.
The reason I sometimes want to pass a temporary to a method taking a non-const reference is to intentionally throw away a value returned by-reference that the calling method doesn't care about. Something like this:
// Assuming: void Person::GetNameAndAddr(std::string &name, std::string &addr);
string name;
person.GetNameAndAddr(name, string()); // don't care about addr
As explained in previous answers, that doesn't compile. But this compiles and works correctly (with my compiler):
person.GetNameAndAddr(name,
const_cast<string &>(static_cast<const string &>(string())));
This just shows that you can use casting to lie to the compiler. Obviously, it would be much cleaner to declare and pass an unused automatic variable:
string name;
string unused;
person.GetNameAndAddr(name, unused); // don't care about addr
This technique does introduce an unneeded local variable into the method's scope. If for some reason you want to prevent it from being used later in the method, e.g., to avoid confusion or error, you can hide it in a local block:
string name;
{
string unused;
person.GetNameAndAddr(name, unused); // don't care about addr
}
-- Chris
Why would you ever want X& x = getx();? Just use X x = getx(); and rely on RVO.
The evil workaround involves the 'mutable' keyword. Actually being evil is left as an exercise for the reader. Or see here: http://www.ddj.com/cpp/184403758
Excellent question, and here's my attempt at a more concise answer (since a lot of useful info is in comments and hard to dig out in the noise.)
Any reference bound directly to a temporary will extend its life [12.2.5]. On the other hand, a reference initialized with another reference will not (even if it's ultimately the same temporary). That makes sense (the compiler doesn't know what that reference ultimately refers to).
But this whole idea is extremely confusing. E.g. const X &x = X(); will make the temporary last as long as the x reference, but const X &x = X().ref(); will NOT (who knows what ref() actually returned). In the latter case, the destructor for X gets called at the end of this line. (This is observable with a non-trivial destructor.)
So it seems generally confusing and dangerous (why complicate the rules about object lifetimes?), but presumably there was a need at least for const references, so the standard does set this behavior for them.
[From sbi comment]: Note that the fact that binding it to a const reference enhances a
temporary's lifetimes is an exception that's been added deliberately
(TTBOMK in order to allow manual optimizations). There wasn't an
exception added for non-const references, because binding a temporary
to a non-const reference was seen to most likely be a programmer
error.
All temporaries do persist until the end of the full-expression. To make use of them, however, you need a trick like you have with ref(). That's legal. There doesn't seem to be a good reason for the extra hoop to jump through, except to remind the programmer that something unusual is going on (namely, a reference parameter whose modifications will be quickly lost).
[Another sbi comment] The reason Stroustrup gives (in D&E) for disallowing the binding of
rvalues to non-const references is that, if Alexey's g() would modify
the object (which you'd expect from a function taking a non-const
reference), it would modify an object that's going to die, so nobody
could get at the modified value anyway. He says that this, most
likely, is an error.
"It is clear that the temporary object is not constant in the sample above, because calls
to non-constant functions are permitted. For instance, ref() could modify the temporary
object."
In your example getX() does not return a const X so you are able to call ref() in much the same way as you could call X().ref(). You are returning a non const ref and so can call non const methods, what you can't do is assign the ref to a non const reference.
Along with SadSidos comment this makes your three points incorrect.
I have a scenario I would like to share where I wish I could do what Alexey is asking. In a Maya C++ plugin, I have to do the following shenanigan in order to get a value into a node attribute:
MFnDoubleArrayData myArrayData;
MObject myArrayObj = myArrayData.create(myArray);
MPlug myPlug = myNode.findPlug(attributeName);
myPlug.setValue(myArrayObj);
This is tedious to write, so I wrote the following helper functions:
MPlug operator | (MFnDependencyNode& node, MObject& attribute){
MStatus status;
MPlug returnValue = node.findPlug(attribute, &status);
return returnValue;
}
void operator << (MPlug& plug, MDoubleArray& doubleArray){
MStatus status;
MFnDoubleArrayData doubleArrayData;
MObject doubleArrayObject = doubleArrayData.create(doubleArray, &status);
status = plug.setValue(doubleArrayObject);
}
And now I can write the code from the beginning of the post as:
(myNode | attributeName) << myArray;
The problem is it doesn't compile outside of Visual C++, because it's trying to bind the temporary variable returned from the | operator to the MPlug reference of the << operator. I would like it to be a reference because this code is called many times and I'd rather not have MPlug being copied so much. I only need the temporary object to live until the end of the second function.
Well, this is my scenario. Just thought I'd show an example where one would like to do what Alexey describe. I welcome all critiques and suggestions!
Thanks.

How to change const std::shared_ptr after first assignment?

If I define a shared_ptr and a const shared_ptr of the same type, like this:
std::shared_ptr<int> first = std::shared_ptr<int>(new int);
const std::shared_ptr<int> second = std::shared_ptr<int>();
And later try to change the value of the const shared_ptr like this:
second = first;
It cause a compile error (as it should). But even if I try to cast away the const part:
(std::shared_ptr<int>)second = first;
The result of the code above is that second ends up being Empty, while first is untouched (eg ref count is still 1).
How can I change the value of a const shared_ptr after it was originally set? Is this even possible with std's pointer?
Thanks!
It is undefined behavior to modify in any way a variable declared as const outside of its construction or destruction.
const std::shared_ptr<int> second
this is a variable declared as const.
There is no standard compliant way to change what it refers to after construction and before destruction.
That being said, manually calling the destructor and constructing a new shared_ptr in the same spot might be legal, I am uncertain. You definitely cannot refer to said shared_ptr by its original name, and possibly leaving the scope where the original shared_ptr existed is illegal (as the destructor tries to destroy the original object, which the compiler can prove is an empty shared pointer (or a non-empty one) based on how the const object was constructed).
This is a bad idea even if you could make an argument the standard permits it.
const objects cannot be changed.
...
Your cast to a shared_ptr<int> simply creates a temporary copy. It is then assigned to, and the temporary copy is changed. Then the temporary copy is discarded. The const shared_ptr<int> not being modified is expected behavior. The legality of assigning to a temporary copy is because shared_ptr and most of the std library was designed before we had the ability to overload operator= based on the r/lvalue-ness of the left hand side.
...
Now, why is this the case? Actual constness is used by the compiler as an optimization hint.
{
const std::shared_ptr<int> bob = std::make_shared<int>();
}
in the above case, the compiler can know for certain that bob is non-empty at the end of the scope. Nothing can be done to bob that could make it empty and still leave you with defined behavior.
So the compiler can eliminate the branch at the end of the scope when destroying bob that checks if the pointer is null.
Similar optimizations could occur if you pass bob to an inline function that checks for bob's null state; the compiler can omit the check.
Suppose you pass bob to
void secret_code( std::shared_ptr<int> const& );
where the compiler cannot see into the implementation of secret_code. It can assume that secret code will not edit bob.
If it wasn't declared const, secret_code could legally do a const_cast<std::shared_ptr&> on the parameter and set it to null; but if the argument to secret_code is actually const this is undefined behavior. (Any code casting away const is responsible for guaranteeing that no actual modification of an actual const value occurs by doing so)
Without const on bob, the compiler could not guarantee:
{
const std::shared_ptr<int> bob = std::make_shared<int>();
secret_code(bob);
if (bob) {
std::cout << "guaranteed to run"
}
}
that the guaranteed to run string would be printed.
With const on bob, the compiler is free to elimiate the if check above.
...
Now, do not confuse my explanation asto why the standard states you cannot edit const stack variables with "if this doesn't happen there is no problem". The standard states you shall not do it; the consequences if you do it are unbounded and can grow with new versions of your compiler.
...
From comments:
For deserialize process, which is actually a type of constructor that deserialize object from file. C++ is nice, but it got its imperfections and sometimes its OK to search for less orthodox methods.
If it is a constructor, make it a constructor.
In C++17 a function returning a T has basically equal standing to a real constructor in many ways (due to guaranteed elision). In C++14, this isn't quite true (you also need a move constructor, and the compiler needs to elide it).
So a deserialization constructor for a type T in C++ needs to return a T, it cannot take a T by-reference and be a real constructor.
Composing this is a bit of a pain, but it can be done. Using the same code for serialization and deserialization is even more of a pain (I cannot off hand figure out how).

C++ Invalid Initialization Error [duplicate]

Why is it not allowed to get non-const reference to a temporary object,
which function getx() returns? Clearly, this is prohibited by C++ Standard
but I am interested in the purpose of such restriction, not a reference to the standard.
struct X
{
X& ref() { return *this; }
};
X getx() { return X();}
void g(X & x) {}
int f()
{
const X& x = getx(); // OK
X& x = getx(); // error
X& x = getx().ref(); // OK
g(getx()); //error
g(getx().ref()); //OK
return 0;
}
It is clear that the lifetime of the object cannot be the cause, because
constant reference to an object is not prohibited by C++ Standard.
It is clear that the temporary object is not constant in the sample above, because calls to non-constant functions are permitted. For instance, ref() could modify the temporary object.
In addition, ref() allows you to fool the compiler and get a link to this temporary object and that solves our problem.
In addition:
They say "assigning a temporary object to the const reference extends the lifetime of this object" and " Nothing is said about non-const references though".
My additional question. Does following assignment extend the lifetime of temporary object?
X& x = getx().ref(); // OK
From this Visual C++ blog article about rvalue references:
... C++ doesn't want you to accidentally
modify temporaries, but directly
calling a non-const member function on
a modifiable rvalue is explicit, so
it's allowed ...
Basically, you shouldn't try to modify temporaries for the very reason that they are temporary objects and will die any moment now. The reason you are allowed to call non-const methods is that, well, you are welcome to do some "stupid" things as long as you know what you are doing and you are explicit about it (like, using reinterpret_cast). But if you bind a temporary to a non-const reference, you can keep passing it around "forever" just to have your manipulation of the object disappear, because somewhere along the way you completely forgot this was a temporary.
If I were you, I would rethink the design of my functions. Why is g() accepting reference, does it modify the parameter? If no, make it const reference, if yes, why do you try to pass temporary to it, don't you care it's a temporary you are modifying? Why is getx() returning temporary anyway? If you share with us your real scenario and what you are trying to accomplish, you may get some good suggestions on how to do it.
Going against the language and fooling the compiler rarely solves problems - usually it creates problems.
Edit: Addressing questions in comment:
1) `X& x = getx().ref(); // OK when will x die?` - I don't know and I don't care, because this is exactly what I mean by "going against the language". The language says "temporaries die at the end of the statement, unless they are bound to const reference, in which case they die when the reference goes out of scope". Applying that rule, it seems x is already dead at the beginning of the next statement, since it's not bound to const reference (the compiler doesn't know what ref() returns). This is just a guess however.
I stated the purpose clearly: you are not allowed to modify temporaries, because it just does not make sense (ignoring C++0x rvalue references). The question "then why am I allowed to call non-const members?" is a good one, but I don't have better answer than the one I already stated above.
Well, if I'm right about x in X& x = getx().ref(); dying at the end of the statement, the problems are obvious.
Anyway, based on your question and comments I don't think even these extra answers will satisfy you. Here is a final attempt/summary: The C++ committee decided it doesn't make sense to modify temporaries, therefore, they disallowed binding to non-const references. May be some compiler implementation or historic issues were also involved, I don't know. Then, some specific case emerged, and it was decided that against all odds, they will still allow direct modification through calling non-const method. But that's an exception - you are generally not allowed to modify temporaries. Yes, C++ is often that weird.
In your code getx() returns a temporary object, a so-called "rvalue". You can copy rvalues into objects (aka. variables) or bind them to to const references (which will extend their life-time until the end of the reference's life). You cannot bind rvalues to non-const references.
This was a deliberate design decision in order to prevent users from accidentally modifying an object that is going to die at the end of the expression:
g(getx()); // g() would modify an object without anyone being able to observe
If you want to do this, you will have to either make a local copy or of the object first or bind it to a const reference:
X x1 = getx();
const X& x2 = getx(); // extend lifetime of temporary to lifetime of const reference
g(x1); // fine
g(x2); // can't bind a const reference to a non-const reference
Note that the next C++ standard will include rvalue references. What you know as references is therefore becoming to be called "lvalue references". You will be allowed to bind rvalues to rvalue references and you can overload functions on "rvalue-ness":
void g(X&); // #1, takes an ordinary (lvalue) reference
void g(X&&); // #2, takes an rvalue reference
X x;
g(x); // calls #1
g(getx()); // calls #2
g(X()); // calls #2, too
The idea behind rvalue references is that, since these objects are going to die anyway, you can take advantage of that knowledge and implement what's called "move semantics", a certain kind of optimization:
class X {
X(X&& rhs)
: pimpl( rhs.pimpl ) // steal rhs' data...
{
rhs.pimpl = NULL; // ...and leave it empty, but deconstructible
}
data* pimpl; // you would use a smart ptr, of course
};
X x(getx()); // x will steal the rvalue's data, leaving the temporary object empty
What you are showing is that operator chaining is allowed.
X& x = getx().ref(); // OK
The expression is 'getx().ref();' and this is executed to completion before assignment to 'x'.
Note that getx() does not return a reference but a fully formed object into the local context. The object is temporary but it is not const, thus allowing you to call other methods to compute a value or have other side effects happen.
// It would allow things like this.
getPipeline().procInstr(1).procInstr(2).procInstr(3);
// or more commonly
std::cout << getManiplator() << 5;
Look at the end of this answer for a better example of this
You can not bind a temporary to a reference because doing so will generate a reference to an object that will be destroyed at the end of the expression thus leaving you with a dangling reference (which is untidy and the standard does not like untidy).
The value returned by ref() is a valid reference but the method does not pay any attention to the lifespan of the object it is returning (because it can not have that information within its context). You have basically just done the equivalent of:
x& = const_cast<x&>(getX());
The reason it is OK to do this with a const reference to a temporary object is that the standard extends the lifespan of the temporary to the lifespan of the reference so the temporary objects lifespan is extended beyond the end of the statement.
So the only remaining question is why does the standard not want to allow reference to temporaries to extend the life of the object beyond the end of the statement?
I believe it is because doing so would make the compiler very hard to get correct for temporary objects. It was done for const references to temporaries as this has limited usage and thus forced you to make a copy of the object to do anything useful but does provide some limited functionality.
Think of this situation:
int getI() { return 5;}
int x& = getI();
x++; // Note x is an alias to a variable. What variable are you updating.
Extending the lifespan of this temporary object is going to be very confusing.
While the following:
int const& y = getI();
Will give you code that it is intuitive to use and understand.
If you want to modify the value you should be returning the value to a variable. If you are trying to avoid the cost of copying the obejct back from the function (as it seems that the object is copy constructed back (technically it is)). Then don't bother the compiler is very good at 'Return Value Optimization'
Why is discussed in the C++ FAQ (boldfacing mine):
In C++, non-const references can bind to lvalues and const references can bind to lvalues or rvalues, but there is nothing that can bind to a non-const rvalue. That's to protect people from changing the values of temporaries that are destroyed before their new value can be used. For example:
void incr(int& a) { ++a; }
int i = 0;
incr(i); // i becomes 1
incr(0); // error: 0 is not an lvalue
If that incr(0) were allowed either some temporary that nobody ever saw would be incremented or - far worse - the value of 0 would become 1. The latter sounds silly, but there was actually a bug like that in early Fortran compilers that set aside a memory location to hold the value 0.
The main issue is that
g(getx()); //error
is a logical error: g is modifying the result of getx() but you don't have any chance to examine the modified object. If g didn't need to modify its parameter then it wouldn't have required an lvalue reference, it could have taken the parameter by value or by const reference.
const X& x = getx(); // OK
is valid because you sometimes need to reuse the result of an expression, and it's pretty clear that you're dealing with a temporary object.
However it is not possible to make
X& x = getx(); // error
valid without making g(getx()) valid, which is what the language designers were trying to avoid in the first place.
g(getx().ref()); //OK
is valid because methods only know about the const-ness of the this, they don't know if they are called on an lvalue or on an rvalue.
As always in C++, you have a workaround for this rule but you have to signal the compiler that you know what you're doing by being explicit:
g(const_cast<x&>(getX()));
Seems like the original question as to why this is not allowed has been answered clearly: "because it is most likely an error".
FWIW, I thought I'd show how to it could be done, even though I don't think it's a good technique.
The reason I sometimes want to pass a temporary to a method taking a non-const reference is to intentionally throw away a value returned by-reference that the calling method doesn't care about. Something like this:
// Assuming: void Person::GetNameAndAddr(std::string &name, std::string &addr);
string name;
person.GetNameAndAddr(name, string()); // don't care about addr
As explained in previous answers, that doesn't compile. But this compiles and works correctly (with my compiler):
person.GetNameAndAddr(name,
const_cast<string &>(static_cast<const string &>(string())));
This just shows that you can use casting to lie to the compiler. Obviously, it would be much cleaner to declare and pass an unused automatic variable:
string name;
string unused;
person.GetNameAndAddr(name, unused); // don't care about addr
This technique does introduce an unneeded local variable into the method's scope. If for some reason you want to prevent it from being used later in the method, e.g., to avoid confusion or error, you can hide it in a local block:
string name;
{
string unused;
person.GetNameAndAddr(name, unused); // don't care about addr
}
-- Chris
Why would you ever want X& x = getx();? Just use X x = getx(); and rely on RVO.
The evil workaround involves the 'mutable' keyword. Actually being evil is left as an exercise for the reader. Or see here: http://www.ddj.com/cpp/184403758
Excellent question, and here's my attempt at a more concise answer (since a lot of useful info is in comments and hard to dig out in the noise.)
Any reference bound directly to a temporary will extend its life [12.2.5]. On the other hand, a reference initialized with another reference will not (even if it's ultimately the same temporary). That makes sense (the compiler doesn't know what that reference ultimately refers to).
But this whole idea is extremely confusing. E.g. const X &x = X(); will make the temporary last as long as the x reference, but const X &x = X().ref(); will NOT (who knows what ref() actually returned). In the latter case, the destructor for X gets called at the end of this line. (This is observable with a non-trivial destructor.)
So it seems generally confusing and dangerous (why complicate the rules about object lifetimes?), but presumably there was a need at least for const references, so the standard does set this behavior for them.
[From sbi comment]: Note that the fact that binding it to a const reference enhances a
temporary's lifetimes is an exception that's been added deliberately
(TTBOMK in order to allow manual optimizations). There wasn't an
exception added for non-const references, because binding a temporary
to a non-const reference was seen to most likely be a programmer
error.
All temporaries do persist until the end of the full-expression. To make use of them, however, you need a trick like you have with ref(). That's legal. There doesn't seem to be a good reason for the extra hoop to jump through, except to remind the programmer that something unusual is going on (namely, a reference parameter whose modifications will be quickly lost).
[Another sbi comment] The reason Stroustrup gives (in D&E) for disallowing the binding of
rvalues to non-const references is that, if Alexey's g() would modify
the object (which you'd expect from a function taking a non-const
reference), it would modify an object that's going to die, so nobody
could get at the modified value anyway. He says that this, most
likely, is an error.
"It is clear that the temporary object is not constant in the sample above, because calls
to non-constant functions are permitted. For instance, ref() could modify the temporary
object."
In your example getX() does not return a const X so you are able to call ref() in much the same way as you could call X().ref(). You are returning a non const ref and so can call non const methods, what you can't do is assign the ref to a non const reference.
Along with SadSidos comment this makes your three points incorrect.
I have a scenario I would like to share where I wish I could do what Alexey is asking. In a Maya C++ plugin, I have to do the following shenanigan in order to get a value into a node attribute:
MFnDoubleArrayData myArrayData;
MObject myArrayObj = myArrayData.create(myArray);
MPlug myPlug = myNode.findPlug(attributeName);
myPlug.setValue(myArrayObj);
This is tedious to write, so I wrote the following helper functions:
MPlug operator | (MFnDependencyNode& node, MObject& attribute){
MStatus status;
MPlug returnValue = node.findPlug(attribute, &status);
return returnValue;
}
void operator << (MPlug& plug, MDoubleArray& doubleArray){
MStatus status;
MFnDoubleArrayData doubleArrayData;
MObject doubleArrayObject = doubleArrayData.create(doubleArray, &status);
status = plug.setValue(doubleArrayObject);
}
And now I can write the code from the beginning of the post as:
(myNode | attributeName) << myArray;
The problem is it doesn't compile outside of Visual C++, because it's trying to bind the temporary variable returned from the | operator to the MPlug reference of the << operator. I would like it to be a reference because this code is called many times and I'd rather not have MPlug being copied so much. I only need the temporary object to live until the end of the second function.
Well, this is my scenario. Just thought I'd show an example where one would like to do what Alexey describe. I welcome all critiques and suggestions!
Thanks.

Returning a vector by value into a reference

I have the following code:
std::vector<Info*> filter(int direction)
{
std::vector<Info*> new_buffer;
for(std::vector<Info*>::iterator it=m_Buffer.begin();it<m_Buffer.end();it++)
{
if(((*it)->direction == direction)
{
new_buffer.push_back(*it);
}
}
return new_buffer;
}
std::vector<Info*> &filteredInfo= filter(m_Direction);
Can someone explain what is happening here ? Would the filter method return by value create a temporary and filteredInfo never gets destroyed because its a reference ?
Not sure if I understand correctly. What is the diference between filteredInfo being a reference and not being one in this case ?
Your compiler should complain of that code.
This statement:
std::vector<Info*> &filteredInfo= filter(m_Direction);
is a bad idea where filter is:
std::vector<Info*> filter(int direction);
You are trying to create a reference to a temporary object. Even if it succeeds with your compiler, its illegal.
You should use:
std::vector<Info*> filteredInfo= filter(m_Direction);
Its as efficient as you want. Either a move operation (C++11) will happen there or Return Value Optimization will kick in. For your implementation of filter, it should be RVO on optimized builds (it depends on your compiler quality though) .
However, you should note that you are copying raw pointers into your vector, I hope you have a correct ownership model? If not, I advice you to use a smart pointer.
Here is what happens:
std::vector<Info*> new_buffer; creates an object locally.
return new_buffer; moves new_buffer to a temporary object when filter(m_Direction) is called.
Now if you call std::vector<Info*> filteredInfo= filter(m_Direction); the temprary object will be moved to filteredInfo so there is no unnecessary copies and it's the most efficient way.
But, if you call std::vector<Info*> &filteredInfo= filter(m_Direction); then filteredInfo is bound to a temporary object, which is a terrible idea and most compilers will complain about this.
Here you're correctly puzzled because there are two independent weird facts mixing in:
Your compiler allows a non-const reference to be bound to a temporary. This historically was a mistake in Microsoft compilers and is not permitted by the standard. That code should not compile.
The standard however, strangely enough, actually allows binding const references to temporaries and has a special rule for that: the temporary object will not be destroyed immediately (like it would happen normally) but its life will be extended to the life of the reference.
In code:
std::vector<int> foo() {
std::vector<int> x{1,2,3};
return x;
}
int main() {
const std::vector<int>& x = foo(); // legal
for (auto& item : x) {
std::cout << x << std::endl;
}
}
The reason for this apparently absurd rule about binding const references to temporaries is that in C++ there is a very common "pattern"(1) of passing const references instead of values for parameters, even when identity is irrelevant. If you combine this (anti)-pattern with implicit conversion what happens is that for example:
void foo(const std::string& x) { ... }
wouldn't be callable with
foo("Hey, you");
without the special rule, because the const char * (literal) is implicitly converted to a temporary std::string and passed as parameter bound to a const reference.
(1) The pattern is indeed quite bad from a philosophical point of view because a value is a value and a reference is a reference: the two are logically distinct concepts. A const reference is not a value and confusing the two can be the source of very subtle bugs. C++ however is performance-obsessed and, especially before move semantics, passing const references was considered a "smart" way of passing values, despite being a problem because of lifetime and aliasing issues and for making things harder for the optimizer. With a modern compiler passing a reference should be used only for "big" objects, especially ones that are not constructed on the fly to be passed or when you're actually interested in object identity and not in just object value.