reference to a temporary and warnings - c++

I've lost one hour to find this problem in my code:
vector<string> & input_variables = parse_xml(xml_path)["variables"];
where parse_xml is a function returning a std::map<std::string, std::vector<std::string> >. Why gcc is not warning me (with -Wall)? Am I missing some flags?

You have taken a reference to an object which is destroyed. In C++11 new language features are written in making this code illegal. You must make a copy or swap the data into a local variable if you wish to use it. GCC is not warning you because C++03 does not provide the necessary features to prevent such.
Technically, the return value of operator[] is an lvalue. Unfortunately, it is about to be destroyed by it's owner, the std::map.

GCC doesn't warn you because technically there's nothing to warn about.
parse_xml() returns a std::map by value, which is a temporary. Calling operator[] returns a reference. The compiler cannot know locally that this reference is actually part of the std::map temporary. For all the compiler knows, operator[] could be returning a reference to a global or something.
A member variable of a temporary is considered a temporary, which is linked to the lifetime of the outer temporary. But the return value of a function (like operator[]) is not so linked.

The reason why it doesn't warn you is because you are getting an invalid operation through a sequence of valid steps:
struct X
{
int c;
int & operator [] (int) { return c; } /* this is perfectly OK */
};
X f()
{
return X(); /* this is perfectly OK */
}
int main()
{
int &x = f()[1]; /* can apply [] on temporary, therefore OK */
}
You can prevent this from occurring by explicitly marking the result of f() as const.

Related

Returning std::vector with std::move

I have a very basic question: is it a good idea to return a std::vector<A> using std::move? For, example:
class A {};
std::vector<A> && func() {
std::vector<A> v;
/* fill v */
return std::move(v);
}
Should I return std::map, std::list.. etc... in this way?
You declare a function to return by r-value reference - this should almost never be done (if you return the local object by reference, you will end up with a dangling reference). Instead declare the function to return by value. This way the caller's value will be move constructed by the r-value returned by the function. The returned value will also bind to any reference.
Secondly, no, you should not return using an explicit std::move as this will prevent the compiler to use RVO. There's no need as the compiler will automatically convert any l-value reference returned to an r-value reference if possible.
std::vector<A> func() {
std::vector<A> v;
/* fill v */
return v; // 'v' is converted to r-value and return value is move constructed.
}
More info:
Using std::move() when returning a value from a function to avoid to copy
Is there any case where a return of a RValue Reference (&&) is useful?
No, it's not. This will in fact prevent copy elision in some cases. There is even a warning in some compilers about it, called -Wpessimizing-move.
In agreement with other answers, just return it by value, changing the return type to simply be std::vector<A>, and the compiler will take care of calling the move constructor when needed.
You could take a look at this post I just found, which seems to explain it in much detail (although I haven't read it through myself):
https://vmpstr.blogspot.hu/2015/12/redundant-stdmove.html
Both gcc and clang compiler with enabled optimization generate same binary code for both case when you return local variable and when you write std::move().
Just return a value.
But using && and noexcept specifier is useful if you create moving constructor and moving operator= for your custom class

Returning a vector by value into a reference

I have the following code:
std::vector<Info*> filter(int direction)
{
std::vector<Info*> new_buffer;
for(std::vector<Info*>::iterator it=m_Buffer.begin();it<m_Buffer.end();it++)
{
if(((*it)->direction == direction)
{
new_buffer.push_back(*it);
}
}
return new_buffer;
}
std::vector<Info*> &filteredInfo= filter(m_Direction);
Can someone explain what is happening here ? Would the filter method return by value create a temporary and filteredInfo never gets destroyed because its a reference ?
Not sure if I understand correctly. What is the diference between filteredInfo being a reference and not being one in this case ?
Your compiler should complain of that code.
This statement:
std::vector<Info*> &filteredInfo= filter(m_Direction);
is a bad idea where filter is:
std::vector<Info*> filter(int direction);
You are trying to create a reference to a temporary object. Even if it succeeds with your compiler, its illegal.
You should use:
std::vector<Info*> filteredInfo= filter(m_Direction);
Its as efficient as you want. Either a move operation (C++11) will happen there or Return Value Optimization will kick in. For your implementation of filter, it should be RVO on optimized builds (it depends on your compiler quality though) .
However, you should note that you are copying raw pointers into your vector, I hope you have a correct ownership model? If not, I advice you to use a smart pointer.
Here is what happens:
std::vector<Info*> new_buffer; creates an object locally.
return new_buffer; moves new_buffer to a temporary object when filter(m_Direction) is called.
Now if you call std::vector<Info*> filteredInfo= filter(m_Direction); the temprary object will be moved to filteredInfo so there is no unnecessary copies and it's the most efficient way.
But, if you call std::vector<Info*> &filteredInfo= filter(m_Direction); then filteredInfo is bound to a temporary object, which is a terrible idea and most compilers will complain about this.
Here you're correctly puzzled because there are two independent weird facts mixing in:
Your compiler allows a non-const reference to be bound to a temporary. This historically was a mistake in Microsoft compilers and is not permitted by the standard. That code should not compile.
The standard however, strangely enough, actually allows binding const references to temporaries and has a special rule for that: the temporary object will not be destroyed immediately (like it would happen normally) but its life will be extended to the life of the reference.
In code:
std::vector<int> foo() {
std::vector<int> x{1,2,3};
return x;
}
int main() {
const std::vector<int>& x = foo(); // legal
for (auto& item : x) {
std::cout << x << std::endl;
}
}
The reason for this apparently absurd rule about binding const references to temporaries is that in C++ there is a very common "pattern"(1) of passing const references instead of values for parameters, even when identity is irrelevant. If you combine this (anti)-pattern with implicit conversion what happens is that for example:
void foo(const std::string& x) { ... }
wouldn't be callable with
foo("Hey, you");
without the special rule, because the const char * (literal) is implicitly converted to a temporary std::string and passed as parameter bound to a const reference.
(1) The pattern is indeed quite bad from a philosophical point of view because a value is a value and a reference is a reference: the two are logically distinct concepts. A const reference is not a value and confusing the two can be the source of very subtle bugs. C++ however is performance-obsessed and, especially before move semantics, passing const references was considered a "smart" way of passing values, despite being a problem because of lifetime and aliasing issues and for making things harder for the optimizer. With a modern compiler passing a reference should be used only for "big" objects, especially ones that are not constructed on the fly to be passed or when you're actually interested in object identity and not in just object value.

Is it valid C++ to cast an rvalue to a const pointer?

In a moment of haste, needing a pointer to an object to pass to a function. I took the address of an unnamed temporary object and to my surprise it compiled (the original code had warnings turned further down and lacked the const correctness present in the example below). Curious, I set up a controlled environment with warnings all the way up and treating warnings as errors in Visual Studio 2013.
Consider the following code:
class Contrived {
int something;
};
int main() {
const Contrived &r = Contrived(); // this is well defined even in C++03, the object lives until r goes out of scope
const Contrived *p1 = &r; // compiles fine, given the type of r this should be fine. But is it considering r was initialized with an rvalue?
const Contrived *p2 = &(const Contrived&)Contrived(); // this is handy when calling functions, is it valid? It also compiles
const int *p3 = &(const int&)27; // it works with PODs too, is it valid C++?
return 0;
}
The three pointer initializations are all more or less the same thing. The question is, are these initializations valid C++ under C++03, C++11, or both? I ask about C++11 separately in case something changed, considering that a lot of work was put in around rvalue references. It may not seem worthwhile to assign these values such as in the above example, but it's worth noting this could save some typing if such values are being passed to a function taking constant pointers and you don't have an appropriate object lying around or feel like making a temporary object on a line above.
EDIT:
Based on the answers the above is valid C++03 and C++11. I'd like to call out some additional points of clarification with regard to the resulting objects' lifetimes.
Consider the following code:
class Contrived {
int something;
} globalClass;
int globalPOD = 0;
template <typename T>
void SetGlobal(const T *p, T &global) {
global = *p;
}
int main() {
const int *p1 = &(const int&)27;
SetGlobal<int>(p1, globalPOD); // does *p still exist at the point of this call?
SetGlobal<int>(&(const int&)27, globalPOD); // since the rvalue expression is cast to a reference at the call site does *p exist within SetGlobal
// or similarly with a class
const Contrived *p2 = &(const Contrived&)Contrived();
SetGlobal<Contrived>(p2, globalClass);
SetGlobal<Contrived>(&(const Contrived&)Contrived(), globalClass);
return 0;
}
The question is are either or both of the calls to SetGlobal valid, in that they are passing a pointer to an object that will exist for the duration of the call under the C++03 or C++11 standard?
An rvalue is a type of expression, not a type of object. We're talking about the temporary object created by Contrived(), it doesn't make sense to say "this object is an rvalue". The expression that created the object is an rvalue expression, but that's different.
Even though the object in question is a temporary object, its lifetime has been extended. It's perfectly fine to perform operations on the object using the identifier r which denotes it. The expression r is an lvalue.
p1 is OK. On the p2 and p3 lines, the lifetime of the reference ends at the end of that full-expression, so the temporary object's lifetime also ends at that point. So it would be undefined behaviour to use p2 or p3 on subsequent lines. The initializing expression could be used as an argument to a function call though, if that's what you meant.
The first one is good: the expression r is not in fact an rvalue.
The other two are technically valid, too, but be aware that pointers become dangling at the end of the full expression (at the semicolon), and any attempt to use them would exhibit undefined behavior.
While it is perfectly legal to pass an rvalue by const&, you have to be aware that your code ends up with invalidated pointers in p2 and p3, since the lifetime of the objects that they point is over.
To exemplify this, consider the following code that is often used to pass a temporary by reference:
template<typename T>
void pass_by_ref(T const&);
A function like this can be called with an lvalue or rvalue as its argument (and often is). Inside that function you can obviously take the reference of your argument - it is just a reference to a const object after all... You are basically doing the exact same thing without the help of a function.
In fact, in C++11, you can go one step further and obtain a non-const pointer to an temporary:
template<typename T>
typename std::remove_reference<T>::type* example(T&& t)
{
return &t;
}
Note that the object the return value points to will only still exist if this function is called with an lvalue (since its argument will turn out to be typename remove_reference<T>::type& && which is typename remove_reference<T>::type&).

Why Can you return a function by reference for a local variable and not for temporary variable? c++

for example this function f defined like this :
int f(int x){return x;}
as you know You cant assign a reference to this temporary int :
int& rf=f(2);// this will give an error
but if I redefined my function f like this:
int& f(int x){return x;}
f(2);// so now f(2) is a reference of x, which has been destroyed
so my question is : how can the compiler not let you create a reference to a temporary which will be destroyed after the statment (int the 1st case). and on the other hand it lets you create a reference f(2) to x while compiler knows that this one will be destroyed after return.
Returning a reference to a local is something that can be difficult or impossible for the compiler to detect. For example:
int & f()
{
int x;
extern int & g(int & x);
// Does this return a reference to "x"?
// The compiler has no way to tell.
return g(x);
}
Even without calling external functions, it can still be difficult to analyse a complex program flow to tell whether the returned reference is to a local; rather than trying to define what counts as "simple enough" to diagnose, the standard doesn't require a diagnostic - it just states that it gives undefined behaviour. A good compiler should give a warning, at least in simple cases.
Binding a temporary to a non-const reference is something that the compiler can easily detect, and so the standard does require a diagnostic for that.
Because, as specified by the standard, returning a reference to a temporary variable from a function is undefined behavior.
What's wrong is actually the function definition:
int& f(int x)
{
return x;
}
You can bind temporary rvalue to a const reference to prolong it's lifetime.
const int& rf=f(2);
To reject your first code snippet, the compiler applies the simple rule that you cannot directly bind a temporary to a non-const reference.
To reject the second, the compiler could perhaps apply a rule that the return statement of a function that returns by reference cannot be the name of an automatic variable (including a by-value function parameter). That seems to me a fairly easy rule too.
I don't know why the standard doesn't specify that doing so is ill-formed. I can't think of any valid use for it, but perhaps at the time of the first standard it would have created an excessive burden on some implementation or other. Or perhaps it was felt that it's only half a fix and not worth bothering with (there are plenty of other ways to create a dangling reference, this just blocks one of them).
The reason why the standard does not in general stop you creating a non-const reference bound to a temporary is that there are occasions when it's OK. For example:
struct Foo {
static void set(Foo &f) { f.val = 0; }
int val;
int bar() {
set(*this);
return val;
}
};
std::cout << Foo().bar() << "\n";
Here Foo() is a temporary, and the line set(*this) binds it to a non-const reference (but not directly, it uses an lvalue expression *this that refers to a temporary some times but not others). There's no problem here, the temporary outlives the reference. So it would be unnecessarily restrictive for the language to somehow prevent any temporary from ever being bound to any non-const reference.
it's not a good idea to return by reference unless you're sure the reference you're returning will still be pointing at something valid.
otherwise, undefined behavior is expected.
since the variable is local, you can be sure that teh reference is invalid
The problem is that semantically there is no difference between a variable on a stack, or a variable on the heap etc. - So the language has no choice but to allow this, even if it is undefined behavior.
In the first example you get an easy compile error, because you're trying to bind a reference on a temporary variable, which can be forbidden by the language.

Example of code which incorrectly tries to re-seat a reference

As per my understanding, C++ does not allow you to re-seat a reference. In other words, you cannot change the object that a reference "refers" to. It's like a constant pointer in that regard (e.g. int* const a = 3;).
In some code I looked at today, I saw the following:
CMyObject& object = ObjectA().myObject();
// ...
object = ObjectB().myObject();
Immediately my alarm bells went off on the last line of code above. Wasn't the code trying to re-seat a reference? Yet the code compiled.
Then I realised that what the code was doing was simply invoking the assignment operator (i.e. operator=) to reassign ObjectA's internal object to ObjectB's internal object. The object reference still referred to ObjectA, it's just that the contents of ObjectA now matched that of ObjectB.
My understanding is that the compiler will always generate a default assignment operator if you don't provide one, which does a shallow copy (similar to the default copy constructor).
Since a reference is typed (just like the underlying object that it refers to), doesn't that mean that we will always invoke the assignment operator when attempting to re-seat a reference, thus preventing the compiler from complaining about this?
I've been racking my brains out trying to come up with an illegal line of code which will incorrectly try to re-seat a reference, to get the compiler to complain.
Can anyone point me to an example of such code?
You can't "reseat" a reference, because it's syntactically impossible. The reference variable you use which refers to the object uses the same semantics as if it was an object (non-reference) variable.
I've been racking my brains out trying to come up with an illegal line of code which will incorrectly try to re-seat a reference, to get the compiler to complain.
const int i = 42;
const int j = 1337;
const int& r = i;
r = j;
The uninitiated might expect the last line to re-seat r to j, but instead, the assignment to i fails.
You can't write portable C++ code to reseat a reference... the compiler tracks where the reference refers to and doesn't allow it to be changed. It's a kind of alias for whatever it refers to, and in some cases the reference value may be incorporated directly into the code at compile time. On some implementations where a particular reference happens to be stored in the form of a pointer, and happens to be looked up at run time, you may be able to use a reinterpret cast to overwrite it with a pointer to another object, but the behaviour is totally undefined and unreliable. For what little it's worth (nothing practically, but perhaps a smidge in assisting understanding of likely implementation), that might look something like:
struct X
{
Y& y_;
X(Y& y) : y_(y) { }
};
...
X x(y1);
*reinterpret_cast<Y**>(&x) = &y2;
My understanding is that the compiler
will always generate a default
assignment operator if you don't
provide one, which does a shallow copy
(similar to the default copy
constructor).
Since a reference is typed (just like
the underlying object that it refers
to), doesn't that mean that we will
always invoke the assignment operator
when attempting to re-seat a
reference, thus preventing the
compiler from complaining about this?
It's not quite like that. Implicit copy (assignment) performs memberwise copying (not necessarily shallow), and the compiler won't let bad things happen implicitly to reference members.
class X
{
int& ref;
public:
X(int& r): ref(r) {}
};
int main()
{
int i;
X a(i), b(i);
a = b;
}