I'm little confused about reference type in c++, here goes my code snippet.
class data
{
public:
std::vector<int> Get() const
{
return vec;
}
private:
std::vector<int> vec = {1,2,3,4};
};
int main()
{
data dat;
auto const& a = dat.Get()[1];
auto const& b = dat.Get()[2];
std::cout << "a = " << a << ", b = " << b << std::endl;
return 0;
}
The output a = 0, b = 1433763856 doesn't make any sense, after I remove the leading & before a and b, everything works fine. Now here goes my questions:
Since reference of reference is not allowed, but vector::operator[] do return a reference of element inside container, why no error thrown?
I know data::Get() function causes deep copy, but why I get the wrong value of a and b? Will the return value be destroyed right after function call?
You return a copy of the vector, as the signature
std::vector<int> Get() const
implies, as opposed to
std::vector<int> /*const*/& Get() const
which would return a reference, this is true, but that doesn't really explain why returning a copy is a mistake in this situation.
After all, if the call was
auto const& v = data.Get(); // *your* version here, the one returning by copy
v would not be dangling.
The point is that you're not keeping that copy alive by bounding it to a reference (as I've done in the last snippet).
Instead, you're calling operator[] on that temporary, and that call results in a reference to an entry (int&) in that vector. When the temporary vector returned by dat.Get() is destroyed, that's the reference which dangles.
If operator[] returned by value, then not even the a and b in your example would dangle.
Related
Today I saw my boss's code which uses a const reference as a map's value type.
Here's the code:
class ConfigManager{
public:
map<PB::Point, const PB::WorldPoint&, compare_point> world_point;
//the rest are omitted
};
(PB is Google Protobuf, we are using the Protobuf library. I don't know much about it or if it's relevant to the question. )
What this class does is that it reads some config files and put it into some maps for searhing.
At first I was surprised because I haven't seen a map with a reference as its value, which is e.g. map<int, classA&> aMap.
So then I searched on SO and these 2 questions tell me that I can't do that.
C++: Is it possible to use a reference as the value in a map?
STL map containing references does not compile
Then I tried this code, indeed it doesn't compile:
Code Example1
struct A {
int x = 3;
int y = 4;
};
map<int, A&> myMap;
int main() {
A a;
myMap.insert(make_pair(1, a));
}
But if I change map<int, A&> myMap; to map<int, const A&> myMap;, it compiles.
Yet another problem occured. With map<int, const A&> myMap;, I can't use [] to get the pair, but I can use map.find().
(My boss told me to use map.find() after I told him using[] can't compile).
Code Example2
struct A {
int x = 3;
int y = 4;
};
map<int, const A&> myMap;
int main() {
A a;
myMap.insert(make_pair(1, a));
//can't compile
cout << myMap[1].x << " " << myMap[1].y << endl;
//can work
//auto it = myMap.find(1);
//cout << it->second.x << " " << it->second.y << endl;
}
So till here I was thinking my boss was correct. His code was correct.
The last story is that I showed the code to some online friends. And they noticed a problem.
Code Example3
#include <map>
#include <iostream>
#include <string>
using namespace std;
struct A {
int x = 3;
int y = 4;
~A(){
cout << "~A():" << x << endl;
x = 0;
y = 0;
}
};
map<string, const A&> myMap;
int main() {
A a;
cout << a.x << " " << a.y << endl;
myMap.insert(make_pair("love", a));
a.x = 999;
cout << "hello" << endl;
auto s = myMap.find("love");
cout << s->second.x << " " << s->second.y << endl;
}
The output is:
3 4
~A():3
hello
0 0
~A():999
If I understand the output correctly(correct me if I get it wrong), it indicates that:
make_pair("love", a) creates an object pair<"love", temproray copy of a>. And the pair gets inserted into myMap.
Somehow, I don't know how it happens, the temporary copy of a gets destructed immediately. To me, it means the memory of the temporary copy of a is now not owned by anyone and it is now a free space of memory that can be filled with any values, if I understand correctly.
So now I am getting confused again.
My questions are:
What happens to the Code Example3? Is my understanding correct? Why does temporary copy of a get destructed right after the statement? Isn't using a const reference can extend a temporary's lifetime? I mean, I think the it should not get destructed till main finishes.
Is my boss's code incorrect and very dangerous?
Why does temporary copy of a get destructed right after the statement?
Because (in most cases) that's how temporaries work. The live until the end of the statement in which they are created. The extension to a temporaries lifetime doesn't apply in this case, see here. The TLDR version is
In general, the lifetime of a temporary cannot be further extended by
"passing it on": a second reference, initialized from the reference to
which the temporary was bound, does not affect its lifetime.
can I use const reference as a map's value type?
Yes as long as you realise that adding a const reference to a map has no effect on the lifetime of the object being referred to. Your bosses code is also incorrect because the temporary returned by make_pair is destroyed at the end of the statement.
You may use std:: unique_ptr<A> instead. Then emplace instead of insert:
using value_t=std:: unique_ptr<A>;
std::map<int, value_t> myMap;
myMap.emplace(1,new A);
myMap[1]=new A{5,6};
myMap[1]->x=7;
more on std:: unique_ptr<A>:
https://en.cppreference.com/w/cpp/memory/unique_ptr
What happens to the Code Example3? Is my understanding correct?
Your explanation is close. The std::pair that is returned by std::make_pair is the temporary object. The temporary std::pair contains the copy of a. At the end of the expression the pair is destroyed, which also destroys its elements including the copy of a.
Why does temporary copy of a get destructed right after the statement? Isn't using a const reference can extend a temporary's lifetime? I mean, I think the it should not get destructed till main finishes.
The temporary here is the result of std::make_pair which is being used as an argument to the member function insert. The relevant rules that apply here are :
Whenever a reference is bound to a temporary or to a subobject thereof, the lifetime of the temporary is extended to match the lifetime of the reference, with the following exceptions:
[...]
a temporary bound to a reference parameter in a function call exists until the end of the full expression containing that function call [...]
[...]
source
The full expression containing the function call is the expression myMap.insert(make_pair(1, a)); This means that the lifetime of the result of std::make_pair ends after the function return, including the A it contains. The new std::map element will refer to the A in the temporary std::pair which will become dangling once insert returns.
Is my boss's code incorrect and very dangerous?
Yes, myMap contains a dangling references.
I have two classes class A and class B. Class A has a map of type map<int,int>.
In class A, i have the following definition,
typedef std::map<int, int> mymap;
mymap MYMAP;
A A_OBJ;
// did some insert operations on A_OBJ's MYMAP
I also have the following function in class A that when invoked by B will return A's MYMAP as a copy to class B.
A::mymap A::get_port(){
// Returns A's map
return this -> MYMAP;
}
In class B,
void B::initialize_table(){
A::mymap port_table = A_OBJ.get_port();
cout<< "size of ports table at A is"<<port_table.size());
}
Code got compiled with out any issue. The only problem is that even if I insert some data to A's map, B always shows that A's map has 0 elements.
I have a timer at B which calls initialize_table() every 2s and it is supposed to get the latest copy of the map from A. Not sure where it went wrong.
Any help is appreciated. Thanks.
You're correct that you're creating a copy of the std::map. The way to fix this is by initializing a reference to the map.
Consider the following stub:
#include <iostream>
#include <map>
using my_map = std::map<int, int>;
struct A {
A() : m() {}
my_map& get_my_map() { return m; }
my_map m;
};
struct B {
B() : a() {}
void initialize_map_ref();
void initialize_map_val();
void print_index_42() { std::cout << a.get_my_map()[42] << '\n'; }
A a;
};
void B::initialize_map_ref() {
// notice this is a reference
my_map& m = a.get_my_map();
m[42] = 43;
}
void B::initialize_map_val() {
// notice this is a copy
my_map m = a.get_my_map();
m[42] = 43;
}
int main() {
B b;
b.initialize_map_ref();
b.print_index_42();
return 0;
}
B::initialize_map_ref initializes a reference (i.e., a reference to the map within a), where B::initialize_map_val creates a copy and initializes the copy. The copy dies after the call, so outside of the call, m[42] == 0. The reference initialization, on the other hand, persists because you've changed a reference to the underlying object.
Ideone: With reference and with value.
I suggest returning by an R-value reference for the sake of efficiency (i.e. A::mymap&& A::get_port(){return A::mymap(this -> MYMAP;) };).
I can only see your code failing in one of two circumstances:
you are updating A_OBJ.MYMAP after you call A_OBJ.get_port()
This will cause the copy to be out of date.
or you are updating A_OBJ.get_port(). and then calling A_OBJ.get_port() again. This will cause the copy to be modified but the original map, to be left unmodified, resulting in the second call to A_OBJ.get_port() to not consider any changes made to the value returned by the previous.
Depending on what you want you may want to return a (const) reference to the map.
EDIT I mistakenly originally thought that A::mymap port_table = A_OBJ.get_port(); would cause two copies, but now I realize it will cause a copy and then a move, it would also do that in the rvalue case, however that is likely to introduce undefined behaviour (I think) because of returning a reference to a temporary... (originally I had it return this -> MYMAP but would be an error as it would try to bind an lvalue (this -> MYMAP) to an rvalue reference)
I'm trying to understand rvalue references and move semantics of C++11.
What is the difference between these examples, and which of them is going to do no vector copy?
First example
std::vector<int> return_vector(void)
{
std::vector<int> tmp {1,2,3,4,5};
return tmp;
}
std::vector<int> &&rval_ref = return_vector();
Second example
std::vector<int>&& return_vector(void)
{
std::vector<int> tmp {1,2,3,4,5};
return std::move(tmp);
}
std::vector<int> &&rval_ref = return_vector();
Third example
std::vector<int> return_vector(void)
{
std::vector<int> tmp {1,2,3,4,5};
return std::move(tmp);
}
std::vector<int> &&rval_ref = return_vector();
First example
std::vector<int> return_vector(void)
{
std::vector<int> tmp {1,2,3,4,5};
return tmp;
}
std::vector<int> &&rval_ref = return_vector();
The first example returns a temporary which is caught by rval_ref. That temporary will have its life extended beyond the rval_ref definition and you can use it as if you had caught it by value. This is very similar to the following:
const std::vector<int>& rval_ref = return_vector();
except that in my rewrite you obviously can't use rval_ref in a non-const manner.
Second example
std::vector<int>&& return_vector(void)
{
std::vector<int> tmp {1,2,3,4,5};
return std::move(tmp);
}
std::vector<int> &&rval_ref = return_vector();
In the second example you have created a run time error. rval_ref now holds a reference to the destructed tmp inside the function. With any luck, this code would immediately crash.
Third example
std::vector<int> return_vector(void)
{
std::vector<int> tmp {1,2,3,4,5};
return std::move(tmp);
}
std::vector<int> &&rval_ref = return_vector();
Your third example is roughly equivalent to your first. The std::move on tmp is unnecessary and can actually be a performance pessimization as it will inhibit return value optimization.
The best way to code what you're doing is:
Best practice
std::vector<int> return_vector(void)
{
std::vector<int> tmp {1,2,3,4,5};
return tmp;
}
std::vector<int> rval_ref = return_vector();
I.e. just as you would in C++03. tmp is implicitly treated as an rvalue in the return statement. It will either be returned via return-value-optimization (no copy, no move), or if the compiler decides it can not perform RVO, then it will use vector's move constructor to do the return. Only if RVO is not performed, and if the returned type did not have a move constructor would the copy constructor be used for the return.
None of them will copy, but the second will refer to a destroyed vector. Named rvalue references almost never exist in regular code. You write it just how you would have written a copy in C++03.
std::vector<int> return_vector()
{
std::vector<int> tmp {1,2,3,4,5};
return tmp;
}
std::vector<int> rval_ref = return_vector();
Except now, the vector is moved. The user of a class doesn't deal with it's rvalue references in the vast majority of cases.
The simple answer is you should write code for rvalue references like you would regular references code, and you should treat them the same mentally 99% of the time. This includes all the old rules about returning references (i.e. never return a reference to a local variable).
Unless you are writing a template container class that needs to take advantage of std::forward and be able to write a generic function that takes either lvalue or rvalue references, this is more or less true.
One of the big advantages to the move constructor and move assignment is that if you define them, the compiler can use them in cases were the RVO (return value optimization) and NRVO (named return value optimization) fail to be invoked. This is pretty huge for returning expensive objects like containers & strings by value efficiently from methods.
Now where things get interesting with rvalue references, is that you can also use them as arguments to normal functions. This allows you to write containers that have overloads for both const reference (const foo& other) and rvalue reference (foo&& other). Even if the argument is too unwieldy to pass with a mere constructor call it can still be done:
std::vector vec;
for(int x=0; x<10; ++x)
{
// automatically uses rvalue reference constructor if available
// because MyCheapType is an unamed temporary variable
vec.push_back(MyCheapType(0.f));
}
std::vector vec;
for(int x=0; x<10; ++x)
{
MyExpensiveType temp(1.0, 3.0);
temp.initSomeOtherFields(malloc(5000));
// old way, passed via const reference, expensive copy
vec.push_back(temp);
// new way, passed via rvalue reference, cheap move
// just don't use temp again, not difficult in a loop like this though . . .
vec.push_back(std::move(temp));
}
The STL containers have been updated to have move overloads for nearly anything (hash key and values, vector insertion, etc), and is where you will see them the most.
You can also use them to normal functions, and if you only provide an rvalue reference argument you can force the caller to create the object and let the function do the move. This is more of an example than a really good use, but in my rendering library, I have assigned a string to all the loaded resources, so that it is easier to see what each object represents in the debugger. The interface is something like this:
TextureHandle CreateTexture(int width, int height, ETextureFormat fmt, string&& friendlyName)
{
std::unique_ptr<TextureObject> tex = D3DCreateTexture(width, height, fmt);
tex->friendlyName = std::move(friendlyName);
return tex;
}
It is a form of a 'leaky abstraction' but allows me to take advantage of the fact I had to create the string already most of the time, and avoid making yet another copying of it. This isn't exactly high-performance code but is a good example of the possibilities as people get the hang of this feature. This code actually requires that the variable either be a temporary to the call, or std::move invoked:
// move from temporary
TextureHandle htex = CreateTexture(128, 128, A8R8G8B8, string("Checkerboard"));
or
// explicit move (not going to use the variable 'str' after the create call)
string str("Checkerboard");
TextureHandle htex = CreateTexture(128, 128, A8R8G8B8, std::move(str));
or
// explicitly make a copy and pass the temporary of the copy down
// since we need to use str again for some reason
string str("Checkerboard");
TextureHandle htex = CreateTexture(128, 128, A8R8G8B8, string(str));
but this won't compile!
string str("Checkerboard");
TextureHandle htex = CreateTexture(128, 128, A8R8G8B8, str);
Not an answer per se, but a guideline. Most of the time there is not much sense in declaring local T&& variable (as you did with std::vector<int>&& rval_ref). You will still have to std::move() them to use in foo(T&&) type methods. There is also the problem that was already mentioned that when you try to return such rval_ref from function you will get the standard reference-to-destroyed-temporary-fiasco.
Most of the time I would go with following pattern:
// Declarations
A a(B&&, C&&);
B b();
C c();
auto ret = a(b(), c());
You don't hold any refs to returned temporary objects, thus you avoid (inexperienced) programmer's error who wish to use a moved object.
auto bRet = b();
auto cRet = c();
auto aRet = a(std::move(b), std::move(c));
// Either these just fail (assert/exception), or you won't get
// your expected results due to their clean state.
bRet.foo();
cRet.bar();
Obviously there are (although rather rare) cases where a function truly returns a T&& which is a reference to a non-temporary object that you can move into your object.
Regarding RVO: these mechanisms generally work and compiler can nicely avoid copying, but in cases where the return path is not obvious (exceptions, if conditionals determining the named object you will return, and probably couple others) rrefs are your saviors (even if potentially more expensive).
None of those will do any extra copying. Even if RVO isn't used, the new standard says that move construction is preferred to copy when doing returns I believe.
I do believe that your second example causes undefined behavior though because you're returning a reference to a local variable.
As already mentioned in comments to the first answer, the return std::move(...); construct can make a difference in cases other than returning of local variables. Here's a runnable example that documents what happens when you return a member object with and without std::move():
#include <iostream>
#include <utility>
struct A {
A() = default;
A(const A&) { std::cout << "A copied\n"; }
A(A&&) { std::cout << "A moved\n"; }
};
class B {
A a;
public:
operator A() const & { std::cout << "B C-value: "; return a; }
operator A() & { std::cout << "B L-value: "; return a; }
operator A() && { std::cout << "B R-value: "; return a; }
};
class C {
A a;
public:
operator A() const & { std::cout << "C C-value: "; return std::move(a); }
operator A() & { std::cout << "C L-value: "; return std::move(a); }
operator A() && { std::cout << "C R-value: "; return std::move(a); }
};
int main() {
// Non-constant L-values
B b;
C c;
A{b}; // B L-value: A copied
A{c}; // C L-value: A moved
// R-values
A{B{}}; // B R-value: A copied
A{C{}}; // C R-value: A moved
// Constant L-values
const B bc;
const C cc;
A{bc}; // B C-value: A copied
A{cc}; // C C-value: A copied
return 0;
}
Presumably, return std::move(some_member); only makes sense if you actually want to move the particular class member, e.g. in a case where class C represents short-lived adapter objects with the sole purpose of creating instances of struct A.
Notice how struct A always gets copied out of class B, even when the class B object is an R-value. This is because the compiler has no way to tell that class B's instance of struct A won't be used any more. In class C, the compiler does have this information from std::move(), which is why struct A gets moved, unless the instance of class C is constant.
If I use a const reference to another member,
is it possible that this reference gets invalidated?
class Class {
public:
const int &x{y};
private:
int y;
};
For example when I use instances of this class in a vector
which increases its capacity after a push_back.
According to the standard all iterators and references are invalidated if
a vector has to increase its capacity. Is the reference still valid after that?
This is currently not safe, as when you copy an instance of Class, x will reference the y of the copied object, not its own y. You can see this by running the following code:
int main()
{
Class a{};
std::vector<Class> vec;
vec.push_back(a);
//these lines print the same address
std::cout << &(a.x) << std::endl;
std::cout << &(vec[0].x) << std::endl;
}
You can fix this by writing your own copy constructor and assignment functions to correctly initialize x:
Class (const Class& rhs) : x{y}, y{rhs.y} {}
This is safe, becausex and y will only be destroyed along with your object. Invalidation of references for std::vector means references to the vector elements:
Class c;
std::vector<Class> vec;
vec.push_back(c);
Class& cr = vec[0];
//other operations on vec
std::cout << c.x; //fine, that reference is internal to the class
std::cout << cr.x; //cr could have been invalidated
Assuming this is a reference to another member from the same instance, you need to override copy constructor to initialize it. Default copy constructor will copy the reference to 'y' from the old instance which might be invalidated.
Why do you need a reference to a member is another question.
P.S. also you need to override an assignment operator for the same reason.
In C++, operator-> has special semantics, in that if the returned type isn't a pointer, it will call operator-> again on that type. But, the intermediate value is kept as a temporary by the calling expression. This allows code to detect changes in the returned value:
template<class T>
class wrapper
{
// ...
T val;
struct arrow_helper
{
arrow_helper(const T& temp)
: temp(temp){}
T temp;
T* operator->() { return &temp; }
~arrow_helper() { std::cout << "modified to " << temp << '\n'; }
};
arrow_helper operator->() { return{ val }; }
//return const value to prevent mistakes
const T operator*() const { return val; }
}
and then T's members can be accessed transparently:
wrapper<Foo> f(/*...*/);
f->bar = 6;
Is there anything that could go wrong from doing this? Also, is there a way to get this effect with functions other than operator->?
EDIT: Another issue I've come across is in expressions like
f->bar = f->bar + 6;
since when the arrow_helper from the second operator-> is destructed it re-overwrites the value back to the original. My semi-elegant solution is for arrow_helper to have a T orig that is hidden, and assert(orig == *owner) in the destructor.
There is no guarantee that all changes will be caught:
Foo &x = f->bar;
x = 6; /* undetected change */
If there is no way to grab a reference to any data within T through T's interface or otherwise, I think this should be safe. If there's any way to grab such a pointer or reference, you're done and in undefined behavior as soon as someone saves off such reference and uses it later.