How to avoid unnecessary instances using rvalue references in C++

How to avoid unnecessary instances using rvalue references in C++ - c++

I would like to create a custom container Container that stores data in individual arrays. However, to facilitate easy iterations over the container, I provide a 'view' on the container by overloading operator[] and return a single struct Value that holds all container variables as references to the actual container. This is what I got so far:
#include <iostream>
using namespace std;
struct Value {
Value(int& data) : data_(data) { }
int& data() { return data_; }
int& data_;
};
struct Container {
Value makeValue(int i) { return Value(data_[i]); } // EDIT 1
Value&& operator[](int i) {
// return std::forward<Value>(Value(data_[i]));
return std::forward<Value>(makeValue(i)); // EDIT 1
}
int data_[5] = {1, 2, 3, 4, 5};
};
int main(int, char**)
{
// Create and output temporary
Container c;
cout << c[2].data() << endl; // Output: 3 - OK!
// Create, modify and output copy
Value v = c[2];
cout << v.data() << endl; // Output: 3 - OK!
v.data() = 8;
cout << v.data() << endl; // Output: 8 - OK!
// Create and output reference
Value&& vv = c[2];
cout << vv.data() << endl; // Output: 8 - OK, but weird:
// shouldn't this be a dangling reference?
cout << vv.data() << endl; // Output: 468319288 - Bad, but that's expected...
}
The code above is working as far as I can tell, but I'm wondering if I use the best approach here:
Is it correct to return the Value as an rvalue reference if I want to avoid unnecessary copying?
Is the use of std::forward correct? Should I use std::move (both will work in this example) or something else?
The output of the compiled program is stated in the comments. Is there any way I can avoid the dangling reference when I declare Value&& vv... (or even forbid it syntactically)?
EDIT 1
I made a small change to the source code so that the Value instance is not directly created in the operator[] method but in another helper function. Would that change anything? Should I use the makeValue(int i) method as shown or do I need to use std::move/std::forward in here?

Is it correct to return the Value as an rvalue reference if I want to avoid unnecessary copying?
No. Returning rvalue references from something that isn't a helper like std::move or std::forward is flat-out wrong. Rvalue references are still references. Returning a reference to a temporary or a local variable has always been wrong and it still is wrong. These are the same C++ rules of old.
Is the use of std::forward correct? Should I use std::move (both will work in this example) or something else?
The answer to the previous question kinda makes this one moot.
The output of the compiled program is stated in the comments. Is there any way I can avoid the dangling reference when I declare Value&& vv... (or even forbid it syntactically)?
It's not the Value&& vv = c[2]; part that creates a dangling reference. It's operator[] itself: see answer to the first question.
Rvalue references change pretty much nothing in this case. Just do things as you would have always done:
Value operator[](int i) {
return Value(data_[i]);
}
Any compiler worth using will optimise this into a direct initialisation of the return value without any copies or moves or anything. With dumb/worthless/weird/experimental compilers it will at worst involve a move (but why would anyone use such a thing for serious stuff?).
So, the line Value v = c[2]; will initialise v directly. The line Value&& vv = c[2]; will initialise a temporary and bind it to the rvalue reference variable. These have the same property as const& used to, and they extend the lifetime of the temporary to the lifetime of the reference, so it wouldn't be dangling.
In sum, the same old C++ of always still works, and still gives results that are both correct and performant. Do not forget it.

Returning a reference to a temporary objects, even if it is an r-value reference, is always wrong! By the time you access the object it will be gone. In that case it also doesn't do what you want it to do, anyway: if you want to avoid unnecessary copies, have one return statement returning a temporary! Copy/move elision will take care of the object not being copied:
Value operator[](int i) {
return Value(data_[i]);
}
Passing the temporary object through a function will inhibit copy/move elision and not copying/moving is even less work than moving.

Related

Reference initialization - temporary bound to return value

In an article about reference initialization at cppreference.com (Lifetime of a temporary), it says:
a temporary bound to a return value of a function in a return statement is not extended: it is destroyed immediately at the end of the return expression. Such function always returns a dangling reference.
This excerpt addresses the exceptions of extending the lifetime of a temporary by binding a reference to it. What do they actually mean by that? I've thought about something like
#include <iostream>
int&& func()
{
return 42;
}
int main()
{
int&& foo = func();
std::cout << foo << std::endl;
return 0;
}
So foo should be referencing the temporary 42. According to the excerpt, this should be a dangling reference - but this prints 42 instead of some random value, so it works perfectly fine.
I'm sure I'm getting something wrong here, and would appreciate if somebody could resolve my confusion.

Your example is very good, but your compiler is not.
A temporary is often a literal value, a function return value, but also an object passed to a function using the syntax "class_name(constructor_arguments)". For example, before lambda expressions were introduced to C++, to sort things one would define some struct X with an overloaded operator() and then make a call like this:
std::sort(v.begin(), v.end(), X());
In this case you expect that the lifetime of the temporary constructed with X() will end on the semicolon that ends the instruction.
If you call a function that expects a const reference, say, void f(const int & n), with a temporery, e.g. f(2), the compiler creates a temporary int, initailses it with 2, and passes a reference to this temporary to the function. You expect this temporary to end its life with the semicolon in f(2);.
Now consider this:
int && ref = 2;
std::cout << ref;
This code is perfectly valid. Notice, however, that here the compiler also creates a temporary object of type int and initalises it with 2. This is this temporary that ref binds to. However, if the temporary's lifetime was limited to the instruction it is created within, and ended on the semicolon that marks the end of instruction, the next instruction would be a disaster, as cout would be using a dangling reference. Thus, references to temporaries like the one above would be rather impractical. This is what the "extension of the lifetime of a temporary" is needed for. I suspect that the compiler, upon seeing something like int && ref = 2 is allowed to transform it to something like this
int tmp = 2;
int && ref = std::move(tmp);
std::cout << ref; // equivalent to std::cout << tmp;
Without lifetime expansion, this could look rather like this:
{
int tmp = 2;
int && ref = std::move(tmp);
}
std::cout << ref; // what is ref?
Doing such a trick in a return statement would be pointless. There's no reasonable, safe way to extend the lifetime of any object local to a function.
BTW. Most modern compilers issue a warning and reduce your function
int&& func()
{
return 42;
}
to
int&& func()
{
return nullptr;
}
with an immediate segfault upon any attempt to dereference the return value.

Pass by Reference an vector inline

i'm trying to pass an vector by reference, which gets modified inside the function (lets say, something like sorting the vector)
void dummy(vector<int> &v) { sort(v.begin(), v.end()); }
The above works only when creating the vector like this and passing the reference, which is expected.
int main() {
vector<int> v = {1, 2, 3};
dummy(v);
}
I'm trying to figure out, if there is an inline way of doing this ? Usually, if the vector is not getting modified we can do something like this -
int main() {
dummy({1,2,3})
}
But, when the vector gets modified, it throws an compilation error saying - cannot bind non-const lvalue reference of type 'std::vector&' to an rvalue of type 'std::vector. So, is there a way to send the vector's reference inline?

In that case you should write an overload for an rvalue reference, namely:
void dummy(vector<int>&&);
This will work with the temporary object passed to the function.

If the vector is not getting modified, you can use const to reference:
void dummy(const vector<int> &v) { .. }
dummy(vector<int>{1, 2, 3});
You can also use universal references &&:
void dummy(vector<int> &&v) { .. }

To elaborate on masoud's answer, if you are passing an outside vector into a function that accepts is as a universal reference you may need to use std::move to avoid a compiler error or implicit duplication of the object. It's best practice to explicitly use std::move when passing something that is not already an explicit universal reference.
The compiler error is intended to avoid a mistake by the developer (accessing the object after its ownership has moved which may have undefined or unexpected behavior depending on the object type). So std::move implies your intention to forfeit outside ownership of the object in question. And depending on what class you are using, it may duplicate the object before passing it in as a universal reference, so std::move would prevent that unintended possibility as well.
void dummy(vector<int> &&v) { ... }
vector<int> myVector = {1, 2, 3};
dummy(std::move(myVector));

optional<reference_wrapper<T>> vs. optional<T>& - practical examples?

I have read about std::optional<std::reference_wrapper<T>> as a way to pass around optional references.
However, I'm unable to think of a practical example where I'd do that, instead of just using an optional<T>&.
For example, suppose I'm writing a function which needs to take an optional vector<int> by reference.
I could do this:
void f(optional<reference_wrapper<vector<int>>> v) {
// ...
}
int main() {
vector<int> v = {1, 2, 3, 4};
f(make_optional<reference_wrapper<vector<int>>>(std::ref(v));
return 0;
}
But why not just do this?
void f(optional<vector<int>>& v) {
// ...
}
int main() {
f(make_optional<vector<int>>>(std::initializer_list{1, 2, 3, 4}));
return 0;
}
Please give an example where optional<reference_wrapper<T>> is preferable to optional<T>&. The semantic differences, and especially the ways they can be leveraged in practice aren't clear to me.

std::optional<T> & is a reference to an optional object that can own a T object. You can mutate a T (if one is contained) or you can clear the optional object that was passed in by reference, destroying the contained T.
std::optional<std::reference_wrapper<T>> is an optional object that can own a reference to a T, but it doesn't actually own the T itself. The T lives outside of the std::optional object. You can mutate the T (if a reference is contained) or you can clear the optional object, which does not destroy the T. You can also make the optional object point at a different T, but this would be kind of pointless since the caller is passing you an optional by value.
Note that we already have a type built-in to the language that means "optional reference to a T": T*. Both a raw pointer and an optional reference have basically the same semantics: you either get nothing or you get a handle to an object you don't own. In modern C++, a raw pointer is the way to express an optional value not owned by the receiver.
I can't think of a single reason I'd ever explicitly use std::optional<std::reference_wrapper<T>> instead of T*.

But why not just do this?
Because your code doesn't compile. make_optional returns a prvalue, and you cannot pass a prvalue to a function that takes a non-const lvalue reference.
That's important because it shows the fundamental difference between these two cases. If you already have a T or a reference to a T from somewhere else, then you cannot pass that to a function that takes an optional<T>&. You'd have to copy the T into an optional<T> variable, then pass a reference to the optional variable to the function.
You wouldn't be able to modify the outside world's T. And that's the difference: with reference_wrapper<T>, you could.
Or if you have a function that can work with or without a modifiable T, you could just pass a T* like most people would.

As the other answers state, the two types serve different purposes, as the reference is to different things (a reference to an optional in one case and a reference to a vector in the other). Rather than repeating the explanation, here is some code you can play with to see the functional differences.
#include <iostream>
#include <vector>
#include <functional>
#include <optional>
// For better readability:
using optional_reference_vector = std::optional<std::reference_wrapper<std::vector<int>>>;
using optional_vector = std::optional<std::vector<int>>;
void f(optional_reference_vector v) {
v->get().push_back(5);
}
void g(optional_vector & w) {
w->push_back(5);
}
int main() {
// Two identical vectors with which to work:
std::vector<int> v = {1, 2, 3, 4};
std::vector<int> w = {1, 2, 3, 4};
// Demonstrate an optional reference to a vector
// ---------------------------------------------
// Create a reference to `v` in `opt_v`.
// Changes to `opt_v` will be reflected in `v` (and vice versa).
optional_reference_vector opt_v {std::ref(v)};
v.clear();
// A copy of `opt_v` will be made in f(). Since we are copying a reference to
// a vector and not the vector itself, the vector in main() is changed by f().
f(opt_v);
// Both `v` and `opt_v` refer to the same vector, so the size is the same.
std::cout << "Using a reference to the vector:\n"
<< "Original vector size: " << v.size() << '\n'
<< "Optional vector size: " << opt_v->get().size() << "\n\n";
// Demonstrate a reference to an optional vector
// ---------------------------------------------
// Copy `w` into `opt_w`.
// Changes to `opt_w` have no effect on `w` (and vice versa).
optional_vector opt_w {w};
w.clear();
// A reference to `opt_w` will be used in g(), so `opt_w` is updated.
g(opt_w);
// There are two vectors that now have different sizes.
std::cout << "Using a copy of the vector:\n"
<< "Original vector size: " << w.size() << '\n'
<< "Optional vector size: " << opt_w->size() << '\n';
return 0;
}
The output from this code:
Using a reference to the vector:
Original vector size: 1
Optional vector size: 1
Using a copy of the vector:
Original vector size: 0
Optional vector size: 5

Reference to element of vector returned by a function in C++

Can someone verify that the following is a BUG, and explain why? I think I know, but am unclear about the details. (My actual problem involved a vector of enums, not ints, but I don't think it should matter.) Suppose I have the following code:
std::vector<int> f (void) {
std::vector<int> v;
v.push_back(5);
return v;
}
void g (void) {
const int &myIntRef = f()[0];
std::cout << myIntRef << std::endl;
}
Am I correct that myIntRef is immediately a dangling reference, because the return value of f is saved nowhere on the stack?
Also, is the following a valid fix, or is it still a bug?
const int myIntCopy = f()[0]; // copy, not a reference
In other words, is the return result of f() thrown away before the 0th element can be copied?

That is a bug. At the end of the complete expression const int &myIntRef = f()[0]; the temporary vector will be destroyed and the memory released. Any later use of myIntRef is undefined behavior.
Under some circumstances, binding a reference to a temporary can extend the lifetime of the temporary. This is not one of such cases, the compiler does not know whether the reference returned by std::vector<int>::operator[] is part of the temporary or a reference to an int with static storage duration or any other thing, and it won't extend the lifetime.

Yes, it is wrong thing to do indeed. When you call:
return v;
temporary copy of object v is being created and
const int &myIntRef = f()[0];
initializes your reference with the first element of this temporary copy. After this line, the temporary copy no longer exists, meaning that myIntRef is an invalid reference, using of which produces undefined behavior.
What you should do is:
std::vector<int> myVector = f();
const int &myIntRef = myVector[0];
std::cout << myIntRef << std::endl;
which (thanks to copy elision) uses an assignment operator to initialize myVector object by using v without copy of v being created. In this case the lifetime of your reference is equal to the lifetime of myVector object, making it perfectly valid code.
And to your second question:
"Also, is the following a valid fix, or is it still a bug?"
const int myIntCopy = f()[0]; // copy, not a reference
Yes, this is another possible solution. f()[0] will access the first element of the temporary copy and use its value to initialize myIntCopy variable. It is guaranteed that the copy of v returned by f() exists at least until the whole expression is executed, see C++03 Standard 12.2 Temporary objects §3:
Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created.

Intuitive understanding of functions taking references of references [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What does T&& mean in C++11?
For some reason, this is eluding my intuition, and I cannot find any explanation on the internet. What does it mean for a C++ function to take a reference of a reference? For example:
void myFunction(int&& val); //what does this mean?!
I understand the idea of passing-by-reference, so
void addTwo(int& a)
{
a += 2;
}
int main()
{
int x = 5;
addTwo(x);
return 0;
}
works and is intuitive to me.

This is not a reference of a reference, but rather a new language feature called an rvalue reference that represents (informally) a reference to an object in memory that isn't referenced elsewhere in the program and can be destructively modified. For example, the return value of a function can be captured by an rvalue reference, as can temporary values introduced into expressions.
Rvalue references can be used for a variety of purposes. From the perspective of most C++ programmers, they can be used to implement move semantics, whereby a new object can be initialized by "moving" the contents of an old object out of the old object and into a new object. You can use this to return huge objects from functions in C++11 without paying a huge cost to copy the object, since the object used to capture the return value can be initialized using the move constructor by just stealing the internals from the temporary object created by the return statement.
Move semantics are orthogonal to copy semantics, so objects can be movable without being copyable. For example, std::ofstreams are not copyable, but they will be movable, so you could return std::ofstreams from functions using the move behavior. This currently cannot be done in C++03. For example, this code is illegal in C++03 but perfectly fine (and encouraged!) in C++11:
std::ifstream GetUserFile() {
while (true) {
std::cout << "Enter filename: ";
std::string filename;
std::getline(std::cin, filename);
ifstream input(filename); // Note: No .c_str() either!
if (input) return input;
std::cout << "Sorry, I couldn't open that file." << std::endl;
}
}
std::ifstream file = GetUserFile(); // Okay, move stream out of the function.
Intuitively, a function that takes an rvalue reference is a function that (probably) is trying to avoid an expensive copy by moving the contents of an old object into a new object. For example, you could define a move constructor for a vector-like object by having that constructor take in an rvalue reference. If we represent the vector as a triple of a pointer to an array, the capacity of the array, and the used space, we might implement its move constructor as follows:
vector::vector(vector&& rhs) {
/* Steal resources from rhs. */
elems = rhs.elems;
size = rhs.size;
capacity = rhs.capacity;
/* Destructively modify rhs to avoid having two objects sharing
* an underlying array.
*/
rhs.elems = nullptr; // Note use of nullptr instead of NULL
rhs.size = 0;
rhs.capacity = 0;
}
It's important to notice that when we clear out rhs at the end of the constructor that we end up putting rhs into such a state that
Will not cause a crash when its destructor invokes (notice that we set its element pointer to nullptr, since freeing nullptr is safe), and
Still lets the object be assigned a new value. This latter point is tricky, but it's important to ensure that you can still give the cleared-out object a new value at some point. This is because it is possible to obtain an rvalue reference to an object that can still be referenced later in the program.
To shed some light on (2), one interesting use case for rvalue references is the ability to explicitly move values around between objects. For example, consider this idiomatic implementation of swap:
template <typename T> void swap(T& lhs, T& rhs) {
T temp = lhs;
lhs = rhs;
rhs = temp;
}
This code is legal, but it's a bit unusual. In particular, it ends up making three copies - first when setting temp equal to a copy of lhs, once setting lhs to be a copy of rhs, and once setting rhs to be a copy of temp. But we don't really want to be making any copies at all here; instead, we just want to shuffle the values around. Consequently, in C++11, you'll be able to explicitly get rvalue references to objects by using the std::move function:
template <typename T> void swap(T& lhs, T& rhs) {
T temp = std::move(lhs);
lhs = std::move(rhs);
rhs = std::move(temp);
}
Now, no copies are made at all. We move the contents of lhs into temp, then move the contents of rhs into lhs, then moves the contents of temp into rhs. In doing so, we left both lhs and rhs in an "emptied" state temporarily before putting new values into them. It's important that when writing the code to move the contents out of an object that we leave the object in a somewhat well-formed state so that this code works correctly.

It's not a reference to a reference. It's a new syntax introduced in C++0x for so-called Rvalue references.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to avoid unnecessary instances using rvalue references in C++ - c++

Related

Reference initialization - temporary bound to return value

Pass by Reference an vector inline

optional<reference_wrapper<T>> vs. optional<T>& - practical examples?

Reference to element of vector returned by a function in C++

Intuitive understanding of functions taking references of references [duplicate]

Categories

Resources