How does wrapping a pointer work with rvalue references? - c++

std::shared_ptr::operator* returns by lvalue reference, and the answer given on overloading pointer like operations here says that the convention is to return by lvalue reference. However, when I'm using the following code, I get error C2664: 'AdjacencyList::addVertex' : cannot convert parameter 1 from 'AdjacencyList::vertex_type' to 'AdjacencyList::vertex_type &&': You cannot bind an lvalue to an rvalue reference:
std::shared_ptr<vertex_type> AdjacencyList::addVertex(vertex_type&& v)
{
auto existingVertex(findVertex(v));
if (!existingVertex.isValid())
{
existingVertex = std::make_shared<vertex_type>(std::forward<vertex_type>(v))
m_vertices.push_back(existingVertex);
}
return existingVertex;
};
AdjacencyList minimumSpanningTree;
// startVertex is a shared_ptr to a vertex returned from a previous call of addVertex
// on another AdjacencyList object
const auto mstStartVertex(minimumSpanningTree.addVertex(*startVertex));
Should I provide AdjacencyList::addVertex(const vertex_type& v) or change the code at the bottom of the above block to make a copy of the vertex before passing to addVertex?
AdjacencyList minimumSpanningTree;
Vertex s(*startVertex);
const auto mstStartVertex(minimumSpanningTree.addVertex(std::move(s)));

I would think that you should return a copy from your operator*, as the sematics of the std::weak_ptr suggest that you can not guarantee that a returned reference would stay valid. Since the returned copy is then given to a function which can move it somewhere else, it should also be efficient enough, since addVertex looks like it would require a copy anyways, i.e., if you would create an overload of addVertex, it will create a copy of the passed const reference internally, would it?

The most efficient approach in terms of redundant copies is to provide rvalue and const reference overloads:
std::shared_ptr<vertex_type> AdjacencyList::addVertex(vertex_type&&);
std::shared_ptr<vertex_type> AdjacencyList::addVertex(const vertex_type&);
To eliminate the redundant code, you can forward to a template method or to a concrete method taking a bool flag and performing const_cast as appropriate.
If the overhead of copying the Vertex object is minimal compared to the cost of increased code, and if the if block is usually or often entered, then the redundant copy will make your code clearer. Your second suggested call will work better if you just create a prvalue temporary that doesn't need to be moved:
const auto mstStartVertex(minimumSpanningTree.addVertex(Vertex{*startVertex}));
However in that case you might as well create the temporary in the call itself, by providing a single value overload (How to reduce redundant code when adding new c++0x rvalue reference operator overloads):
std::shared_ptr<vertex_type> AdjacencyList::addVertex(vertex_type);

Related

Is moving objects upon call by value a bad practice compared to passing by const reference?

I came across a C++17 code base where functions always accept parameters by value, even if a const reference would work, and no semantic reason for passing by value is apparent. The code then explicitly uses a std::move when calling functions. For instance:
A retrieveData(DataReader reader) // reader could be passed by const reference.
{
A a = { };
a.someField = reader.retrieveField(); // retrieveField is a const function.
return a;
}
auto someReader = constructDataReader();
auto data = retrieveData(std::move(someReader)); // Calls explicitly use move all the time.
Defining functions with value parameters by default and counting on move semantics like this seems like a bad practice, but is it? Is this really faster/better than simply passing lvalues by const reference, or perhaps creating a && overload for rvalues if needed?
I'm not sure how many copies modern compilers would do in the above example in case of a call without an explicit move on an lvalue, i.e. retrieveData(r).
I know a lot has been written on the subject of moving, but would really appreciate some clarification here.

Pass by value to avoid interface duplication [duplicate]

I'm learning C++ at the moment and try avoid picking up bad habits.
From what I understand, clang-tidy contains many "best practices" and I try to stick to them as best as possible (even though I don't necessarily understand why they are considered good yet), but I'm not sure if I understand what's recommended here.
I used this class from the tutorial:
class Creature
{
private:
std::string m_name;
public:
Creature(const std::string &name)
: m_name{name}
{
}
};
This leads to a suggestion from clang-tidy that I should pass by value instead of reference and use std::move.
If I do, I get the suggestion to make name a reference (to ensure it does not get copied every time) and the warning that std::move won't have any effect because name is a const so I should remove it.
The only way I don't get a warning is by removing const altogether:
Creature(std::string name)
: m_name{std::move(name)}
{
}
Which seems logical, as the only benefit of const was to prevent messing with the original string (which doesn't happen because I passed by value).
But I read on CPlusPlus.com:
Although note that -in the standard library- moving implies that the moved-from object is left in a valid but unspecified state. Which means that, after such an operation, the value of the moved-from object should only be destroyed or assigned a new value; accessing it otherwise yields an unspecified value.
Now imagine this code:
std::string nameString("Alex");
Creature c(nameString);
Because nameString gets passed by value, std::move will only invalidate name inside the constructor and not touch the original string. But what are the advantages of this? It seems like the content gets copied only once anyhow - if I pass by reference when I call m_name{name}, if I pass by value when I pass it (and then it gets moved). I understand that this is better than passing by value and not using std::move (because it gets copied twice).
So two questions:
Did I understand correctly what is happening here?
Is there any upside of using std::move over passing by reference and just calling m_name{name}?
/* (0) */
Creature(const std::string &name) : m_name{name} { }
A passed lvalue binds to name, then is copied into m_name.
A passed rvalue binds to name, then is copied into m_name.
/* (1) */
Creature(std::string name) : m_name{std::move(name)} { }
A passed lvalue is copied into name, then is moved into m_name.
A passed rvalue is moved into name, then is moved into m_name.
/* (2) */
Creature(const std::string &name) : m_name{name} { }
Creature(std::string &&rname) : m_name{std::move(rname)} { }
A passed lvalue binds to name, then is copied into m_name.
A passed rvalue binds to rname, then is moved into m_name.
As move operations are usually faster than copies, (1) is better than (0) if you pass a lot of temporaries. (2) is optimal in terms of copies/moves, but requires code repetition.
The code repetition can be avoided with perfect forwarding:
/* (3) */
template <typename T,
std::enable_if_t<
std::is_convertible_v<std::remove_cvref_t<T>, std::string>,
int> = 0
>
Creature(T&& name) : m_name{std::forward<T>(name)} { }
You might optionally want to constrain T in order to restrict the domain of types that this constructor can be instantiated with (as shown above). C++20 aims to simplify this with Concepts.
In C++17, prvalues are affected by guaranteed copy elision, which - when applicable - will reduce the number of copies/moves when passing arguments to functions.
Did I understand correctly what is happening here?
Yes.
Is there any upside of using std::move over passing by reference and just calling m_name{name}?
An easy to grasp function signature without any additional overloads. The signature immediately reveals that the argument will be copied - this saves callers from wondering whether a const std::string& reference might be stored as a data member, possibly becoming a dangling reference later on. And there is no need to overload on std::string&& name and const std::string& arguments to avoid unnecessary copies when rvalues are passed to the function. Passing an lvalue
std::string nameString("Alex");
Creature c(nameString);
to the function that takes its argument by value causes one copy and one move construction. Passing an rvalue to the same function
std::string nameString("Alex");
Creature c(std::move(nameString));
causes two move constructions. In contrast, when the function parameter is const std::string&, there will always be a copy, even when passing an rvalue argument. This is clearly an advantage as long as the argument type is cheap to move-construct (this is the case for std::string).
But there is a downside to consider: the reasoning doesn't work for functions that assign the function argument to another variable (instead of initializing it):
void setName(std::string name)
{
m_name = std::move(name);
}
will cause a deallocation of the resource that m_name refers to before it's reassigned. I recommend reading Item 41 in Effective Modern C++ and also this question.
How you pass is not the only variable here, what you pass makes the big difference between the two.
In C++, we have all kinds of value categories and this "idiom" exists for cases where you pass in an rvalue (such as "Alex-string-literal-that-constructs-temporary-std::string" or std::move(nameString)), which results in 0 copies of std::string being made (the type does not even have to be copy-constructible for rvalue arguments), and only uses std::string's move constructor.
Somewhat related Q&A.
There are several disadvantages of pass-by-value-and-move approach over pass-by-(rv)reference:
it causes 3 objects to be spawned instead of 2;
passing an object by value may lead to extra stack overhead, because even regular string class is typically at least 3 or 4 times larger than a pointer;
argument objects construction is going to be done on the caller side, causing code bloat;

passing as rvalue reference vs double-move value

Suppose I have the following class:
class foo {
std::unique_ptr<blah> ptr;
}
What's the difference between these two:
foo::foo(unique_ptr p)
: ptr(std::move(p))
{ }
and
foo::foo(unique_ptr&& p)
: ptr(std::move(p)
{ }
When called like
auto p = make_unique<blah>();
foo f(std::move(p));
Both compile, and I guess both must use unique_ptr's move constructors? I guess the first one it'll get moved twice, but the second one it'll only get moved once?
They do the same, which is moving the pointer twice (which translates to 2 casts).
The only practical difference is the place where the code would break if you were to pass a std::unique_ptr by value.
Having foo::foo(unique_ptr p) the compiler would complain about the copy constructor being deleted.
Having foo::foo(unique_ptr&& p) it would say that there is no matching function for the set of arguments provided.
std::unique_ptr is a non-copyable class, this means that you either pass a constant reference (readonly access) or move it (rvalue) to express/give ownership.
The second case (getting explicit rvalue) is the proper implementation, as your intent is passing the ownership and std::unique_ptr has deleted copy assignment/constructor.
In the first case, copy is prohibited and the arg is an rvalue (you moved it before passing) so the compiler uses the move constructor. Then the ownership belongs to the new argument and you move it again to the internal field passing the ownership again, not a very nice solution. (Note that this declaration isn't restrictive, it allows any value type. If std::unique_ptr would be copyable, copy contructor would be used here, hiding the actual problem).
In terms of resulting code, std::move is a simple cast to rvalue, so both do the same. In terms of correctness, second one must be used.

Why do we pass a reference to the object as an argument to the overloaded output operator

Why is it so important to be the reference and not just a copy of the object? For instance:
ostream& operator<<(ostream& out, const X & _class);
ostream& operator<<(ostream& out, const X _class);
What do we lose/win if we don't pass it as a reference?
In general, const& is preferred because, except for easy-to-copy types (Such as basic types) , copying is expensive (I recall, not always). But note that pass by value means the internal value of the function has nothing to do with the value passed to the function. That allows the compiler to do some assumptions and perform better optimizations in some cases. So in some cases, passing by value is better.
One of such cases is when you need a copy of the passed parameter:
void f( T param )
{
/* do something mutable with param */
}
In that cases, passing by value is prefereable over passing by const reference + hand copy, because the compiler could do assumptions based on value-semantics, and optimize the code. The rule is: Let the compiler decide how to pass by value.
In the case of streams, C++ streams are not copyable, thats why they are passed by reference. Is a non-const reference because IO operations change the internal state of the stream.
"reasonable pessimism" would summarise why we do this, and indeed why we prefer to pass a reference for any non-trivial object for which we don't need a copy.
We can be reasonably pessimistic that for anything other than a native type, making a copy is inefficient when compared to accessing the object via a reference.
We can also expect that not all objects are copyable, so writing a function that demands that our arguments are copyable is not only a possible inefficiency, it may well also lead to a program that cannot be compiled.
We can also expect that some objects' copy constructors will have side-effects (such as the deprecated auto_ptr). If we just want to query the state of an object, these side-effects would be undesirable. In the case of the auto_ptr, they would result in the deletion of the object controlled by the auto_ptr at the end of your function. Catastrophic.
The general rule would be:
If you are just going to read the object, pass a const reference
If you are going to modify the object, pass a reference (or pointer).
If you are definitely going to make a copy of the object, pass it by
value.
If you might make a copy, then either pass by const reference
(optimistic that we won't need to make a copy) or by value
(reasonably confident that the copy is required).
in the general case, passing a const& to a function is more efficient since it avoids making a copy.
Well the answer is obvious. Because if you do not, then the actual object will not be modified. Instead a copy will made and the copy will be modified, then later the copy will get destroyed.
If it is not const reference, then you need define a copy constructor to get it correct, and calling copy constructors would cost more memory and CPU obviously which is really unnecessary.
Let me put it straight. When you pass a reference of the object and modify the object contents in the definition of the overloaded operator, the same will get reflected on the object. For example: (Though a + operator is never overloaded this way for complex numbers, the example is just to prove a point). Say for an overloaded + operator
complex1& operator+(complex1 a)
{
a.real = a.real+1;
real=real+a.real;
img=img+a.img;
return *(this);
}
int main()
{
complex1 c1(1,2),c2(2.4,6.3);
complex1 c3 = c1+c2;
cout<<c2;
return 0;
}
Here, the changes made in real part of c2(i.e. addition of 1) will not be reflected when it is printed and will still be 2.4 if a reference is not passed. Thus passing a reference will increase the value of its real part by 1.
Secondly, passing a reference is more efficient as you pass only a reference to that object unlike passing by value where all the properties of the object gets copied.

Working around the C++ limitation on non-const references to temporaries

I've got a C++ data-structure that is a required "scratchpad" for other computations. It's not long-lived, and it's not frequently used so not performance critical. However, it includes a random number generator amongst other updatable tracking fields, and while the actual value of the generator isn't important, it is important that the value is updated rather than copied and reused. This means that in general, objects of this class are passed by reference.
If an instance is only needed once, the most natural approach is to construct them whereever needed (perhaps using a factory method or a constructor), and then passing the scratchpad to the consuming method. Consumers' method signatures use pass by reference since they don't know this is the only use, but factory methods and constructors return by value - and you can't pass unnamed temporaries by reference.
Is there a way to avoid clogging the code with nasty temporary variables? I'd like to avoid things like the following:
scratchpad_t<typeX<typeY,potentially::messy>, typename T> useless_temp = factory(rng_parm);
xyz.initialize_computation(useless_temp);
I could make the scratchpad intrinsically mutable and just label all parameters const &, but that doesn't strike me as best-practice since it's misleading, and I can't do this for classes I don't fully control. Passing by rvalue reference would require adding overloads to all consumers of scratchpad, which kind of defeats the purpose - having clear and concise code.
Given the fact that performance is not critical (but code size and readability are), what's the best-practice approach to passing in such a scratchpad? Using C++0x features is OK if required but preferably C++03-only features should suffice.
Edit: To be clear, using a temporary is doable, it's just unfortunate clutter in code I'd like to avoid. If you never give the temporary a name, it's clearly only used once, and the fewer lines of code to read, the better. Also, in constructors' initializers, it's impossible to declare temporaries.
While it is not okay to pass rvalues to functions accepting non-const references, it is okay to call member functions on rvalues, but the member function does not know how it was called. If you return a reference to the current object, you can convert rvalues to lvalues:
class scratchpad_t
{
// ...
public:
scratchpad_t& self()
{
return *this;
}
};
void foo(scratchpad_t& r)
{
}
int main()
{
foo(scratchpad_t().self());
}
Note how the call to self() yields an lvalue expression even though scratchpad_t is an rvalue.
Please correct me if I'm wrong, but Rvalue reference parameters don't accept lvalue references so using them would require adding overloads to all consumers of scratchpad, which is also unfortunate.
Well, you could use templates...
template <typename Scratch> void foo(Scratch&& scratchpad)
{
// ...
}
If you call foo with an rvalue parameter, Scratch will be deduced to scratchpad_t, and thus Scratch&& will be scratchpad_t&&.
And if you call foo with an lvalue parameter, Scratch will be deduced to scratchpad_t&, and because of reference collapsing rules, Scratch&& will also be scratchpad_t&.
Note that the formal parameter scratchpad is a name and thus an lvalue, no matter if its type is an lvalue reference or an rvalue reference. If you want to pass scratchpad on to other functions, you don't need the template trick for those functions anymore, just use an lvalue reference parameter.
By the way, you do realize that the temporary scratchpad involved in xyz.initialize_computation(scratchpad_t(1, 2, 3)); will be destroyed as soon as initialize_computation is done, right? Storing the reference inside the xyz object for later user would be an extremely bad idea.
self() doesn't need to be a member method, it can be a templated function
Yes, that is also possible, although I would rename it to make the intention clearer:
template <typename T>
T& as_lvalue(T&& x)
{
return x;
}
Is the problem just that this:
scratchpad_t<typeX<typeY,potentially::messy>, typename T> useless_temp = factory(rng_parm);
is ugly? If so, then why not change it to this?:
auto useless_temp = factory(rng_parm);
Personally, I would rather see const_cast than mutable. When I see mutable, I'm assuming someone's doing logical const-ness, and don't think much of it. const_cast however raises red flags, as code like this should.
One option would be to use something like shared_ptr (auto_ptr would work too depending on what factory is doing) and pass it by value, which avoids the copy cost and maintains only a single instance, yet can be passed in from your factory method.
If you allocate the object in the heap you might be able to convert the code to something like:
std::auto_ptr<scratch_t> create_scratch();
foo( *create_scratch() );
The factory creates and returns an auto_ptr instead of an object in the stack. The returned auto_ptr temporary will take ownership of the object, but you are allowed to call non-const methods on a temporary and you can dereference the pointer to get a real reference. At the next sequence point the smart pointer will be destroyed and the memory freed. If you need to pass the same scratch_t to different functions in a row you can just capture the smart pointer:
std::auto_ptr<scratch_t> s( create_scratch() );
foo( *s );
bar( *s );
This can be replaced with std::unique_ptr in the upcoming standard.
I marked FredOverflow's response as the answer for his suggestion to use a method to simply return a non-const reference; this works in C++03. That solution requires a member method per scratchpad-like type, but in C++0x we can also write that method more generally for any type:
template <typename T> T & temp(T && temporary_value) {return temporary_value;}
This function simply forwards normal lvalue references, and converts rvalue references into lvalue references. Of course, doing this returns a modifiable value whose result is ignored - which happens to be exactly what I want, but may seem odd in some contexts.