C++ move semantics and code duplication as a result of method overloading - c++

Before I begin, I didn't find anything that completely explains my question in an other stackoverflow post so I decided to create my own. Apologies if it is already answered somewhere else(kindly point me to the existing post if it does exist).
Lets say we have the following two methods inside of a class:
int do_stuff(Thing& thing){ /* Do stuff... */} // L-value version
int do_stuff(Thing&& thing){ /* Do stuff... */} // R-value version
From what I have read, modern C++ has pretty much abandoned this kind of logic and it is recommending to just pass the Thing by value and let the compiler do it's magic. My question is, If I want to have two separate methods that explicitly handles L-values/R-values and avoid code duplication, which of the following is best(performance wise and as in best practice)?
int do_stuff(Thing& thing){ return do_stuff(std::move(thing)); } // The L-value version uses the R-value one
Or
int do_stuff(Thing&& thing){ return do_stuff(thing); } // The R-value version uses the L-value one since text is an L-value inside the scope of do_stuff(Thing&&)
Edit: The purpose of the question is for me to understand this simple case of move semantics and not to create a valid C++ API.
Edit #2: The print and std::string parts of the question are used as an example. They can be anything.
Edit #3: Renamed the example code. The methods do modify the Thing object.

If print doesn't change anything and only prints the string, it's best to take a const std::string & as const std::string & is able to bind to both lvalues and rvalues.
int print(const std::string& text) {}

Passing an argument by value does not mean it can't be an rvalue reference. The && just means that the parameter has to be an rvalue reference. Not having && doesn't mean the parameter can't be an rvalue reference.
When the argument is only used by the function, and with that, I mean if it's not modified, the best way to declare your function is:
int do_stuff(const Thing& thing);
That way, it is very clear to the reader that thing won't be modified. For most of the other cases, you should simply declare your function as:
int do_stuff(Thing thing);
passing the parameter by value, and not by reference or rvalue reference.
It used to be common to write code like this:
int do_stuff(Thing& thing)
{
/* change thing so that the caller can use the changed thing */
return success; // where success is an int
}
However, nowadays, it is often preferred to return the modified thing:
Thing do_stuff(Thing thing) { /* return modified thing */ }
In the example above:
int do_stuff(Thing thing);
the caller decides whether or not thing should be a copy:
do_stuff(my_thing); // copy - I need the original my_thing
do_stuff(std::move(thing)); // no copy - I don't need the original my_thing
Note that this declaration of do_stuff covers both of your versions:
int do_stuff(Thing&);
int do_stuff(Thing&&);
That said, you almost never need functions like:
int do_stuff(Thing&&);
unless for objects which cannot be copied like stream objects.

There is no difference in performance. std::move does nothing but casts the type of its argument, so a smart compiler will omit the call to std::move, and even omit the redundant call to do_stuff. You can see under -O2, in either case, GCC compiles the do_stuff that calls the other do_stuff to a simple jmp command to the other do_stuff.
So it is opinion-based which way is better. I personally like the second way because it is shorter.

Related

Write overloads for const reference and rvalue reference

Recently I find myself often in the situation of having a single function that takes some object as a parameter. The function will have to copy that object.
However the parameter for that function may also quite frequently be a temporary and thus I want to also provide an overload of that function that takes an rvalue reference instead a const reference.
Both overloads tend to only differ in that they have different types of references as argument types. Other than that they are functionally equivalent.
For instance consider this toy example:
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
Now I was wondering whether there is a way to avoid this code-duplication by e.g. implementing one function in terms of the other.
For instance I was thinking of implementing the copy-version in terms of the move-one like this:
void foo(const MyObject &obj) {
MyObj copy = obj;
foo(std::move(copy));
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
However this still does not seem ideal since now there is a copy AND a move operation happening when calling the const ref overload instead of a single copy operation that was required before.
Furthermore, if the object does not provide a move-constructor, then this would effectively copy the object twice (afaik) which defeats the whole purpose of providing these overloads in the first place (avoiding copies where possible).
I'm sure one could hack something together using macros and the preprocessor but I would very much like to avoid involving the preprocessor in this (for readability purposes).
Therefore my question reads: Is there a possibility to achieve what I want (effectively only implementing the functionality once and then implement the second overload in terms of the first one)?
If possible I would like to avoid using templates instead.
My opinion is that understanding (truly) how std::move and std::forward work, together with what their similarities and their differences are is the key point to solve your doubts, so I suggest that you read my answer to What's the difference between std::move and std::forward, where I give a very good explanation of the two.
In
void foo(MyObject &&obj) {
globalVec.push_back(obj); // Moves (no, it doesn't!)
}
there's no move. obj is the name of a variable, and the overload of push_back which will be called is not the one which will steal reasources out of its argument.
You would have to write
void foo(MyObject&& obj) {
globalVec.push_back(std::move(obj)); // Moves
}
if you want to make the move possible, because std::move(obj) says look, I know this obj here is a local variable, but I guarantee you that I don't need it later, so you can treat it as a temporary: steal its guts if you need.
As regards the code duplication you see in
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject&& /*rvalue reference -> std::move it */ obj) {
globalVec.push_back(std::move(obj)); // Moves (corrected)
}
what allows you to avoid it is std::forward, which you would use like this:
template<typename T>
void foo(T&& /* universal/forwarding reference -> std::forward it */ obj) {
globalVec.push_back(std::forward<T>(obj)); // moves conditionally
}
As regards the error messages of templates, be aware that there are ways to make things easier. for instance, you could use static_asserts at the beginning of the function to enfornce that T is a specific type. That would certainly make the errors more understandable. For instance:
#include <type_traits>
#include <vector>
std::vector<int> globalVec{1,2,3};
template<typename T>
void foo(T&& obj) {
static_assert(std::is_same_v<int, std::decay_t<T>>,
"\n\n*****\nNot an int, aaarg\n*****\n\n");
globalVec.push_back(std::forward<T>(obj));
}
int main() {
int x;
foo(x);
foo(3);
foo('c'); // errors at compile time with nice message
}
Then there's SFINAE, which is harder and I guess beyond the scope of this question and answer.
My suggestion
Don't be scared of templates and SFINAE! They do pay off :)
There's a beautiful library that leverages template metaprogramming and SFINAE heavily and successfully, but this is really off-topic :D
A simple solution is:
void foo(MyObject obj) {
globalVec.push_back(std::move(obj));
}
If caller passes an lvalue, then there is a copy (into the parameter) and a move (into the vector). If caller passes an rvalue, then there are two moves (one into parameter and another into vector). This can potentially be slightly less optimal compared to the two overloads because of the extra move (slightly compensated by the lack of indirection) but in cases where moves are cheap, this is often a decent compromise.
Another solution for templates is std::forward explored in depth in Enlico's answer.
If you cannot have a template and the potential cost of a move is too expensive, then you just have to be satisfied with some extra boilerplate of having two overloads.

How to write a getter method so that it returns an rvalue

If we have this class definition
class BusDetails
{
private:
string m_busName;
public:
string BusName() const { return m_busName; }
};
how could the getter method be changed or used, so that using the return value as an lvalue would give a compiler error?
For example, if I use it like
int main(void)
{
BusDetails bus;
bus.BusName() = "abc"; // <--- This should give a compiler error
cout << bus.BusName() << endl;
return 0;
}
I get no compiler error, so apparently the assignment works, but the result is not as expected.
Update: this behavior is as expected with build-in types (i.e. the compiler gives an error at the above line of code if I have an int as a return type instead of string).
The BusName() was declared as a const function. So it can't change members.
Your function should return string& and not be const.
string& BusName() { return m_busName; }
In addition you can add for const object (this is const):
const string& BusName() const { return m_busName; }
It's not clear what behavior you want.
If you want the assignment to be an error, and keep all of the
flexibility of value return (e.g. you can modify the code to
return a calculated value), you can return std::string const.
(This will inhibit move semantics, but that's generally not
a big issue.)
If you want to be able to modify the "member", but still want
to retain flexibility with regards to how it is implemented in
the class, then you should provide a setter method. One
convention (not the only one) is to provide a getter function
like you have now (but returning std::string const), and
provide a function with the same name void
BusName( std::string const& newValue ) to set the value.
(Other conventions would use a name like SetBusName, or return
the old value, so client code could save and restore it, or
return *this, so client code could chain the operations:
obj.BusName( "toto" ).SomethingElse( /*...*/ ).
You may also provide a non-const member returning a reference
to a non-const. If you do this, however, you might as well make
the data member public.
Finally, you might provide a non-const member which returns
some sort of proxy class, so that assigning to it would in fact
call a setter function, and converting it to std::string would
call the getter. This is by far the most flexible, if you
want to support modifications by the client, but it's also by
far the most complex, so you might not want to use it unless you
need to.
Well it is kind of expected behavior what you have written.
You do return a copy of m_busName. Because you do not return the reference. Therefore a temporary copy of the return variable is made, and then the assignment takes place. operator= is "abc" called on that copy.
So the way to go would be string& BusName() const { return m_busName; }. But that shall give a compiler error.
You kind of want contradictory things. You say string BusName() const, yet you want to return a reference that will allow the state of the object to be changed.
However if you don't promise the object will not change you can drop the const and go with
string& BusName() { return m_busName; };
Or if you want to keep the const
const string& BusName() const { return m_busName; };
however this should give an error on the assignment, naturally.
The same goes for functions. If you do pass argument by reference it is a reference. If you see that you modify a copy, you must have not passed it by reference but by value.
The function does return an rvalue.
The problem is that std::string::operator= works with an rvalue on the left. Prior to C++11 it was difficult or impossible to prevent it from working: in C++11 they added (relatively late) what is colloquially known as rvalue references to this: the ability to overload methods and in-class operators based on the rvalue state of the object.
However, std::string was not modified, probably do to a mixture of not much time and dislike of breaking existing code without good reason.
You could patch around this problem a few ways. You could write your own string class that obeys rvalue reference to this. You could descend from std::string and block the operator= specifically. You could write an accessor object that has-a std::string that can cast-to std::string&& a d std::string& (based on rvalue status of this) implicitly, but blocks assignment with deleted method.
All three have issues. The last has the fewest issues, the second the most hidden pitfalls, the first is just drudgery.

Is it possible to take a parameter by const reference, while banning conversions so that temporaries aren't passed instead?

Sometimes we like to take a large parameter by reference, and also to make the reference const if possible to advertize that it is an input parameter. But by making the reference const, the compiler then allows itself to convert data if it's of the wrong type. This means it's not as efficient, but more worrying is the fact that I think I am referring to the original data; perhaps I will take it's address, not realizing that I am, in effect, taking the address of a temporary.
The call to bar in this code fails. This is desirable, because the reference is not of the correct type. The call to bar_const is also of the wrong type, but it silently compiles. This is undesirable for me.
#include<vector>
using namespace std;
int vi;
void foo(int &) { }
void bar(long &) { }
void bar_const(const long &) { }
int main() {
foo(vi);
// bar(vi); // compiler error, as expected/desired
bar_const(vi);
}
What's the safest way to pass a lightweight, read-only reference? I'm tempted to create a new reference-like template.
(Obviously, int and long are very small types. But I have been caught out with larger structures which can be converted to each other. I don't want this to silently happen when I'm taking a const reference. Sometimes, marking the constructors as explicit helps, but that is not ideal)
Update: I imagine a system like the following: Imagine having two functions X byVal(); and X& byRef(); and the following block of code:
X x;
const_lvalue_ref<X> a = x; // I want this to compile
const_lvalue_ref<X> b = byVal(); // I want this to fail at compile time
const_lvalue_ref<X> c = byRef(); // I want this to compile
That example is based on local variables, but I want it to also work with parameters. I want to get some sort of error message if I'm accidentally passing a ref-to-temporary or a ref-to-a-copy when I think I'll passing something lightweight such as a ref-to-lvalue. This is just a 'coding standard' thing - if I actually want to allow passing a ref to a temporary, then I'll use a straightforward const X&. (I'm finding this piece on Boost's FOREACH to be quite useful.)
Well, if your "large parameter" is a class, the first thing to do is ensure that you mark any single parameter constructors explicit (apart from the copy constructor):
class BigType
{
public:
explicit BigType(int);
};
This applies to constructors which have default parameters which could potentially be called with a single argument, also.
Then it won't be automatically converted to since there are no implicit constructors for the compiler to use to do the conversion. You probably don't have any global conversion operators which make that type, but if you do, then
If that doesn't work for you, you could use some template magic, like:
template <typename T>
void func(const T &); // causes an undefined reference at link time.
template <>
void func(const BigType &v)
{
// use v.
}
If you can use C++11 (or parts thereof), this is easy:
void f(BigObject const& bo){
// ...
}
void f(BigObject&&) = delete; // or just undefined
Live example on Ideone.
This will work, because binding to an rvalue ref is preferred over binding to a reference-to-const for a temporary object.
You can also exploit the fact that only a single user-defined conversion is allowed in an implicit conversion sequence:
struct BigObjWrapper{
BigObjWrapper(BigObject const& o)
: object(o) {}
BigObject const& object;
};
void f(BigObjWrapper wrap){
BigObject const& bo = wrap.object;
// ...
}
Live example on Ideone.
This is pretty simple to solve: stop taking values by reference. If you want to ensure that a parameter is addressable, then make it an address:
void bar_const(const long *) { }
That way, the user must pass a pointer. And you can't get a pointer to a temporary (unless the user is being terribly malicious).
That being said, I think your thinking on this matter is... wrongheaded. It comes down to this point.
perhaps I will take it's address, not realizing that I am, in effect, taking the address of a temporary.
Taking the address of a const& that happens to be a temporary is actually fine. The problem is that you cannot store it long-term. Nor can you transfer ownership of it. After all, you got a const reference.
And that's part of the problem. If you take a const&, your interface is saying, "I'm allowed to use this object, but I do not own it, nor can I give ownership to someone else." Since you do not own the object, you cannot store it long-term. This is what const& means.
Taking a const* instead can be problematic. Why? Because you don't know where that pointer came from. Who owns this pointer? const& has a number of syntactic safeguards to prevent you from doing bad things (so long as you don't take its address). const* has nothing; you can copy that pointer to your heart's content. Your interface says nothing about whether you are allowed to own the object or transfer ownership to others.
This ambiguity is why C++11 has smart pointers like unique_ptr and shared_ptr. These pointers can describe real memory ownership relations.
If your function takes a unique_ptr by value, then you now own that object. If it takes a shared_ptr, then you now share ownership of that object. There are syntactic guarantees in place that ensure ownership (again, unless you take unpleasant steps).
In the event of your not using C++11, you should use Boost smart pointers to achieve similar effects.
You can't, and even if you could, it probably wouldn't help much.
Consider:
void another(long const& l)
{
bar_const(l);
}
Even if you could somehow prevent the binding to a temporary as input to
bar_const, functions like another could be called with the reference
bound to a temporary, and you'd end up in the same situation.
If you can't accept a temporary, you'll need to use a reference to a
non-const, or a pointer:
void bar_const(long const* l);
requires an lvalue to initialize it. Of course, a function like
void another(long const& l)
{
bar_const(&l);
}
will still cause problems. But if you globally adopt the convention to
use a pointer if object lifetime must extend beyond the end of the call,
then hopefully the author of another will think about why he's taking
the address, and avoid it.
I think your example with int and long is a bit of a red herring as in canonical C++ you will never pass builtin types by const reference anyway: You pass them by value or by non-const reference.
So let's assume instead that you have a large user defined class. In this case, if it's creating temporaries for you then that means you created implicit conversions for that class. All you have to do is mark all converting constructors (those that can be called with a single parameter) as explicit and the compiler will prevent those temporaries from being created automatically. For example:
class Foo
{
explicit Foo(int bar) { }
};
(Answering my own question thanks to this great answer on another question I asked. Thanks #hvd.)
In short, marking a function parameter as volatile means that it cannot be bound to an rvalue. (Can anybody nail down a standard quote for that? Temporaries can be bound to const&, but not to const volatile & apparently. This is what I get on g++-4.6.1. (Extra: see this extended comment stream for some gory details that are way over my head :-) ))
void foo( const volatile Input & input, Output & output) {
}
foo(input, output); // compiles. good
foo(get_input_as_value(), output); // compile failure, as desired.
But, you don't actually want the parameters to be volatile. So I've written a small wrapper to const_cast the volatile away. So the signature of foo becomes this instead:
void foo( const_lvalue<Input> input, Output & output) {
}
where the wrapper is:
template<typename T>
struct const_lvalue {
const T * t;
const_lvalue(const volatile T & t_) : t(const_cast<const T*>(&t_)) {}
const T* operator-> () const { return t; }
};
This can be created from an lvalue only
Any downsides? It might mean that I accidentally misuse an object that is truly volatile, but then again I've never used volatile before in my life. So this is the right solution for me, I think.
I hope to get in the habit of doing this with all suitable parameters by default.
Demo on ideone

What does const& do?

I do not understand what is going on. I am just learning C++ and I see something like this a lot:
double some_function(const Struct_Name& s) {
...
}
Why the const if we are passing by reference?
You pass by const reference when you don't want to(or can't) modify the argument being passed in, and you don't want the performance hit that you might get from copying the object.
A const reference prevents the object from being modified, as const would anywhere else, but also avoids the potential cost of copying.
You are telling the compiler you're not going to change s, ever.
This enables it to make some optimizations it wouldn't have been able to do otherwise. Basically, it gives you the same semantics as passing by value, but doesn't incur the performance penalty of calling the copy constructor.
Call by const-reference avoids a copy of the Struct_Name while promising not to modify it.
There is both a performance reason for this, and a semantics reason.
If Struct_Name is large, copying it is expensive in both time and memory.
If Struct_Name is uncopyable (or becomes invalid when copied) calling by value is impossible or introduces undesirable complexity. For example: std::unique_ptr and std::auto_ptr.
By using const we can signal both the user of the function and the compiler that the object passed as the argument s will not be changed inside the function (which would actually be possible, because we pass it by reference!). The compiler can than give us an error if we modify the object by accident and it can do some optimizations it couldn't do otherwise.
An additional advantage is, that if the caller of the function only owns a const pointer to an object, it can still provide that object as an argument without casting.
const here promisses that some_function will not modify s parameter,
double some_function(const Struct_Name& s) {
...
}
you can try modifying it but compiler will return errors. Actually constness requires you to carefully write Struct_Name internal methods, ie. you will not be able to call inside some_function non-const functions on s object. You can try, but you will get error. ie:
struct Struct_Name {
void myfun() const { } // can be called from some_function
void myfun2() { } // will show error if called from some_function
};
Using const parameter is good from design point of view, if you know some function is not supposed to change your object then you add const. This means that no other programmer can do changes in some deeply hidden in classes hierarchy code, that will modify your object. It really makes debugging easy.
Another reason that noone has mentioned yet - passing by a const reference allows the compiler to create and pass a temporary object, without generating a warning, if the input value is not the exact type declared in the parameter, but the type has a constructor that supports the type being passed in.
For example:
void foo(const std::string &s)
{
...
}
foo("hello"); // OK
foo() is expecting a std::string but receives a const char* instead. Since std::string has a constructor that accept a const char*, the compiler generates code that is effectively doing this:
std::string temp("hello");
foo(temp);
The compiler knows the parameter is const, the temporary will not be altered by foo(), and the temporary will be discarded after foo() exits, so it does not complain about having to create a temporary.
The same thing happens if the parameter is passed by value (const or non-const, it does not matter) instead of by reference:
void foo(const std::string s)
{
...
}
void bar(std::string s)
{
...
}
foo("hello"); // OK
bar("world"); // OK
This is effectively the same as this:
{
std::string temp1("hello");
foo(temp1);
}
{
std::string temp2("world");
bar(temp2);
}
Again, the compiler does not complain, as it knows the temporary does not affect the calling code, and any alterations made to the temporary in bar() will be safely discarded.
If the parameter were a non-const reference instead, passing a const char* would generate a warning about the temporary that has to be created to satisfy the reference binding. The warning is to let you know that any changes the function makes to the temporary (since it is not const) will be lost when the function exits, which may or may not have an effect on the calling code. This is usually an indication that a temporary should not be used in that situation:
void foo(std::string &s)
{
...
}
foo("hello"); // warning!

Copy constructor, why in return by value functions

Suppose i have:
class A
{
A(A& foo){ ..... }
A& operator=(const A& p) { }
}
...
A lol;
...
A wow(...)
{
return lol;
}
...
...
A stick;
stick = wow(...);
Then I'll get a compile error in the last line. But if I add 'const' before 'A&', its ok.
I want to know why. Where it's exactly the problem?
I dont get why it should be const.
Language: C++
I edited... I think that change its relevant. That gives error.
I believe the problem you are mentioning is similar to:
c++, object life-time of anonymous (unnamed) variables
where the essential point is that in C++ anonymous-temporaries can not be passed by reference but only by const reference.
The following code compiles perfectly fine with both Comeau and VC9:
class A
{
public:
A() {}
A(A&){}
};
A lol;
A wow()
{
return lol;
}
int main()
{
A stick;
stick = wow();
return 0;
}
If this doesn't compile with your compiler, then I suspect your compiler to be broken. If it does, then that means you should have pasted the real code, instead of supplying a snippet that doesn't resemble the problem you see.
The call to wow results in a temporary object, an r-value. R-values can not be assigned to non-const references. Since your copy constructor accepts non-const references, you can not pass the result of the call to wow directly to it. This is why adding the const fixes the problem. Now the copy constructor accepts const references, which r-values bind to just fine.
Chances are, your copy constructor does not change the object it is copying, so the paramter should be passed by const-reference. This is how copy constructors are expected to work, except in specific, documented circumstances.
But as sbi points out in his answer, this copy constructor shouldn't be getting called at all. So while this is all true, it likely has nothing to do with your problem. Unless there is a compiler bug. Perhaps your compiler sees the two-step construction, and decided it'll cut out the middle man by converting A stick; stick = wow(); into A stick = wow(); But this would be a bug, as evidenced by the fact that it produces a compile error out of perfectly legal code. But without actual code, its impossible to say what's really happening. There should be several other errors before any issues with your copy constructor come up.
Not reproducible. Are you missing the default constructor, or forgot to make the constructors public?
See http://www.ideone.com/nPsHj.
(Note that, a copy constructor can take an cv A& argument with any const-volatile combination plus some default arguments. See §[class.copy]/2 in C++ standard.)
Edit: Interesting, g++-4.3 (ideone) and 4.5 (with -pedantic flag) don't have the compile error, but g++-4.2 do complain:
x.cpp: In function ‘int main()’:
x.cpp:19: error: no matching function for call to ‘A::A(A)’
x.cpp:7: note: candidates are: A::A(A&)
This function:
A wow(...)
{ ... }
returns an object of by value.
This means it is copied back to the point where the function was called.
This line:
stick = wow(...);
Does a copy construction on stick.
The value copied into stick is the value copied back from the function wow().
But remember that the result of the call to wow() is a temporary object (it was copied back from wow() but is not in a variable yet).
So now we look at the copy constructor for A:
A(A& foo){ ..... }
You are trying to pass a temporary object to a reference parameter. This is not allowed. A temporary object can only be bound to a const reference. There are two solutions to the problem:
1) Use a const reference.
2) Pass by value into the copy constructor.
Unfortunately if you use solution (2) you get a bit stuck as it becomes a circular dependency. Passing by value involves using the copy constructor so you enter an infinte loop. So your solution is to use pass by const reference.