Should I always move on `sink` constructor or setter arguments? - c++

struct TestConstRef {
std::string str;
Test(const std::string& mStr) : str{mStr} { }
};
struct TestMove {
std::string str;
Test(std::string mStr) : str{std::move(mStr)} { }
};
After watching GoingNative 2013, I understood that sink arguments should always be passed by value and moved with std::move. Is TestMove::ctor the correct way of applying this idiom? Is there any case where TestConstRef::ctor is better/more efficient?
What about trivial setters? Should I use the following idiom or pass a const std::string&?
struct TestSetter {
std::string str;
void setStr(std::string mStr) { str = std::move(str); }
};

The simple answer is: yes.
The reason is quite simple as well, if you store by value you might either need to move (from a temporary) or make a copy (from a l-value). Let us examine what happens in both situations, with both ways.
From a temporary
if you take the argument by const-ref, the temporary is bound to the const-ref and cannot be moved from again, thus you end up making a (useless) copy.
if you take the argument by value, the value is initialized from the temporary (moving), and then you yourself move from the argument, thus no copy is made.
One limitation: a class without an efficient move-constructor (such as std::array<T, N>) because then you did two copies instead of one.
From a l-value (or const temporary, but who would do that...)
if you take the argument by const-ref, nothing happens there, and then you copy it (cannot move from it), thus a single copy is made.
if you take the argument by value, you copy it in the argument and then move from it, thus a single copy is made.
One limitation: the same... classes for which moving is akin to copying.
So, the simple answer is that in most cases, by using a sink you avoid unnecessary copies (replacing them by moves).
The single limitation is classes for which the move constructor is as expensive (or near as expensive) as the copy constructor; in which case having two moves instead of one copy is "worst". Thankfully, such classes are rare (arrays are one case).

A bit late, as this question already has an accepted answer, but anyways... here's an alternative:
struct Test {
std::string str;
Test(std::string&& mStr) : str{std::move(mStr)} { } // 1
Test(const std::string& mStr) : str{mStr} { } // 2
};
Why would that be better? Consider the two cases:
From a temporary (case // 1)
Only one move-constructor is called for str.
From an l-value (case // 2)
Only one copy-constructor is called for str.
It probably can't get any better than that.
But wait, there is more:
No additional code is generated on the caller's side! The calling of the copy- or move-constructor (which might be inlined or not) can now live in the implementation of the called function (here: Test::Test) and therefore only a single copy of that code is required. If you use by-value parameter passing, the caller is responsible for producing the object that is passed to the function. This might add up in large projects and I try to avoid it if possible.

Related

Moving parameter to data member: Take parameter by copy or rvalue-ref?

I have a class object entity which gobbles up a string and shoves it into it's member on construction (for the sake of this argument this could be any old member function).
Now I can do this in at least two ways: I might accept a std::string copy and move this into a member (inefficient_entity). However I could also directly take by rvalue-ref and just continue the move into the data member (efficient_entity).
Is there a performance difference between the two? I'm asking bc it's way more convenient to take by copy and let the call site decide if it wants to move or copy the string. The other way I would probably need to create an overload set which can grow to huge amounts given that my constructor could also accept multiple arguments in the same manner.
Will this be optimized out anyway or do I have to worry?
Demo
#include <string>
#include <cstdio>
struct efficient_entity
{
efficient_entity(std::string&& str)
: str_ { std::move(str) }
{ }
std::string str_;
};
struct inefficient_entity
{
inefficient_entity(std::string str)
: str_ { std::move(str) }
{ }
std::string str_;
};
int main()
{
std::string create_here = "I guarantee you this string is absolutely huge!";
efficient_entity(std::move(create_here));
inefficient_entity(std::move(create_here));
}
Note: Afaik there was a talk on cppcon which covered exactly that but I can't find it anymore (appreciated if someone could drop a hint).
Passing by value costs you one extra move.
Passing by reference, on the other hand, requires you to write two functions: const T & or T &&. Or 2N functions for N parameters, or a template.
I prefer the first one by default, and use the second one in low-level utilities that should be fast.
Having just a single T && overload is viable, but highly unorthodox. This gives you one extra move only for copies, which have to be written as func(std::string(value)) (or func(auto(value)) in C++23).
The supposed benefit here is that all copies are explicit. I would only do this for heavy types, and only if you like this style.

What's happening in this return statement?

I'm reading on copy elision (and how it's supposed to be guaranteed in C++17) and this got me a bit confused (I'm not sure I know things I thought I knew before). So here's a minimal test case:
std::string nameof(int param)
{
switch (param)
{
case 1:
return "1"; // A
case 2:
return "2" // B
}
return std::string(); // C
}
The way I see it, cases A and B perform a direct construction on the return value so copy elision has no meaning here, while case C cannot perform copy elision because there are multiple return paths. Are these assumptions correct?
Also, I'd like to know if
there's a better way of writing the above (e.g. have a std::string retval; and always return that one or write cases A and B as return string("1") etc)
there's any move happening, for example "1" is a temporary but I'm assuming it's being used as a parameter for the constructor of std::string
there are optimization concerns I ommited (e.g. I believe C could be written as return{}, would that be a better choice?)
To make it NRVO-friendly, you should always return the same object. The value of the object might be different, but the object should be the same.
However, following above rule makes program harder to read, and often one should opt for readability over unnoticeable performance improvement. Since std::string has a move constructor defined, the difference between moving a pointer and a length and not doing so would be so tiny that I see no way of actually noticing this in the application.
As for your last question, return std::string() and return {} would be exactly the same.
There are also some incorrect statements in your question. For example, "1" is not a temporary. It's a string literal. Temporary is created from this literal.
Last, but not least, mandatory C++17 copy elision does not apply here. It is reserved for cases like
std::string x = "X";
which before the mandatory requirement could generate code to create a temporary std::string and initialize x with copy (or move) constructor.
In all cases, the copy might or might not be elided. Consider:
std::string j = nameof(whatever);
This could be implemented one of two ways:
Only one std::string object is ever constructed, j. (The copy is elided.)
A temporary std::string object is constructed, its value is copied to j, then the temporary is destroyed. (The function returns a temporary that is copied.)

why use a move constructor?

I'm a little confused as to why you would use/need a move constructor.
If I have the following:
vector fill(istream& is)
{
vector res;
for(double x; is >> x; res.push_back(x));
return res;
}
void bar()
{
vector vec = fill(cin);
// ... use vec ...
}
I can remove the need to return res, hence not calling the copy constructor, by adding vector fill(istream& is, vector& res).
So what is the point of having a move constructor?
Assume you next put you std::vector<T> into a std::vector<std::vector<T>> (if you think vectors shouldn't be nested, assume the inner type to be std::string and assume we are discussing std::string's move constructor): even though you can add an empty object and fill it in-place, eventually the vector will need to be relocated upon resizing at which point moving the elements comes in handy.
Note that returning from a function isn't the main motivator of move construction, at least, not with respect to efficiency: where efficiency matters structuring the code to enable copy-elision further improves performance by even avoiding the move.
The move constructor may still be relevant semantically, though, because returning requires that a type is either copyable or movable. Some types, e.g., streams, are not copyable but they are movable and can be returned that way.
In you example compiler might apply RVO - Return Value Optimization, this means you function will be inlined, so no return will take place - and no move semantics will be applied. Only if it cannot apply RVO - move constructor will be used (if available).
Before move semantics were introduced people were using various techniques to simulate them. One of them is actually returning values by references.
One reason is that using assignment operators makes it easier to grasp what each line is doing. If have a function call somefunction(var1, var2, var3), it is not clear whether some of them gets modified or not. To find that out, you have to actually read the other function.
Additionally, if you have to pass a vector as an argument to fill(), it means every place that calls your function will require two lines instead of one: First to create an empty vector, and then to call fill().
Another reason is that a move constructor allows the function to return an instance of a class that does not have a default constructor. Consider the following example:
struct something{
something(int i) : value(i) {}
something(const something& a) : value(a.value) {}
int value;
};
something function1(){
return something(1);
}
void function2(something& other){
other.value = 2;
}
int main(void){
// valid usage
something var1(18);
// (do something with var1 and then re-use the variable)
var1 = function1();
// compile error
something var2;
function2(var2);
}
In case you are concerned about effiency, it should not matter whether you write your fill() to return a value, or to take output variable as a parameter. Your compiler should optimize it to the most efficient alternative of those two. If you suspect it doesn't, you had better measure it.

Pass by value or const reference? [duplicate]

This question already has answers here:
Are the days of passing const std::string & as a parameter over?
(13 answers)
Closed 9 years ago.
There is a set of good rules to determine whether pass by value or const reference
If the function intends to change the argument as a side effect, take
it by non-const reference.
If the function doesn't modify its argument and the argument is of
primitive type, take it by value.
Otherwise take it by const reference, except in the following
cases: If the function would then need to make a copy of the const
reference anyway, take it by value.
For constructor as following, how to determine it?
class A
{
public:
A(string str) : mStr(str) {} // here which is better,
// pass by value or const reference?
void setString(string str) { mStr = str; } // how about here?
private:
string mStr;
};
In this particular case, and assuming C++11 and move construction/assignment for strings, you should take the argument by value and move it to the member for the constructor.
A::A(string str) : mStr(std::move(str)) {}
The case of the setter is a bit trickier and I am not sure whether you really want/need to optimize every bit of it... If you want to optimize the most you could provide two overloads, one taking an rvalue reference and another taking a const lvalue reference. At any rate, the const lvalue reference is probably a good enough approach:
void A::setString(string const& str) { mStr = str; }
Why the difference?
In the case of the constructor, the member is not yet built, so it is going to need to allocate memory. You can move that memory allocation (and actual copying of the data, but that is the leaser cost) to the interface, so that if the caller has a temporary it can be forwarded without an additional memory allocation.
In the case of assignment the things are a bit more complicated. If the current size of the string is large enough to hold the new value, then no allocation is required, but if the string is not large enough, then it will need to reallocate. If the allocation is moved to the interface (by-value argument), it will be executed always even when it is unnecessary. If the allocation is done inside the function (const reference argument) then for a small set of cases (those where the argument is a temporary that is larger then the current buffer) an allocation that could otherwise have been avoided would be done.
The article you site is not a good reference for software
engineering. (It is also likely out of date, given that it
talks about move semantics and is dated from 2003.)
The general rule is simple: pass class types by const reference,
and other types by value. There are explicit exceptions: in
keeping with the conventions of the standard library, it is also
usual to pass iterators and functional objects by value.
Anything else is optimization, and shouldn't be undertaken until
the profiler says you have to.
In this case it is better to pass argument by const reference. Reason: string is a class type, you don't modify it and it can be arbitrary big.
It is always better to use the member initialization list to initialize your values as it provides with the following advantages:
1) The assignment version, creates a default constructor to initialize mStr and then assigned a new value on top of the default-constructed one. Using the MIL avoids this wasted construction because the arguments in the list are used as constructor arguments.
2) It's the only place to initialize constant variables unless these are just intergers which you can use enums in the class. enum T{v=2};
3) It's the place to initialize references.
This is what I would suggest:
#include <iostream>
#include <string>
class A
{
private:
std::string mStr;
public:
A(std::string str):mStr(str){}
//if you are using an older version of C++11, then use
//std::string &str instead
inline const void setString(std::string str)
{
mStr = str;
}
const std::string getString() const
{
return mStr;
}
};
int main()
{
A a("this is a 1st test.");
a.setString("this is a 2nd test.");
std::cout<<a.getString()<<std::endl;
}
Have a look at this:
http://www.cplusplus.com/forum/articles/17820/

c++ copy construct parameter passed by value

I want freeFunct to do non const stuff on its own copy of object a.
Let's say that freeFunct is required to be a free function
because in real code cases it takes many different parameters,
calls several public functions from all of them and there is no
point in making it a non-static member function of any class.
Three different ways of declaring it come to my mind.
I have the feeling that the third solution is the worse.
Is there any difference between the first two?
Is there something better?
void freeFunct1(A a){
a.doStuff();
}
void freeFunct2(const A& a){
A b = a;
b.doStuff();
}
/**users of freeFunct3 are expected
*to give a copy of their variable:
*{
* A b = a;
* freeFunct3(b);
*}
*/
void freeFunct3(A& a){
a.doStuff();
}
The first is best: it allows the caller to choose whether to copy or move his object, so can be more efficient if the caller doesn't need to keep a copy.
freeFunct1(a); // "a" is copied and not changed
freeFunct1(std::move(a)); // "a" is moved and perhaps changed
The second is similar, but forces a copy.
The third, as you say, is more error-prone, since the caller has to be aware that it will modify the argument.
First, as already said, don't do freeFunct3 if the semantics of the free function is to only modify its "own" object.
Second, there are differences between freeFunct1 and freeFunct2, relating to move optimization [C++11], exception safety, and potentially code size.
With freeFunct2 (taking by reference-to-const):
It will always construct a new copy of the argument, never move it [C++11].
If the copy construction of A throws an exception, it will throw inside the body of the function.
If A's copy constructor is inlined (and the function is not), it will be expanded once, inside the body of the function (even if the function is called from multiple different places).
With freeFunct1 (taking by value):
[C++11] You can avoid a copy if A has a move constructor and you pass an rvalue (e.g. call freeFunct1(A(args))).
If the copy (or move) construction of A throws an exception, it will throw at the call site.
If A's copy (or move) constructor is inlined, it will be expanded multiple times, at each call site.
Alternatively, you can overload on lvalue/rvalue reference to avoid unnecessarily copying rvalues:
void freeFunct4(const A& a){
A b = a;
b.doStuff();
}
void freeFunct4(A&& a){
a.doStuff();
}
IMO, the first is the best and the last is the worst.
Quite a few people, however, have gotten so accustomed to passing by const reference that they'll write #2 by default, even though in this case they need the copy that it's trying to avoid.
The first changes only the local copy. The second is the same as the first, but with extra code. The third will make changes to a visible to the caller of freeFunct3 as it's a non-const reference. If called as in the comment above the function, then it's no different than the second version really.
So if you just want to modify the local copy, without those changes being passed to the caller, then the first version is what I recommend.