C++ using move semantics to pass argument to another function - c++

I have function which takes 2 arguments, but one of these (map container) is passed to another function:
void myFunc(const std::map<std::string, int> &myMap, int num) {
int x = internalFunc(myMap);
// do some stuff...
}
int internalFunc(const std::map<std::string, int> &myMap) {
// return some map statistics
}
somewhere in main.cpp :
std::map<std::string, int> map1{ {"Hello", 10}, {"Hello2", 20}, {"Hello3", 30} };
myFunc(map1, 20);
My question is:
Is move semantics a good way for optimising this piece of code (passing one argument to another function using move) like this:
int internalFunc(std::map<std::string, int> &&myMap) {
// now gets rvalue reference
// return some map statistics
}
void myFunc(std::map<std::string, int> myMap, int num) {
int x = internalFunc(std::move(myMap));
// do some stuff...
}
I prefer not to use universal reference (using template) and std::forward in this case because this function is always called with this type of map and I prefer to keep the code as simple as possible.
internalFunc is always called by this one specific function myFunc.
Is move semantics good way for optimising this kind of functions? I understand that optimising using move semantics depends on moved object type, but let's stick to the example above with standard map container.
Thank you

Move semantics are useful if you need to modify the map or if the function needs to take ownership of the map for some reason (maybe the latter is more understandable if the functions were class members, i.e. setters or constructors for instance).
You should use const std::map<std::string, int>& for 3 main reasons:
You only want read access.
Readability: the user will understand quickly that the map won't be modified.
You won't get better results using move semantics.
Note on move semantics
If you use move semantics, the arguments of your functions don't necessarily need the double &&. Generally, it is better to omit them (except for perfect forwarding and non-copyable objects like stream objects). The && requires that the passed arguments be rvalues. However, omitting the && does not mean you cannot pass the arguments as rvalues. Let's see this with an example:
int internalFunc(std::map<std::string, int> myMap)
{
/* ... */
return 1; // or whatever
}
void myFunc(std::map<std::string, int> myMap, int num)
{
int x = internalFunc(std::move(myMap));
// do some stuff...
}
The arguments myMap in the above code don't get copied if you pass them as rvalues:
int main()
{
myFunc({ {"Hello", 10}, {"Hello2", 20}, {"Hello3", 30} }, 20);
return 0;
}
Moreover, you can now use the same code by passing lvalues, like you do in the main function of your question (myFunc(map1, 20);). The argument myMap in myFunc is then a copy of course. Also, your version with move semantics would not compile in this case.
If you really want to make sure the maps don't get copied, you can use the &&, but this is rather rare, and, in my opinion, should be used only for objects which cannot be copied (i.e. stream objects).

Related

Do const and reference in function parameters cause unnecessary casts? [duplicate]

I have some pre-C++11 code in which I use const references to pass large parameters like vector's a lot. An example is as follows:
int hd(const vector<int>& a) {
return a[0];
}
I heard that with new C++11 features, you can pass the vector by value as follows without performance hits.
int hd(vector<int> a) {
return a[0];
}
For example, this answer says
C++11's move semantics make passing and returning by value much more attractive even for complex objects.
Is it true that the above two options are the same performance-wise?
If so, when is using const reference as in option 1 better than option 2? (i.e. why do we still need to use const references in C++11).
One reason I ask is that const references complicate deduction of template parameters, and it would be a lot easier to use pass-by-value only, if it is the same with const reference performance-wise.
The general rule of thumb for passing by value is when you would end up making a copy anyway. That is to say that rather than doing this:
void f(const std::vector<int>& x) {
std::vector<int> y(x);
// stuff
}
where you first pass a const-ref and then copy it, you should do this instead:
void f(std::vector<int> x) {
// work with x instead
}
This has been partially true in C++03, and has become more useful with move semantics, as the copy may be replaced by a move in the pass-by-val case when the function is called with an rvalue.
Otherwise, when all you want to do is read the data, passing by const reference is still the preferred, efficient way.
There is a big difference. You will get a copy of a vector's internal array unless it was about to die.
int hd(vector<int> a) {
//...
}
hd(func_returning_vector()); // internal array is "stolen" (move constructor is called)
vector<int> v = {1, 2, 3, 4, 5, 6, 7, 8};
hd(v); // internal array is copied (copy constructor is called)
C++11 and the introduction of rvalue references changed the rules about returning objects like vectors - now you can do that (without worrying about a guaranteed copy). No basic rules about taking them as argument changed, though - you should still take them by const reference unless you actually need a real copy - take by value then.
C++11's move semantics make passing and returning by value much more attractive even for complex objects.
The sample you give, however, is a sample of pass by value
int hd(vector<int> a) {
So C++11 has no impact on this.
Even if you had correctly declared 'hd' to take an rvalue
int hd(vector<int>&& a) {
it may be cheaper than pass-by-value but performing a successful move (as opposed to a simple std::move which may have no effect at all) may be more expensive than a simple pass-by-reference. A new vector<int> must be constructed and it must take ownership of the contents of a. We don't have the old overhead of having to allocate a new array of elements and copy the values over, but we still need to transfer the data fields of vector.
More importantly, in the case of a successful move, a would be destroyed in this process:
std::vector<int> x;
x.push(1);
int n = hd(std::move(x));
std::cout << x.size() << '\n'; // not what it used to be
Consider the following full example:
struct Str {
char* m_ptr;
Str() : m_ptr(nullptr) {}
Str(const char* ptr) : m_ptr(strdup(ptr)) {}
Str(const Str& rhs) : m_ptr(strdup(rhs.m_ptr)) {}
Str(Str&& rhs) {
if (&rhs != this) {
m_ptr = rhs.m_ptr;
rhs.m_ptr = nullptr;
}
}
~Str() {
if (m_ptr) {
printf("dtor: freeing %p\n", m_ptr)
free(m_ptr);
m_ptr = nullptr;
}
}
};
void hd(Str&& str) {
printf("str.m_ptr = %p\n", str.m_ptr);
}
int main() {
Str a("hello world"); // duplicates 'hello world'.
Str b(a); // creates another copy
hd(std::move(b)); // transfers authority for b to function hd.
//hd(b); // compile error
printf("after hd, b.m_ptr = %p\n", b.m_ptr); // it's been moved.
}
As a general rule:
Pass by value for trivial objects,
Pass by value if the destination needs a mutable copy,
Pass by value if you always need to make a copy,
Pass by const reference for non-trivial objects where the viewer only needs to see the content/state but doesn't need it to be modifiable,
Move when the destination needs a mutable copy of a temporary/constructed value (e.g. std::move(std::string("a") + std::string("b"))).
Move when you require locality of the object state but want to retain existing values/data and release the current holder.
Remember that if you are not passing in an r-value, then passing by value would result in a full blown copy. So generally speaking, passing by value could lead to a performance hit.
Your example is flawed. C++11 does not give you a move with the code that you have, and a copy would be made.
However, you can get a move by declaring the function to take an rvalue reference, and then passing one:
int hd(vector<int>&& a) {
return a[0];
}
// ...
std::vector<int> a = ...
int x = hd(std::move(a));
That's assuming that you won't be using the variable a in your function again except to destroy it or to assign to it a new value. Here, std::move casts the value to an rvalue reference, allowing the move.
Const references allow temporaries to be silently created. You can pass in something that is appropriate for an implicit constructor, and a temporary will be created. The classic example is a char array being converted to const std::string& but with std::vector, a std::initializer_list can be converted.
So:
int hd(const std::vector<int>&); // Declaration of const reference function
int x = hd({1,2,3,4});
And of course, you can move the temporary in as well:
int hd(std::vector<int>&&); // Declaration of rvalue reference function
int x = hd({1,2,3,4});

C++ std::vector difference between creating object then adding it vs creating it inside the vector?

Since std::vector::push_back(obj) creates a copy of the object, would it be more efficient to create it within the push_back() call than beforehand?
struct foo {
int val;
std::string str;
foo(int _val, std::string _str) :
val(_val), str(_str) {}
};
int main() {
std::vector<foo> list;
std::string str("hi");
int val = 2;
list.push_back(foo(val,str));
return 0;
}
// or
int main() {
std::vector<foo> list;
std::string str("hi");
int val = 2;
foo f(val,str);
list.push_back(f);
return 0;
}
list.push_back(foo(val,str));
asks for a foo object to be constructed, and then passed into the vector. So both approaches are similar in that regard.
However—with this approach a c++11 compiler will treat the foo object as a "temporary" value (rvalue) and will use the void vector::push_back(T&&) function instead of the void vector::push_back(const T&) one, and that's indeed to be faster in most situations. You could also get this behavior with a previously declared object with:
foo f(val,str);
list.push_back(std::move(f));
Also, note that (in c++11) you can do directly:
list.emplace_back(val, str);
It's actually somewhat involved. For starters, we should note that std::vector::push_back is overloaded on the two reference types:
void push_back( const T& value );
void push_back( T&& value );
The first overload is invoked when we pass an lvalue to push_back, because only an lvalue reference type can bind to an lvalue, like f in your second version. And in the same fashion, only an rvalue reference can bind to an rvalue like in your first version.
Does it make a difference? Only if your type benefits from move semantics. You didn't provide any copy or move operation, so the compiler is going to implicitly define them for you. And they are going to copy/move each member respectively. Because std::string (of which you have a member) actually does benefit from being moved if the string is very long, you might see better performance if you choose not to create a named object and instead pass an rvalue.
But if your type doesn't benefit from move semantics, you'll see no difference whatsoever. So on the whole, it's safe to say that you lose nothing, and can gain plenty by "creating the object at the call".
Having said all that, we mustn't forget that a vector supports another insertion method. You can forward the arguments for foo's constructor directly into the vector via a call to std::vector::emplace_back. That one will avoid any intermediate foo objects, even the temporary in the call to push_back, and will create the target foo directly at the storage the vector intends to provide for it. So emplace_back may often be the best choice.
You ‘d better use
emplace_back(foo(val,str))
if you are about creating and pushing new element to your vector. So you perform an in-place construction.
If you’ve already created your object and you are sure you will never use it alone for another instruction, then you can do
push_back(std::move(f))
In that case your f object is dangled and his content is owned by your vector.

why use a move constructor?

I'm a little confused as to why you would use/need a move constructor.
If I have the following:
vector fill(istream& is)
{
vector res;
for(double x; is >> x; res.push_back(x));
return res;
}
void bar()
{
vector vec = fill(cin);
// ... use vec ...
}
I can remove the need to return res, hence not calling the copy constructor, by adding vector fill(istream& is, vector& res).
So what is the point of having a move constructor?
Assume you next put you std::vector<T> into a std::vector<std::vector<T>> (if you think vectors shouldn't be nested, assume the inner type to be std::string and assume we are discussing std::string's move constructor): even though you can add an empty object and fill it in-place, eventually the vector will need to be relocated upon resizing at which point moving the elements comes in handy.
Note that returning from a function isn't the main motivator of move construction, at least, not with respect to efficiency: where efficiency matters structuring the code to enable copy-elision further improves performance by even avoiding the move.
The move constructor may still be relevant semantically, though, because returning requires that a type is either copyable or movable. Some types, e.g., streams, are not copyable but they are movable and can be returned that way.
In you example compiler might apply RVO - Return Value Optimization, this means you function will be inlined, so no return will take place - and no move semantics will be applied. Only if it cannot apply RVO - move constructor will be used (if available).
Before move semantics were introduced people were using various techniques to simulate them. One of them is actually returning values by references.
One reason is that using assignment operators makes it easier to grasp what each line is doing. If have a function call somefunction(var1, var2, var3), it is not clear whether some of them gets modified or not. To find that out, you have to actually read the other function.
Additionally, if you have to pass a vector as an argument to fill(), it means every place that calls your function will require two lines instead of one: First to create an empty vector, and then to call fill().
Another reason is that a move constructor allows the function to return an instance of a class that does not have a default constructor. Consider the following example:
struct something{
something(int i) : value(i) {}
something(const something& a) : value(a.value) {}
int value;
};
something function1(){
return something(1);
}
void function2(something& other){
other.value = 2;
}
int main(void){
// valid usage
something var1(18);
// (do something with var1 and then re-use the variable)
var1 = function1();
// compile error
something var2;
function2(var2);
}
In case you are concerned about effiency, it should not matter whether you write your fill() to return a value, or to take output variable as a parameter. Your compiler should optimize it to the most efficient alternative of those two. If you suspect it doesn't, you had better measure it.

How should I write function parameters to enforce a move rather than a copy?

I want to move a large container from a return value into another class using that class' constructor. How do I formulate the parameter to ensure that it doesn't end up being copied?
/* for the sake of simplicity, imagine this typedef to be global */
typedef std::unordered_map<std::string, unsigned int> umap;
umap foo()
{
umap m; /* fill with lots of data */
return m;
}
class Bar
{
public:
Bar(umap m) : bm(m) { }
private:
umap bm;
};
Bar myBar(foo()); // run foo and pass return value directly to Bar constructor
Will above formulation trigger the appropriate behavior, or do I need to specify the constructor's parameters as rvalue-references, the way containers do for their own move-semantics?
public:
Bar(umap&& m) : bm(m) { }
or
public:
Bar(umap&& m) : bm(std::move(m)) { }
...?
If you want to support moving and copying, the easiest way would be to pass by value:
Bar(umap m) : bm(std::move(m)) { }
Now you can construct Bar from both an lvalue and an rvalue:
umap m;
Bar b1(m); // copies
Bar b2(std::move(m)); // moves
Bar b3(make_umap()); // moves
If you only want to support rvalues, then use an explicit rvalue reference:
Bar(umap && m) : bm(std::move(m)) { }
The std::move is always necessary, since (m) is always an lvalue.
Enforce passing of rvalue into the constructor and also make sure to move the map, like this:
Bar(umap &&m) : bm(std::move(m)) {}
Starting with the first example, to the above solution:
Let's start with the first example of Bar(umap m) : bm(m) {}. This will perform at least 1 copy, maybe 2.
bm(m) will always copy. m here is lvalue even though it's bound to an rvalue. To make it move, you must wrap it in std::move to turn it back into an rvalue. We get bm(std::move(m)).
Bar(umap m) might copy. The value will be moved in if it's an rvalue, but will be copied if it's an lvalue. Since we want to prevent all copies, we need to only let rvalue bind. We get Bar(umap && m).
In addition to the answers above:
What I see in your example is a classical case for "copy elision" optimization.
What are copy elision and return value optimization?
An actual compiler needs no support from you to optimize the things away and there is no must to deal yourself with rvalue references in this special case of return value optimization.
In C++11 you use std::move. If you need C++03 compatibility you can default-construct bm via :bm() and then use using std::swap; swap(bm, m);.

Does Passing STL Containers Make A Copy?

I can't remember whether passing an STL container makes a copy of the container, or just another alias. If I have a couple containers:
std::unordered_map<int,std::string> _hashStuff;
std::vector<char> _characterStuff;
And I want to pass those variables to a function, can I make the function as so:
void SomeClass::someFunction(std::vector<char> characterStuff);
Or would this make a copy of the unordered_map / vector? I'm thinking I might need to use shared_ptr.
void SomeClass::someFunction(std::shared_ptr<std::vector<char>> characterStuff);
It depends. If you are passing an lvalue in input to your function (in practice, if you are passing something that has a name, to which the address-of operator & can be applied) then the copy constructor of your class will be invoked.
void foo(vector<char> v)
{
...
}
int bar()
{
vector<char> myChars = { 'a', 'b', 'c' };
foo(myChars); // myChars gets COPIED
}
If you are passing an rvalue (roughly, something that doesn't have a name and to which the address-of operator & cannot be applied) and the class has a move constructor, then the object will be moved (which is not, beware, the same as creating an "alias", but rather transferring the guts of the object into a new skeleton, making the previous skeleton useless).
In the invocation of foo() below, the result of make_vector() is an rvalue. Therefore, the object it returns is being moved when given in input to foo() (i.e. vector's move constructor will be invoked):
void foo(vector<char> v);
{
...
}
vector<char> make_vector()
{
...
};
int bar()
{
foo(make_vector()); // myChars gets MOVED
}
Some STL classes have a move constructor but do not have a copy constructor, because they inherently are meant to be non-copiable (for instance, unique_ptr). You won't get a copy of a unique_ptr when you pass it to a function.
Even for those classes that do have a copy constructor, you can still force move semantics by using the std::move function to change your argument from an lvalue into an rvalue, but again that doesn't create an alias, it just transfers the ownership of the object to the function you are invoking. This means that you won't be able to do anything else with the original object other than reassigning to it another value or having it destroyed.
For instance:
void foo(vector<char> v)
{
...
}
vector<char> make_vector()
{
...
};
int bar()
{
vector<char> myChars = { 'a', 'b', 'c' };
foo(move(myChars)); // myChars gets MOVED
cout << myChars.size(); // ERROR! object myChars has been moved
myChars = make_vector(); // OK, you can assign another vector to myChars
}
If you find this whole subject of lvalue and rvalue references and move semantics obscure, that's very understandable. I personally found this tutorial quite helpful:
http://thbecker.net/articles/rvalue_references/section_01.html
You should be able to find some info also on http://www.isocpp.org or on YouTube (look for seminars by Scott Meyers).
Yes, it'll copy the vector because you're passing by value. Passing by value always makes a copy or move (which may be elided under certain conditions, but not in your case). If you want to refer to the same vector inside the function as outside, you can just pass it by reference instead. Change your function to:
void SomeClass::someFunction(std::vector<char>& characterStuff);
The type std::vector<char>& is a reference type, "reference to std::vector<char>". The name characterStuff will act as an alias for the object referred to by _characterStuff.
C++ is based on values: When passing object by value you get independent copies. If you don't want to get a copy, you can use a reference or a const reference, instead:
void SomeClass::someFunction(std::vector<char>& changable) { ... }
void SomeClass::otherFunction(std::vector<char> const& immutable) { ... }
When the called function shouldn't be able to change the argument but you don't want to create a copy of the object, you'd want to pass by const&. Normally, I wouldn't use something like a std::shared_ptr<T> instead. There are uses of this type by certainly not to prevent copying when calling a function.