Please pardon my lack of clarity on this topic. I am trying to create functions for inserting a big class into a vector. In this example, I use vector of ints as the big class.
#include <vector>
#include <iostream>
using namespace std;
vector<vector<int>> vectorOfVectors;
void fn(const vector<int> &v) {
vectorOfVectors.push_back(v);
}
void fn(vector<int> &&v) {
vectorOfVectors.push_back(std::move(v));
}
int main() {
vector<int> a({1});
const vector<int> b({2});
fn(std::move(a));
fn(b);
cout<<b[0];
}
Obviously, I want no copying to be done when possible. My questions:
Does this code do the right thing?
Is there a better way of doing this?
For the same approach to work with custom classes, do I need to define move constructors?
Does this code do the right thing?
Yes. C++11 added std::vector::push_back(T&&) for this very reason.
Is there a better way of doing this?
Your fn(const vector<int> &v) and fn(vector<int> &&v) are both doing the same thing, pushing the argument v onto the end of vectorOfVectors. In lieu of your two fn functions you could have used one function template that uses perfect forwarding.
template<typename T>
void fn(T &&v) {
vectorOfVectors.push_back(std::forward<T>(v));
}
How this works is thanks to the C++11 reference collapsing rules and std::forward. The template type T becomes vector<int>& in the case that v is an lvalue but vector<int>&& in the case that v is an rvalue. The reference collapsing rules means that vector<int>& && becomes vector<int>& while vector<int>&& && becomes vector<int>&&. This does exactly what you want, calling the version of push_back that does a copy in the case of an lvalue but the version that does a move in the case of an rvalue.
One downside is that this sometimes can lead to interesting diagnostics when you get things wrong. ("Interesting" here means hundreds of lines of inscrutable diagnostic text from either g++ or clang++). Another downside is that templates can result in a case of "converters gone wild".
For the same approach to work with custom classes, do I need to define move constructors?
Not necessarily. You'll get an implicitly declared move constructor if the class doesn't declare a user-defined destructor, copy constructor, copy assignment operator, or move assignment operator. The implicitly declared move constructor will be defined as deleted if the class has non-movable data members or derives from a class that can't be moved or deleted.
To me, that's a bit too much to remember. I don't know if this is a good practice or a bad one, but I've started using Foo(const Foo&)=default, with similar declarations for the other rule of five functions. I'll also qualify the constructors as explicit in many cases to avoid the "converters gone wild" problem.
It does avoid copying a.
Yes. Using push_back means you're forced to construct at least two objects while emplace_back and perfect forwarding could do less work.
template<typename... Ts>
auto fn(Ts &&...ts)
-> decltype(vectorOfVectors.emplace_back(std::forward<Ts>(ts)...), void())
{
vectorOfVectors.emplace_back(std::forward<Ts>(ts)...);
}
http://melpon.org/wandbox/permlink/sT65g3sDxHI0ZZhZ
3. As long as you're using push_back, you would need the classes to be move constructible in order to avoid copying. You don't necessarily need to define the move contructor yourself though, if you can get the default definition.
Related
I'm a bit confused regarding the difference between push_back and emplace_back.
void emplace_back(Type&& _Val);
void push_back(const Type& _Val);
void push_back(Type&& _Val);
As there is a push_back overload taking a rvalue reference I don't quite see what the purpose of emplace_back becomes?
In addition to what visitor said :
The function void emplace_back(Type&& _Val) provided by MSCV10 is non conforming and redundant, because as you noted it is strictly equivalent to push_back(Type&& _Val).
But the real C++0x form of emplace_back is really useful: void emplace_back(Args&&...);
Instead of taking a value_type it takes a variadic list of arguments, so that means that you can now perfectly forward the arguments and construct directly an object into a container without a temporary at all.
That's useful because no matter how much cleverness RVO and move semantic bring to the table there is still complicated cases where a push_back is likely to make unnecessary copies (or move). For example, with the traditional insert() function of a std::map, you have to create a temporary, which will then be copied into a std::pair<Key, Value>, which will then be copied into the map :
std::map<int, Complicated> m;
int anInt = 4;
double aDouble = 5.0;
std::string aString = "C++";
// cross your finger so that the optimizer is really good
m.insert(std::make_pair(4, Complicated(anInt, aDouble, aString)));
// should be easier for the optimizer
m.emplace(4, anInt, aDouble, aString);
So why didn't they implement the right version of emplace_back in MSVC? Actually, it bugged me too a while ago, so I asked the same question on the Visual C++ blog. Here is the answer from Stephan T Lavavej, the official maintainer of the Visual C++ standard library implementation at Microsoft.
Q: Are beta 2 emplace functions just some kind of placeholder right now?
A: As you may know, variadic templates
aren't implemented in VC10. We
simulate them with preprocessor
machinery for things like
make_shared<T>(), tuple, and the new
things in <functional>. This
preprocessor machinery is relatively
difficult to use and maintain. Also,
it significantly affects compilation
speed, as we have to repeatedly
include subheaders. Due to a
combination of our time constraints
and compilation speed concerns, we
haven't simulated variadic templates
in our emplace functions.
When variadic templates are
implemented in the compiler, you can
expect that we'll take advantage of
them in the libraries, including in
our emplace functions. We take
conformance very seriously, but
unfortunately, we can't do everything
all at once.
It's an understandable decision. Everyone who tried just once to emulate variadic template with preprocessor horrible tricks knows how disgusting this stuff gets.
emplace_back shouldn't take an argument of type vector::value_type, but instead variadic arguments that are forwarded to the constructor of the appended item.
template <class... Args> void emplace_back(Args&&... args);
It is possible to pass a value_type which will be forwarded to the copy constructor.
Because it forwards the arguments, this means that if you don't have rvalue, this still means that the container will store a "copied" copy, not a moved copy.
std::vector<std::string> vec;
vec.emplace_back(std::string("Hello")); // moves
std::string s;
vec.emplace_back(s); //copies
But the above should be identical to what push_back does. It is probably rather meant for use cases like:
std::vector<std::pair<std::string, std::string> > vec;
vec.emplace_back(std::string("Hello"), std::string("world"));
// should end up invoking this constructor:
//template<class U, class V> pair(U&& x, V&& y);
//without making any copies of the strings
Optimization for emplace_back can be demonstrated in next example.
For emplace_back constructor A (int x_arg) will be called. And for
push_back A (int x_arg) is called first and move A (A &&rhs) is called afterwards.
Of course, the constructor has to be marked as explicit, but for current example is good to remove explicitness.
#include <iostream>
#include <vector>
class A
{
public:
A (int x_arg) : x (x_arg) { std::cout << "A (x_arg)\n"; }
A () { x = 0; std::cout << "A ()\n"; }
A (const A &rhs) noexcept { x = rhs.x; std::cout << "A (A &)\n"; }
A (A &&rhs) noexcept { x = rhs.x; std::cout << "A (A &&)\n"; }
private:
int x;
};
int main ()
{
{
std::vector<A> a;
std::cout << "call emplace_back:\n";
a.emplace_back (0);
}
{
std::vector<A> a;
std::cout << "call push_back:\n";
a.push_back (1);
}
return 0;
}
output:
call emplace_back:
A (x_arg)
call push_back:
A (x_arg)
A (A &&)
One more example for lists:
// constructs the elements in place.
emplace_back("element");
// creates a new object and then copies (or moves) that object.
push_back(ExplicitDataType{"element"});
Specific use case for emplace_back: If you need to create a temporary object which will then be pushed into a container, use emplace_back instead of push_back. It will create the object in-place within the container.
Notes:
push_back in the above case will create a temporary object and move it
into the container. However, in-place construction used for emplace_back would be more
performant than constructing and then moving the object (which generally involves some copying).
In general, you can use emplace_back instead of push_back in all the cases without much issue. (See exceptions)
A nice code for the push_back and emplace_back is shown here.
http://en.cppreference.com/w/cpp/container/vector/emplace_back
You can see the move operation on push_back and not on emplace_back.
emplace_back conforming implementation will forward arguments to the vector<Object>::value_typeconstructor when added to the vector. I recall Visual Studio didn't support variadic templates, but with variadic templates will be supported in Visual Studio 2013 RC, so I guess a conforming signature will be added.
With emplace_back, if you forward the arguments directly to vector<Object>::value_type constructor, you don't need a type to be movable or copyable for emplace_back function, strictly speaking. In the vector<NonCopyableNonMovableObject> case, this is not useful, since vector<Object>::value_type needs a copyable or movable type to grow.
But note that this could be useful for std::map<Key, NonCopyableNonMovableObject>, since once you allocate an entry in the map, it doesn't need to be moved or copied ever anymore, unlike with vector, meaning that you can use std::map effectively with a mapped type that is neither copyable nor movable.
Recently I find myself often in the situation of having a single function that takes some object as a parameter. The function will have to copy that object.
However the parameter for that function may also quite frequently be a temporary and thus I want to also provide an overload of that function that takes an rvalue reference instead a const reference.
Both overloads tend to only differ in that they have different types of references as argument types. Other than that they are functionally equivalent.
For instance consider this toy example:
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
Now I was wondering whether there is a way to avoid this code-duplication by e.g. implementing one function in terms of the other.
For instance I was thinking of implementing the copy-version in terms of the move-one like this:
void foo(const MyObject &obj) {
MyObj copy = obj;
foo(std::move(copy));
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
However this still does not seem ideal since now there is a copy AND a move operation happening when calling the const ref overload instead of a single copy operation that was required before.
Furthermore, if the object does not provide a move-constructor, then this would effectively copy the object twice (afaik) which defeats the whole purpose of providing these overloads in the first place (avoiding copies where possible).
I'm sure one could hack something together using macros and the preprocessor but I would very much like to avoid involving the preprocessor in this (for readability purposes).
Therefore my question reads: Is there a possibility to achieve what I want (effectively only implementing the functionality once and then implement the second overload in terms of the first one)?
If possible I would like to avoid using templates instead.
My opinion is that understanding (truly) how std::move and std::forward work, together with what their similarities and their differences are is the key point to solve your doubts, so I suggest that you read my answer to What's the difference between std::move and std::forward, where I give a very good explanation of the two.
In
void foo(MyObject &&obj) {
globalVec.push_back(obj); // Moves (no, it doesn't!)
}
there's no move. obj is the name of a variable, and the overload of push_back which will be called is not the one which will steal reasources out of its argument.
You would have to write
void foo(MyObject&& obj) {
globalVec.push_back(std::move(obj)); // Moves
}
if you want to make the move possible, because std::move(obj) says look, I know this obj here is a local variable, but I guarantee you that I don't need it later, so you can treat it as a temporary: steal its guts if you need.
As regards the code duplication you see in
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject&& /*rvalue reference -> std::move it */ obj) {
globalVec.push_back(std::move(obj)); // Moves (corrected)
}
what allows you to avoid it is std::forward, which you would use like this:
template<typename T>
void foo(T&& /* universal/forwarding reference -> std::forward it */ obj) {
globalVec.push_back(std::forward<T>(obj)); // moves conditionally
}
As regards the error messages of templates, be aware that there are ways to make things easier. for instance, you could use static_asserts at the beginning of the function to enfornce that T is a specific type. That would certainly make the errors more understandable. For instance:
#include <type_traits>
#include <vector>
std::vector<int> globalVec{1,2,3};
template<typename T>
void foo(T&& obj) {
static_assert(std::is_same_v<int, std::decay_t<T>>,
"\n\n*****\nNot an int, aaarg\n*****\n\n");
globalVec.push_back(std::forward<T>(obj));
}
int main() {
int x;
foo(x);
foo(3);
foo('c'); // errors at compile time with nice message
}
Then there's SFINAE, which is harder and I guess beyond the scope of this question and answer.
My suggestion
Don't be scared of templates and SFINAE! They do pay off :)
There's a beautiful library that leverages template metaprogramming and SFINAE heavily and successfully, but this is really off-topic :D
A simple solution is:
void foo(MyObject obj) {
globalVec.push_back(std::move(obj));
}
If caller passes an lvalue, then there is a copy (into the parameter) and a move (into the vector). If caller passes an rvalue, then there are two moves (one into parameter and another into vector). This can potentially be slightly less optimal compared to the two overloads because of the extra move (slightly compensated by the lack of indirection) but in cases where moves are cheap, this is often a decent compromise.
Another solution for templates is std::forward explored in depth in Enlico's answer.
If you cannot have a template and the potential cost of a move is too expensive, then you just have to be satisfied with some extra boilerplate of having two overloads.
I see a lot of code at work where people use emplace and emplace_back with a temporary object, like this:
struct A {
A::A(int, int);
};
vector<A> v;
vector<A>.emplace_back(A(1, 2));
I know that the whole point of emplace_back is to be able to pass the parameters directly, like this:
v.emplace_back(1, 2);
But unfortunately this is not clear to a few people. But let's not dwell on that....
My question is: is the compiler able to optimize this and skip the create and copy? Or should I really try to fix these occurrences?
For your reference... we're working with C++14.
My question is: is the compiler able to optimize this and skip the create and copy? Or should I really try to fix these occurrences?
It can't avoid a copy, in the general case. Since emplace_back accepts by forwarding references, it must create temporaries from a pure standardese perspective. Those references must bind to objects, after all.
Copy elision is a set of rules that allows a copy(or move) constructor to be avoided, and a copy elided, even if the constructor and corresponding destructor have side-effects. It applies in only specific circumstances. And passing arguments by reference is not one of those. So for non-trivial types, where the object copies can't be inlined by the as-if rule, the compiler's hands are bound if it aims to be standard conformant.
The easy answer is no; elision doesn't work with perfect forwarding. But this is c++ so the answer is actually yes.
It requires a touch of boilerplate:
struct A {
A(int, int){std::cout << "A(int,int)\n"; }
A(A&&){std::cout<<"A(A&&)\n";}
};
template<class F>
struct maker_t {
F f;
template<class T>
operator T()&&{ return f(); }
};
template<class F>
maker_t<std::decay_t<F>> maker( F&& f ) { return {std::forward<F>(f)}; }
vector<A> v;
v.emplace_back(maker([]{ return A(1,2); }));
live example.
Output is one call to A(int,int). No move occurs. In c++17 the making doesn't even require that a move constructor exist (but the vector does, as it thinks it may have to move the elements in an already allocated buffer). In c++14 the moves are simply elided.
is the compiler able to optimize this and skip the create and copy?
There is not necessarily a copy involved. If a move constructor is available, there will be a move. This cannot be optimized away, as the direct initialization case will just call the init constructor, while in the other case, the move constructor will be called additionally (including its side-effects).
Therefore, if possible, you should refactor that code.
Assume that we have two types, T1 and T2.
T1 isn't important except the following facts:
it isn't copy constructible
it has a move constructor
we have an excellent function with the signature T1 copy(T1 const& orig), which creates a copy.
T2 can be simplified to the following class:
// T2.h
class T2 {
public:
T2() { /* initializes the vector with something */ }
T2(T2 const& other);
private:
std::vector<T1> v;
}
// T2.cpp
T2::T2(T2 const& other) : ... {}
How would you implement this method, if you could only write to the ellipsis part, or to the global scope?
A simple real world use case - assuming the "you can't write anything between the curly braces" part is a real world restriction:
T1 is std::unique_ptr<anything>
copy is std::make_unique
anything has a copy constructor
I also have two additional requirements for the implementation:
performance. It shouldn't be (considerably) slower than the naive implementation with a for loop in the copy constructor's body.
readability. The entire point behind the question is to do something which is more clear/clean than the trivial for loop (e.g. imagine T2 with two or more member vectors).
And optional, but nice to have features:
something that's easily generalized to other containers
something that works with just iterators
something that's generic
A clarification: I know the question is trivially solvable with a std::vector<T1> copy_vec(std::vector<T1> const& orig) global function. Placing that function into an anonymous namespace within T2.cpp would also make it local, but I would argue against its readability, I think it wouldn't be better than the for loop at all. And it's clearly a bad solution if the copy constructor isn't in an implementation file but inlined in the header.
So a rephrasing of my question is:
Is there already something similar implemented, which I can just include?
If there is none, then why? I'm not saying I thought about every corner case, but I think this is something that possibly can be implemented in a nice generic way, and thanks to unique_ptr, it's a common enough case.
Nothing wrong with a naive loop:
v.reserve(other.v.size());
for (auto& elem : other.v) {
v.push_back(copy(elem));
}
That's plenty readable and optimal.
Though I guess the modern, clever solution with be to use range-v3:
T2(T2 const& other)
: v(other.v | view::transform(copy))
{ }
I'm not sure that's enough better than the loop to justify the additional complexity but YMMV.
I'm a bit confused regarding the difference between push_back and emplace_back.
void emplace_back(Type&& _Val);
void push_back(const Type& _Val);
void push_back(Type&& _Val);
As there is a push_back overload taking a rvalue reference I don't quite see what the purpose of emplace_back becomes?
In addition to what visitor said :
The function void emplace_back(Type&& _Val) provided by MSCV10 is non conforming and redundant, because as you noted it is strictly equivalent to push_back(Type&& _Val).
But the real C++0x form of emplace_back is really useful: void emplace_back(Args&&...);
Instead of taking a value_type it takes a variadic list of arguments, so that means that you can now perfectly forward the arguments and construct directly an object into a container without a temporary at all.
That's useful because no matter how much cleverness RVO and move semantic bring to the table there is still complicated cases where a push_back is likely to make unnecessary copies (or move). For example, with the traditional insert() function of a std::map, you have to create a temporary, which will then be copied into a std::pair<Key, Value>, which will then be copied into the map :
std::map<int, Complicated> m;
int anInt = 4;
double aDouble = 5.0;
std::string aString = "C++";
// cross your finger so that the optimizer is really good
m.insert(std::make_pair(4, Complicated(anInt, aDouble, aString)));
// should be easier for the optimizer
m.emplace(4, anInt, aDouble, aString);
So why didn't they implement the right version of emplace_back in MSVC? Actually, it bugged me too a while ago, so I asked the same question on the Visual C++ blog. Here is the answer from Stephan T Lavavej, the official maintainer of the Visual C++ standard library implementation at Microsoft.
Q: Are beta 2 emplace functions just some kind of placeholder right now?
A: As you may know, variadic templates
aren't implemented in VC10. We
simulate them with preprocessor
machinery for things like
make_shared<T>(), tuple, and the new
things in <functional>. This
preprocessor machinery is relatively
difficult to use and maintain. Also,
it significantly affects compilation
speed, as we have to repeatedly
include subheaders. Due to a
combination of our time constraints
and compilation speed concerns, we
haven't simulated variadic templates
in our emplace functions.
When variadic templates are
implemented in the compiler, you can
expect that we'll take advantage of
them in the libraries, including in
our emplace functions. We take
conformance very seriously, but
unfortunately, we can't do everything
all at once.
It's an understandable decision. Everyone who tried just once to emulate variadic template with preprocessor horrible tricks knows how disgusting this stuff gets.
emplace_back shouldn't take an argument of type vector::value_type, but instead variadic arguments that are forwarded to the constructor of the appended item.
template <class... Args> void emplace_back(Args&&... args);
It is possible to pass a value_type which will be forwarded to the copy constructor.
Because it forwards the arguments, this means that if you don't have rvalue, this still means that the container will store a "copied" copy, not a moved copy.
std::vector<std::string> vec;
vec.emplace_back(std::string("Hello")); // moves
std::string s;
vec.emplace_back(s); //copies
But the above should be identical to what push_back does. It is probably rather meant for use cases like:
std::vector<std::pair<std::string, std::string> > vec;
vec.emplace_back(std::string("Hello"), std::string("world"));
// should end up invoking this constructor:
//template<class U, class V> pair(U&& x, V&& y);
//without making any copies of the strings
Optimization for emplace_back can be demonstrated in next example.
For emplace_back constructor A (int x_arg) will be called. And for
push_back A (int x_arg) is called first and move A (A &&rhs) is called afterwards.
Of course, the constructor has to be marked as explicit, but for current example is good to remove explicitness.
#include <iostream>
#include <vector>
class A
{
public:
A (int x_arg) : x (x_arg) { std::cout << "A (x_arg)\n"; }
A () { x = 0; std::cout << "A ()\n"; }
A (const A &rhs) noexcept { x = rhs.x; std::cout << "A (A &)\n"; }
A (A &&rhs) noexcept { x = rhs.x; std::cout << "A (A &&)\n"; }
private:
int x;
};
int main ()
{
{
std::vector<A> a;
std::cout << "call emplace_back:\n";
a.emplace_back (0);
}
{
std::vector<A> a;
std::cout << "call push_back:\n";
a.push_back (1);
}
return 0;
}
output:
call emplace_back:
A (x_arg)
call push_back:
A (x_arg)
A (A &&)
One more example for lists:
// constructs the elements in place.
emplace_back("element");
// creates a new object and then copies (or moves) that object.
push_back(ExplicitDataType{"element"});
Specific use case for emplace_back: If you need to create a temporary object which will then be pushed into a container, use emplace_back instead of push_back. It will create the object in-place within the container.
Notes:
push_back in the above case will create a temporary object and move it
into the container. However, in-place construction used for emplace_back would be more
performant than constructing and then moving the object (which generally involves some copying).
In general, you can use emplace_back instead of push_back in all the cases without much issue. (See exceptions)
A nice code for the push_back and emplace_back is shown here.
http://en.cppreference.com/w/cpp/container/vector/emplace_back
You can see the move operation on push_back and not on emplace_back.
emplace_back conforming implementation will forward arguments to the vector<Object>::value_typeconstructor when added to the vector. I recall Visual Studio didn't support variadic templates, but with variadic templates will be supported in Visual Studio 2013 RC, so I guess a conforming signature will be added.
With emplace_back, if you forward the arguments directly to vector<Object>::value_type constructor, you don't need a type to be movable or copyable for emplace_back function, strictly speaking. In the vector<NonCopyableNonMovableObject> case, this is not useful, since vector<Object>::value_type needs a copyable or movable type to grow.
But note that this could be useful for std::map<Key, NonCopyableNonMovableObject>, since once you allocate an entry in the map, it doesn't need to be moved or copied ever anymore, unlike with vector, meaning that you can use std::map effectively with a mapped type that is neither copyable nor movable.