std::move a const std::vector in a lambda capture - c++

Motivation:
I'm trying to transfer a std::vector<std::unique_ptr<some_type>> to a different thread, via a lambda capture.
Since I need the vector to not be cleaned up when the function goes out of scope, I need to take it by value (and not by reference).
Since it's a vector of unique_ptrs, I need to move (and not copy) it into the capture.
I'm using a generalized lambda capture to move the vector while capturing.
Minimal program to illustrate the concept:
auto create_vector(){
std::vector<std::unique_ptr<int>> new_vector{};
new_vector.push_back(std::make_unique<int>(5));
return std::move(new_vector);
}
int main() {
const auto vec_const = create_vector();
[vec=std::move(vec_const)](){
std::cout << "lambda, vec size: " << vec.size() << std::endl;
}();
}
Issue:
If I'm using a const local vector, compilation fails due to attempting to copy the unique_ptrs.
However if I remove the const qualifier, the code compiles and runs well.
auto vec_const = create_vector();
Questions:
What's the reason for this? Does being const disable the "movability" of the vector? Why?
How would I ensure the constness of a vector in such a scenario?
Follow-up:
The comments and answers mention that a const type can't be moved from. Sounds reasonable, however the the compiler errors fail to make it clear. In this case I would expect one of two things:
The std::move(vec_const) should throw an error regarding moving from const (casting it to rvalue) being impossible.
The vector move-constructor telling me that it refuses to accept const rvalues.
Why don't those happen? Why does instead the assignment seems to just try to copy the unique_ptrs inside the vector (which is what I'd expect from the vectors copy-constructor)?

Moving is a disruptive operation: you conceptually change the content of the thing you move from.
So yes: a const object can (and should) not be moved from. That would change the original object, which makes its constness void.
In this case, vector has no vector(const vector&&), only vector(vector &&) (move constructor) and vector(const vector &) (copy constructor).
Overload resolution will only bind a call with const vector argument to the latter (lest const-correctness would be violated), so this will result in copying the contents.
I agree: error reporting sucks. It's hard to engineer an error report about vector when you hit a problem with unique_ptr. That's why the whole tail of required from ...., required from ... obliterates the view.
From your question, and your code, I can tell that you don't fully grasp the move semantics stuff:
you shouldn't move into a return value; a return value is already an rvalue, so there's no point.
std::move does not really move anything, it only changes the qualifier of the variable you want to 'move from', so that the right receiver can be selected (using 'binding' rules). It is the receiving function that actually changes the contents of the original object.

When you are moving something from A to B, then act of moving must necessarily mean that A gets modified, since after the move A may no longer have whatever was in A, originally. This is the whole purpose of move semantics: to provide an optimal implementation since the moved-from object is allowed to be modified: its contents getting transferred in some fast and mysterious way into B, leaving A in some valid, but unspecified, state.
Consequently, by definition, A cannot be const.

Related

Is it costly to pass an initializer_list as a list by value?

I want to pass a std::list as a parameter to fn(std::list<int>), so I do fn({10, 21, 30}) and everybody is happy.
However, I've come to learn that one shouldn't pass list by value, cause it's costly. So, I redefine my fn as fn(std::list<int> &). Now, when I do the call fn({10, 21, 30}), I get an error: candidate function not viable: cannot convert initializer list argument to 'std::list<int> &'.
QUESTION TIME
Is the "you shall not pass an costly object by value" rule valid here? We aren't passing a list after all, but an initializer_list, no?
If the rule still applies, what's the easy fix here?
I guess my doubt comes from the fact that I don't know clearly what happens when one passes an initializer_list argument to a function that accepts a list.
Is list generated on the spot and then passed by value? If not, what is it that actually happens?
However, I've come to learn that one shouldn't pass list by value, cause it's costly.
That's not entirely accurate. If you need to pass in a list that the function can modify, where the modifications shouldn't be externally visible, you do want to pass a list by value. This gives the caller the ability to choose whether to copy or move from an existing list, so gives you the most reasonable flexibility.
If the modifications should be externally visible, you should prevent temporary list objects from being passed in, since passing in a temporary list object would prevent the caller from being able to see the changes made to the list. The flexibility to silently pass in temporary objects is the flexibility to shoot yourself in the foot. Don't make it too flexible.
If you need to pass in a list that the function will not modify, then const std::list<T> & is the type to use. This allows either lvalues or rvalues to be passed in. Since there won't be any update to the list, there is no need for the caller to see any update to the list, and there is no problem passing in temporary list objects. This again gives the caller the most reasonable flexibility.
Is the "you shall not pass an costly object by value" rule valid here? We aren't passing a list after all, but an initializer_list, no?
You're constructing a std::list from an initializer list. You're not copying that std::list object, but you are copying the list items from the initializer list to the std::list. If the copying of the list items is cheap, you don't need to worry about it. If the copying of the list items is expensive, then it should be up to the caller to construct the list in some other way, it still doesn't need to be something to worry about inside your function.
If the rule still applies, what's the easy fix here?
Both passing std::list by value or by const & allow the caller to avoid pointless copies. Which of those you should use depends on the results you want to achieve, as explained above.
Is list generated on the spot and then passed by value? If not, what is it that actually happens?
Passing the list by value constructs a new std::list object in the location of the function parameter, using the function argument to specify how to construct it. This may or may not involve a copy or a move of an existing std::list object, depending on what the caller specifies as the function argument.
The expression {10, 21, 30} will construct a initializer_list<int>
This in turn will be used to create a list<int>
That list will be a temporary and a temporarys will not bind to a
non-const reference.
One fix would be to change the prototype for you function to
fn(const std::list<int>&)
This means that you can't edit it inside the function, and you probably don't need to.
However, if you must edit the parameter inside the function, taking it by value would be appropriate.
Also note, don't optimize prematurely, you should always use idiomatic
constructs that clearly represents what you want do do, and for functions,
that almost always means parameters by const& and return by value.
This is easy to use right, hard to use wrong, and almost always fast enough.
Optimization should only be done after profiling, and only for the parts of the program that you have measured to need it.
Quoting the C++14 standard draft, (emphasis are mine)
18.9 Initializer lists [support.initlist]
2: An object of type initializer_list provides access to an array of
objects of type const E. [ Note: A pair of pointers or a pointer plus
a length would be obvious representations for initializer_list.
initializer_list is used to implement initializer lists as specified
in 8.5.4. Copying an initializer list does not copy the underlying
elements. —end note ]
std::list has a constructor which is used to construct from std::initializer_list. As you can see, it takes it by value.
list(initializer_list<T>, const Allocator& = Allocator());
If you are never going to modify your parameter, then fn(const std::list<int>&) will do just fine. Otherwise, fn(std::list<int>) will suffice well for.
To answer your questions:
Is the "you shall not pass an costly object by value" rule valid here?
We aren't passing a list after all, but an initializer_list, no?
std::initializer_list is not a costly object. But std::list<int> surely sounds like a costly object
If the rule still applies, what's the easy fix here?
Again, it's not costly
Is list generated on the spot and then passed by value? If not, what is it that actually happens?
Yes, it is... your list object is created on the spot at run-time right before the program enters your function scope
However, I've come to learn that one shouldn't pass list by value, cause it's costly. So, I redefine my fn as fn(std::list &). Now, when I do the call fn({10, 21, 30}), I get an error: candidate function not viable: cannot convert initializer list argument to 'std::list &'.
A way to fix the problem would be:
fn(std::list<int>& v) {
cout << v.size();
}
fn(std::list<int>&& v) {
fn(v);
}
Now fn({1, 2, 3 }); works as well (it will call the second overloaded function that accepts a list by rvalue ref, and then fn(v); calls the first one that accepts lvalue references.
fn(std::list<int> v)
{
}
The problem with this function is that it can be called like:
list<int> biglist;
fn(biglist);
And it will make a copy. And it will be slow. That's why you want to avoid it.
I would give you the following solutions:
Overloaded your fn function to accept both rvalues and lvalues
properly as shown before.
Only use the second function (the one that accepts only rvalue
references). The problem with this approach is that will throw a compile error even if it's called with a lvalue reference, which is something you want to allow.
Like the other answers and comments you can use a const reference to the list.
void fn(const std::list<int>& l)
{
for (auto it = l.begin(); it != l.end(); ++it)
{
*it; //do something
}
}
If this fn function is heavily used and you are worried about the overhead of constructing and destructing the temporary list object, you can create a second function that receives the initializer_list directly that doesn't involve any copying whatsoever. Using a profiler to catch such a performance hot spot is not trivial in many cases.
void fn(const std::initializer_list<int>& l)
{
for (auto it = l.begin(); it != l.end(); ++it)
{
*it; //do something
}
}
You can have std::list<> because in fact you're making temporary list and passing initializer_list by value is cheap. Also accessing that list later can be faster than a reference because you avoid dereferencing.
You could hack it by having const& std::list as parameter or like that
void foo( std::list<int> &list ) {}
int main() {
std::list<int> list{1,2,3};
foo( list );
}
List is created on function scope and this constructor is called
list (initializer_list<value_type> il,
const allocator_type& alloc = allocator_type())
So there's no passing list by value. But if you'll use that function and pass list as parameter it'll be passed by value.

what happens to lvalue passed in function as rvalue (c++)?

I have been wondering about that all day long and I can't find an answer to that specific case.
Main :
std::vector<MyObject*> myVector;
myVector.reserve(5);
myFunction(std::move(myVector));
myFunction :
void myFunction(std::vector<MyObject*> && givenVector){
std::vector<MyObject*> otherVector = givenVector;
std::cout << givenVector[0];
// do some stuff
}
My questions are :
in the main, is myVector destroyed by the function myFunction() because it is considered as an rvalue or does the compiler knows that it is also a lvalue and therefore performs a copy before sending it to myFunction ? What happens if I try to use the vector after the call to myFunction()?
inside the function myFunction() , is the vector givenVector destroyed when affected to otherVector ? if so, what happens when I try to print it ? if not is it useful to use rvalue in this function ?
Looks like duplicate.
myVector is not destroyed by the function myFunction(). It's unspecifed what should happen in general case with class with stealen resources.
givenVector is not destroyed when affected to otherVector. It's unspecifed what should happen in general case with class with stealen resources.
In order to be compilable, you should apply a std::move to your vector before you pass it to the function (--at least if no further overloads exists):
myFunction(std::move(myVector));
Then, inside the function, by
std::vector<MyObject*> otherVector = std::move(givenVector);
the move constructor of std::vector is called which basically moves all the content out of the vector (note however again the std::move on the right-hand side -- otherwise you'll get a copy). By this, the vector is not "destroyed". Even after the move it is still alive, yet in an unspecified state.
That means that those member functions which pose no specific condition on the state of the vector might be called, such as the destructor, the size() operator and so on. A pop_back() or a derefencing of a vector element however will likely fail.
See here for a more detailed explanation what you still can do with a moved-from object.
The code won't compile, since you try to bind an lvalue to an rvalue reference. You'll need to deliberately convert it to an rvalue:
myFunction(std::move(givenVector));
Simply doing this won't "destroy" the object; what happens to it depends on what the function does. Generally, functions which take rvalue references do so in order to move from the argument, in which case they might leave it in some valid but "empty" state, but won't destroy it.
Your code moves the vector to the local otherVector, leaving it empty. Then you try to print the first element of an empty vector, giving undefined behaviour.
No copy is performed. What happens to myVector depends on what myFunction does with it. You should consider objects that have been moved from as either being the same or being empty. You can assign new values and keep using it or destroy it.
myVector is fine. It is an lvalue and otherVector makes a copy of it. You most likely wanted to write otherVector = std::move(myVector);, in which case myVector should be empty. If you have an old implementation of the STL (that does not know about move semantics) a copy is performed and myVector is not changed. If that makes sense is for you to decide. You moved a given vector to a new vector, which can be useful. Printing an empty vector is not so useful.
If a function gets an argument by rvalue-reference, that does not mean it will be destructively used, only that it can be.
Destructive use means that the passed object is thereafter in some unspecified but valid state, fit only for re-initializing, mving, copying or destruction.
In the function, the argument has a name and thus is an lvalue.
To mark the place(s) where you want to take advantage of the licence to ruthlessly plunder it, you have to convert it to an rvalue-reference on passing it on, for example with std::move or std::forward, the latter mostly for templates.

Why pass by value and not by const reference?

Since const reference is pretty much the same as passing by value but without creating a copy (to my understanding). So is there a case where it is needed to create a copy of the variables (so we would need to use pass by value).
There are situations where you don't modify the input, but you still need an internal copy of the input, and then you may as well take the arguments by value. For example, suppose you have a function that returns a sorted copy of a vector:
template <typename V> V sorted_copy_1(V const & v)
{
V v_copy = v;
std::sort(v_copy.begin(), v_copy.end());
return v;
}
This is fine, but if the user has a vector that they never need for any other purpose, then you have to make a mandatory copy here that may be unnecessary. So just take the argument by value:
template <typename V> V sorted_copy_2(V v)
{
std::sort(v.begin(), v.end());
return v;
}
Now the entire process of producing, sorting and returning a vector can be done essentially "in-place".
Less expensive examples are algorithms which consume counters or iterators which need to be modified in the process of the algorithm. Again, taking those by value allows you to use the function parameter directly, rather than requiring a local copy.
It's usually faster to pass basic data types such as ints, floats and pointers by value.
Your function may want to modify the parameter locally, without altering the state of the variable passed in.
C++11 introduces move semantics. To move an object into a function parameter, its type cannot be const reference.
Like so many things, it's a balance.
We pass by const reference to avoid making a copy of the object.
When you pass a const reference, you pass a pointer (references are pointers with extra sugar to make them taste less bitter). And assuming the object is trivial to copy, of course.
To access a reference, the compiler will have to dereference the pointer to get to the content [assuming it can't be inlined and the compiler optimises away the dereference, but in that case, it will also optimise away the extra copy, so there's no loss from passing by value either].
So, if your copy is "cheaper" than the sum of dereferencing and passing the pointer, then you "win" when you pass by value.
And of course, if you are going to make a copy ANYWAY, then you may just as well make the copy when constructing the argument, rather than copying explicitly later.
The best example is probably the Copy and Swap idiom:
C& operator=(C other)
{
swap(*this, other);
return *this;
}
Taking other by value instead of by const reference makes it much easier to write a correct assignment operator that avoids code duplication and provides a strong exception guarantee!
Also passing iterators and pointers is done by value since it makes those algorithms much more reasonable to code, since they can modify their parameters locally. Otherwise something like std::partition would have to immediately copy its input anyway, which is both inefficient and looks silly. And we all know that avoiding silly-looking code is the number one priority:
template<class BidirIt, class UnaryPredicate>
BidirIt partition(BidirIt first, BidirIt last, UnaryPredicate p)
{
while (1) {
while ((first != last) && p(*first)) {
++first;
}
if (first == last--) break;
while ((first != last) && !p(*last)) {
--last;
}
if (first == last) break;
std::iter_swap(first++, last);
}
return first;
}
A const& cannot be changed without a const_cast through the reference, but it can be changed. At any point where code leaves the "analysis range" of your compiler (maybe a function call to a different compilation unit, or through a function pointer it cannot determine the value of at compilation time) it must assume that the value referred to may have changed.
This costs optimization. And it can make it harder to reason about possible bugs or quirks in your code: a reference is non-local state, and functions that operate only on local state and produce no side effects are really easy to reason about. Making your code easy to reason about is a large boon: more time is spent maintaining and fixing code than writing it, and effort spent on performance is fungible (you can spent it where it matters, instead of wasting time on micro optimizations everywhere).
On the other hand, a value requires that the value be copied into local automatic storage, which has costs.
But if your object is cheap to copy, and you don't want the above effect to occur, always take by value as it makes the compilers job of understanding the function easier.
Naturally only when the value is cheap to copy. If expensive to copy, or even if the copy cost is unknown, that cost should be enough to take by const&.
The short version of the above: taking by value makes it easier for you and the compiler to reason about the state of the parameter.
There is another reason. If your object is cheap to move, and you are going to store a local copy anyhow, taking by value opens up efficiencies. If you take a std::string by const&, then make a local copy, one std::string may be created in order to pass thes parameter, and another created for the local copy.
If you took the std::string by value, only one copy will be created (and possibly moved).
For a concrete example:
std::string some_external_state;
void foo( std::string const& str ) {
some_external_state = str;
}
void bar( std::string str ) {
some_external_state = std::move(str);
}
then we can compare:
int main() {
foo("Hello world!");
bar("Goodbye cruel world.");
}
the call to foo creates a std::string containing "Hello world!". It is then copied again into the some_external_state. 2 copies are made, 1 string discarded.
The call to bar directly creates the std::string parameter. Its state is then moved into some_external_state. 1 copy created, 1 move, 1 (empty) string discarded.
There are also certain exception safety improvements caused by this technique, as any allocation happens outside of bar, while foo could throw a resource exhausted exception.
This only applies when perfect forwarding would be annoying or fail, when moving is known to be cheap, when copying could be expensive, and when you know you are almost certainly going to make a local copy of the parameter.
Finally, there are some small types (like int) which the non-optimized ABI for direct copies is faster than the non-optimized ABI for const& parameters. This mainly matters when coding interfaces that cannot or will not be optimized, and is usually a micro optimization.

Transforming pass-by reference into pass-by return

I have the following function:
void read_int(std::vector<int> &myVector)
Which allows me to fill myVector through it reference. It is used like this:
std::vector<int> myVector;
read_int(myVector);
I want to refactor a bit the code (keeping the original function) to in the end have this:
auto myVector = read_int(); // auto is std::vector<int>
What would be the best intermediate function to achieve this?
It seems to me that the following straight-forward answer is suboptimal:
std::vector<int> read_int() {
std::vector<int> myVector_temp;
read_int(myVector_temp);
return myVector_temp;
}
The obvious answer is correct, and basically optimal.
void do_stufF(std::vector<int>& on_this); // (1)
std::vector<int> do_stuff_better() { // (2)
std::vector<int> myVector_temp; // (3)
do_stuff(myVector_temp); // (4)
return myVector_temp; // (5)
}
At (3) we create a named return value in automatic storage (on the stack).
At (5) we only ever return the named return value from the function, and we never return anything else but that named return value anywhere else in the function.
Because of (3) and (5), the compiler is allowed to (and most likely will) elide the existence of the myVector_temp object. It will directly construct the return value of the function, and call it myVector_temp. It still needs there to be an existing move or copy constructor, but it does not call it.
On the other end, when calling do_stuff_better, some compilers can also elide the assignment at call:
std::vector<int> bob = do_stuff_better(); // (6)
The compiler is allowed to effectively pass a "pointer to bob" and tell do_stuff_better() to construct its return value in bob's location, eliding this copy construction as well (well, it can arrange how the call occurs such that the location that do_stuff_better() is asked to construct its return value in is the same as the location of bob).
And in C++11, even if the requirements for both elisions are not met, or the compiler chooses not to use them, in both cases a move must be done instead of a copy.
At line (5) we are returning a locally declared automatic storage duration variable in a plain and simple return statement. This makes the return an implicit move if not elided.
At line (6), the function returns an unnamed object, which is an rvalue. When bob is constructed from it, it move-constructs.
moveing a std::vector consists of copying the value of ~3 pointers, and then zeroing the source, regardless of how big the vector is. No elements need be copied or moved.
Both of the above elisions, where we remove the named local variable within do_stuff_better(), and we remove the return value of do_stuff_better() and instead directly construct bob, are somewhat fragile. Learning the rules under which your compiler is allowed to do those elisions, and also the situations where your compiler actually does the elisions, is worthwhile.
As an example of how it is fragile, if you had a branch where you did a return std::vector<int>() in your do_stuff_better() after checking an error state, the in-function elision would probably be blocked.
Even if elision is blocked or your compiler doesn't implement it for a case, the fact that the container is move'd means that the run time costs are going to be minimal.
I think, you have to read more about move semantics (link to Google query, there are a lot of papers on this - just choose one).
In short, in C++ all STL containers are written in such way, that returning them from function will cause their contents to be moved from the returned value (so called right-hand reference) to the variable you are assigning it to. In effect you'll only copy a few fields of the std::vector instead of its data. That's a lot faster than copying its contents.

boost::optional not letting me reassign const value types

It seems to me there should be four variants of boost::optional
optional<Foo> => holds a mutable Foo and can be reassigned after initialization
optional<Foo const> const => holds a const Foo and can't be reassigned after initialization
optional<Foo> const => (should?) hold a mutable Foo but can't be reassigned after initialization
optional<Foo const> => (should?) hold a const Foo and can be reassigned after initialization
The first 2 cases work as expected. But the optional<Foo> const dereferences to a const Foo, and the optional<Foo const> doesn't allow reassignment after initialization (as touched upon in this question).
The reassignment of the const value types is specifically what I ran into, and the error is:
/usr/include/boost/optional/optional.hpp:486: error: passing ‘const Foo’ as ‘this’ argument of ‘Foo& Foo::operator=(const Foo&)’ discards qualifiers [-fpermissive]
And it happens here:
void assign_value(argument_type val,is_not_reference_tag) { get_impl() = val; }
After construction, the implementation uses the assignment operator for the type you parameterized the optional with. It obviously doesn't want a left-hand operand which is a const value. But why shouldn't you be able to reset a non-const optional to a new const value, such as in this case:
optional<Foo const> theFoo (maybeGetFoo());
while (someCondition) {
// do some work involving calling some methods on theFoo
// ...but they should only be const ones
theFoo = maybeGetFoo();
}
Some Questions:
Am I right that wanting this is conceptually fine, and not being able to do it is just a fluke in the implementation?
If I don't edit the boost sources, what would be a clean way to implement logic like in the loop above without scrapping boost::optional altogether?
If this does make sense and I were to edit the boost::optional source (which I've already had to do to make it support movable types, though I suspect they'll be doing that themselves soon) then what minimally invasive changes might do the trick?
So basically the problem seems to be related to this note in the documentation for optional& optional<T (not a ref)>::operator= ( T const& rhs ):
Notes: If *this was initialized, T's assignment operator is used, otherwise, its copy-constructor is used.
That is, suppose you have boost::optional<const Foo> theFoo;. Since a default constructed boost::optional<> is empty, the statement:
theFoo=defaultFoo;
should mean "copy construct defaultFoo into theFoos internal storage." Since there's nothing already in that internal storage, this makes sense, even if the internal storage is supposed to house a const Foo. Once finished, theFoo will not be empty.
Once theFoo contains a value, the statement
theFoo=defaultFoo;
should mean "assign defaultFoo into the object in theFoos internal storage." But theFoos internal storage isn't assignable (as it is const), and so this should raise a (compile time?) error.
Unfortunately, you'll notice the last two statements are identical, but conceptually require different compile time behavior. There's nothing to let the compiler tell the difference between the two, though.
Particularly in the scenario you're describing, it might make more sense to define boost::optional<...>'s assignment operator to instead have the semantics:
If *this was initialized, its current contents are first destroyed. Then T's copy-constructor is used.
After all, it's entirely possible to invoke T's assignment operator if that's what you really want to do, by saying *theFoo = rhs.
(1) One's opinion on what the behavior "should" be depends on whether optionals are "a container for zero or one objects of an arbitrary type" or "a thin proxy for a type, with an added feature". The existing code uses the latter idea, and by doing so, it removes half of the "four different behaviors" in the list. This reduces the complexity, and keeps you from unintentionally introducing inefficient usages.
(2) For any Foo type whose values are copyable, one can easily switch between mutable and immutable optionals by making a new one. So in the given case, you'd get it as mutable briefly and then copy it into an immutable value.
optional<Foo> theMutableFoo (maybeGetFoo());
while (someCondition) {
optional<Foo const> theFoo (theMutableFoo);
// do some work involving calling some methods on theFoo
// ...but they should only be const ones
// ...therefore, just don't use theMutableFoo in here!
theMutableFoo = maybeGetFoo();
}
Given the model that it's a "thin proxy" for a type, this is exactly the same kind of thing you would have to do if the type were not wrapped in an optional. An ordinary const value type needs the same treatment in such situations.
(3) One would have to follow up on the information given by #Andrzej to find out. But such an implementation change would probably not perform better than creating a new optional every time (as in the loop above). It's probably best to accept the existing design.
Sources: #R.MartinhoFernandez, #KerrekSB
To answer your three questions:
The implementation follows the design philosophy that optional's assignment uses T's assignment; and in this sense, the implementation is fine.
optional was designed with possible extensions in mind. Optional is only an interface for the underlying class optional_base. Instead of using optional you can derive your own class from optional_base. optional_base has a protected member construct, which does almost what you need. You will need a new member, say reset(T) that firsts clears the optional_base and then calls construct().
Alternatively, you can add member reset(T) to optional. This would be the least intrusive change.
You could also try the reference implementation of optional from this proposal.