I have the following code in my project at the moment :
std::vector<int> vectorOfFirsts;
std::set<double> setOfSeconds;
std::list<std::pair<int,double>> cachedList;
// do something to fill the list
for (const auto& pair : cachedList)
{
vectorOfFirsts.push_back(pair.first);
setOfSeconds.insert(pair.second);
}
This list will be very big, and is only necessary for filling the vector and the set (i.e. its content can be invalidated). My question now is, if the following optimization is a good idea:
for (const auto& pair : cachedList)
{
vectorOfFirsts.push_back(std::move(pair.first));
setOfSeconds.insert(std::move(pair.second));
}
Will calling std::move on pair.first somehow invalidate pair.second? And will this code provide any speedup for the loop? I know that it would probably be a good idea to fill the vector/set instead of the list, but the list is filled via some legacy code I have no control over / no time to dig in.
Will calling std::move on pair.first somehow invalidate pair.second?
No. first and second are completely different variables happening to reside in some class object. Moving one does not affect the other.
And will this code provide any speedup for the loop?
It depends on the types. The point of move-ing something is to transfer resources, basically. Since the pairs here are of ints and doubles, there are no resources involved, so there is nothing to transfer. Had it been a pair of some matrix type and some tensor type, each with some internal dynamically-allocated buffer, then it might have improved performance.
Stop.
Take a moment to think about this code. Comments inline
// step one - iterate through cachedList, binding the dereferenced
// iterator to a CONST reference
for (const auto& pair : cachedList)
{
// step 2 - use std::move to cast the l-value reference pair to an
// r-value. This will have the type const <pairtype> &&. A const
// r-value reference.
// vector::push_back does not have an overload for const T&& (rightly)
// so const T&& will decay to const T&. You will copy the object.
vectorOfFirsts.push_back(std::move(pair.first));
// ditto
setOfSeconds.insert(std::move(pair.second));
}
It needs to be:
for (auto& pair : cachedList)
{
vectorOfFirsts.push_back(std::move(pair.first));
setOfSeconds.insert(std::move(pair.second));
}
And yes, this then becomes a valid and legitimate use of move.
Related
With regard to the following code I would like to have some clarification. We have an array of pointers to a class. Next we loop over the array using a range based loop. For this range based loop auto& is used. But next when we use the element a we can use the arrow operator to call a function.
This code is compiled using C++ 11.
// Definition of an array of pointers to some class.
some_class* array[10];
// The array of pointers is set.
// Loop over the array.
for(auto& a : array)
{
// Call some function using the arrow operator.
a->some_func();
}
Is my understanding correct that auto& a is a reference to a pointer? Is this not a bit over kill. Would using auto a not create a copy of the pointer and take up the same amount of memory?
Your code compiles fine.
Nevertheless, there is not really a point using a reference here, if you don't like to change it.
Best practise here is
Use const auto &T if the content shall not be changed. The reference is important, if the type T of auto is large. Otherwise you will copy the object.
Use auto & T if you like to change the content of the container you are iterating.
Is my understanding correct that auto& a is a reference to a pointer?
Yes that's correct
Would using auto a not create a copy of the pointer and take up the same amount of memory
Think of references as an alias for the variable, that is, think of it as a different name.
as for this -> not create a copy of the pointer`
A pointer is very light weight and copying a pointer is relatively cheap (that's how views are implemented, pointers to sequences). If the object underline the container you are iterating is a fundamental type or a pointer to some type, auto is enough. In cases where the underline object of a container is a heavy weight object, then auto& is a better alternative (and ofc you can add const qualifier if you don't want to modify it).
Motivation:
I'm trying to transfer a std::vector<std::unique_ptr<some_type>> to a different thread, via a lambda capture.
Since I need the vector to not be cleaned up when the function goes out of scope, I need to take it by value (and not by reference).
Since it's a vector of unique_ptrs, I need to move (and not copy) it into the capture.
I'm using a generalized lambda capture to move the vector while capturing.
Minimal program to illustrate the concept:
auto create_vector(){
std::vector<std::unique_ptr<int>> new_vector{};
new_vector.push_back(std::make_unique<int>(5));
return std::move(new_vector);
}
int main() {
const auto vec_const = create_vector();
[vec=std::move(vec_const)](){
std::cout << "lambda, vec size: " << vec.size() << std::endl;
}();
}
Issue:
If I'm using a const local vector, compilation fails due to attempting to copy the unique_ptrs.
However if I remove the const qualifier, the code compiles and runs well.
auto vec_const = create_vector();
Questions:
What's the reason for this? Does being const disable the "movability" of the vector? Why?
How would I ensure the constness of a vector in such a scenario?
Follow-up:
The comments and answers mention that a const type can't be moved from. Sounds reasonable, however the the compiler errors fail to make it clear. In this case I would expect one of two things:
The std::move(vec_const) should throw an error regarding moving from const (casting it to rvalue) being impossible.
The vector move-constructor telling me that it refuses to accept const rvalues.
Why don't those happen? Why does instead the assignment seems to just try to copy the unique_ptrs inside the vector (which is what I'd expect from the vectors copy-constructor)?
Moving is a disruptive operation: you conceptually change the content of the thing you move from.
So yes: a const object can (and should) not be moved from. That would change the original object, which makes its constness void.
In this case, vector has no vector(const vector&&), only vector(vector &&) (move constructor) and vector(const vector &) (copy constructor).
Overload resolution will only bind a call with const vector argument to the latter (lest const-correctness would be violated), so this will result in copying the contents.
I agree: error reporting sucks. It's hard to engineer an error report about vector when you hit a problem with unique_ptr. That's why the whole tail of required from ...., required from ... obliterates the view.
From your question, and your code, I can tell that you don't fully grasp the move semantics stuff:
you shouldn't move into a return value; a return value is already an rvalue, so there's no point.
std::move does not really move anything, it only changes the qualifier of the variable you want to 'move from', so that the right receiver can be selected (using 'binding' rules). It is the receiving function that actually changes the contents of the original object.
When you are moving something from A to B, then act of moving must necessarily mean that A gets modified, since after the move A may no longer have whatever was in A, originally. This is the whole purpose of move semantics: to provide an optimal implementation since the moved-from object is allowed to be modified: its contents getting transferred in some fast and mysterious way into B, leaving A in some valid, but unspecified, state.
Consequently, by definition, A cannot be const.
http://www.cplusplus.com/reference/vector/vector/push_back/ (C++11 Version)
What is the difference and/or advantages of void push_back (const value_type& val); & void push_back (value_type&& val) and which do you suggest I use?;
I don't understand how to fill in the arguments (const value_type& val) & (value_type&& val)
I don't understand the second sentence under the parameter section. (It's a bit too wordy for me to get). I do understand what val is though
It doesn't give an example I can understand real well. Can I get other examples using vectors or some video links that explain the use of the function in practice better?
http://www.cplusplus.com/reference/vector/vector/pop_back/
It doesn't give an example I can understand real well. Can I get other examples using vectors or some video links that explain the use of the function in practice better?
If you are a beginner, just read over the additional qualifiers like const, & and &&. The methods in the STL are implemented in a way, that they behave consistent over all overloads:
I will give you a small example here:
std::vector<int> myvector;
myvector.push_back(5);
int five = 5;
myvector.push_back(five);
Now the more in depth part of the answer:
First (const value_type& val). The & character signals, that we take the argument by reference, that means we don't copy the argument, but get a fancy pointer, that will behave like the object itself.
You may not want, that your variable is changed, if you push it back to a vector. To get a promise, by the programmer of the STL, that he will not change your variable while pushing it back to the vector, he can add the const before the type.
The reason it is implemented that way, is that it may prevent an unneeded copy. (First copy the argument onto the stack to call push_back and the second time copy it at the position in the vector. The first copy is unnecessary and saved by the const reference.)
This is all nice and simple, but there are cases, where the compiler is not allowed to take a reference of a value and pass it to a function. In case of temporary values, there is no reference to take, because there is no variable in memory. Take the following line for example.
myvector.push_back(5);
Since the 5 has no address, it can't be passed as a reference. The compiler can not use the first overload of the function. But the programmer also does not want to waste the time for the copy onto the stack. That is why C++11 added new semantic. A so called rvalue for such temporary objects. If you want to write a function to take such an rvalue, you can do so by using type&& rvalue_variable. The value in this case the 5 is moved onto the stack by using the move constructor of the type. For trivial types like int, this will be the same as the copy constructor. For complex types like std::vector there are shortcuts one can take if one is allowed to rip the temporary object apart. In case of the vector, it does not need to copy all the data in the vector to a new location, but can use the pointer of the old vector in the new object.
Now we can look at the example again:
std::vector<int> myvector;
myvector.push_back(5); // push_back(const int&) can't be applied. The compiler chooses push_back(int&&) for us
int five = 5;
myvector.push_back(five); // push_back(const int&) can be applied and is used by the compiler
// The resulting vector after this has the two values [5, 5]
// and we see, that we don't need to care about it.
This should show you how you can use both of them.
push_back():
std::vector<int> vec = { 0, 1, 2 };
vec.push_back(3);
pop_back():
vec.pop_back();
vec.pop_back();
If you need more clarification:
push_back(const T& val) adds its parameter to the end of the vector, effectively increasing the size by 1 iff the vector capacity will be exceeded by its size.
pop_back() doesn't take any parameters and removes the last element of the vector, effectively reducing the size by 1.
Update:
I'm trying to tackle your questions one by one, if there is anything unclear, let me know.
What is the difference and/or advantages of void push_back (const value_type& val); & void push_back (value_type&& val) and which do you suggest I use?;
Prior to C++11, rvalue-references didn't exist. That's why push_back was implemented as vector.push_back(const value_type& val). If you have a compiler that supports C++11 or later, std::vector.push_back() will be overloaded for const lvalue references and rvalue references.
I don't understand how to fill in the arguments (const value_type& val) & (value_type&& val)
You as a programmer do NOT choose how you pass arguments to push_back(), the compiler does it for you automagically, in most cases.
I don't understand the second sentence under the parameter section. (It's a bit too wordy for me to get). I do understand what val is though
value_type is equal to the type of vector that you declared. If a vector is declared with std::string, then it can only hold std::string.
std::vector<std::string> vec;
vec.push_back("str"); // Ok. "str" is allowed.
vec.push_back(12); // Compile-time error. 12 is not allowed.
What is the difference and/or advantages of void push_back (const value_type& val); & void push_back (value_type&& val) and which do you suggest I use?
void push_back(const value_type&) takes an argument, that is then copied into the vector. This means that a new element is initialized as a copy of the passed argument, as defined by an appropriate allocator.
void push_back(value_type&&) takes an argument, that is then moved into the container (this type of expressions are called rvalue expressions).
The usage of either of two depends on the results you want to achieve.
I don't understand how to fill in the arguments (const value_type& val) & (value_type&& val)
In most cases you shouldn't think about which version to use, as compiler will take care of this for you. Second version will be called for any rvalue argument and the first one for the rest. In a rare case when you want to ensure the second overload is called you can use std::move to explicitly convert the argument expression into xvalue (which is a kind of rvalues).
I don't understand the second sentence under the parameter section. (It's a bit too wordy for me to get). I do understand what val is though
The sentence in question is:
Member type value_type is the type of the elements in the container, defined in vector as an alias of its first template parameter (T).
This means that value_type is the same type as the type of vector's elements. E.g., if you have vector<int> then value_type is the same as int and for vector<string> the value_type is string.
Because vector is not an ordinary type, but a template, you must specify a type parameters (which goes into angle brackets <> after vector) when defining a variable. Inside the vector template specification this type parameter T is then aliased with value_type:
typedef T value_type;
It doesn't give an example I can understand real well. Can I get other examples using vectors or some video links that explain the use of the function in practice better?
The main thing you need to remember is that vector behaves like a simple array, but with dynamicly changeable size and some additional information, like its length. push_back is simply a function that adds a new element at the end of this pseudo-array. There is, of course, a lot of subtle details, but they are inconsequential in most of the cases.
The basic usage is like this:
vector<int> v; // v is empty
v.push_back(1); // v now contains one element
vector<float> v2 { 1.0, 2.0 }; // v2 is now a vector with two elements
float f = v2.pop_back(); // v2 now has one element, and f is now equals 2.0
The best way to understand how it works is to try using it yourself.
Since const reference is pretty much the same as passing by value but without creating a copy (to my understanding). So is there a case where it is needed to create a copy of the variables (so we would need to use pass by value).
There are situations where you don't modify the input, but you still need an internal copy of the input, and then you may as well take the arguments by value. For example, suppose you have a function that returns a sorted copy of a vector:
template <typename V> V sorted_copy_1(V const & v)
{
V v_copy = v;
std::sort(v_copy.begin(), v_copy.end());
return v;
}
This is fine, but if the user has a vector that they never need for any other purpose, then you have to make a mandatory copy here that may be unnecessary. So just take the argument by value:
template <typename V> V sorted_copy_2(V v)
{
std::sort(v.begin(), v.end());
return v;
}
Now the entire process of producing, sorting and returning a vector can be done essentially "in-place".
Less expensive examples are algorithms which consume counters or iterators which need to be modified in the process of the algorithm. Again, taking those by value allows you to use the function parameter directly, rather than requiring a local copy.
It's usually faster to pass basic data types such as ints, floats and pointers by value.
Your function may want to modify the parameter locally, without altering the state of the variable passed in.
C++11 introduces move semantics. To move an object into a function parameter, its type cannot be const reference.
Like so many things, it's a balance.
We pass by const reference to avoid making a copy of the object.
When you pass a const reference, you pass a pointer (references are pointers with extra sugar to make them taste less bitter). And assuming the object is trivial to copy, of course.
To access a reference, the compiler will have to dereference the pointer to get to the content [assuming it can't be inlined and the compiler optimises away the dereference, but in that case, it will also optimise away the extra copy, so there's no loss from passing by value either].
So, if your copy is "cheaper" than the sum of dereferencing and passing the pointer, then you "win" when you pass by value.
And of course, if you are going to make a copy ANYWAY, then you may just as well make the copy when constructing the argument, rather than copying explicitly later.
The best example is probably the Copy and Swap idiom:
C& operator=(C other)
{
swap(*this, other);
return *this;
}
Taking other by value instead of by const reference makes it much easier to write a correct assignment operator that avoids code duplication and provides a strong exception guarantee!
Also passing iterators and pointers is done by value since it makes those algorithms much more reasonable to code, since they can modify their parameters locally. Otherwise something like std::partition would have to immediately copy its input anyway, which is both inefficient and looks silly. And we all know that avoiding silly-looking code is the number one priority:
template<class BidirIt, class UnaryPredicate>
BidirIt partition(BidirIt first, BidirIt last, UnaryPredicate p)
{
while (1) {
while ((first != last) && p(*first)) {
++first;
}
if (first == last--) break;
while ((first != last) && !p(*last)) {
--last;
}
if (first == last) break;
std::iter_swap(first++, last);
}
return first;
}
A const& cannot be changed without a const_cast through the reference, but it can be changed. At any point where code leaves the "analysis range" of your compiler (maybe a function call to a different compilation unit, or through a function pointer it cannot determine the value of at compilation time) it must assume that the value referred to may have changed.
This costs optimization. And it can make it harder to reason about possible bugs or quirks in your code: a reference is non-local state, and functions that operate only on local state and produce no side effects are really easy to reason about. Making your code easy to reason about is a large boon: more time is spent maintaining and fixing code than writing it, and effort spent on performance is fungible (you can spent it where it matters, instead of wasting time on micro optimizations everywhere).
On the other hand, a value requires that the value be copied into local automatic storage, which has costs.
But if your object is cheap to copy, and you don't want the above effect to occur, always take by value as it makes the compilers job of understanding the function easier.
Naturally only when the value is cheap to copy. If expensive to copy, or even if the copy cost is unknown, that cost should be enough to take by const&.
The short version of the above: taking by value makes it easier for you and the compiler to reason about the state of the parameter.
There is another reason. If your object is cheap to move, and you are going to store a local copy anyhow, taking by value opens up efficiencies. If you take a std::string by const&, then make a local copy, one std::string may be created in order to pass thes parameter, and another created for the local copy.
If you took the std::string by value, only one copy will be created (and possibly moved).
For a concrete example:
std::string some_external_state;
void foo( std::string const& str ) {
some_external_state = str;
}
void bar( std::string str ) {
some_external_state = std::move(str);
}
then we can compare:
int main() {
foo("Hello world!");
bar("Goodbye cruel world.");
}
the call to foo creates a std::string containing "Hello world!". It is then copied again into the some_external_state. 2 copies are made, 1 string discarded.
The call to bar directly creates the std::string parameter. Its state is then moved into some_external_state. 1 copy created, 1 move, 1 (empty) string discarded.
There are also certain exception safety improvements caused by this technique, as any allocation happens outside of bar, while foo could throw a resource exhausted exception.
This only applies when perfect forwarding would be annoying or fail, when moving is known to be cheap, when copying could be expensive, and when you know you are almost certainly going to make a local copy of the parameter.
Finally, there are some small types (like int) which the non-optimized ABI for direct copies is faster than the non-optimized ABI for const& parameters. This mainly matters when coding interfaces that cannot or will not be optimized, and is usually a micro optimization.
A const int * and an int *const are very different. Similarly with const std::auto_ptr<int> vs. std::auto_ptr<const int>. However, there appears to be no such distinction with const std::vector<int> vs. std::vector<const int> (actually I'm not sure the second is even allowed). Why is this?
Sometimes I have a function which I want to pass a reference to a vector. The function shouldn't modify the vector itself (eg. no push_back()), but it wants to modify each of the contained values (say, increment them). Similarly, I might want a function to only change the vector structure but not modify any of its existing contents (though this would be odd). This kind of thing is possible with std::auto_ptr (for example), but because std::vector::front() (for example) is defined as
const T &front() const;
T &front();
rather than just
T &front() const;
There's no way to express this.
Examples of what I want to do:
//create a (non-modifiable) auto_ptr containing a (modifiable) int
const std::auto_ptr<int> a(new int(3));
//this works and makes sense - changing the value pointed to, not the pointer itself
*a = 4;
//this is an error, as it should be
a.reset();
//create a (non-modifiable) vector containing a (modifiable) int
const std::vector<int> v(1, 3);
//this makes sense to me but doesn't work - trying to change the value in the vector, not the vector itself
v.front() = 4;
//this is an error, as it should be
v.clear();
It's a design decision.
If you have a const container, it usually stands to reason that you don't want anybody to modify the elements that it contains, which are an intrinsic part of it. That the container completely "owns" these elements "solidifies the bond", if you will.
This is in contrast to the historic, more lower-level "container" implementations (i.e. raw arrays) which are more hands-off. As you quite rightly say, there is a big difference between int const* and int * const. But standard containers simply choose to pass the constness on.
The difference is that pointers to int do not own the ints that they point to, whereas a vector<int> does own the contained ints. A vector<int> can be conceptualised as a struct with int members, where the number of members just happens to be variable.
If you want to create a function that can modify the values contained in the vector but not the vector itself then you should design the function to accept iterator arguments.
Example:
void setAllToOne(std::vector<int>::iterator begin, std::vector<int>::iterator end)
{
std::for_each(begin, end, [](int& elem) { elem = 1; });
}
If you can afford to put the desired functionality in a header, then it can be made generic as:
template<typename OutputIterator>
void setAllToOne(OutputIterator begin, OutputIterator end)
{
typedef typename iterator_traits<OutputIterator>::reference ref;
std::for_each(begin, end, [](ref elem) { elem = 1; });
}
One big problem syntactically with what you suggest is this: a std::vector<const T> is not the same type as a std::vector<T>. Therefore, you could not pass a vector<T> to a function that expects a vector<const T> without some kind of conversion. Not a simple cast, but the creation of a new vector<const T>. And that new one could not simply share data with the old; it would have to either copy or move the data from the old one to the new one.
You can get away with this with std::shared_ptr, but that's because those are shared pointers. You can have two objects that reference the same pointer, so the conversion from a std::shared_ptr<T> to shared_ptr<const T> doesn't hurt (beyond bumping the reference count). There is no such thing as a shared_vector.
std::unique_ptr works too because they can only be moved from, not copied. Therefore, only one of them will ever have the pointer.
So what you're asking for is simply not possible.
You are correct, it is not possible to have a vector of const int primarily because the elements will not assignable (requirements for the type of the element contained in the vector).
If you want a function that only modifies the elements of a vector but not add elements to the vector itself, this is primarily what STL does for you -- have functions that are agnostic about which container a sequence of elements is contained in. The function simply takes a pair of iterators and does its thing for that sequence, completely oblivious to the fact that they are contained in a vector.
Look up "insert iterators" for getting to know about how to insert something into a container without needing to know what the elements are. E.g., back_inserter takes a container and all that it cares for is to know that the container has a member function called "push_back".