Capacity of the vector from which data was moved [duplicate] - c++

This question already has answers here:
Is a moved-from vector always empty?
(4 answers)
Closed 4 years ago.
Is it mandatory, that the capacity of the std::vector is zero, after moving data from it? Assume that the memory allocators of source and destination vectors are always matching.
std::vector< int > v{1, 2, 3};
assert(0 < v.capacity());
std::vector< int > w;
w = std::move(v);
assert(0 == v.capacity());
Here said that move assignment operator leaves stealed RHS vector in a valid, but unspecified state. But nowhere pointed, that vector should perform additional memory allocations during move assignment operation. Another note is: vector have continuous memory region as underlying storage.

If your question had been about move construction of a vector, the answer would be easy, the source vector is left empty after the move. This is because of the requirement in
Table 99 — Allocator-aware container requirements
Expression:
X(rv)
X u(rv)
Requires: move construction of A shall not exit via an exception.
post: u shall have the same elements as rv had before this construction; the value of u.get_allocator() shall be the same as the value of rv.get_allocator() before this construction.
Complexity: constant
(the A in the requirements clause is the allocator type)
The constant complexity leaves no option but to steal resources from the source vector, which means for it to be in a valid, but unspecified state you'd need to leave it empty, and capacity() will equal zero.
The answer is considerably more complicated in case of a move assignment. The same Table 99 lists the requirement for move assignment as
Expression:
a = rv
Return type:
X&
Requires: If allocator_traits<allocator_type>::propagate_on_container_move_assignment::value is
false, T is MoveInsertable into X and MoveAssignable. All existing elements of a are either move assigned to or destroyed.
post: a shall be equal to the value that rv had before this assignment.
Complexity: linear
There are different cases to evaluate here.
First, say allocator_traits<allocator_type>::propagate_on_container_move_assignment::value == true, then the allocator can also be move assigned. This is mentioned in §23.2.1/8
... The allocator may be replaced only via assignment or swap(). Allocator replacement is performed by copy assignment, move assignment, or swapping of the allocator
only if allocator_traits<allocator_type>::propagate_on_container_copy_assignment::value,
allocator_traits<allocator_type>::propagate_on_container_move_assignment::value, or allocator_traits<allocator_type>::propagate_on_container_swap::value is true within the implementation of the corresponding container operation.
So the destination vector will destroy its elements, the allocator from the source is moved and the destination vector takes ownership of the memory buffer from the source. This will leave the source vector empty, and capacity() will equal zero.
Now let's consider the case where allocator_traits<allocator_type>::propagate_on_container_move_assignment::value == false. This means the allocator from the source cannot be move assigned to the destination vector. So you need to check the two allocators for equality before determining what to do.
If dest.get_allocator() == src.get_allocator(), then the destination vector is free to take ownership of the memory buffer from the source because it can use its own allocator to deallocate the storage.
Table 28 — Allocator requirements
Expression:
a1 == a2
Return type:
bool
returns true only if storage allocated from each can be deallocated via the other. ...
The sequence of operations performed is the same as the first case, except the source allocator is not move assigned. This will leave the source vector empty, and capacity() will equal zero.
In the last case, if allocator_traits<allocator_type>::propagate_on_container_move_assignment::value == false and dest.get_allocator() != src.get_allocator(), then the source allocator cannot be moved, and the destination allocator is unable to deallocate the storage allocated by the source allocator, so it cannot steal the memory buffer from source.
Each element from the source vector must be either move inserted or move assigned to the destination vector. Which operation gets done depends on the existing size and capacity of the destination vector.
The source vector retains ownership of its memory buffer after the move assignment, and it is up to the implementation to decide whether to deallocate the buffer or not, and the vector will most likely have capacity() greater than 0.
To ensure you do not run into undefined behavior when trying to resuse a vector that has been move assigned from, you should first call the clear() member function. This can be safely done since vector::clear has no pre-conditions, and will return the vector to a valid and specified state.
Also, vector::capacity has no pre-conditions either, so you can always query the capacity() of a moved from vector.

The state of the moved-from vector is unspecified but valid after the move, as you found out.
This means that it could really be in any valid state, in particular you can't assume that its capacity will be 0. It will probably be zero, and that would make a lot of sense, but that's not at all guaranteed.
But again, in practice if you don't care all that much about the standard, I suppose that you could rely on the capacity being 0. Due to the constraints on move operations, the vector move constructor pretty much has to steal memory from the moved-from one, leaving it empty. That's what will happen in almost all cases/implementations, but that's just not required.
A particularly twisted implementation could decide to reserve some elements in the moved-from vector just to mess with you. That technically wouldn't break any requirement.

Related

What happens to the left hand side std::vector resources on move assignment?

I was trying to find out if it's granted by the C++ 11 standard that the resources of a std::vector receiving a move assignment are freed.
I'll give you an example:
std::vector<int> a {1, 2};
std::vector<int> b {3, 4};
a = std::move(b);
Now what I'd be expecting is that both a and b are wrapping a regular array (usually) and they have a pointer to it, namely a_ptr and b_ptr. Now I think the move assignment among other things does a_ptr = b_ptr. Does the standard guarantee that the memory in a_ptr is freed before or by this?
By taking a look at Table 71: Container requirements it is stated that container's internal structure will be destroyed.
a = rv => All existing elements of a are either move assigned to or destroyed
As Howard Hinnant has mentioned in the comments, there is a special case which has also been stated in above sentence:
All existing elements of a are either move assigned to or
destroyed.
This spacial case is related to move-assignment operator and simply has 3 different states:
propagate_on_container_move_assignment is true in lhs:
In this case allocated memory in lhs is freed and the move operation is done. The allocator is also move assigned.
propagate_on_container_move_assignment is false in lhs and the allocators from both sides are equal:
In this case same sequence of operations is done except that there is no need to assign to the lhs allocator.
propagate_on_container_move_assignment is false in lhs and the allocators from both sides are not equal:
In this case because two allocators are not equal the allocator of lhs container cannot be used to manage the memory of the rhs container so the only option is to move assign rhs container items to the lhs container.
The last case explains why items in lhs container can be move assigned to. In any case the memory will be freed, reused or reallocated.

reserve() - data() trick on empty vector - is it correct?

As we know, std::vector when initialized like std::vector vect(n) or empty_vect.resize(n) not only allocates required amount of memory but also initializes it with default value (i.e. calls default constructor). This leads to unnecessary initialization especially if I have an array of integers and I'd like to fill it with some specific values that cannot be provided via any vector constructor.
Capacity on the other hand allocates the memory in call like empty_vect.reserve(n), but in this case vector still is empty. So size() returns 0, empty() returns true, operator[] generates exceptions.
Now, please look into the code:
{ // My scope starts here...
std::vector<int> vect;
vect.reserve(n);
int *data = vect.data();
// Here I know the size 'n' and I also have data pointer so I can use it as a regular array.
// ...
} // Here ends my scope, so vector is destroyed, memory is released.
The question is if "so I can use it as array" is a safe assumption?
No matter for arguments, I am just curious of above question. Anyway, as for arguments:
It allocates memory and automatically frees it on any return from function
Code does not performs unnecessary data initialization (which may affect performance in some cases)
No, you cannot use it.
The standard (current draft, equivalent wording in C++11) says in [vector.data]:
constexpr T* data() noexcept;
constexpr const T* data() const noexcept;
Returns: A pointer such that [data(), data() + size()) is a valid range.
For a non-empty vector, data() == addressof(front()).
You don't have any guarantee that you can access through the pointer beyond the vector's size. In particular, for an empty vector, the last sentence doesn't apply and so you cannot even be sure that you are getting a valid pointer to the underlying array.
There is currently no way to use std::vector with default-initialized elements.
As mentioned in the comments, you can use std::unique_ptr instead (requires #inclue<memory>):
auto data = std::unique_ptr<int[]>{new int[n]};
which will give you a std::unique_ptr<int[]> smart pointer to a dynamically sized array of int's, which will be destroyed automatically when the lifetime of data ends and that can transfer it's ownership via move operations.
It can be dereferenced and indexed directly with the usual pointer syntax, but does not allow direct pointer arithmetic. A raw pointer can be obtained from it via data.get().
It does not offer you the std::vector interface, though. In particular it does not provide access to its allocation size and cannot be copied.
Note: I made a mistake in a previous version of this answer. I used std::make_unique<int[]> without realizing that it actually also performs value-initialization (initialize to zero for ints). In C++20 there will be std::make_unique_default_init<int[]> which will default-initialize (and therefore leave ints with indeterminate value).

C++ std::vector clear all elements

Consider a std::vector:
std::vector<int> vec;
vec.push_back(1);
vec.push_back(2);
Would vec.clear() and vec = std::vector<int>() do the same job? What about the deallocation in second case?
vec.clear() clears all elements from the vector, leaving you with a guarantee of vec.size() == 0.
vec = std::vector<int>() calls the copy/move(Since C++11) assignment operator , this replaces the contents of vec with that of other. other in this case is a newly constructed empty vector<int> which means that it's the same effect as vec.clear();. The only difference is that clear() doesn't affect the vector's capacity while re-assigning does, it resets it.
The old elements are deallocated properly just as they would with clear().
Note that vec.clear() is always as fast and without the optimizer doing it's work most likely faster than constructing a new vector and assigning it to vec.
They are different:
clear is guaranteed to not change capacity.
Move assignment is not guaranteed to change capacity to zero, but it may and will in a typical implementation.
The clear guarantee is by this rule:
No reallocation shall take place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity()
Post conditions of clear:
Erases all elements in the
container. Post: a.empty()
returns true
Post condition of assignment:
a = rv;
a shall be equal to
the value that rv
had before this
assignment
a = il;
Assigns the range
[il.begin(),il.end()) into a. All existing
elements of a are either assigned to or
destroyed.

Is it safe to push_back an element from the same vector?

vector<int> v;
v.push_back(1);
v.push_back(v[0]);
If the second push_back causes a reallocation, the reference to the first integer in the vector will no longer be valid. So this isn't safe?
vector<int> v;
v.push_back(1);
v.reserve(v.size() + 1);
v.push_back(v[0]);
This makes it safe?
It looks like http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-closed.html#526 addressed this problem (or something very similar to it) as a potential defect in the standard:
1) Parameters taken by const reference can be changed during execution
of the function
Examples:
Given std::vector v:
v.insert(v.begin(), v[2]);
v[2] can be changed by moving elements of vector
The proposed resolution was that this was not a defect:
vector::insert(iter, value) is required to work because the standard
doesn't give permission for it not to work.
Yes, it's safe, and standard library implementations jump through hoops to make it so.
I believe implementers trace this requirement back to 23.2/11 somehow, but I can't figure out how, and I can't find something more concrete either. The best I can find is this article:
http://www.drdobbs.com/cpp/copying-container-elements-from-the-c-li/240155771
Inspection of libc++'s and libstdc++'s implementations shows that they are also safe.
The standard guarantees even your first example to be safe. Quoting C++11
[sequence.reqmts]
3 In Tables 100 and 101 ... X denotes a sequence container class, a denotes a value of X containing elements of type T, ... t denotes an lvalue or a const rvalue of X::value_type
16 Table 101 ...
Expression a.push_back(t) Return type void Operational semantics Appends a copy of t. Requires: T shall be CopyInsertable into X. Container basic_string, deque, list, vector
So even though it's not exactly trivial, the implementation must guarantee it will not invalidate the reference when doing the push_back.
It is not obvious that the first example is safe, because the simplest implementation of push_back would be to first reallocate the vector, if needed, and then copy the reference.
But at least it seems to be safe with Visual Studio 2010. Its implementation of push_back does special handling of the case when you push back an element in the vector.
The code is structured as follows:
void push_back(const _Ty& _Val)
{ // insert element at end
if (_Inside(_STD addressof(_Val)))
{ // push back an element
...
}
else
{ // push back a non-element
...
}
}
This isn't a guarantee from the standard, but as another data point, v.push_back(v[0]) is safe for LLVM's libc++.
libc++'s std::vector::push_back calls __push_back_slow_path when it needs to reallocate memory:
void __push_back_slow_path(_Up& __x) {
allocator_type& __a = this->__alloc();
__split_buffer<value_type, allocator_type&> __v(__recommend(size() + 1),
size(),
__a);
// Note that we construct a copy of __x before deallocating
// the existing storage or moving existing elements.
__alloc_traits::construct(__a,
_VSTD::__to_raw_pointer(__v.__end_),
_VSTD::forward<_Up>(__x));
__v.__end_++;
// Moving existing elements happens here:
__swap_out_circular_buffer(__v);
// When __v goes out of scope, __x will be invalid.
}
The first version is definitely NOT safe:
Operations on iterators obtained by calling a standard library container or string member function may access the underlying container, but shall not modify it. [ Note: In particular, container operations that invalidate iterators conflict with operations on iterators associated with that container. — end note ]
from section 17.6.5.9
Note that this is the section on data races, which people normally think of in conjunction with threading... but the actual definition involves "happens before" relationships, and I don't see any ordering relationship between the multiple side-effects of push_back in play here, namely the reference invalidation seems not to be defined as ordered with respect to copy-constructing the new tail element.
It is completely safe.
In your second example you have
v.reserve(v.size() + 1);
which is not needed because if vector goes out of its size, it will imply the reserve.
Vector is responsible for this stuff, not you.
Both are safe since push_back will copy the value, not the reference. If you are storing pointers, that is still safe as far as the vector is concerned, but just know that you'll have two elements of your vector pointing to the same data.
Section 23.2.1 General Container Requirements
16
a.push_back(t) Appends a copy of t. Requires: T shall be CopyInsertable into X.
a.push_back(rv) Appends a copy of rv. Requires: T shall be MoveInsertable into X.
Implementations of push_back must therefore ensure that a copy of v[0] is inserted. By counter example, assuming an implementation that would reallocate before copying, it would not assuredly append a copy of v[0] and as such violate the specs.
From 23.3.6.5/1: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid.
Since we're inserting at the end, no references will be invalidated if the vector isn't resized. So if the vector's capacity() > size() then it's guaranteed to work, otherwise it's guaranteed to be undefined behavior.

Is a moved-from vector always empty?

I know that generally the standard places few requirements on the values which have been moved from:
N3485 17.6.5.15 [lib.types.movedfrom]/1:
Objects of types defined in the C++ standard library may be moved from (12.8). Move operations may
be explicitly specified or implicitly generated. Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.
I can't find anything about vector that explicitly excludes it from this paragraph. However, I can't come up with a sane implementation that would result in the vector being not empty.
Is there some standardese that entails this that I'm missing or is this similar to treating basic_string as a contiguous buffer in C++03?
I'm coming to this party late, and offering an additional answer because I do not believe any other answer at this time is completely correct.
Question:
Is a moved-from vector always empty?
Answer:
Usually, but no, not always.
The gory details:
vector has no standard-defined moved-from state like some types do (e.g. unique_ptr is specified to be equal to nullptr after being moved from). However the requirements for vector are such that there are not too many options.
The answer depends on whether we're talking about vector's move constructor or move assignment operator. In the latter case, the answer also depends on the vector's allocator.
vector<T, A>::vector(vector&& v)
This operation must have constant complexity. That means that there are no options but to steal resources from v to construct *this, leaving v in an empty state. This is true no matter what the allocator A is, nor what the type T is.
So for the move constructor, yes, the moved-from vector will always be empty. This is not directly specified, but falls out of the complexity requirement, and the fact that there is no other way to implement it.
vector<T, A>&
vector<T, A>::operator=(vector&& v)
This is considerably more complicated. There are 3 major cases:
One:
allocator_traits<A>::propagate_on_container_move_assignment::value == true
(propagate_on_container_move_assignment evaluates to true_type)
In this case the move assignment operator will destruct all elements in *this, deallocate capacity using the allocator from *this, move assign the allocators, and then transfer ownership of the memory buffer from v to *this. Except for the destruction of elements in *this, this is an O(1) complexity operation. And typically (e.g. in most but not all std::algorithms), the lhs of a move assignment has empty() == true prior to the move assignment.
Note: In C++11 the propagate_on_container_move_assignment for std::allocator is false_type, but this has been changed to true_type for C++1y (y == 4 we hope).
In case One, the moved-from vector will always be empty.
Two:
allocator_traits<A>::propagate_on_container_move_assignment::value == false
&& get_allocator() == v.get_allocator()
(propagate_on_container_move_assignment evaluates to false_type, and the two allocators compare equal)
In this case, the move assignment operator behaves just like case One, with the following exceptions:
The allocators are not move assigned.
The decision between this case and case Three happens at run time, and case Three requires more of T, and thus so does case Two, even though case Two doesn't actually execute those extra requirements on T.
In case Two, the moved-from vector will always be empty.
Three:
allocator_traits<A>::propagate_on_container_move_assignment::value == false
&& get_allocator() != v.get_allocator()
(propagate_on_container_move_assignment evaluates to false_type, and the two allocators do not compare equal)
In this case the implementation can not move assign the allocators, nor can it transfer any resources from v to *this (resources being the memory buffer). In this case, the only way to implement the move assignment operator is to effectively:
typedef move_iterator<iterator> Ip;
assign(Ip(v.begin()), Ip(v.end()));
That is, move each individual T from v to *this. The assign can reuse both capacity and size in *this if available. For example if *this has the same size as v the implementation can move assign each T from v to *this. This requires T to be MoveAssignable. Note that MoveAssignable does not require T to have a move assignment operator. A copy assignment operator will also suffice. MoveAssignable just means T has to be assignable from an rvalue T.
If the size of *this is not sufficient, then new T will have to be constructed in *this. This requires T to be MoveInsertable. For any sane allocator I can think of, MoveInsertable boils down to the same thing as MoveConstructible, which means constructible from an rvalue T (does not imply the existence of a move constructor for T).
In case Three, the moved-from vector will in general not be empty. It could be full of moved-from elements. If the elements don't have a move constructor, this could be equivalent to a copy assignment. However, there is nothing that mandates this. The implementor is free to do some extra work and execute v.clear() if he so desires, leaving v empty. I am not aware of any implementation doing so, nor am I aware of any motivation for an implementation to do so. But I don't see anything forbidding it.
David Rodríguez reports that GCC 4.8.1 calls v.clear() in this case, leaving v empty. libc++ does not, leaving v not empty. Both implementations are conforming.
While it might not be a sane implementation in the general case, a valid implementation of the move constructor/assignment is just copying the data from the source, leaving the source untouched. Additionally, for the case of assignment, move can be implemented as swap, and the moved-from container might contain the old value of the moved-to container.
Implementing move as copy can actually happen if you use polymorphic allocators, as we do, and the allocator is not deemed to be part of the value of the object (and thus, assignment never changes the actual allocator being used). In this context, a move operation can detect whether both the source and the destination use the same allocator. If they use the same allocator the move operation can just move the data from the source. If they use different allocators then the destination must copy the source container.
In a lot of situations, move-construction and move-assignment can be implemented by delegating to swap - especially if no allocators are involved. There are several reasons for doing that:
swap has to be implemented anyway
developer efficiency because less code has to be written
runtime efficiency because fewer operations are executed in total
Here is an example for move-assignment. In this case, the move-from vector will not be empty, if the moved-to vector was not empty.
auto operator=(vector&& rhs) -> vector&
{
if (/* allocator is neither move- nor swap-aware */) {
swap(rhs);
} else {
...
}
return *this;
}
I left comments to this effect on other answers, but had to rush off before fully explaining. The result of a moved-from vector must always be empty, or in the case of move assignment, must be either empty or the previous object's state (i.e. a swap), because otherwise the iterator invalidation rules cannot be met, namely that a move does not invalidate them. Consider:
std::vector<int> move;
std::vector<int>::iterator it;
{
std::vector<int> x(some_size);
it = x.begin();
move = std::move(x);
}
std::cout << *it;
Here you can see that iterator invalidation does expose the implementation of the move. The requirement for this code to be legal, specifically that the iterator remains valid, prevents the implementation from performing a copy, or small-object-storage or any similar thing. If a copy was made, then it would be invalidated when the optional is emptied, and the same is true if the vector uses some kind of SSO-based storage. Essentially, the only reasonable possible implementation is to swap pointers, or simply move them.
Kindly view the Standard quotes on requirements for all containers:
X u(rv)
X u = rv
post: u shall be equal to the value that rv had before this construction
a = rv
a shall be equal to the value that rv had before this assignment
Iterator validity is part of the value of a container. Although the Standard does not unambiguously state this directly, we can see in, for example,
begin() returns an iterator referring to the first element in the
container. end() returns an iterator which is the past-the-end value
for the container. If the container is empty, then begin() == end();
Any implementation which actually did move from the elements of the source instead of swapping the memory would be defective, so I suggest that any Standard wordings saying otherwise is a defect- not least of which because the Standard is not in fact very clear on this point. These quotes are from N3691.