C++ std::vector clear all elements - c++

Consider a std::vector:
std::vector<int> vec;
vec.push_back(1);
vec.push_back(2);
Would vec.clear() and vec = std::vector<int>() do the same job? What about the deallocation in second case?

vec.clear() clears all elements from the vector, leaving you with a guarantee of vec.size() == 0.
vec = std::vector<int>() calls the copy/move(Since C++11) assignment operator , this replaces the contents of vec with that of other. other in this case is a newly constructed empty vector<int> which means that it's the same effect as vec.clear();. The only difference is that clear() doesn't affect the vector's capacity while re-assigning does, it resets it.
The old elements are deallocated properly just as they would with clear().
Note that vec.clear() is always as fast and without the optimizer doing it's work most likely faster than constructing a new vector and assigning it to vec.

They are different:
clear is guaranteed to not change capacity.
Move assignment is not guaranteed to change capacity to zero, but it may and will in a typical implementation.
The clear guarantee is by this rule:
No reallocation shall take place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity()
Post conditions of clear:
Erases all elements in the
container. Post: a.empty()
returns true
Post condition of assignment:
a = rv;
a shall be equal to
the value that rv
had before this
assignment
a = il;
Assigns the range
[il.begin(),il.end()) into a. All existing
elements of a are either assigned to or
destroyed.

Related

Is it safe to compare to pointer of std::vector to check equality?

at a time, I created a pointer point to a std::vector, then I did some push_back, reserve, resize operation to that vector, after such operations, is it safe to compare the pointer to the address of that vector to check whether the pointer point to that vector, because there might be some re-allocation of memory.
for example
std::vector<int> vec;
vector<int>* pVec = &vec;
vec.reserve(10000);
assert(pVec == &vec);
vec = anotherVec;
assert(pVec == &vec);
what is more, is it safe to compare a pointer to the first value of vector?
for example:
std::vector<int> vec(1,0);
int* p = &vec[0];
// some operation here
assert(p == &vec[0]);
As I tested by myself, it seems that the first situation is safe, while the second is not, but I can't be sure.
std::vector<int> vec;
vector<int>* pVec = &vec;
vec.reserve(10000);
assert(pVec == &vec);
is safe.
std::vector<int> vec(1,0);
int* p = &vec[0];
// some operation here
assert(p == &vec[0]);
is not safe.
The first block is safe since the address vec will not change even when its contents change.
The second block is not safe since the address of vec[0] may change; for example when the vec resizes itself โ€” e.g, when you push_back elements to it.
it seems that the first situation is safe, while the second is not
That's right. In the first "situation", the vec object itself stays wherever it is in memory regardless of the reserve call, which might move the managed elements to another area of dynamic memory. It's because elements can be moved that the pointers may not compare equal in the second scenario.
The second situation is safe as long as no relocation takes place. If you know the size you will need in advance and use reserve() before you get the pointer it is perfectly safe and you save a little bit of performance (one less level of indirection).
However, any addition with push_back() for example might go beyond the allocated space and invalidate your pointer. std::vector is optimized and will try to allocate more memory at the same position if possible (since it saves copying data around) but you cannot be sure of that.
For that matter, instead of taking a pointer you could take an iterator because an iterator on a vector behaves exactly like a pointer (and has no performance impact) with more type safety.
vector<int>* pVec = &vec; operates on the address of the std::vector<int> object which is valid till scope.
vec = anotherVec; does not change the address of the vec because of here the operator = of std::vector is called.
So, both assert(pVec == &vec); is successfully passed through.
In the case int* p = &vec[0]; it depends: see Iterator invalidation.
The first case is indeed safe as there is no danger of the vector object's address changing. The second case is safe as long as no reallocation happens (reallocations can be traced using the std::vector::capacity member function), otherwise it's either undefined or implementation-defined depending on the version of the language. For more information consult this answer. since the same restrictions apply in that case.

Capacity of the vector from which data was moved [duplicate]

This question already has answers here:
Is a moved-from vector always empty?
(4 answers)
Closed 4 years ago.
Is it mandatory, that the capacity of the std::vector is zero, after moving data from it? Assume that the memory allocators of source and destination vectors are always matching.
std::vector< int > v{1, 2, 3};
assert(0 < v.capacity());
std::vector< int > w;
w = std::move(v);
assert(0 == v.capacity());
Here said that move assignment operator leaves stealed RHS vector in a valid, but unspecified state. But nowhere pointed, that vector should perform additional memory allocations during move assignment operation. Another note is: vector have continuous memory region as underlying storage.
If your question had been about move construction of a vector, the answer would be easy, the source vector is left empty after the move. This is because of the requirement in
Table 99 โ€” Allocator-aware container requirements
Expression:
X(rv)
X u(rv)
Requires: move construction of A shall not exit via an exception.
post: u shall have the same elements as rv had before this construction; the value of u.get_allocator() shall be the same as the value of rv.get_allocator() before this construction.
Complexity: constant
(the A in the requirements clause is the allocator type)
The constant complexity leaves no option but to steal resources from the source vector, which means for it to be in a valid, but unspecified state you'd need to leave it empty, and capacity() will equal zero.
The answer is considerably more complicated in case of a move assignment. The same Table 99 lists the requirement for move assignment as
Expression:
a = rv
Return type:
X&
Requires: If allocator_traits<allocator_type>::propagate_on_container_move_assignment::value is
false, T is MoveInsertable into X and MoveAssignable. All existing elements of a are either move assigned to or destroyed.
post: a shall be equal to the value that rv had before this assignment.
Complexity: linear
There are different cases to evaluate here.
First, say allocator_traits<allocator_type>::propagate_on_container_move_assignment::value == true, then the allocator can also be move assigned. This is mentioned in ยง23.2.1/8
... The allocator may be replaced only via assignment or swap(). Allocator replacement is performed by copy assignment, move assignment, or swapping of the allocator
only if allocator_traits<allocator_type>::propagate_on_container_copy_assignment::value,
allocator_traits<allocator_type>::propagate_on_container_move_assignment::value, or allocator_traits<allocator_type>::propagate_on_container_swap::value is true within the implementation of the corresponding container operation.
So the destination vector will destroy its elements, the allocator from the source is moved and the destination vector takes ownership of the memory buffer from the source. This will leave the source vector empty, and capacity() will equal zero.
Now let's consider the case where allocator_traits<allocator_type>::propagate_on_container_move_assignment::value == false. This means the allocator from the source cannot be move assigned to the destination vector. So you need to check the two allocators for equality before determining what to do.
If dest.get_allocator() == src.get_allocator(), then the destination vector is free to take ownership of the memory buffer from the source because it can use its own allocator to deallocate the storage.
Table 28 โ€” Allocator requirements
Expression:
a1 == a2
Return type:
bool
returns true only if storage allocated from each can be deallocated via the other. ...
The sequence of operations performed is the same as the first case, except the source allocator is not move assigned. This will leave the source vector empty, and capacity() will equal zero.
In the last case, if allocator_traits<allocator_type>::propagate_on_container_move_assignment::value == false and dest.get_allocator() != src.get_allocator(), then the source allocator cannot be moved, and the destination allocator is unable to deallocate the storage allocated by the source allocator, so it cannot steal the memory buffer from source.
Each element from the source vector must be either move inserted or move assigned to the destination vector. Which operation gets done depends on the existing size and capacity of the destination vector.
The source vector retains ownership of its memory buffer after the move assignment, and it is up to the implementation to decide whether to deallocate the buffer or not, and the vector will most likely have capacity() greater than 0.
To ensure you do not run into undefined behavior when trying to resuse a vector that has been move assigned from, you should first call the clear() member function. This can be safely done since vector::clear has no pre-conditions, and will return the vector to a valid and specified state.
Also, vector::capacity has no pre-conditions either, so you can always query the capacity() of a moved from vector.
The state of the moved-from vector is unspecified but valid after the move, as you found out.
This means that it could really be in any valid state, in particular you can't assume that its capacity will be 0. It will probably be zero, and that would make a lot of sense, but that's not at all guaranteed.
But again, in practice if you don't care all that much about the standard, I suppose that you could rely on the capacity being 0. Due to the constraints on move operations, the vector move constructor pretty much has to steal memory from the moved-from one, leaving it empty. That's what will happen in almost all cases/implementations, but that's just not required.
A particularly twisted implementation could decide to reserve some elements in the moved-from vector just to mess with you. That technically wouldn't break any requirement.

Vector Resize vs Reserve for nested vectors

I'm trying to use another project's code and they have structs of this form:
struct data{
std::vector<sparse_array> cols,rows;
}
struct sparse_array {
std::vector<unsigned int> idxs;
std::vector<double> values;
void add(unsigned int idx, double value) {
idxs.push_back(idx);
values.push_back(value);
}
}
For my code, I tried using the following lines:
data prob;
prob.cols.reserve(num_cols);
prob.rows.reserve(num_rows);
// Some loop that calls
prob.cols[i].add(idx, value);
prob.rows[i].add(idx, value);
And when I output the values, prob.rows[i].value[j] to a file I get all zeros. But when I use resize instead of reserve I get the actual value that I read in. Can someone give me an explanation about this?
Function reserve() simply allocates a contiguous region of memory big enough to hold the number of items you specify and moves the vector's old content into this new block, which makes sure no more reallocations for the vectors' storage will be done upon insertions as long as the specified capacity is not exceeded. This function is used to reduce the number of reallocations (which also invalidate iterators), but does not insert any new items at the end of your vector.
From the C++11 Standard, Paragraph 23.3.6.3/1 about reserve():
A directive that informs a vector of a planned change in size, so that it can manage the storage
allocation accordingly. After reserve(), capacity() is greater or equal to the argument of reserve if reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens at this point if and only if the current capacity is less than the argument of reserve(). If an exception is thrown other than by the move constructor of a non-CopyInsertable type, there are no effects.
Notice that by doing prob.cols[i].push_back(idx, value); you are likely to get undefined behavior, since i is probably an out-of-bounds index.
On the other hand, function resize() does insert items at the end of your vector, so that the final size of the vector will be the one you specified (this means it can even erase elements, if you specify a size smaller than the current one). If you specify no second argument to a call to resize(), the newly inserted items will be value-initialized. Otherwise, they will be copy-initialized from the value you provide.
From the C++11 Standard, Paragraph 23.3.6.3/9 about resize():
If sz <= size(), equivalent to erase(begin() + sz, end());. If size() < sz, appends
sz - size() value-initialized elements to the sequence.
So to sum it up, the reason why accessing your vector after invoking resize() gives the expected result is that items are actually being added to the vector. On the other hand, since the call to reserve() does not add any item, subsequent accesses to non-existing elements will give you undefined behavior.
If the vector is empty, then std::vector::resize(n) expands the content of this vector by inserting n new elements at the end. std::vector::reserve(n) only reallocates the memory block that your vector uses for storing its elements so that it's big enough to hold n elements.
Then when you call prob.cols[i], you are trying to access the element at index i. In case you used reserve before, this results in accessing the memory where no element resides yet, which produces the undefined behavior.
So just use resize in this case :)

std::vector elements initializing

std::vector<int> v1(1000);
std::vector<std::vector<int>> v2(1000);
std::vector<std::vector<int>::const_iterator> v3(1000);
How elements of these 3 vectors initialized?
About int, I test it and I saw that all elements become 0. Is this standard? I believed that primitives remain undefined. I create a vector with 300000000 elements, give non-zero values, delete it and recreate it, to avoid OS memory clear for data safety. Elements of recreated vector were 0 too.
What about iterator? Is there a initial value (0) for default constructor or initial value remains undefined? When I check this, iterators point to 0, but this can be OS
When I create a special object to track constructors, I saw that for first object, vector run the default constructor and for all others it run the copy constructor. Is this standard?
Is there a way to completely avoid initialization of elements? Or I must create my own vector? (Oh my God, I always say NOT ANOTHER VECTOR IMPLEMENTATION)
I ask because I use ultra huge sparse matrices with parallel processing, so I cannot use push_back() and of course I don't want useless initialization, when later I will change the value.
You are using this constructor (for std::vector<>):
explicit vector (size_type n, const T& value= T(), const Allocator& = Allocator());
Which has the following documentation:
Repetitive sequence constructor: Initializes the vector with its content set to a repetition, n times, of copies of value.
Since you do not specify the value it takes the default-value of the parameter, T(), which is int in your case, so all elements will be 0
They are default initialized.
About int, I test it and I saw that all elements become 0. Is this standard? I believed that primitives remain undefined.
No, an uninitialized int has an indeterminate value. These are default initialized, i.e.,
int i; // uninitialized, indeterminate value
int k = int(); // default initialized, value == 0
In C++11 the specification for the constructor vector::vector(size_type n) says that n elements are default-inserted. This is being defined as an element initialized by the expression allocator_traits<Allocator>::construct(m, p) (where m is of the allocator type and p a pointer to the type stored in the container). For the default allocator this expression is ::new (static_cast<void*>(p)) T() (see 20.6.8.2). This value-initializes each element.
The elements of a vector are default initialized, which in the case of POD types means zero initialized. There's no way to avoid it with a standard vector.

Does push_back() always increase a vector's size?

I have a piece of code which creates a std::vector<T> with a known size:
std::vector<T> vectorOfTs(n);
Does calling push_back increase the size to n+1?
vectorOfTs.push_back(T());
Yes; note that vector<T>.capacity() is different from vector<T>.size(). The latter denotes the number of elements currently in the vector while the former represents the number of items that fit in the space currently allocated for the vector's internal buffer.
Almost. If there are no exceptions, then size() will increment.
push_back(T()) could also throw an exception at various stages: see here, or summarily:
T() construction, in which case no call to push_back takes place, and size() is unaffected
if the vector needs to increase the capacity, that may throw, in which case size() is unaffected
the vector element will be copy or move constructed using std::allocator_traits<A>::construct(m, p, v);, if A is std::allocator<T>, then this will call placement-new, as by ::new((void*)p) T(v): if any of this throws the vector's size() is unaffected, ****unless***
the move constructor isn't noexcept and does throw: in which case the effects are unspecified
the vector update's then complete - size() will have incremented and the value will be in the vector (even if T::~T())
Yes. If you instead want to reserve space, call reserve(), e.g.:
std::vector<T> vectorOfTs;
vectorOfTs.reserve(n);
// now size() == 0, capacity() >= n
vectorOfTs.push_back(T());
// now size() == 1
Yes.
std::vector<T> vectorOfTs(n);
In the above statement, actually you are constructing 'n' number of new instances of type T (i.e. default constructor T() would be triggered for each time). Now vector vectorOfTs contains n elements. The following version of the vector constructor would be invoked for the above statement.
explicit vector ( size_type n, const T& value= T(), const Allocator& = Allocator() );
So, when you push back another element into vector, size of vector would be n+1.