Guarantees of reordering of a vector - c++

Say I have this code:
#include <iostream>
#include <vector>
int main()
{
std::vector<int> vec {10, 15, 20};
auto itr = vec.begin();
vec.erase(itr);
for(const auto& element : vec)
{
std::cout << element << " ";
}
return 0;
}
This gives me 15 20 as expected. Now, cppreference says this about erase():
Invalidates iterators and references at or after the point of the
erase, including the end() iterator
Fair enough, but is that the only guarantee the standard gives about vector::erase()?
Is a vector allowed to reorder it's element after the erased iterator?
For example, are these conditions guaranteed to hold after the erase which would mean all elements after the erase() iterator shifted 1 to the left:
vec[0] == 15
vec[1] == 20
Or are implementations allowed to move values around as they see fit, and thus, create scenarios where vec[0] == 20 etc?
I would like a quote of the relevant part of the standard.

Let's start at the beginning:
23.2.3 Sequence containers
A sequence container organizes a finite set of objects, all of the same type, into a strictly linear arrangement.
The library provides four basic kinds of sequence containers: vector,
forward_list, list, and deque.
Emphasis on "a strictly linear arrangement". This is unambiguous.
This definition is followed by a table called "sequence container requirements", which describes erase() thusly:
a.erase(q) [ ... ]
Effects: Erases the element pointed to by q
Combined, this leaves no wiggle room for interpretation. The elements in a vector are always in "a strict linear arrangement", so when one of them is erase()d, there's only one possible outcome.

Technically, no, the standard doesn't write out a promise that it won't re-order elements when you least expect.
Practically, obviously it's not going to do that. That would be ridiculous.
Legally, you can probably take the "Effects" clause:
Erases the element pointed to by q
as having no other effects unless stated elsewhere (e.g. iterator invalidation, which follows from the erasure effect).

The two statements I found that I think guarantee it would be:
C++11 Standard
23.2.1
11 Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking a container member function or passing a container as an argument to a library function shall not invalidate iterators to, or change the values of, objects within that container.
If you can't "change the values of" then you can't arbitrarily re-order elements (like swapping the end value with the erased one).
23.2.3
12 The iterator returned from a.erase(q) points to the element immediately following q prior to the element being erased. If no such element exists, a.end() is returned.
This implies that the conceptual erasure of an element is implemented by closing the physical gap from the right. Given the previous rule, conceptually closing the gap, can not be seen as conceptually changing their values. This means the only implementation would be to shift the values in order.
By means of explanation.
The Standard is dealing with the abstract concept not the actual implementation, although its statements do impact the implementation.
Conceptually erasing an element simply removes it and nothing more. So given the sequence:
3 5 7 4 2 9 (6 values)
If we erase the 3rd element what does that conceptually give us?
3 5 4 2 9 (5 values)
This must be true because of the first statement above:
23.2.1
11 Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking a container member function or passing a container as an argument to a library function shall not invalidate iterators to, or change the values of, objects within that container.
If the implementation reordered the elements, say by swapping the erased element with the end element that rule would be broken because we would end up with this:
3 5 9 4 2
Conceptually the resulting value to the right of the erased element has changed from a 4 to a 9, thus breaking the rule.

Related

Is using erase(it) in a loop always safe for all containers and platforms?

i want to remove elements within a container(for now it is unordered_set) by certain condition
for (auto it = windows.begin(); it != windows.end(); ) {
if ((*it)->closed() == 0)
it = numbers.erase(it);
else
++it;
}
i know the erase(it) will return the position immediately following the last of the elements erased. but
Is it mandatory by the standard there won't cause the rearrangement for the iteation when invoking erase? Is it always safe for all containers and all platforms? Say there may be some magic implementation for certain type of container within certain platform.
The C++ standard requires that unordered_set::erase preserve the order of remaining elements, and return an iterator immediately following those being erased. Therefore, the loop you show is well-defined.
[unord.req]/14 ... The erase members shall invalidate only iterators and references to the erased elements, and preserve the relative order of the elements that are not erased.
[unord.req]/11, Table 91 a.erase(q) Erases the element pointed to by q. Returns the iterator immediately following q prior to the erasure.

C++ formal requirement of behaviour of iterators over a container

I'm sure that I'm not alone in expecting that I could add several elements in some order to a vector or list, and then could use an iterator to retrieve those elements in the same order. For example, in:
#include <vector>
#include <cassert>
int main(int argc, char **argv)
{
using namespace std;
vector<int> v;
v.push_back(4);
v.push_back(10);
v.push_back(100);
auto i = v.begin();
assert(*i++ == 4);
assert(*i++ == 10);
assert(*i++ == 100);
return 0;
}
... all assertions should pass and the program should terminate normally (assuming that no std::bad_alloc exception is thrown during construction of the vector or adding the elements to it).
However, I'm having trouble reconciling this with any requirement in the C++ standard (I'm looking at C++11, but would like answers for other standards also if they are markedly different).
The requirement for begin() is just (23.2.1 para 6):
begin() returns an iterator referring to the first element in the container.
What I'm looking for is the requirement, or combination of requirements that in turn logically requires, that if i = v.begin(), then ++i shall refer to the second element in the vector (assuming that such an element exists) - or indeed, even the requirement that successive increments of an iterator will return each of the elements in the vector.
Edit:
A more general question is, what (if any) text in the standard requires that successfully incrementing an iterator obtained by calling begin() on a sequence (ordered or unordered) actually visits every element of the sequence?
There's isn't in the standard something straightforward to state that
if i = v.begin(), then ++i shall refer to the second element in the
vector.
However, for vector's iterators why can imply it from the following wording in the draft standard N4527 24.2.1/p5 In general [iterator.requirements.general]:
Iterators that further satisfy the requirement that, for integral
values n and dereferenceable iterator values a and (a + n), *(a + n) is equivalent to *(addressof(*a) + n), are called contiguous
iterators.
Now, std::vector's iterator satisfy this requirement, consequently we can imply that ++i is equivalent to i + 1 and thus to addressof(*i) + 1. Which indeed is the second element in the vector due to its contiguous nature.
Edit:
There was indeed a turbidness on the matter about random access iterators and contiguous storage containers in C++11 and C++14 standards. Thus, the commity decided to refine them by putting an extra group of iterators named contiguous iterators. You can find more info in the relative proposal N3884.
It looks to me like we need to put two separate parts of the standard together to get a solid requirement here. We can start with table 101, which requires that a[n] be equivalent to *(a.begin() + n) for sequence containers (specifically, basic_string, array, deqeue and vector) (and the same requirement for a.at(n), for the same containers).
Then we look at table 111 in [random.access.iterators], where it requires that the expression r += n be equivalent to:
{
difference_type m = n;
if (m >= 0)
while (m--)
++r;
else
while (m++)
--r;
return r;
}
[indentation added]
Between the two, these imply that for any n, *(begin() + n) refers to the nth item in the vector. Just in case you want to cover the last base I see open, let's cover the requirement that push_back actually append to the collection. That's also in table 101: a.push_back(t) "Appends a copy of t" (again for basic_string, string, deque, list, and vector).
[C++14: 23.2.3/1]: A sequence container organizes a finite set of objects, all of the same type, into a strictly linear arrangement. [..]
I don't know how else you'd interpret it.
The specification isn't just in the iterators. It is also in the specification of the containers, and the operations that modify those containers.
The thing is, you are not going to find a single clause that says "incrementing begin() repeatedly will access all elements of a vector in order". You need to look at the specification of every operation on every container (since these define an order of elements in the container) and the specification of iterators (and operations on them) which is essentially that "incrementing moves to the next element in the order that operations on the container defined, until we pass the end". It is the combination of numerous clauses in the standard that give the end effect.
The general concepts, however, are ....
All containers maintain some range of zero or more elements. That range has three key properties: a beginning (corresponding to the first element in an order that is meaningful to the container), and an end (corresponding to the last element), and an order (which determines the sequence in which elements will be retrieved one after the other - i.e. defines the meaning of "next").
An iterator is an object that either references an element in a range, or has a "past the end" value. An iterator that references an element in the range other than the end (last), when incremented, will reference the next element. An iterator that references the end (last) element in the range, when incremented, will be an end (past the end) iterator.
The begin() method returns an iterator that references (or points to) the first in the range (or an end iterator if the range has zero elements). The end() method returns an end iterator - one that corresponds to "one past the the end of the range". That means, if an iterator is initialised using the begin(), incrementing it repeatedly will move sequentially through the range until the end iterator is reached.
Then it is necessary to look at the specification for the various modifiers of the container - the member functions that add or remove elements. For example, push_back() is specified as adding an element to the end of the existing range for that container. It extends the range by adding an element to the end.
It is that combination of specifications - of iterators and of operations that modify containers - that guarantees the order. The net effect is that, if elements are added to a container in some order, then a iterator initialised using begin() will - when incremented repeatedly - reference the elements in the order in which they were placed in the container.
Obviously, some container modifiers are a bit more complicated - for example, std::vector's insert() is given an iterator, and adds elements there, shuffling subsequent elements to make room. However, the key point is that the modifiers place elements into the container in a defined order (or remove, in the case of operations like std::vector::erase()) and iterators will access elements in that defined order.

STL list::splice - iterator validity

I'm reading how "list::splice" works and I don't understand something:
mylist1.splice (it, mylist2); // mylist1: 1 10 20 30 2 3 4
// mylist2 (empty)
// "it" still points to 2 (the 5th element)
mylist2.splice (mylist2.begin(),mylist1, it);
// mylist1: 1 10 20 30 3 4
// mylist2: 2
// "it" is now invalid.
it = mylist1.begin();
std::advance(it,3); // "it" points now to 30
mylist1.splice ( mylist1.begin(), mylist1, it, mylist1.end());
// mylist1: 30 3 4 1 10 20
in the first and third splice the it iterator is still valid, but why isn't it in the second splice?
According to the documentation:
Iterator validity
No changes on the iterators, pointers and references
related to the container before the call. The iterators, pointers and
references that referred to transferred elements keep referring to
those same elements, but iterators now iterate into the container the
elements have been transferred to.
thus it should still be valid
It's only a guess, but they might have written that to imply that it is now "invalid" in the sense that it is no longer a valid iterator of mylist1, but instead becomes a valid iterator of mylist2.
But still, and I guess you already knew that, it is a valid iterator, so the wording is misleading. You need to be careful, though, as it means that after the second splice-operation, for example, you can no longer do:
std::distance( mylist1.begin(), it );
but need to use
std::distance( mylist2.begin(), it );
as the first would be illegal.
The standard clearly defines it that way in:
23.3.5.5 list operations [list.ops]
void splice(const_iterator position, list& x, const_iterator i);
void splice(const_iterator position, list&& x, const_iterator i);
7 Effects: Inserts an element pointed to by i from list x before position and removes the element from x. The result is unchanged if position == i or position == ++i. Pointers and references to *i continue to refer to this same element but as a member of *this. Iterators to *i (including i itself) continue to refer to the same element, but now behave as iterators into *this, not into x.
So, if your compiler/STL invalidates the iterator, this is clearly a bug.
Apparently (since I'm using MSVC2012) the behavior is different:
http://msdn.microsoft.com/en-us/library/72fb8wzd.aspx
In all cases, only iterators or references that point at spliced
elements become invalid.
Thus when I have iterators to elements that get moved from one container to another, these iterators become invalid.
I'd be interested in knowing if this behavior is the standard one, though.

container.erase(first,last) where first == last in STL containers

Is there a defined behavior for container.erase(first,last) when first == last in the STL, or is it undefined?
Example:
std::vector<int> v(1,1);
v.erase(v.begin(),v.begin());
std::cout << v.size(); // 1 or 0?
If there is a Standard Library specification document that has this information I would appreciate a reference to it.
The behavior is well defined.
It is a No-op(No-Operation). It does not perform any erase operation on the container as end is same as begin.
The relevant Quote from the Standard are as follows:
C++03 Standard: 24.1 Iterator requirements and
C++11 Standard: 24.2.1 Iterator requirements
Para 6 & 7 for both:
An iterator j is called reachable from an iterator i if and only if there is a finite sequence of applications of the expression ++i that makes i == j. If j is reachable from i, they refer to the same container.
Most of the library’s algorithmic templates that operate on data structures have interfaces that use ranges.A range is a pair of iterators that designate the beginning and end of the computation. A range [i, i) is an empty range; in general, a range [i, j) refers to the elements in the data structure starting with the one pointed to by i and up to but not including the one pointed to by j. Range [i, j) is valid if and only if j is reachable from i. The result of the application of functions in the library to invalid ranges is undefined.
That would erase nothing at all, just like other algorithms that operate on [, ) ranges.
Even if the container is empty I think that would still work because begin() == end().
Conceptually, there is an ordinary loop from begin to end, with a simple loop condition that checks if the iterator is end already, like this:
void erase (iterator from, iterator to) {
...
while (from != to) erase (from++);
...
}
(however, implementations may vary). As you see, if from==to, then there is no single iteration of the loop body.
It is perfectly defined. It removes all elements from first to last, including first and excluding last. If there are no elements in this range (when first == last), then how much are removed? You guessed it, none.
Though I'm not sure what happens if first comes after last, I suppose this will invoke undefined behaviour.

Does std::vector::swap invalidate iterators?

If I swap two vectors, will their iterators remain valid, now just pointing to the "other" container, or will the iterator be invalidated?
That is, given:
using namespace std;
vector<int> x(42, 42);
vector<int> y;
vector<int>::iterator a = x.begin();
vector<int>::iterator b = x.end();
x.swap(y);
// a and b still valid? Pointing to x or y?
It seems the std mentions nothing about this:
[n3092 - 23.3.6.2]
void swap(vector<T,Allocator>& x);
Effects:
Exchanges the contents and capacity()
of *this with that of x.
Note that since I'm on VS 2005 I'm also interested in the effects of iterator debug checks etc. (_SECURE_SCL)
The behavior of swap has been clarified considerably in C++11, in large part to permit the Standard Library algorithms to use argument dependent lookup (ADL) to find swap functions for user-defined types. C++11 adds a swappable concept (C++11 §17.6.3.2[swappable.requirements]) to make this legal (and required).
The text in the C++11 language standard that addresses your question is the following text from the container requirements (§23.2.1[container.requirements.general]/8), which defines the behavior of the swap member function of a container:
Every iterator referring to an element in one container before the swap shall refer to the same element in the other container after the swap.
It is unspecified whether an iterator with value a.end() before the swap will have value b.end() after the swap.
In your example, a is guaranteed to be valid after the swap, but b is not because it is an end iterator. The reason end iterators are not guaranteed to be valid is explained in a note at §23.2.1/10:
[Note: the end() iterator does not refer to any element, so it may be
invalidated. --end note]
This is the same behavior that is defined in C++03, just substantially clarified. The original language from C++03 is at C++03 §23.1/10:
no swap() function invalidates any references, pointers, or iterators referring to the elements of the containers being swapped.
It's not immediately obvious in the original text, but the phrase "to the elements of the containers" is extremely important, because end() iterators do not point to elements.
Swapping two vectors does not invalidate the iterators, pointers, and references to its elements (C++03, 23.1.11).
Typically the iterator would contain knowledge of its container, and the swap operation maintains this for a given iterator.
In VC++ 10 the vector container is managed using this structure in <xutility>, for example:
struct _Container_proxy
{ // store head of iterator chain and back pointer
_Container_proxy()
: _Mycont(0), _Myfirstiter(0)
{ // construct from pointers
}
const _Container_base12 *_Mycont;
_Iterator_base12 *_Myfirstiter;
};
All iterators that refer to the elements of the containers remain valid
As for Visual Studio 2005, I have just tested it.
I think it should always work, as the vector::swap function even contains an explicit step to swap everything:
// vector-header
void swap(_Myt& _Right)
{ // exchange contents with _Right
if (this->_Alval == _Right._Alval)
{ // same allocator, swap control information
#if _HAS_ITERATOR_DEBUGGING
this->_Swap_all(_Right);
#endif /* _HAS_ITERATOR_DEBUGGING */
...
The iterators point to their original elements in the now-swapped vector object. (I.e. w/rg to the OP, they first pointed to elements in x, after the swap they point to elements in y.)
Note that in the n3092 draft the requirement is laid out in §23.2.1/9 :
Every iterator referring to an
element in one container before the
swap shall refer to the same element
in the other container after the swap.