std::vector<T>::assign using a subrange valid? - c++

I want to convert a vector into a sub-range of that vector, for example by removing the first and last values. Is the use of the assign member function valid in this context?
std::vector<int> data = {1, 2, 3, 4};
data.assign(data.begin() + 1, data.end() - 1);
// data is hopefully {2, 3}
Cppreference states that
All iterators, pointers and references to the elements of the container are invalidated. The past-the-end iterator is also invalidated.
This invalidation, however, doesn't appear to happen until the end of assign.
To be safe, I could just go with the following but it seems more verbose:
std::vector<int> data = {1, 2, 3, 4};
data = std::vector<int>{data.begin() + 1, data.end() - 1};
// data is now {2, 3}

The __invalidate_all_iterators function that your link refers to is merely a debugging tool. It doesn't "cause" the iterators to be invalidated; It effectively reports that the iterators have been invalidated by the previous actions. It may be that this debugging tool might not catch a bug caused by this assignment.
It is a precondition of assign that the iterators are not to the same container. A precondition violation results in undefined behaviour.
Standard quote (latest draft):
[sequence.reqmts] a.assign(i,j) Expects: T is Cpp17EmplaceConstructible into X from *i and assignable from *i.
For vector, if the iterator does not meet the forward iterator requirements ([forward.iterators]), T is also Cpp17MoveInsertable into X.
Neither i nor j are iterators into a.
Your safe alternative is correct.
If you want to avoid reallocation (keeping in mind that there will be unused space left), and if you want to avoid copying (which is important for complex types, doesn't matter for int), then following should be efficient:
int begin_offset = 1;
int end_offset = 1;
if (begin_offset)
std::move(data.begin() + begin_offset, data.end() - end_offset, data.begin());
data.erase(data.end() - end_offset - begin_offset, data.end());

Related

Does the STL offer something to find the last element for which a predicate is true in a range defined by two non-reverse iterators? [duplicate]

I am working on an exercise where I have a vector and I am writing my own reverse algorithm by using a reverse and a normal (forward) iterator to reverse the content of the vector. However, I am not able to compare the iterators.
int vals[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 };
vector<int> numbers(vals, vals + 10);
vector<int>::iterator start = numbers.begin();
vector<int>::reverse_iterator end = numbers.rend();
I have a previous algorithm for reversing the vector by using two iterators, however in this task I am not able to compare them using the != operator between them. My guess would be to get the underlying pointers or indexes in the vector with each other but how do I get the pointers/index?
Do a comparison using the the iterator returned by base(): it == rit.base() - 1.
You can convert a reverse_iterator to iterator by calling base().
Be careful however, as there are some caveats. #Matthieu M.'s comment is particularly helpful:
Note: base() actually returns an iterator to the element following the
element that the reverse_iterator was pointing to.
Checkout http://en.cppreference.com/w/cpp/iterator/reverse_iterator/base
rit.base()
returns a 'normal' iterator.
You can use (&*start == &*(end - 1)) to directly compare the address that the iterator is pointing to.
The two types cannot be compared (which is a very good idea) and calling .base() is not very elegant (or generic) in my opinion.
You can convert the types and compare the result.
Taking into account the off-by-one rule involving reverse_iterators.
Conversion from iterator to reverse_iterator need to be explicit (fortunately), however, conversion from reverse_iterator to iterator is not possible (unfortunately).
So there is only one way to do conversion and then make the comparison.
std::vector<double> vv = {1.,2.,3.};
auto it = vv.begin();
auto rit = vv.rend();
// assert( it == rit ); // error: does not compile
assert(std::vector<double>::reverse_iterator{it} == rit);

Idiomatic way to cheaply remove an element from an arbitrary container?

All C++ standard library containers have an insert() method; yet they don't all have a remove() method which does not take any arguments, but performs the cheapest possible removal at arbitrary order. Now, of course this would act differently for different containers: In a vector we would remove from the back, in a singly-list list we'd remove from the front (unless we kept a pointer to the tail), and so forth according to implementation details.
So, is there a more idiomatic way to do this other than rolling my own template specialization for every container?
Part of the design of the standard containers is that they ONLY provide operations as member functions if it is possible to provide one that is optimal (by chosen measures) for that container.
If a container provides a member function, that is because there is some way of implementing that function is a way that is optimal for that container.
If it is not possible to provide an optimal implementation of an operation (like remove()) then it is not provided.
Only std::list and (C++11 and later) std::forward_list are designed for efficient removal of elements, which is why they are the only containers with a remove() member function.
The other containers are not designed for efficient removal of arbitrary elements;
std::array cannot be resized, so it makes no sense have have either an insert() OR a remove() member function.
std::deque is only optimised for removal at the beginning or end.
Removal of an element from std::vector is less efficient than other containers, except (possibly) from the end.
Implementing a remove() member function for these containers therefore goes against the design philosophy.
So if you want to be able to efficiently remove elements from a container, you need to pick the right container for the job.
Rolling your own wrapper for the standard containers, to emulate operations that some containers don't support is simply misleading - in the sense of encouraging a user of your wrapper classes to believe they don't need to be careful with their choice of container if they have particular requirements of performance or memory usage.
So to answer your question
"So, is there a more idiomatic way to do this other than rolling my own template specialization for every container?
There are lot of ways to do remove
Sequence container and unordered container's erase() returns the next
iterator after the erased item.
Associative container's erase() returns nothing.
/*
* Remove from Vector or Deque
*/
vector<int> vec = {1, 4, 1, 1, 1, 12, 18, 16}; // To remove all '1'
for (vector<int>::iterator itr = vec.begin(); itr != vec.end(); ++itr) {
if ( *itr == 1 ) {
vec.erase(itr);
}
} // vec: { 4, 12, 18, 16}
// Complexity: O(n*m)
remove(vec.begin(), vec.end(), 1); // O(n)
// vec: {4, 12, 18, 16, ?, ?, ?, ?}
vector<int>::iterator newEnd = remove(vec.begin(), vec.end(), 1); // O(n)
vec.erase(newEnd, vec.end());
// Similarly for algorithm: remove_if() and unique()
// vec still occupy 8 int space: vec.capacity() == 8
vec.shrink_to_fit(); // C++ 11
// Now vec.capacity() == 4
// For C++ 03:
vector<int>(vec).swap(vec); // Release the vacant memory
/*
* Remove from List
*/
list<int> mylist = {1, 4, 1, 1, 1, 12, 18, 16};
list<int>::iterator newEnd = remove(mylist.begin(), mylist.end(), 1);
mylist.erase(newEnd, mylist.end());
mylist.remove(1); // faster
/*
* Remove from associative containers or unordered containers
*/
multiset<int> myset = {1, 4, 1, 1, 1, 12, 18, 16};
multiset<int>::iterator newEnd = remove(myset.begin(), myset.end(), 1);
myset.erase(newEnd, myset.end()); // O(n)
myset.erase(1); // O(log(n)) or O(1)

Is it allowed to increment an end iterator?

Is it allowed to increment an iterator variable it that already is at end(), i.e. auto it = v.end()?
Is it allowed in general?
If not, is it not allowed for vector?
If yes, is ++it maybe idempotent if it==v.end()?
I ask, because I stumbled upon code like this:
std::vector<int> v{ 1, 2, 3, 4, 5, 6, 7 };
// delete every other element
for(auto it=v.begin(); it<v.end(); ++it) { // it<end ok? ++it ok on end?
it = v.erase(it);
}
It works fine with g++-6, but that is no proof.
For one it<v.end() may only work with vectors, I suppose it should read it!=v.end() in general. But in this example that would not recognize the end of v if ++it is applied when it already is on the end.
No the behaviour is undefined. You are allowed to set an iterator to end(), but you must not increment it or dereference it.
You are allowed to decrement it so long as the backing container is not empty.

How does the Standard deal with self-referencing iterators in container insert functions?

When dealing with C++ Standard Library containers like std::vector, how do their range-based insertion methods handle the user using iterators that reference the vector's own contents?
Presumably if they are say, vector::iterator already, then the implementation can special-case this scenario, but if they are a user-defined type which eventually results in accessing the vector, how does the vector deal with keeping those iterators valid whilst evaluating the range? Does the Standard simply ban referencing the vector in the range?
For a simple example, consider an iterator whose value_type is size_t, and the result of de-referencing it is the size of the vector being inserted into.
struct silly_iterator {
vector<std::size_t>* v;
unsigned number;
std::size_t operator*() { return v->size(); }
operator++() { --number; }
bool operator==(silly_iterator other) const { return number == 0; }
// other methods
};
std::vector<std::size_t> vec = { 3, 4, 5, 6, 7 };
vector.insert(vector.begin() + 2, silly_iterator(&vec, 10), silly_iterator());
What is the contents of vec after this code has executed?
For another example,
struct silly_iterator {
std::vector<std::size_t>* v;
std::size_t operator*() { return 0; }
operator++() { --number; v->push_back((*v)[4]); }
bool operator==(silly_iterator other) const { return number == 0; }
// other methods
};
std::vector<std::size_t> vec = { 3, 4, 5, 6, 7 };
vec.insert(vec.begin() + 2, silly_iterator(&vec, 10), silly_iterator());
In C++11 n3242 (which is slightly old draft) in 23.2.3 table 100 we learn that for the iterator pair insert function, pre: i and j are not iterators into a. I believe based on that wording I would choose to broadly interpret that as i and j shall not access a, and that both of your iterators are undefined behavior.
But let's say that my broad interpretation is not what was intended by the standard. Then to answer your question, for input iterators the result is almost certainly going to be: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 5, 6, 7 while for forward iterators or better one of 3, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 7 or 3, 4, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 5, 6, 7 depending on whether the size is updated before or after the elements are copied into the open space. I don't see anywhere that the results for forward iterators in such a case would be specified.
Basically, the only reason to invalidate iterators/references for vector is a reallocation, since otherwise you are still pointing at some part of your vector.
C++11 23.3.6.3/5:
Remarks: Reallocation invalidates all the references, pointers, and iterators referring to the elements
in the sequence. It is guaranteed that no reallocation takes place during insertions that happen after
a call to reserve() until the time when an insertion would make the size of the vector greater than
the value of capacity().
This is again reiterated in the remarks to the insert functions C++11 23.3.6.5/1:
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens,
all the iterators and references before the insertion point remain valid. [...]
With vector, you can consider your iterators to behave very much like pointers (which shows why exactly reallocation will cause problems). In fact, the reference type is value_type&, as defined by the standard, showing that references are indeed not even wrapped.
Note, that the iterators' target may change by the insertion, since the underlying data changes. Also, to be standards compliant, you would need to ensure that reallocation does not happen (e.g. with a call to reserve).

stl insertion iterators

Maybe I am missing something completely obvious, but I can't figure out why one would use back_inserter/front_inserter/inserter,
instead of just providing the appropriate iterator from the container interface.
And thats my question.
Because those call push_back, push_front, and insert which the container's "normal" iterators can't (or, at least, don't).
Example:
int main() {
using namespace std;
vector<int> a (3, 42), b;
copy(a.begin(), a.end(), back_inserter(b));
copy(b.rbegin(), b.rend(), ostream_iterator<int>(cout, ", "));
return 0;
}
The main reason is that regular iterators iterate over existing elements in the container, while the *inserter family of iterators actually inserts new elements in the container.
std::vector<int> v(3); // { 0, 0, 0 }
int array[] = { 1, 2, 3 };
std::copy( array, array+3, std::back_inserter(v) ); // adds 3 elements
// v = { 0, 0, 0, 1, 2, 3 }
std::copy( array, array+3, v.begin() ); // overwrites 3 elements
// v = { 1, 2, 3, 1, 2, 3 }
int array2[] = { 4, 5, 6 };
std::copy( array2, array2+3, std::inserter(v, v.begin()) );
// v = { 4, 5, 6, 1, 2, 3, 1, 2, 3 }
The iterator points to an element, and doesn't in general know what container it's attached to. (Iterators were based on pointers, and you can't tell from a pointer what data structure it's associated with.)
Adding an element to an STL container changes the description of the container. For example, STL containers have a .size() function, which has to change. Since some metadata has to change, whatever inserts the new element has to know what container it's adding to.
It is all a matter of what you really need. Both kinds of iterators can be used, depending on your intent.
When you use an "ordinary" iterator, it does not create new elements in the container. It simply writes the data into the existing consecutive elements of the container, one after another. It overwrites any data that is already in the container. And if it is allowed to reach the end of the sequence, any further writes make it "fall off the end" and cause undefined behavior. I.e. it crashes.
Inserter iterators, on the other hand, create a new element and insert it at the current position (front, back, somewhere in the middle) every time something is written through them. They never overwrite existing elements, they add new ones.