What is the difference between begin () and rend ()? - c++

I have a question in iterators about the difference between begin() and rend().
#include <iostream>
#include <array>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
vector<int> v1;
v1 = {9,2,6,4,5};
cout<<*v1.begin();
cout<<*v1.rend();
return 0;
}
cout<<*v1.begin();
returns 9
but
cout<<*v1.rend();
returns a number that is not 9
Why are there such different results?

In C++, ranges are marked by a pair of iterators marking the beginning of the range and a position one past the end of the range. For containers, the begin() and end() member functions provide you with a pair of iterators to the first and past-the-end positions. It’s not safe to read from end(), since it doesn’t point to an actual element.
Similarly, the rbegin() and rend() member functions return reverse iterators that point, respectively, to the last and just-before-the-first positions. For the same reason that it’s not safe to dereference the end() iterator (it’s past the end of the range), you shouldn’t dereference the rend() iterator, since it doesn’t point to an element within the container.

Dereferencing end() or rend() has undefined behaviour.
begin() points to the first element, rbegin() points to the last element. end() (in most cases) points to one after the last element and rend() effectively points to one before the first element (though it isn't implemented like that).

What is the difference between begin () and rend ()?
begin returns an iterator to the first element of the container.
rend returns a reverse iterator to one before the first element of the container (which is one past the last element in the reverse iterator range).
*v1.rend()
The behaviour of indirecting through the rend iterator is undefined (same goes for indirecting through end iterator).
Why are there such different results?
Besides behaviour being undefined in one case and not the other, since they refer to different to different elements (one of them being an element that doesn't exist), there is little reason to assume the results to be the same.

vector::rend() is a built-in function in the C++ standard library which returns a reverse iterator pointing to the theoretical element right before the first element in the array container. but vector::begin() returns an iterator pointing to the first element in the vector.
See this code :
for (auto it = v1.rbegin(); it != v1.rend(); it++)
cout << *it << " ";
The vector elements in reverse order are :
5 4 6 2 9
To iterate in the vector always choose one of these methods together :
vector::begin() and vector::end()
vector::cbegin() and vector::cend()
vector::crbegin() and vector::crend()
vector::rbegin() and vector::rend()
For more information see "C++ Vector Tutorial With Example" by Ankit Lathiya.
Try it online

Related

What does std::find return if element not found

As per documentation, std::find returns
last
if no element is found. What does that mean? Does it return an iterator pointing to the last element in the container? Or does it return an iterator pointing to .end(), i.e. pointing outside the container?
The following code prints 0, which is not an element of the container. So, I guess std::find returns an iterator outside the container. Could you please confirm?
int main()
{
vector<int> vec = {1, 2,3, 1000, 4, 5};
auto itr = std::find(vec.begin(), vec.end(), 456);
cout << *itr;
}
last is the name of second parameter to find. It doesn't know what kind of container you're using, just the iterators that you give it.
In your example, last is vec.end(), which is (by definition) not dereferenceable, since it's one past the last element. So by dereferencing it, you invoke undefined behaviour, which in this case manifests as printing out 0.
Algorithms apply to ranges, which are defined by a pair of iterators. Those iterators are passed as arguments to the algorithm. The first iterator points at the first element in the range, and the second argument points at one past the end of the range. Algorithms that can fail return a copy of the past-the-end iterator when they fail. That's what std::find does: if there is no matching element it returns its second argument.
Note that the preceding paragraph does not use the word "container". Containers have member functions that give you a range that you can use to get at the elements of the container, but there are also ways of creating iterators that have no connection to any container.
Based on this documentation, it literally says:
"Return value:
Iterator to the first element satisfying the condition or last if no such element is found."
In your case, it's out the vector by one, .end()

Is passing vector's end() to std::vector::erase valid?

A very simple question but I couldn't find an answer. It would make awful lot of sense for it to be allowed but want to double-check.
std::vector<int> v(10, 0);
v.erase(v.end()); // allowed or not?
An invalid position or range causes undefined behavior.
From here
The iterator pos must be valid and dereferenceable. Thus the end()
iterator (which is valid, but is not dereferencable) cannot be used as
a value for pos.
For the single argument overload it is NOT valid to pass end() to std::vector::erase, because the single argument overload will erase the element AT that position. There is no element at the end() position, since end() is one past the last element.
However, end() can be passed to the overload of erase that takes an Iterator range:
vec.erase(vec.begin(), vec.end())

Erase by iterator on a C++ STL map

I'm curious about the rationale behind the following code. For a given map, I can delete a range up to, but not including, end() (obviously,) using the following code:
map<string, int> myMap;
myMap["one"] = 1;
myMap["two"] = 2;
myMap["three"] = 3;
map<string, int>::iterator it = myMap.find("two");
myMap.erase( it, myMap.end() );
This erases the last two items using the range. However, if I used the single iterator version of erase, I half expected passing myMap.end() to result in no action as the iterator was clearly at the end of the collection. This is as distinct from a corrupt or invalid iterator which would clearly lead to undefined behaviour.
However, when I do this:
myMap.erase( myMap.end() );
I simply get a segmentation fault. I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case. Is there some subtle reason for this that I'm missing? I noticed that even this works:
myMap.erase( myMap.end(), myMap.end() );
(i.e. does nothing)
The reason I ask is that I have some code which receives a valid iterator to the collection (but which could be end()) and I wanted to simply pass this into erase rather than having to check first like this:
if ( it != myMap.end() )
myMap.erase( it );
which seems a bit clunky to me. The alternative is to re code so I can use the by-key-type erase overload but I'd rather not re-write too much if I can help it.
The key is that in the standard library ranges determined by two iterators are half-opened ranges. In math notation [a,b) They include the first but not the last iterator (if both are the same, the range is empty). At the same time, end() returns an iterator that is one beyond the last element, which perfectly matches the half-open range notation.
When you use the range version of erase it will never try to delete the element referenced by the last iterator. Consider a modified example:
map<int,int> m;
for (int i = 0; i < 5; ++i)
m[i] = i;
m.erase( m.find(1), m.find(4) );
At the end of the execution the map will hold two keys 0 and 4. Note that the element referred by the second iterator was not erased from the container.
On the other hand, the single iterator operation will erase the element referenced by the iterator. If the code above was changed to:
for (int i = 1; i <= 4; ++i )
m.erase( m.find(i) );
The element with key 4 will be deleted. In your case you will attempt to delete the end iterator that does not refer to a valid object.
I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case.
No, it is not hard to do, but the function was designed with a different contract in mind: the caller must pass in an iterator into an element in the container. Part of the reason for this is that in C++ most of the features are designed so that the incur the minimum cost possible, allowing the user to balance the safety/performance on their side. The user can test the iterator before calling erase, but if that test was inside the library then the user would not be able to opt out of testing when she knows that the iterator is valid.
n3337 23.2.4 Table 102
a.erase( q1, q2)
erases all the elements in the range [q1,q2). Returns q2.
So, iterator returning from map::end() is not in range in case of myMap.erase(myMap.end(), myMap.end());
a.erase(q)
erases the element pointed to by q. Returns an iterator pointing to the element immediately following q prior to the element being erased. If no such element exists, returns a.end().
I wouldn't have thought it difficult for map to check whether the
iterator equalled end() and not take action in that case. Is there
some subtle reason for this that I'm missing?
Reason is same, that std::vector::operator[] can don't check, that index is in range, of course.
When you use two iterators to specify a range, the range consists of the elements from the element that the first iterator points to up to but not including the element that the second iterator points to. So erase(it, myMap.end()) says to erase everything from it up to but not including end(). You could equally well pass an iterator that points to a "real" element as the second one, and the element that that iterator points to would not be erased.
When you use erase(it) it says to erase the element that it points to. The end() iterator does not point to a valid element, so erase(end()) doesn't do anything sensible. It would be possible for the library to diagnose this situation, and a debugging library will do that, but it imposes a cost on every call to erase to check what the iterator points to. The standard library doesn't impose that cost on users. You're on your own. <g>

vector::erase and reverse_iterator

I have a collection of elements in a std::vector that are sorted in a descending order starting from the first element. I have to use a vector because I need to have the elements in a contiguous chunk of memory. And I have a collection holding many instances of vectors with the described characteristics (always sorted in a descending order).
Now, sometimes, when I find out that I have too many elements in the greater collection (the one that holds these vectors), I discard the smallest elements from these vectors some way similar to this pseudo-code:
grand_collection: collection that holds these vectors
T: type argument of my vector
C: the type that is a member of T, that participates in the < comparison (this is what sorts data before they hit any of the vectors).
std::map<C, std::pair<T::const_reverse_iterator, std::vector<T>&>> what_to_delete;
iterate(it = grand_collection.begin() -> grand_collection.end())
{
iterate(vect_rit = it->rbegin() -> it->rend())
{
// ...
what_to_delete <- (vect_rit->C, pair(vect_rit, *it))
if (what_to_delete.size() > threshold)
what_to_delete.erase(what_to_delete.begin());
// ...
}
}
Now, after running this code, in what_to_delete I have a collection of iterators pointing to the original vectors that I want to remove from these vectors (overall smallest values). Remember, the original vectors are sorted before they hit this code, which means that for any what_to_delete[0 - n] there is no way that an iterator on position n - m would point to an element further from the beginning of the same vector than n, where m > 0.
When erasing elements from the original vectors, I have to convert a reverse_iterator to iterator. To do this, I rely on C++11's §24.4.1/1:
The relationship between reverse_iterator and iterator is
&*(reverse_iterator(i)) == &*(i- 1)
Which means that to delete a vect_rit, I use:
vector.erase(--vect_rit.base());
Now, according to C++11 standard §23.3.6.5/3:
iterator erase(const_iterator position); Effects: Invalidates
iterators and references at or after the point of the erase.
How does this work with reverse_iterators? Are reverse_iterators internally implemented with a reference to a vector's real beginning (vector[0]) and transforming that vect_rit to a classic iterator so then erasing would be safe? Or does reverse_iterator use rbegin() (which is vector[vector.size()]) as a reference point and deleting anything that is further from vector's 0-index would still invalidate my reverse iterator?
Edit:
Looks like reverse_iterator uses rbegin() as its reference point. Erasing elements the way I described was giving me errors about a non-deferenceable iterator after the first element was deleted. Whereas when storing classic iterators (converting to const_iterator) while inserting to what_to_delete worked correctly.
Now, for future reference, does The Standard specify what should be treated as a reference point in case of a random-access reverse_iterator? Or this is an implementation detail?
Thanks!
In the question you have already quoted exactly what the standard says a reverse_iterator is:
The relationship between reverse_iterator and iterator is &*(reverse_iterator(i)) == &*(i- 1)
Remember that a reverse_iterator is just an 'adaptor' on top of the underlying iterator (reverse_iterator::current). The 'reference point', as you put it, for a reverse_iterator is that wrapped iterator, current. All operations on the reverse_iterator really occur on that underlying iterator. You can obtain that iterator using the reverse_iterator::base() function.
If you erase --vect_rit.base(), you are in effect erasing --current, so current will be invalidated.
As a side note, the expression --vect_rit.base() might not always compile. If the iterator is actually just a raw pointer (as might be the case for a vector), then vect_rit.base() returns an rvalue (a prvalue in C++11 terms), so the pre-decrement operator won't work on it since that operator needs a modifiable lvalue. See "Item 28: Understand how to use a reverse_iterator's base iterator" in "Effective STL" by Scott Meyers. (an early version of the item can be found online in "Guideline 3" of http://www.drdobbs.com/three-guidelines-for-effective-iterator/184401406).
You can use the even uglier expression, (++vect_rit).base(), to avoid that problem. Or since you're dealing with a vector and random access iterators: vect_rit.base() - 1
Either way, vect_rit is invalidated by the erase because vect_rit.current is invalidated.
However, remember that vector::erase() returns a valid iterator to the new location of the element that followed the one that was just erased. You can use that to 're-synchronize' vect_rit:
vect_rit = vector_type::reverse_iterator( vector.erase(vect_rit.base() - 1));
From a standardese point of view (and I'll admit, I'm not an expert on the standard): From §24.5.1.1:
namespace std {
template <class Iterator>
class reverse_iterator ...
{
...
Iterator base() const; // explicit
...
protected:
Iterator current;
...
};
}
And from §24.5.1.3.3:
Iterator base() const; // explicit
Returns: current.
Thus it seems to me that so long as you don't erase anything in the vector before what one of your reverse_iterators points to, said reverse_iterator should remain valid.
Of course, given your description, there is one catch: if you have two contiguous elements in your vector that you end up wanting to delete, the fact that you vector.erase(--vector_rit.base()) means that you've invalidated the reverse_iterator "pointing" to the immediately preceeding element, and so your next vector.erase(...) is undefined behavior.
Just in case that's clear as mud, let me say that differently:
std::vector<T> v=...;
...
// it_1 and it_2 are contiguous
std::vector<T>::reverse_iterator it_1=v.rend();
std::vector<T>::reverse_iterator it_2=it_1;
--it_2;
// Erase everything after it_1's pointee:
// convert from reverse_iterator to iterator
std::vector<T>::iterator tmp_it=it_1.base();
// but that points one too far in, so decrement;
--tmp_it;
// of course, now tmp_it points at it_2's base:
assert(tmp_it == it_2.base());
// perform erasure
v.erase(tmp_it); // invalidates all iterators pointing at or past *tmp_it
// (like, say it_2.base()...)
// now delete it_2's pointee:
std::vector<T>::iterator tmp_it_2=it_2.base(); // note, invalid iterator!
// undefined behavior:
--tmp_it_2;
v.erase(tmp_it_2);
In practice, I suspect that you'll run into two possible implementations: more commonly, the underlying iterator will be little more than a (suitably wrapped) raw pointer, and so everything will work perfectly happily. Less commonly, the iterator might actually try to track invalidations/perform bounds checking (didn't Dinkumware STL do such things when compiled in debug mode at one point?), and just might yell at you.
The reverse_iterator, just like the normal iterator, points to a certain position in the vector. Implementation details are irrelevant, but if you must know, they both are (in a typical implementation) just plain old pointers inside. The difference is the direction. The reverse iterator has its + and - reversed w.r.t. the regular iterator (and also ++ and --, > and < etc).
This is interesting to know, but doesn't really imply an answer to the main question.
If you read the language carefully, it says:
Invalidates iterators and references at or after the point of the erase.
References do not have a built-in sense of direction. Hence, the language clearly refers to the container's own sense of direction. Positions after the point of the erase are those with higher indices. Hence, the iterator's direction is irrelevant here.

Iterator value differs from reverse iterator value after conversion

I have the following code:
int main()
{
vector<int> v;
for(int i = 0; i < 10; ++i)
v.push_back(i);
auto it = v.begin() + 3;
cout << "Iterator: " << *it << endl;
vector<int>::reverse_iterator revIt(it);
cout << "Reverse iterator: " << *revIt << endl;
}
After running this code I get the following output:
Iterator: 3
Reverse iterator: 2
Could someone explain why the 2 values differ ?
Reverse iterators 'correspond' to a base iterator with an offset of one element because of how rbegin() and rend() have to be represented using base iterators that are valid (end() and begin() respectively). For example, rend() cannot be represented by an interator that 'points' before the container's begin() iterator, although that's what it logically represents. So rend()'s 'base iterator' is begin(). Therefore, rbegin()'s base iterator becomes end().
A reverse iterator automatically adjusts for that offset when it is dereferenced (using the * or -> operators).
An old article by Scott Meyers explains the relationship in detail along with a nice picture:
Guideline 3: Understand How to Use a reverse_iterator’s Base iterator
Invoking the base member function on a reverse_iterator yields the
“corresponding” iterator, but it’s not really clear what that means.
As an example, take a look at this code, which puts the numbers 1-5 in
a vector, sets a reverse_iterator to point to the 3, and sets an
iterator to the reverse_iterator’s base:
vector<int> v;
// put 1-5 in the vector
for (int i = 1; i <= 5; ++i) {
v.push_back(i);
}
// make ri point to the 3
vector<int>::reverse_iterator ri =
find(v.rbegin(), v.rend(), 3);
// make i the same as ri's base
vector<int>::iterator i(ri.base());
After executing this code, things can be thought of as looking like
this:
This picture is nice, displaying the characteristic offset of a
reverse_iterator and its corresponding base iterator that mimics the
offset of rbegin() and rend() with respect to begin() and end(), but
it doesn’t tell you everything you need to know. In particular, it
doesn’t explain how to use i to perform operations you’d like to
perform on ri.
Looks like the documentation says they do that to handle past-the-end elements, i.e. if you reverse an iterator that is past the end, the new reverse iterator points to the last element.
The first paragraph of 24.5.1 Reverse iterators says:
Class template reverse_iterator is an iterator adaptor that iterates from the end of the sequence defined by its underlying iterator to the beginning of that sequence. The fundamental relation between a reverse iterator and its corresponding iterator i is established by the identity:
&*(reverse_iterator(i)) == &*(i - 1).
The value returned by rend() cannot point before begin(), because that is not valid. So it was decided that rend() should contain the value of begin() and all other reverse iterators be shifted one position further. The operator* compensates for this and accesses the correct element anyway.
Reverse iterator looks always "one before" then the forward, since its range is shifted by one:
Forward iterator goes from begin() (the first element) to end() (past the last: [begin-end) is opened at the end side)
Reverse iterator goes from rbegin() { return reverse_iterator(end()); } to rend() { return reverse_iterator(begin()); } by definition, but also has to walk the open range [rbegin-rend) having rbegin to be the last (not "past the last") and rend to be "before the first" (not "the first") hence a 1 difference to be accommodated.