Should a bidirectional iterator having reached end remain decrementable in C++? - c++

I'm writing a special iterator class that works like std::istream_iterator and many others by using a default-constructed instance to mark end of iteration. I want to give it the bidirectional iterator category. After running the following code:
MyIterType i_cur(get_some_iter()), i_end;
while(i_cur != i_end) ++i_cur;
do the standard requirements for bidirectional iterators impose the following to be valid?
--i_cur
++i_cur
--i_end or ++i_end
Thank you for quoting the standard if possible. I compile in C++03, but if C++11 introduces changes, I'm also interested to know them.

The Standard says that the answer to your question is yes:
[C++03] : 24.1.4 Bidirectional iterators
Table 75—Bidirectional iterator requirements (in addition to forward iterator)
expression return type operational assertion/note
semantics pre/post-coindition
======== ============ ============ ====================
--r X& pre: there exists s such
that r == ++s.
post: s is dereferenceable.
--(++r) == r.
--r == --s implies r
== s.
&r == &--r

In 24.2.6 in C++11, Table 110 gives the additional requirements for a bidirectional iterator.
In particular, if there exists s such that r == ++s, then --r must be valid and the resulting r must be dereferenceable. In addition:
--(++r) == r.
--r == --s implies r == s.
&r == &--r.
So unless the initial i_cur returned by get_some_iter is a past-the-end iterator, the final i_cur has a predecessor and so must be decrementable. The same holds for i_end, since it is the successor of the predecessor of the final i_cur.

Related

std::list sentinel node from standard point of view

Code example:
list<int> mylist{10, 20, 30, 40};
auto p = mylist.end();
while (true)
{
p++;
if (p == mylist.end()) // skip sentinel
continue;
cout << *p << endl;
}
I wonder, how much this code is legal from standard (C++17, n4810) point of view?
I looking for bidirectional iterators requirements related to example above, but no luck.
My question is:
Ability to pass through end(), it is implementation details or it is standard requirements?
Quoting from the latest draft available online.
[iterator.requirements.general]/7
Just as a regular pointer to an array guarantees that there is a pointer value pointing past the last element of the array, so for any iterator type there is an iterator value that points past the last element of a corresponding sequence. These values are called past-the-end values. Values of an iterator i for which the expression *i is defined are called dereferenceable. The library never assumes that past-the-end values are dereferenceable.
I believe that this applies not just to the end() but what comes after that as well. Note that the standard does not clearly state that end() should never be dereferenced.
And Cpp17Iterator requirements table states that for expression *r, r should be dereferenceable:
past-the-end iterator is considered a non-incrementable iterator and incrementing it (as you are doing at the beginning of the while loop) results in undefined behavior.
Something like what you are trying to do can also happen when using std::advance.
The book "The C++ Standard Library: A Tutorial and Reference" by Nicolai Josuttis has this quote:
Note that advance() does not check whether it crosses the end() of a sequence (it can't check because iterators in general do not know the containers on which they operate). Thus, calling this function might result in undefined behavior because calling operator ++ for the end of a sequence is not defined.
You code is illegal. You first initialized p to be the past-the-end iterator.
auto p = mylist.end();
Now you p++. Per Table 76,
the operational semantics of r++ is:
{ X tmp = r;
++r;
return tmp; }
And per [Table 74],
++r
Expects: r is dereferenceable.
And per [iterator.requirements.general]/7,
The library never assumes that past-the-end values are
dereferenceable.
In other words, incrementing a past-the-end iterator as you did is undefined behavior.

Iterating past the front of doubly linked list

std::list<my_type> my_list;
std::list<my_type>::iterator my_iter = my_list.begin();
std::list<my_type>::iterator my_prev = std::prev(my_iter);
What is the value of my_prev?
Is it my_list.rend(), even though it is technically a different type?
How to check for such condition besides my_iter == my_list.begin()?
According to C++11 standard:
24.4.4 Iterator operations [iterator.operations]
§ 7
template <class BidirectionalIterator>
BidirectionalIterator prev(BidirectionalIterator x,
typename std::iterator_traits<BidirectionalIterator>::difference_type n = 1);
Effects: Equivalent to advance(x, -n); return x;
In the same section, but §2 and §3
template <class InputIterator, class Distance>
void advance(InputIterator& i, Distance n);
Requires: n shall be negative only for bidirectional and random access iterators.
Effects: Increments (or decrements for negative n) iterator reference i by n.
Then, for decrementing bidirectional iterators:
24.2.6 Bidirectional iterators
Expression Return type Operational Assertion/note
semantics pre-/post-condition
pre: there exists s such that
--r X& r == ++s.
post: r is dereferenceable.
--(++r) == r.
--r == --s implies r == s.
&r == &--r.
So, it looks like (or at least I couldn't find more relevant part from the standard), that this behavior is not defined.
I'd suggest you to maintain this situation and be careful with my_iter's value, before passing it to std::prev.

Is ->second defined for iterator my_map.end()?

I'm working with a std::map<std::string, MyClass* >.
I want to test if my_map.find(key) returned a specific pointer.
Right now I'm doing;
auto iter = my_map.find(key);
if ((iter != my_map.end()) && (iter->second == expected)) {
// Something wonderful has happened
}
However, the operator * of the iterator is required to return a reference. Intuitively I'm assuming it to be valid and fully initialized? If so, my_map.end()->second would be NULL, and (since NULL is never expected), I could reduce my if statement to:
if (iter->second == expected)
Is this valid according to specification? Does anyone have practical experience with the implementations of this? IMHO, the code becomes clearer, and possibly a tiny performance improvement could be achieved.
Intuitively I'm assuming it to be valid and fully initialized?
You cannot assume an iterator to an element past-the-end of a container to be dereferenceable. Per paragraph 24.2.1/5 of the C++11 Standard:
Just as a regular pointer to an array guarantees that there is a pointer value pointing past the last element
of the array, so for any iterator type there is an iterator value that points past the last element of a
corresponding sequence. These values are called past-the-end values. Values of an iterator i for which the
expression *i is defined are called dereferenceable. The library never assumes that past-the-end values are
dereferenceable. [...]
However, the operator *of the iterator is required to return a reference. Intuitively I'm assuming it to be valid and fully initialized?
Your assumption is wrong, dereferencing iterator that points outside of container will lead to UB.
24.2 Iterator requirements [iterator.requirements]
24.2.1 In general [iterator.requirements.general]
7 Most of the library’s algorithmic templates that operate on data structures have interfaces that use ranges.
A range is a pair of iterators that designate the beginning and end of the computation. A range [i,i) is an
empty range; in general, a range [i,j) refers to the elements in the data structure starting with the element
pointed to by i and up to but not including the element pointed to by j. Range [i,j) is valid if and only if
j is reachable from i. The result of the application of functions in the library to invalid ranges is undefined.
Even without checking the specs, you can easily see that dereferencing an iterator at end has to be invalid.
A perfectly natural implementation (the de-factor standard implementation for vector<>) is for end() to be literally a memory pointer that has a value of ptr_last_element + 1, that is, the pointer value that would point to the next element - if there was a next element.
You cannot possibly be allowed to dereference the end iterator because it could be a pointer that would end up pointing to either the next object in the heap, or perhaps an overflow guard area (so you would dereference random memory), or past the end of the heap, and possibly outside of the memory space of the process, in which case you might get an Access Violation exception when dereferencing).
If iter == my_map.end (), then dereferencing it is undefined behavior; but you're not doing that here.
auto iter = my_map.find(key);
if ((iter != my_map.end()) && (iter->second == expected)) {
// Something wonderful has happened
}
If iter != my_map.end() is false, then the second half of the expression (iter->second == expected) will not be exectuted.
Read up on "short-circut evaluation".
Analogous valid code for pointers:
if ( p != NULL && *p == 4 ) {}

Does incrementing a mutable input iterator invalidate old iterator values?

Iterators that further satisfy the requirements of output iterators are called mutable iterators. Nonmutable iterators are referred to as constant iterators. [24.2.1:4]
This suggests you could have a mutable input iterator, which meets the requirements of both input and output iterators.
After incrementing an input iterator, copies of its old value need not be dereferenceable [24.2.3]. However, the standard does not say the same for output iterators; in fact, the operational semantics for postfix increment are given as { X tmp = r; ++r; return tmp; }, suggesting that output iterators may not invalidate (copies of) old iterator values.
So, can incrementing a mutable input iterator invalidate old iterator copies?
If so, how would you support code like X a(r++); *a = t or X::reference p(*r++); p = t with (e.g.) a proxy object?
If not, then why does boost::iterator claim it needs a proxy object? (Link is code; scroll down to read the comments on structs writable_postfix_increment_proxy and postfix_increment_result). That is, if you can return a (dereferenceable) copy of the old iterator value, why would you need to wrap this copy in a proxy?
The explanation if found in the next section, [24.2.5] Forward iterators, where it is stated how these differ from input and output iterators:
Two dereferenceable iterators a and b of type X offer the multi-pass guarantee if:
— a == b implies ++a == ++b and
— X is a pointer type or the expression (void)++X(a), *a is equivalent to the expression *a.
[ Note: The requirement that a == b implies ++a == ++b (which is not true for input and output iterators) and the removal of the restrictions on the number of the assignments through a mutable iterator (which applies to output iterators) allows the use of multi-pass one-directional algorithms with forward iterators.
—end note ]
Unfortunately, the standard must be read as a whole, and the explanation is not always where you expect it to be.
Input and output iterators are basically designed to allow single-pass traversal: to describe sequences where each element can only be visited once.
Streams are a great example. If you read from stdin or a socket, or write to a file, then there is only the stream's current position. All other iterators pointing to the same underlying sequence are invalidated when you increment an iterator.
Forward iterators allow multi-pass traversal, the additional guarantee you need: they ensure that you can copy your iterator, increment the original, and the copy will still point to the old position, so you can iterate from there.

comparing iterators from different containers

Is it legal to compare iterators from different containers?
std::vector<int> foo;
std::vector<int> bar;
Does the expression foo.begin() == bar.begin() yield false or undefined behavior?
(I am writing a custom iterator and stumbled upon this question while implementing operator==.)
If you consider the C++11 standard (n3337):
§ 24.2.1 — [iterator.requirements.general#6]
An iterator j is called reachable from an iterator i if and only if there is a finite sequence of applications of the expression ++i that makes i == j. If j is reachable from i, they refer to elements of the same sequence.
§ 24.2.5 — [forward.iterators#2]
The domain of == for forward iterators is that of iterators over the same underlying sequence.
Given that RandomAccessIterator must satisfy all requirements imposed by ForwardIterator, comparing iterators from different containers is undefined.
The LWG issue #446 talks specifically about this question, and the proposal was to add the following text to the standard (thanks to #Lightness Races in Orbit for bringing it to attention):
The result of directly or indirectly evaluating any comparison function or the binary - operator with two iterator values as arguments that were obtained from two different ranges r1 and r2 (including their past-the-end values) which are not subranges of one common range is undefined, unless explicitly described otherwise.
Undefined behavior as far as I know. In VS 2010 with
/*
* to disable iterator checking that complains that the iterators are incompatible (come from * different containers :-)
*/
#define _HAS_ITERATOR_DEBUGGING 0
std::vector<int> vec1, vec2;
std::vector<int>::iterator it1 = vec1.begin();
std::vector<int>::iterator it2 = vec2.begin();
if (it1 == it2)
{
std::cout << "they are equal!!!";
}
The equality test returns in this case true :-), since the containers are empty and the _Ptr member of the iterators are both nullptr.
Who knows maybe your implementation does things differently and the test would return false :-).
EDIT:
See C++ Standard library Active Issues list "446. Iterator equality between different containers". Maybe someone can check the standard to see if the change was adopted?
Probably not since it is on the active issues list so Charles Bailey who also answered this is right it's unspecified behavior.
So I guess the behavior could differ (at least theoretically) between different implementations and this is only one problem.
The fact that with iterator debugging enabled in the STL implementation that comes with VS checks are in place for this exact case (iterators coming from different containers) singnals at least to me once more that doing such comparisons should be avoided whenever possible.
You cannot directly compare iterators from different containers. An iterator is an object that uses the internal state of a container to traverse it; comparing the internals of one container to another simply does not make sense.
However, if the iterators resulting from container.begin() are available, it may make sense to compare iterators by the count of objects traversed from begin() to the current iterator value. This is done using std::distance:
int a = std::distance(containerA.begin(), iteratorA);
int b = std::distance(containerB.begin(), iteratorB);
if (a <comparison> b)
{ /* ... */ }
Without more context, it's difficult to judge whether this would solve your problem or not. YMMV.
No. If it were legal, this would imply that pointers would not be iterators.
I believe that it is unspecified behaviour (C++03). std::vector iterators are random access iterators and the behaviour of == is defined in the requirements for forward iterators.
== is an equivalence relation
Note that this is a requirement on a type, so must be applicable (in this case) to any pair of valid (dereferencable or otherwise) std::vector::iterators. I believe that this means == must give you a true/false answer and can't cause UB.
— If a and b are equal, then either a and b are both dereferenceable or else neither is dereferenceable.
Conversely, a dereferenceable iterator cannot compare equal to an iterator that is not dereferenceable.
— If a and b are both dereferenceable, then a == b if and only if *a and *b are the same object.
Note the lack of requirement on whether a == b for two iterators that aren't dereferenceable. So long as == is transitive (if a.end() == b.end() and b.end() == c.end() then a.end() == c.end()), reflexive (a.end() == a.end()) and symmetric (if a.end() == b.end() then b.end() == a.end()) it doesn't matter if some, all or no end() iterators to different containers compare equal.
Note, also, that this is in contrast to <. < is defined in terms of b - a, where a and b are both random access iterators. A pre-condition of performing b - a is that there must be a Distance value n such that a + n == b which requires a and b to be iterators into the same range.
ISO/IEC 14882:2003(E) 5.10.1
The == (equal to) and the != (not equal to) operators have the same semantic restrictions, conversions, and result type as the relational operators except for their lower precedence and truth-value result. [ .. ] Pointers to objects or functions of the same type (after pointer conversions) can be compared for equality. Two pointers of the same type compare equal if and only if they are both null, both point to the same function, or both represent the same address (3.9.2).
Simulation Results on XCode (3.2.3):
#include <iostream>
#include <vector>
int main()
{
std::vector <int> a,aa;
std::vector <float> b;
if( a.begin() == aa.begin() )
std::cout << "\n a.begin() == aa.begin() \n" ;
a.push_back(10) ;
if( a.begin() != aa.begin() )
std::cout << "\n After push back a.begin() != aa.begin() \n" ;
// Error if( a.begin() == b.begin() )
return 0;
}
Output :
a.begin() == aa.begin()
After push back a.begin() != aa.begin()
I don't get the requirements on input iterators from the standard 100%, but from there on (forward/bidirectional/random access iterators) there are no requirements on the domain of ==, so it must return false result in an equivalence relation. You can't do < or > or subtraction on iterators from different containers though.
Edit: It does not have to return false, it has to result in an equivalence relation, this allows .begin() of two empty containers to compare equal (as shown in another answer). If the iterators are dereferencable, a == b => *a == *b has to hold. It's still not undefined behaviour.