This question already has answers here:
Is comparison of const_iterator with iterator well-defined?
(3 answers)
Closed 4 years ago.
I have two iterators into a container, one const and one non-const. Is there an issue with comparing them to see if they both refer to the same object in the container? This is a general C++11 iterator question:
Can a const and non-const iterator be legitimately compared to see if
they both refer to the same object, independent of the type of
container (i.e., they are both iterators that are guaranteed to refer
to objects in the same container or that container's end(), but one is
const and the other is not)?
For example, consider the following code:
some_c++11_container container;
// Populate container
...
some_c++11_container::iterator iObject1=container.begin();
some_c++11_container::const_iterator ciObject2=container.cbegin();
// Some operations that move iObject1 and ciObject2 around the container
...
if (ciObject2==iObject1) // Is this comparison allowed by the C++11 standard?
...; //Perform some action contingent on the equality of the two iterators
Yes, this will work like you expect.
The Standard guarantees that for any container type, some_container::iterator can be implicitly converted to some_container::const_iterator.
The first table in 23.2.1 [container.requirements.general], after defining X as a container type which contains objects of type T, has:
Expression: X::iterator
Return type: iterator type whose value type is T
Note: any iterator category that meets the forward iterator requirements. convertible to X::const_iterator.
Expression: X::const_iterator
Return type: constant iterator type whose value type is T
Note: any iterator category that meets the forward iterator requirements.
(These aren't really expressions, and are types, rather than having "return types", but that's how they're squeezed into the table that is mostly expressions.)
So when you have ciObject2==iObject1, the compiler notices that the best operator== is ciObject2==some_container::const_iterator(iObject1). And operator== on two const_iterator tells you if they refer to the same element.
(I don't see anything explicitly saying that the result of this conversion refers to the same object as the original iterator. I guess that's just understood.)
From §24.2.3/1
A class or pointer type X satisfies the requirements of an input iterator for the value type T if X satisfies the Iterator (24.2.2) and EqualityComparable (Table 17) requirements ...
Thus input iterators are required to be EqualityComparable.
All standard library container iterators must satisfy forward iterator requirements (§23.2.1 - Table 96). Since those requirements are a superset of input iterator requirements, it follows these iterators must satisfy the EqualityComparable concept.
Also, from §23.2.1 - Table 96, X::iterator is required to be convertible to
X::const_iterator.
Adding the two together answers your question that it is indeed required by the standard that comparing a Container::const_iterator to a Container::iterator is well-defined (as long as both are valid iterators pointing to the same container).
I don't think it is possible for there to be an issue comparing them. If you have a const iterator when you check for it being the end the iterator end() returns is not const.
IIRC iterator implicitly converts to const_iterator. And the result must point to the same position.
If so the mixed compare will do the conversion then compare the now compatible const_iterators.
Related
I am a bit confused about the C++17 standard named requirements regarding iterators; it seems to me that std::map::iterator violates the requirements. From the standard, I gather the following:
All container iterators are supposed to satisfy ForwardIterator.
For an iterator It satisfying ForwardIterator, std::iterator_traits<It>::reference has to be either value_type& – if It is mutable – or const value_type& – if It is constant.
The notion “It is mutable” is defined as “It also satisfies OutputIterator”, while “constant” means non-mutable.
In order to satisfy OutputIterator, the expression *it = o should be valid for o being of a type writable to the iterator; there has to be a nonempty set of such types.
Now, std::iterator_traits<std::map::iterator>::reference is value_type& (without const), but the iterator does not satisfy the OutputIterator requirements – as value_type of map::iterator is std::pair<const Key, Value>, we can never assign to *it.
So, the question is, am I overlooking some part of the standard that deals with this inconsistency? Or am I interpreting the standard requirements wrong?
C++20 introduces std::common_reference. What is its purpose? Can someone give an example of using it?
common_reference came out of my efforts to come up with a conceptualization of STL's iterators that accommodates proxy iterators.
In the STL, iterators have two associated types of particular interest: reference and value_type. The former is the return type of the iterator's operator*, and the value_type is the (non-const, non-reference) type of the elements of the sequence.
Generic algorithms often have a need to do things like this:
value_type tmp = *it;
... so we know that there must be some relationship between these two types. For non-proxy iterators the relationship is simple: reference is always value_type, optionally const and reference qualified. Early attempts at defining the InputIterator concept required that the expression *it was convertible to const value_type &, and for most interesting iterators that is sufficient.
I wanted iterators in C++20 to be more powerful than this. For example, consider the needs of a zip_iterator that iterates two sequences in lock-step. When you dereference a zip_iterator, you get a temporary pair of the two iterators' reference types. So, zip'ing a vector<int> and a vector<double> would have these associated types:
zip iterator's reference : pair<int &, double &>
zip iterator's value_type: pair<int, double>
As you can see, these two types are not related to each other simply by adding top-level cv- and ref qualification. And yet letting the two types be arbitrarily different feels wrong. Clearly there is some relationship here. But what is the relationship, and what can generic algorithms that operate on iterators safely assume about the two types?
The answer in C++20 is that for any valid iterator type, proxy or not, the types reference && and value_type & share a common reference. In other words, for some iterator it there is some type CR which makes the following well-formed:
void foo(CR) // CR is the common reference for iterator I
{}
void algo( I it, iter_value_t<I> val )
{
foo(val); // OK, lvalue to value_type convertible to CR
foo(*it); // OK, reference convertible to CR
}
CR is the common reference. All algorithms can rely on the fact that this type exists, and can use std::common_reference to compute it.
So, that is the role that common_reference plays in the STL in C++20. Generally, unless you are writing generic algorithms or proxy iterators, you can safely ignore it. It's there under the covers ensuring that your iterators are meeting their contractual obligations.
EDIT: The OP also asked for an example. This is a little contrived, but imagine it's C++20 and you are given a random-access range r of type R about which you know nothing, and you want to sort the range.
Further imagine that for some reason, you want to use a monomorphic comparison function, like std::less<T>. (Maybe you've type-erased the range, and you need to also type-erase the comparison function and pass it through a virtual? Again, a stretch.) What should T be in std::less<T>? For that you would use common_reference, or the helper iter_common_reference_t which is implemented in terms of it.
using CR = std::iter_common_reference_t<std::ranges::iterator_t<R>>;
std::ranges::sort(r, std::less<CR>{});
That is guaranteed to work, even if range r has proxy iterators.
cppreference says that the iterators for the vector<bool> specialization are implementation defined and many not support traits like ForwardIterator (and therefore RandomAccessIterator).
cplusplus adds a mysterious "most":
The pointer and iterator types used by the container are not
necessarily neither pointers nor conforming iterators, although they
shall simulate most of their expected behavior.
I don't have access to the official specification. Are there any iterator behaviors guaranteed for the vector<bool> iterators?
More concretely, how would one write standards-compliant code to insert an item in the middle of a vector<bool>? The following works on several compilers that I tried:
std::vector<bool> v(4);
int k = 2;
v.insert(v.begin() + k, true);
Will it always?
The fundamental problem with vector<bool>'s iterators is that they are not ForwardIterators. C++14 [forward.iterators]/1 requires that ForwardIterators' reference type be T& or const T&, as appropriate.
Any function which takes a forward iterator over a range of Ts is allowed to do this:
T &t = *it;
t = //Some value.
However, vector<bool>'s reference types are not bool&; they're a proxy object that is convertible to and assignable from a bool. They act like a bool, but they are not a bool. As such, this code is illegal:
bool &b = *it;
It would be attempting to get an lvalue reference to a temporary created from the proxy object. That's not allowed.
Therefore, you cannot use vector<bool>'s iterators in any function that takes ForwardIterators or higher.
However, your code doesn't necessarily have to care about that. As long as you control what code you pass those vector<bool> iterators to, and you don't do anything that violates how they behave, then you're fine.
As far as their interface is concerned, they act like RandomAccessIterators, except for when they don't (see above). So you can offset them with integers with constant time complexity and so forth.
vector<bool> is fine, so long as you don't treat it like a vector that contains bools. Your code will work because it uses vector<bool>'s own interface, which it obviously accepts.
It would not work if you passed a pair of vector<bool> iterators to std::sort.
C++14 [vector.bool]/2:
Unless described below, all operations have the same requirements and
semantics as the primary vector template, except that operations
dealing with the bool value type map to bit values in the container
storage and allocator_traits::construct (20.7.8.2) is not used to
construct these values.
The following code compiles just fine, overwriting the values in v2 with those from v1:
std::vector<int> v1 = {1, 2, 3, 4, 5};
std::vector<int> v2 = {6, 7, 8, 9, 10};
std::copy(v1.begin(), v1.end(), v2.begin());
The third argument of std::copy is an OutputIterator. However, the Container requirements specify that a.begin(), where a is a Container object, should have a return type of iterator which is defined as:
any iterator category that meets the forward iterator requirements.
Forward iterator requirements do not include the requirements of output iterators, so is the example above undefined? I'm using the iterator as an output iterator even though there's no obvious guarantee that it will be one.
I'm fairly certain the above code is valid, however, so my guess is that you can infer from the details about containers that the forward iterator returned by begin() will in fact also support the output iterator requirements. In that case, when does begin() not return an output iterator? Only when the container is const or are there other situations?
Forward iterators can conform the the specifications of an output iterator if they're mutable, depending on the type of the sequence. It's not explicitly spelled out (unlike the fact that they to input iterator requirements), but if we take a look at the requirements table
we can go and check if a given forward iterator conforms to them:
*r = o
(§24.2.5/1): if X is a mutable iterator, reference is a reference to T
A mutable reference is assignable (unless you have a non-assignable type, obviously).
++r, r++, *r++ = o
(§24.2.5 Table 109)
The first line in Table 109 is the same requirement as for output iterators, except that forward iterators don't have the remark. The second line is more restrictive than for output iterators, since it specifies that a reference must be returned.
Bottom line, if you have a mutable forward iterator into a sequence of copy-assignable types, you have a valid output iterator.
(Technically, a constant iterator into a sequence of types that have a operator=(...) const and mutable members would also qualify, but let's hope nobody does something like that.)
Forward iterator requirements do not include the requirements of output iterators
This sounds backwards. OutputIterators need to satisfy fewer criteria than ForwardIterators.
(Forward iterators should be reusable after increment, i.e. incrementing them twice should yield the same result).
Therefore, it is ok provided that the output iterator stays valid until the algorithm completes. IOW:
auto outit = std::begin(v2);
std::advance(outit, v1.size()); // or: std::distance(std::begin(v1), std::end(v2))
// outit should still be valid here
Edit To the comment:
§ 24.2.1
Iterators that further satisfy the requirements of output iterators are called mutable iterators. Nonmutable iterators are referred to as constant iterators.
Now, let me find the bit that ties this together saying vector::begin() returns mutable Random Access iterator.
For info
§ 24.2.5 Forward iterators [forward.iterators]
1 A class or pointer type X satisfies the requirements of a forward iterator if
X satisfies the requirements of an input iterator (24.2.3),
X satisfies the DefaultConstructible requirements (17.6.3.1),
if X is a mutable iterator, reference is a reference to T; if X is a const iterator, reference is a reference to const T,
the expressions in Table 109 are valid and have the indicated semantics, and
objects of type X offer the multi-pass guarantee, described below.
According to C++ standard (3.7.3.2/4) using (not only dereferencing, but also copying, casting, whatever else) an invalid pointer is undefined behavior (in case of doubt also see this question). Now the typical code to traverse an STL containter looks like this:
std::vector<int> toTraverse;
//populate the vector
for( std::vector<int>::iterator it = toTraverse.begin(); it != toTraverse.end(); ++it ) {
//process( *it );
}
std::vector::end() is an iterator onto the hypothetic element beyond the last element of the containter. There's no element there, therefore using a pointer through that iterator is undefined behavior.
Now how does the != end() work then? I mean in order to do the comparison an iterator needs to be constructed wrapping an invalid address and then that invalid address will have to be used in a comparison which again is undefined behavior. Is such comparison legal and why?
The only requirement for end() is that ++(--end()) == end(). The end() could simply be a special state the iterator is in. There is no reason the end() iterator has to correspond to a pointer of any kind.
Besides, even if it were a pointer, comparing two pointers doesn't require any sort of dereference anyway. Consider the following:
char[5] a = {'a', 'b', 'c', 'd', 'e'};
char* end = a+5;
for (char* it = a; it != a+5; ++it);
That code will work just fine, and it mirrors your vector code.
You're right that an invalid pointer can't be used, but you're wrong that a pointer to an element one past the last element in an array is an invalid pointer - it's valid.
The C standard, section 6.5.6.8 says that it's well defined and valid:
...if the expression P points to the
last element of an array object, the
expression (P)+1 points one past the
last element of the array object...
but cannot be dereferenced:
...if the result points one past the
last element of the array object, it
shall not be used as the operand of a
unary * operator that is evaluated...
One past the end is not an invalid value (neither with regular arrays or iterators). You can't dereference it but it can be used for comparisons.
std::vector<X>::iterator it;
This is a singular iterator. You can only assign a valid iterator to it.
std::vector<X>::iterator it = vec.end();
This is a perfectly valid iterator. You can't dereference it but you can use it for comparisons and decrement it (assuming the container has a sufficient size).
Huh? There's no rule that says that iterators need to be implemented using nothing but a pointer.
It could have a boolean flag in there, which gets set when the increment operation sees that it passes the end of the valid data, for instance.
The implementation of a standard library's container's end() iterator is, well, implementation-defined, so the implementation can play tricks it knows the platform to support.
If you implemented your own iterators, you can do whatever you want - so long as it is standard-conform. For example, your iterator, if storing a pointer, could store a NULL pointer to indicate an end iterator. Or it could contain a boolean flag or whatnot.
I answer here since other answers are now out-of-date; nevertheless, they were not quite right to the question.
First, C++14 has changed the rules mentioned in the question. Indirection through an invalid pointer value or passing an invalid pointer value to a deallocation function are still undefined, but other operations are now implemenatation-defined, see Documentation of "invalid pointer value" conversion in C++ implementations.
Second, words matter. You can't bypass the definitions while applying the rules. The key point here is the definition of "invalid". For iterators, this is defined in [iterator.requirements]. Though pointers are iterators, meanings of "invalid" to them are subtly different. Rules for pointers render "invalid" as "don't indirect through invalid value", which is a special case of "not dereferenceable" to iterators; however, "not deferenceable" is not implying "invalid" for iterators. "Invalid" is explicitly defined as "may be singular", while "singular" value is defined as "not associated with any sequence" (in the same paragraph of definition of "dereferenceable"). That paragraph even explicitly defined "past-the-end values".
From the text of the standard in [iterator.requirements], it is clear that:
Past-the-end values are not assumed to be dereferenceable (at least by the standard library), as the standard states.
Dereferenceable values are not singular, since they are associated with sequence.
Past-the-end values are not singular, since they are associated with sequence.
An iterator is not invalid if it is definitely not singular (by negation on definition of "invalid iterator"). In other words, if an iterator is associated to a sequence, it is not invalid.
Value of end() is a past-the-end value, which is associated with a sequence before it is invalidated. So it is actually valid by definition. Even with misconception on "invalid" literally, the rules of pointers are not applicable here.
The rules allowing == comparison on such values are in input iterator requirements, which is inherited by some other category of iterators (forward, bidirectional, etc). More specifically, valid iterators are required to be comparable in the domain of the iterator in such way (==). Further, forward iterator requirements specifies the domain is over the underlying sequence. And container requirements specifies the iterator and const_iterator member types in any iterator category meets forward iterator requirements. Thus, == on end() and iterator over same container is required to be well-defined. As a standard container, vector<int> also obey the requirements. That's the whole story.
Third, even when end() is a pointer value (this is likely to happen with optimized implementation of iterator of vector instance), the rules in the question are still not applicable. The reason is mentioned above (and in some other answers): "invalid" is concerned with *(indirect through), not comparison. One-past-end value is explicitly allowed to be compared in specified ways by the standard. Also note ISO C++ is not ISO C, they also subtly mismatches (e.g. for < on pointer values not in the same array, unspecified vs. undefined), though they have similar rules here.
Simple. Iterators aren't (necessarily) pointers.
They have some similarities (i.e. you can dereference them), but that's about it.
Besides what was already said (iterators need not be pointers), I'd like to point out the rule you cite
According to C++ standard (3.7.3.2/4)
using (not only dereferencing, but
also copying, casting, whatever else)
an invalid pointer is undefined
behavior
wouldn't apply to end() iterator anyway. Basically, when you have an array, all the pointers to its elements, plus one pointer past-the-end, plus one pointer before the start of the array, are valid. That means:
int arr[5];
int *p=0;
p==arr+4; // OK
p==arr+5; // past-the-end, but OK
p==arr-1; // also OK
p==arr+123456; // not OK, according to your rule