std::unordered_set::equal_range returns a pair of iterators describing the range of values in the set where the keys for the values compare as equal. Given:
auto iteratorFromEqualRange = someUnorderedSet.equal_range(key).first;
auto iteratoFromFind = someUnorderedSet.find(key);
is it guaranteed by the Standard that:
++iteratorFromEqualRange == ++iteratorFromFind;
as they are both defined in terms of std::unordered_set::iterator? In other words, can a different implementation of std::unordered_set keep "hidden" information about the context of what we're iterating, or is this a not-very-subtle enforcement of the bucket interface (which limits our implementation options)?
I expect that this is indeed a guarantee, given the requirements of LegacyForwardIterator, I'm just asking for confirmation (or better news that includes some kind of escape hatch)
The iterator of unordered_set is a Forward Iterator (now named LegacyForwardIterator).
The C++14 standard (final draft n4140) states this regarding Forward Iterators:
24.2.5 Forward iterators [forward.iterators]
1 A class or pointer type X satisfies the requirements of a forward iterator if
...
(1.5) — objects of type X offer the multi-pass guarantee, described below.
...
3 Two dereferenceable iterators a and b of type X offer the multi-pass guarantee if:
(3.1) — a == b implies ++a == ++b and
(3.2) — X is a pointer type or the expression (void)++X(a), *a is equivalent to the expression *a.
Combining (1.5) and (3.1) in this case would mean that ++iteratorFromEqualRange == ++iteratorFromFind; is guaranteed by the standard, provided both these iterators can be dereferenced.
Related
My question is mainly about terminology and how to interpret the standard.
[expr.rel]#4:
The result of comparing unequal pointers to objects is defined in terms of a partial order consistent with the following rules:
(4.1) If two pointers point to different elements of the same array,
or to subobjects thereof, the pointer to the element with the higher
subscript is required to compare greater.
(4.2) If two pointers point
to different non-static data members of the same object, or to
subobjects of such members, recursively, the pointer to the later
declared member is required to compare greater provided the two
members have the same access control ([class.access]), neither member
is a subobject of zero size, and their class is not a union.
(4.3) Otherwise, neither pointer is required to compare greater than the
other.
I am little confused as how to interpret (4.3). Does that mean that this
#include <iostream>
int main() {
int x;
int y;
std::cout << (&x < &y);
std::cout << (&x < &y);
}
is...
valid C++ code and the output is either 11 or 00.
invalid code, because it has undefined behaivour
?
In other words, I know that (4.3) does apply here, but I am not sure about the implications. When the standard says "it can be either A or B" is this the same as saying "it is undefined" ?
The wording has changed in various editions of the C++ standard, and in the recent draft cited in the question. (See my comments on the question for the gory details.)
C++11 says:
Other pointer comparisons are unspecified.
C++17 says:
Otherwise, neither pointer compares greater than the other.
The latest draft, cited in the question, says:
Otherwise, neither pointer is required to compare greater than the other.
That change was made in response to an issue saying ""compares greater" term is needlessly confusing".
If you look at the surrounding context in the draft standard, it's clear that in the remaining cases the result is unspecified. Quoting from [expr.rel] (text in italics is my summary):
The result of comparing unequal pointers to objects is defined in
terms of a partial order consistent with the following rules:
[pointers to elements of the same array]
[pointers to members of the same object]
[remaining cases] Otherwise, neither pointer is required to compare greater than the other.
If two operands p and q compare equal, p<=q and
p>=q both yield true and p<q and p>q both yield
false. Otherwise, if a pointer p compares greater than a pointer q, p>=q, p>q, q<=p, and q<p all
yield true and p<=q, p<q, q>=p, and q>p
all yield false. Otherwise, the result of each of the operators
is unspecified.
So the result of the < operator in such cases is unspecified, but it does not have undefined behavior. It can be either true or false, but I don't believe it's required to be consistent. The program's output could be any of 00, 01, 10, or 11.
For the provided code, this case applies:
(4.3) Otherwise, neither pointer is required to compare greater than the other.
There is no mention of UB, and so a strict reading of "neither is required" suggests that the result of the comparison could be different every time it's evaluated.
This means the program could validly output any of the following results:
00
01
10
11
valid C++ code
Yes.
Nowhere does the standard say that this is UB or ill-formed, and neither is this case lacking a rule describing the behaviour because the quoted 4.3 applies.
and the output is either 11 or 00
I'm not sure that 10 or 01 are technically guaranteed to not be output 1.
Given that neither pointer is required to compare greater than the other, the result of the comparison can be either true of false. There appears to not be an explicit requirement for the result to be the same for each invocation on same operands in this case.
1 But I consider this unlikely in practice. I also think that leaving such possibility open is not intentional. Rather, the intention is to allow for deterministic, but not necessarily total order.
P.S.
auto comp = std::less<>;
std::cout << comp(&x, &y);
std::cout << comp(&x, &y);
would be guaranteed to be either 11 or 00 because std::less (like its friends) is guaranteed to impose a strict total order for pointers.
x and y are not part of the same array, per (4.1). And they are not members of the same object, per (4.2). So, you fall into (4.3), which means if you try to compare them to each other, the result of the comparison is indeterminate, it could be true or false. If it were undefined behavior instead, the standard would likely state that explicitly.
I have found multiple times, while reading some concepts definitions, the use of the term equal, like in Swappable:
Let t1 and t2 be equality-preserving expressions that denote distinct equal objects of type T,
Is equal defined somewhere in the standard? I guess it means that the semantics of two objects, or the value they refer (the human semantics given to their represented domain value) are the same, even if the objects are not comparable (no operator== overloaded), or something abstract like that (like, two objects a and b are equal if a == b would yield true assuming it is a valid expression --for example, because operator== is not defined because it's not required).
Since templates are going to work on the semantics the user gives, the concept of equality is largely defined by the program. For example, if I am working with case insensitive strings, I can consider the strings FoO and fOo to be equal and FoO and bAr to be unequal. The operations I supply must reflect this semantics.
Equality isn't defined based on operator==; on the contrary, operator== is (in a sense) defined based on equality. [concept.equalitycomparable]/equality_comparable:
template<class T>
concept equality_comparable = weakly-equality-comparable-with<T, T>;
Let a and b be objects of type T. T models
equality_comparable only if bool(a == b) is true when a is
equal to b ([concepts.equality]), and false otherwise.
[ Note: The requirement that the expression a == b is
equality-preserving implies that == is transitive and symmetric.
— end note ]
While checking the references for another question, I noticed an odd clause in C++11, at [expr.rel] ¶3:
Pointers to void (after pointer conversions) can be compared, with a result defined as follows: If both
pointers represent the same address or are both the null pointer value, the result is true if the operator is
<= or >= and false otherwise; otherwise the result is unspecified.
This seems to mean that, once two pointers have been casted to void *, their ordering relation is no longer guaranteed; for example, this:
int foo[] = {1, 2, 3, 4, 5};
void *a = &foo[0];
void *b = &foo[1];
std::cout<<(a < b);
would seem to be unspecified.
Interestingly, this clause wasn't there in C++03 and disappeared in C++14, so if we take the example above and apply the C++14 wording to it, I'd say that ¶3.1
If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript compares greater.
would apply, as a and b point to elements of the same array, even though they have been casted to void *. Notice that the wording of ¶3.1 was there pretty much the same in C++11, but seemed to be overridden by the void * clause.
Am I right in my understanding? What was the point of that oddball clause added in C++11 and immediately removed? Or maybe it's still there, but moved to/implied by some other part of the standard?
TL;DR:
in C++98/03 the clause was not present, and the standard did not specify relational operators for void pointers (core issue 879, see end of this post);
the odd clause about comparing void pointers was added in C++11 to resolve it, but this in turn gave rise to two other core issues 583 & 1512 (see below);
the resolution of these issues required the clause to be removed and be replaced with the wording found in C++14 standard, which allows for "normal" void * comparison.
Core Issue 583: Relational pointer comparisons against the null pointer constant
Relational pointer comparisons against the null pointer constant Section: 8.9 [expr.rel]
In C, this is ill-formed (cf C99 6.5.8):
void f(char* s) {
if (s < 0) { }
} ...but in C++, it's not. Why? Who would ever need to write (s > 0) when they could just as well write (s != 0)?
This has been in the language since the ARM (and possibly earlier);
apparently it's because the pointer conversions (7.11 [conv.ptr]) need
to be performed on both operands whenever one of the operands is of
pointer type. So it looks like the "null-ptr-to-real-pointer-type"
conversion is hitching a ride with the other pointer conversions.
Proposed resolution (April, 2013):
This issue is resolved by the resolution of issue 1512.
Core Issue 1512: Pointer comparison vs qualification conversions
Pointer comparison vs qualification conversions Section: 8.9 [expr.rel]
According to 8.9 [expr.rel] paragraph 2, describing pointer
comparisons,
Pointer conversions (7.11 [conv.ptr]) and qualification conversions
(7.5 [conv.qual]) are performed on pointer operands (or on a pointer
operand and a null pointer constant, or on two null pointer constants,
at least one of which is non-integral) to bring them to their
composite pointer type. This would appear to make the following
example ill-formed,
bool foo(int** x, const int** y) {
return x < y; // valid ? } because int** cannot be converted to const int**, according to the rules of 7.5 [conv.qual] paragraph 4.
This seems too strict for pointer comparison, and current
implementations accept the example.
Proposed resolution (November, 2012):
Relevant excerpts from resolution of the above issues are found in the paper: Pointer comparison vs qualification conversions (revision 3).
The following also resolves core issue 583.
Change in 5.9 expr.rel paragraphs 1 to 5:
In this section the following statement (the odd clause in C++11) has been expunged:
Pointers to void (after pointer conversions) can be compared, with a result defined as follows: If both pointers represent the same address or are both the null pointer value, the result is true if the operator is <= or >= and false otherwise; otherwise the result is unspecified
And the following statements have been added:
If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript compares greater.
If one pointer points to an element of an array, or to a subobject thereof, and another pointer points one past the last element of the array, the latter pointer compares greater.
So in the final working draft of C++14 (n4140) section [expr.rel]/3, the above statements are found as they were stated at the time of the resolution.
Digging for the reason why this odd clause was added led me to a much earlier issue 879: Missing built-in comparison operators for pointer types.
The proposed resolution of this issue (in July, 2009) led to the addition of this clause which was voted into WP in October, 2009.
And that is how it came to be included in the C++11 standard.
The C++17 standard 27.2.1.8 says:
An iterator j is called reachable from an iterator i if and only if
there is a finite sequence of applications of the expression ++i that
makes i == j.
That is to say, any conforming iterator type must provide operator ==.
However, I find nothing about operator != is a requirement for iterator types.
Does the C++ standard require operator != must be provided for a given iterator type?
See C++17 [input.iterators]/2 Table 95 "Input iterator requirements".
Input iterators require that a != b is valid and behaves the same as !(a == b) if the latter is valid. Link to cppreference.com summary
Output iterators do not need to support either operation.
Reading Working Draft N3337-1, Standard for Programming Language C++, 24.2.5 Forward iterators, page 806.
From draft:
Two dereferenceable iterators a and b of type X offer the multi-pass guarantee if:
— a == b implies ++a == ++b and
— X is a pointer type or the expression (void)++X(a), *a is equivalent to the expression *a.
[ Note: The requirement that a == b implies ++a == ++b (which is not true for input and output iterators) and the removal of the restrictions on the number of the assignments through a mutable iterator (which applies to output iterators) allows the use of multi-pass one-directional algorithms with forward iterators.
—end note ]
Could someone re-interpret this in easier terms ? I understand that Forward iterators are multi-pass, but I don't understand how this is accomplished per C++ standard requirements.
The terms states it all, I'd think: you can pass through the sequence multiple times and remember positions within the sequence. As long as the sequence doesn't change, starting at a specific position (iterator) you'll traverse over the same objects as often as you want in the same order. However, you can only go forward, there is no way to move backwards. The canonical example of a sequence like this is a singly-linked list.
The quoted clause basically says, that if you have two iterators comparing equal and you increment each one of them, you get to the same position and they compare equal again:
if (it1 == it2) {
++it1;
++it2;
assert(it1 == it2); // has to hold for multi-pass sequences
}
The somewhat weird expression ++X(a), *a is basically intended to advance an iterator independent to a and the requirement that ++X(a), *a being equivalent to *a basically means that iterator over the sequence using an independent iterator doesn't change what a refers to. This is unlike input iterator where ++InIt(a), *a is not necessarily equivalent to *a as the first expression can have change the position, possibly invalidating a and/or change the value it is referring to.
By contrast, the single-pass sequence (input and output iterations in standard terms) can only be traversed once: trying to traverse the sequence multiple times will not work necessarily work. The canonical example of sequences like this are input from the keyboard and output to the console: once read, you can't get back the same characters again and once sent you can't undo the characters.