Stroustrup on comparisons, errata? - c++

In Stroustrup C++ 4th Ed Page 891, where comparisons properties are described. He explains that the function cmp can be represented by less than < for a strict weak ordering. I'm confused by his explanation of "Transitivity of equivalence" as follows;
Transitivity of equivalence: Define equiv(x,y) to be
!(cmp(x,y)||cmp(y,x)). If equiv(x,y) and equiv(y,z), then equiv(x,z).
The last rule is the one that allows us to define equality (x==y) as !(cmp(x,y)||cmp(y,x)) if we need ==.
Should this instead be defined as follows?
cmp is <= and equiv(x,y) = (cmp(x,y) && cmp(y,x))
Appreciate your guidance.

This is not errata.
equiv(x,y) := !(cmp(x,y)||cmp(y,x))
x := x
y := x
substituting in:
!((x < x) || (x < x))
!((false) || (false))
!(false)
true

Can this instead be defined as follows?
cmp is <= and equiv(x,y) = (cmp(x,y) && cmp(y,x))
Yes, that also gives you consistent definitions.
Should this instead be defined as follows?
It isn't better than the definitions we use, so I'd suggest no, mostly because there's loads of existing code written for the current definition.
Instead of the current definition of Compare
Compare is a set of requirements expected by some of the standard library facilities from the user-provided function object types.
The return value of the function call operation applied to an object of a type satisfying Compare, when contextually converted to bool, yields true if the first argument of the call appears before the second in the strict weak ordering relation induced by this type, and false otherwise.
For all a, cmp(a,a)==false
If cmp(a,b)==true then cmp(b,a)==false
If cmp(a,b)==true and cmp(b,c)==true then cmp(a,c)==true
It would instead be
The return value of the function call operation applied to an object of a type satisfying Compare, when contextually converted to bool, yields false if the first argument of the call appears after the second in the strict weak ordering relation induced by this type, and true otherwise.
For all a, cmp(a,a)==true
If cmp(a,b)==false then cmp(b,a)==true
If cmp(a,b)==true and cmp(b,c)==true then cmp(a,c)==true

Related

Does greater operator ">" satisfy strict weak ordering?

Definition:
Let < be a binary relation where a < b means "a is less than b".
Let > be a binary relation where a > b means "a is greater than b".
So, we assume < and > have meanings we usually use in a daily life. Though, in some programming languages (e.g. C++), we can overload them to give them different definitions, hereafter we don't think about that.
Context:
As far I read mathematical definition of strict weak ordering (e.g. Wikipedia), I think both < and > satify it. However, all examples I saw in many websites refer only to <. There is even a website which says
what they roughly mean is that a Strict Weak Ordering has to behave the way that "less than" behaves: if a is less than b then b is not less than a, if a is less than b and b is less than c then a is less than c, and so on.
Also, in N4140 (C++14 International Standard), strict weak ordering is defines as
(§25.4-4) If we define equiv(a, b) as !comp(a, b) && !comp(b, a), then the requirements are that comp and equiv both be transitive relations
where comp is defined as
(§25.4-2) Compare is a function object type (20.9). The return value of the function call operation applied to an object of type Compare, when contextually converted to bool (Clause 4), yields true if the first argument of the call is less than the second, and false otherwise. Compare comp is used throughout for algorithms
assuming an ordering relation.
Question:
Does ">" satisfy strict weak ordering? I expect so, but have no confidence.
Does greater operator “>” satisfy strict weak ordering?
The mathematical strict greater than relation is a strict weak ordering.
As for the operator in C++ langauge: For all integers types: Yes. In general: No, but in most cases yes. Same applies to strict less than operator.
As for the confusing quote, "is less than" in that context intends to convey that means that the the end result of the sort operation is a non-decreasing sequence i.e. objects are "less" or equal to objects after them. If std::greater is used as comparison object, then greater values are "lesser" in order.
This may be confusing, but is not intended to exclude strict greater than operator.
what is the case where > doesn't satisfy strict weak ordering?
Some examples:
Overloaded operators that don't satisfy the properties.
> operator on pointers that do not point to the same array has unspecified result.
> does not satisfy irreflexivity requirement for floating point types in IEEE-754 representation unless NaNs are excluded from the domain.
Even if the standard refers to "less than" for arbitrary Compare functions, that only implies "less than" in the context of the ordering.
If I define an ordering by comparison function [](int a, int b) { return a > b; }, then an element is "less than" another in this ordering if its integer value is greater. That's because the ordering I've created is an ordering of the integers in reverse order. You shouldn't read < as "less than" in orderings. You should read it as "comes before".
Whenever x < y is a strict weak ordering then x > y is also a strict weak ordering, just with the reverse order.

Concept of "equal" in C++20 concepts

I have found multiple times, while reading some concepts definitions, the use of the term equal, like in Swappable:
Let t1 and t2 be equality-preserving expressions that denote distinct equal objects of type T,
Is equal defined somewhere in the standard? I guess it means that the semantics of two objects, or the value they refer (the human semantics given to their represented domain value) are the same, even if the objects are not comparable (no operator== overloaded), or something abstract like that (like, two objects a and b are equal if a == b would yield true assuming it is a valid expression --for example, because operator== is not defined because it's not required).
Since templates are going to work on the semantics the user gives, the concept of equality is largely defined by the program. For example, if I am working with case insensitive strings, I can consider the strings FoO and fOo to be equal and FoO and bAr to be unequal. The operations I supply must reflect this semantics.
Equality isn't defined based on operator==; on the contrary, operator== is (in a sense) defined based on equality. [concept.equalitycomparable]/equality_comparable:
template<class T>
concept equality_comparable = weakly-equality-comparable-with<T, T>;
Let a and b be objects of type T. T models
equality_­comparable only if bool(a == b) is true when a is
equal to b ([concepts.equality]), and false otherwise.
[ Note: The requirement that the expression a == b is
equality-preserving implies that == is transitive and symmetric.
— end note ]

Rigorous proof of the following C++ code's property?

Take the following C++14 code snippet:
unsigned int f(unsigned int a, unsigned int b){
if(a>b)return a;
return b;
}
Statement: the function f returns the maximum of its arguments.
Now, the statement is "obviously" true, yet I failed to prove it rigorously with respect to the ISO/IEC 14882:2014(E) specification.
First: I cannot state the property in a formal way.
A formalized version could be:
For every statement s, when the abstract machine (which is defined in the spec.) is in state P and s looks like "f(expr_a,expr_b)" and 'f' in s is resolved to the function in question, s(P).return=max(expr_a(P).return, expr_b(P).return).
Here for a state P and expression s, s(P) is the state of the machine after evaluation of s.
Question: What would be a correctly formalized version of the statement? How to prove the statement using the properties imposed by the above mentioned specification? For each deductive step please reference the applicable snippet from the standard allowing said step (the number of the segment is enough).
Edit: Maybe formalized in Coq
Please appologize for my approximate ageing mathematic knowledge.
Maximum for a closed subset of natural number (BN) can be defined as follow:
Max:(BN,BN) -> BN
(x ∊ BN)(a ∊ BN)(b ∊ BN)(x = Max(a,b)) => ( x=a & a>b | x=b )
where the symbol have the common mathemical signification.
While your function could be rewritten as follow, where UN is the ensemble of unsigned int:
f:(UN,UN) -> UN
(x ∊ UN)(a ∊ UN)(b ∊ UN)(x = f(a,b)) => ( x=a && a>b || x=b )
Where symbol = is operator==(unsigned int,unsigned int), etc...
So the question reduces to know if the standard specifies that the mathematical structure(s) formed by the unsigned integer with C++ arithmetic operators and comparison operator is isomorphic to the matematical structures (classes,categories) formed by a closed subset of N with the common arithemtic operation and relations. I think the answer is yes, this is expressed in plain english:
C++14 standard,[expr.rel]/5 (Relational Operators)
If both operands (after conversions) are of arithmetic or enumeration type, each of the operators shall yield true if the specified relationship is true and false if it is false.
C++14 standard, [basic.fundamental]/4 (Fundamental Types)
Unsigned integers shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.
Then you could also proove that ({true,false},&&,||) is also isomorphic to boolean arithmetic by analysing the text in [expr.log.and]
and [expr.log.or]
I do not think that you should go further than showing that there is this isomorphism because further would mean demonstrating axioms.
It appears to me that the easiest solution is to prove this backwards. If the first argument to f is the maximum argument, prove that the first argument is returned (fairly easy - the maximum argument a is by definition bigger than b). If the second argument is the maximum argument, prove that the second argument is returned. If the two are equal, show that there is no unique maximum element, so the second argument is still a maximum argument.
Finally, prove that these three options are exhaustive. If a unique maximum argument is passed, it must be passed either as the first or the second argument, since f is binary.
I am unsure about what you want. Looking at a previous version, N3337, we can easily see that almost everything is specified:
a and b starts with the calling values (Function 5.2.2 - 4)
Calling a function executes the compound statement of the function body (Obvious, but where?)
The statements are normally executed in order (Statements 6)
If-statements execute the first sub-statement if condition is true (The If Statement 6.4.1)
Relations actually work as expected (Relation operators 5.9 - 5)
The return-statement returns the value to the caller of the function (The return statement 6.6.3)
However, you attempt to start with f(expr_a, expr_b); and evaluating the arguments to f potentially requires a lot more; especially since they are not sequenced - and could be any function-call.

how is cmp defined in c++? with < or with <=?

I asked me how the cmp function in std::sort and std::is_sorted is defined.
here are two documentations for is_sorted_until how say it should be operator< :
en.cppreference.com
cplusplus.com
But i think there should be a problem with equal elements.
The list {1,1,1} should not be sorted because 1<1==false.
But there is an example which says:
...
int *sorted_end = std::is_sorted_until(nums, nums + N);
...
1 1 4 9 5 3 : 4 initial sorted elements
but that should return 1 if < is used like documented.
It would work with <=, but that is not the way it is documented.
I'm really confused.
The comparison is required to define a strict weak ordering. A strict weak ordering defines a set of equivalence classes from the incomparability relation, i.e., if x < y is false, and y < x is false too (i.e. x and y cannot be compared with <), x and y are considered equivalent. These equivalence classes have a total order, and that's the total order resulting from the sort functions.
In the example given, {1,1,1} has only a single equivalence class, the one composed of {1,1,1}.
is_sorted_until finds the first element x[i] for which x[i] < x[i-1] is true.
To be exact, it's neither < nor <=, it is defaulted to std::less. That one in turn calls < for most types, except where it is specialized. For example, < for pointers does not generally give a strict ordering, while std::less does.
It does indeed use operator< unless you provide a custom comparison. But the definition of "sorted" is not a[n] < a[n+1] (which we might call "strictly sorted"), but !(a[n+1] < a[n]); so equal elements are considered sorted. This is equivalent to using <=, but (in common with all other standard algorithms) doesn't require that operator to be defined.
In general, all ordered comparisons must define a "strict weak ordering". "Strict" means that the comparison must be false for equivalent objects; so < is valid, while <= is not.
If you look at the example implementation, < is used for checking if the next element is less than the previous one:
if (*next < *first)
return next;
If it is, then the order is broken, and the function returns. I. e. the logic is reversed - the algorithm does not terminate if the next element is equal to the previous.

Are the integer comparison operators short circuited in C++?

Like the title states, are the integer (or any numerical datatypes like float etc.) comparison operators (==, !=, >, >=, <, <=) short circuited in C++?
They can't short circuit. To know if x == y, x != y, etc are true or false you need to evaluate both, x and y. Short circuiting refers to logical boolean operators && and ||. Logical AND is known to be false if the first argument is false and Logical OR is known to be true if the first argument is true. In these cases you don't need to evaluate the second argument, this is called short circuiting.
Edit: this follows the discussions for why x >= y don't short circuit when the operands are unsigned ints and x is zero:
For logical operands short circuiting comes for free and is implementation neutral. The machine code for if(f() && g()) stmt; is likely to look similar to this:
call f
test return value of f
jump to next on zero
call g
test return value of g
jump to next on zero
execute stmt
next: ...
To prevent short circuiting you actually need to do the computation of the result of the operator and test it after that. This takes you a register and makes the code less efficient.
For non-logical operators the situation is the opposite. Mandating short circuiting implies:
The compiler can't choose an evaluation of the expression that uses a minimum number of registers.
The semantics may be implementation defined (or even undefined) for many cases, like when comparing with maximum value.
The compiler needs to add an additional test/jump. For if(f() > g()) stmt; the machine code will look like this:
call f
mov return value of f to R1
test return value of f
jump to next on zero
call g
compare R1 with return value of g
jump to next on less-than equal
execute stmt
next: ...
Note how the first test and jump are just unnecessary otherwise.
No. The comparison operators require both operands to evaluate the correct answer. By contrast, the logical operators && and || in some cases don't need to evaluate the right operand to get the right answer, and therefore do "short-circuit".
No, how could they be. In order to check whether 1 == 2 you have to inspect both the 1 and the 2. (Ofcoruse, a compiler can do a lot of reordering, static checking, optimizations, etc. but that's not inherit to c++)
How would that work? Short-circuiting means you can avoid evaluating the RHS based solely on the result of evaluating the LHS.
e.g.
true || false
doesn't need to evaluate the RHS because true || x is true no matter what x turns out to be.
But this won't work for any of the comparisons that you list. For example:
5 == x
How can you ever know the result of the expression without knowing x?