What does mathematically defined result mean?
There is a quote from 5/4:
If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable values for
its type, the behavior is undefined.
There's a note right after this statement, which provides some types of examples:
[ Note: most existing implementations of C++ ignore integer overflows. Treatment of division by zero, forming a remainder using a zero divisor, and all floating point exceptions vary among machines, and is usually adjustable by a library function. —end note ]
For example, 0/0 is not mathematically defined.
The case of 1/0 is slightly different, but in practice, for the C++ standard you can be sure that it's not viewed as mathematically defined.
The mathematical way to state this would be that the behavior is undefined iff the inputs are not elements of the natural domain of the function.
(The second part, about results being representable, translates to a restriction on the codomain of the function)
It depends on context. Mathematically not defined means simply: it is not defined under mathematics.
Suppose you want to divide by 0 but it is not defined : Division is the inverse of multiplication. If a \ b=c, then b * c=a.
But if b=0 , then any multiple of b is also 0 , and so if a != 0 , no such c exists.
On the other hand, if a and b are both zero, then every real number c satisfies b * c=a. Either way, it is impossible to assign a particular real number to the quotient when the divisor is zero. (From wikipedia).
In algebra, a function is said to be "undefined" at points not in its domain.
In geometry,In ancient times, geometers attempted to define every term. For example, Euclid defined a point as "that which has no part". In modern times, mathematicians recognized that attempting to define every word inevitably led to circular definitions, and in geometry left some words, "point" for example, as undefined.
So it depends on context.
In programming context, you can assume that it means divide by 0 or out of a defined range,causing overflow.
"Mathematically defined result" means the same thing it would mean in other similar contexts. Namely, there are constructs in most programming languages that represent, to that or another degree of accuracy, various mathematical concepts via some kind of natural mapping. This mapping is normally not defined in the documentation but is implied by reader's understanding of the natural language and common sense. In the C++ standard, it is referred in various places as "usual" or "ordinary" mathematical rules.
If you are unsure what this mapping is, I guess you can submit a defect report, but usually users of the standard know what 2+2 maps to, just as they know what words like if or the or appear or requirement mean.
It is true that 2+2 can be conceivably mapped to various mathematical constructs, not necessarily even connected to say Peano arithmetic, but that's not an "ordinary" or "usual" mathematics by the standards of C++ community.
So to answer the question, take an expression and map it to the corresponding mathematical concept. If it is not defined, then it is not defined.
Related
In C and C++ floating point computations are non deterministic by default as not even the true datatype is chosen by user, as for any intermediate computation of a FP subexpression, the compiler can choose to represent a value with higher precision (that is as another real datatype).
[Some compilers (GCC) were known to do that for any automatic variable, not just the (anonymous) intermediate result of a subexpression.]
The compiler can do that for a few computations, in some functions; it can do it in some cases and not other for exactly the same subexpression.
It can even inline a function and use a different precision each time a function is called in the source code. It means that any inlinable function can have its semantics call dependent; only separately compiled, ABI-called functions (those functions that are called according to conventions described by the ABI and which act essentially as a black box) have the absolute guarantee of having only one floating point behavior, fixed during separate compilation (that means that no global optimization occurs).
[Note that this is similar to how string literals are defined: any two computation of the same string literal in the source code can refer to the same or different character arrays.]
That means that even for a purely applicative functions the fundamental equality f(x) == f(x) is only guaranteed if floating point operations (and string literals) are not used (or that the address of a string literal is only used to access its elements).
So floating point operations have non deterministic semantics with an arbitrary choice made by the compiler for each and every FP operations (which seems a lot more perverse that the very small issue of letting the compiler choose which subexpression A or B to compute first in A+B).
It seems that function that does any computation with intermediate floating point values cannot be used in any STL container or algorithm that expects a functor satisfying axioms, such that
sorted containers: set, map, multiset, multimap
hashed containers
sorting algorithms: sort, stable_sort
algorithm operating on sorted ranges: lower_bound, set_union, set_intersection...
as all binary predicates and hash functions must be deterministic before axioms can even be conceived, that they must be is purely applicative, mathematical function with a defined value of all possible inputs, which is never the case with C++ non-deterministic floating point intermediate values?
In other words, are floating point operations by default almost unusable based on the standard alone, and only usable using in real world implementations that have some (vague) implicit guarantees of determinism?
A compiler is allowed to use higher precision operations and temporary values when evaluating floating-point arithmetic expressions. So if you do a1 = b+c; a2 = b+c;, then yes, it's possible for a1 and a2 to not be equal to each other (due to double rounding).
That really doesn't matter to the algorithms and containers you mentioned, though; those don't do any arithmetic on their values. The "axioms" that those rely on are those of ordering relationships at most.
The worst you could say is that if you'd previously stored b+c in, say, a set, doing mySet.find(b+c) might fail. So yeah, don't do that. But that's something you don't generally do anyway, the rounding error produced by floating-point arithmetic already making it rare to expect exact equality from derived quantities.
About the only time that you do start having to worry about extended precision is when you're calculating theoretical FP error bounds on a sequence of operations. When you're doing that, you'll know what to look for. Until then, this problem is not a problem.
It is common knowledge that one has to be careful when comparing floating point values. Usually, instead of using ==, we use some epsilon or ULP based equality testing.
However, I wonder, are there any cases, when using == is perfectly fine?
Look at this simple snippet, which cases are guaranteed to succeed?
void fn(float a, float b) {
float l1 = a/b;
float l2 = a/b;
if (l1==l1) { } // case a)
if (l1==l2) { } // case b)
if (l1==a/b) { } // case c)
if (l1==5.0f/3.0f) { } // case d)
}
int main() {
fn(5.0f, 3.0f);
}
Note: I've checked this and this, but they don't cover (all of) my cases.
Note2: It seems that I have to add some plus information, so answers can be useful in practice: I'd like to know:
what the C++ standard says
what happens, if a C++ implementation follows IEEE-754
This is the only relevant statement I found in the current draft standard:
The value representation of floating-point types is implementation-defined. [ Note: This document imposes no requirements on the accuracy of floating-point operations; see also [support.limits]. — end note ]
So, does this mean, that even "case a)" is implementation defined? I mean, l1==l1 is definitely a floating-point operation. So, if an implementation is "inaccurate", then could l1==l1 be false?
I think this question is not a duplicate of Is floating-point == ever OK?. That question doesn't address any of the cases I'm asking. Same subject, different question. I'd like to have answers specifically to case a)-d), for which I cannot find answers in the duplicated question.
However, I wonder, are there any cases, when using == is perfectly fine?
Sure there are. One category of examples are usages that involve no computation, e.g. setters that should only execute on changes:
void setRange(float min, float max)
{
if(min == m_fMin && max == m_fMax)
return;
m_fMin = min;
m_fMax = max;
// Do something with min and/or max
emit rangeChanged(min, max);
}
See also Is floating-point == ever OK? and Is floating-point == ever OK?.
Contrived cases may "work". Practical cases may still fail. One additional issue is that often optimisation will cause small variations in the way the calculation is done so that symbolically the results should be equal but numerically they are different. The example above could, theoretically, fail in such a case. Some compilers offer an option to produce more consistent results at a cost to performance. I would advise "always" avoiding the equality of floating point numbers.
Equality of physical measurements, as well as digitally stored floats, is often meaningless. So if your comparing if floats are equal in your code you are probably doing something wrong. You usually want greater than or less that or within a tolerance. Often code can be rewritten so these types of issues are avoided.
Only a) and b) are guaranteed to succeed in any sane implementation (see the legalese below for details), as they compare two values that have been derived in the same way and rounded to float precision. Consequently, both compared values are guaranteed to be identical to the last bit.
Case c) and d) may fail because the computation and subsequent comparison may be carried out with higher precision than float. The different rounding of double should be enough to fail the test.
Note that the cases a) and b) may still fail if infinities or NANs are involved, though.
Legalese
Using the N3242 C++11 working draft of the standard, I find the following:
In the text describing the assignment expression, it is explicitly stated that type conversion takes place, [expr.ass] 3:
If the left operand is not of class type, the expression is implicitly converted (Clause 4) to the cv-unqualified type of the left operand.
Clause 4 refers to the standard conversions [conv], which contain the following on floating point conversions, [conv.double] 1:
A prvalue of floating point type can be converted to a prvalue of another floating point type. If the
source value can be exactly represented in the destination type, the result of the conversion is that exact
representation. If the source value is between two adjacent destination values, the result of the conversion
is an implementation-defined choice of either of those values. Otherwise, the behavior is undefined.
(Emphasis mine.)
So we have the guarantee that the result of the conversion is actually defined, unless we are dealing with values outside the representable range (like float a = 1e300, which is UB).
When people think about "internal floating point representation may be more precise than visible in code", they think about the following sentence in the standard, [expr] 11:
The values of the floating operands and the results of floating expressions may be represented in greater
precision and range than that required by the type; the types are not changed thereby.
Note that this applies to operands and results, not to variables. This is emphasized by the attached footnote 60:
The cast and assignment operators must still perform their specific conversions as described in 5.4, 5.2.9 and 5.17.
(I guess, this is the footnote that Maciej Piechotka meant in the comments - the numbering seems to have changed in the version of the standard he's been using.)
So, when I say float a = some_double_expression;, I have the guarantee that the result of the expression is actually rounded to be representable by a float (invoking UB only if the value is out-of-bounds), and a will refer to that rounded value afterwards.
An implementation could indeed specify that the result of the rounding is random, and thus break the cases a) and b). Sane implementations won't do that, though.
Assuming IEEE 754 semantics, there are definitely some cases where you can do this. Conventional floating point number computations are exact whenever they can be, which for example includes (but is not limited to) all basic operations where the operands and the results are integers.
So if you know for a fact that you don't do anything that would result in something unrepresentable, you are fine. For example
float a = 1.0f;
float b = 1.0f;
float c = 2.0f;
assert(a + b == c); // you can safely expect this to succeed
The situation only really gets bad if you have computations with results that aren't exactly representable (or that involve operations which aren't exact) and you change the order of operations.
Note that the C++ standard itself doesn't guarantee IEEE 754 semantics, but that's what you can expect to be dealing with most of the time.
Case (a) fails if a == b == 0.0. In this case, the operation yields NaN, and by definition (IEEE, not C) NaN ≠ NaN.
Cases (b) and (c) can fail in parallel computation when floating-point round modes (or other computation modes) are changed in the middle of this thread's execution. Seen this one in practice, unfortunately.
Case (d) can be different because the compiler (on some machine) may choose to constant-fold the computation of 5.0f/3.0f and replace it with the constant result (of unspecified precision), whereas a/b must be computed at runtime on the target machine (which might be radically different). In fact, intermediate calculations may be performed in arbitrary precision. I've seen differences on old Intel architectures when intermediate computation was performed in 80-bit floating-point, a format that the language didn't even directly support.
In my humble opinion, you should not rely on the == operator because it has many corner cases. The biggest problem is rounding and extended precision. In case of x86, floating point operations can be done with bigger precision than you can store in variables (if you use coprocessors, IIRC SSE operations use same precision as storage).
This is usually good thing, but this causes problems like:
1./2 != 1./2 because one value is form variable and second is from floating point register. In the simplest cases, it will work, but if you add other floating point operations the compiler could decide to split some variables to the stack, changing their values, thus changing the result of the comparison.
To have 100% certainty you need look at assembly and see what operations was done before on both values. Even the order can change the result in non-trivial cases.
Overall what is point of using ==? You should use algorithms that are stable. This means they work even if values are not equal, but they still give the same results. The only place I know where == could be useful is serializing/deserializing where you know what result you want exactly and you can alter serialization to archive your goal.
I did a quick test using the following:
float x = std::numeric_limits<float>::max();
x += 0.1;
that resulted in x == std::numeric_limits::max() so it didn't get any bigger than the limit.
Is this guaranteed behavior across compilers and platforms though? What about HLSL?
Is this guaranteed behavior across compilers and platforms though?
No, the behavior is undefined. The standard says (emphasis mine):
5 Expressions....
If during the evaluation of an expression, the result is not mathematically defined or not in the range of
representable values for its type, the behavior is undefined. [ Note: most existing implementations of C++
ignore integer overflows. Treatment of division by zero, forming a remainder using a zero divisor, and all
floating point exceptions vary among machines, and is usually adjustable by a library function. —end note ]
As #user2079303 mentioned, in practice we can be less restricted:
it is not undefined if std::numeric_limits<float>::has_infinity. Which is often true. In that case, the result is merely unspecified.
The value of std::numeric_limits<T>::max() is defined to be the maximum finite value representable by type T (see 18.3.2.4 [numeric.limits.members] paragraph 4). Thus, the question actually becomes multiple subquestions:
Is it possible to create a value bigger than std::numeric_limits<T>::max(), i.e., is there an infinity?
If so, which value needs to be added to std::numeric_limits<T>::max() to get the infinity?
If not, is the behavior defined?
C++ does not specify the floating point format and different formats may disagree on what the result is. In particular, I don't think floating point formats need to define a value for infinity. For example, IBM Floating Points do not have an infinity. On the other hand the IEEE 754 does have an infinity representation.
Since overflow of arithmetic types may be undefined behavior (see 5 [expr] paragraph 4) and I don't see any exclusion for floating point types. Thus, the behavior would be undefined behavior if there is no infinity. At least, it can be tested whether a type does have an infinity (see 18.3.2.3 [numeric.limits] paragraph 35) in which case the operation can't overflow.
If there is an infinity I think adding any value to std::numeric_limits<T>::max() would get you infinity. However, determining whether that is, indeed, the case would require to dig through the respective floating point specification. I could imagine that IEEE 754 might ignore additions if the value is too small to be relevant as is the case for adding 0.1 to std::numeric_limits<T>::max(). I could also imagine that it decides that it always overflows to infinity.
This is awkward, but the bitwise AND operator is defined in the C++ standard as follows (emphasis mine).
The usual arithmetic conversions are performed; the result is the bitwise AND function of its operands. The operator applies only to integral or unscoped enumeration operands.
This looks kind of meaningless to me. The "bitwise AND function" is not defined anywhere in the standard, as far as I can see.
I get that the AND function is well-understood and thus may not require explanation. The meaning of the word "bitwise" should also be rather clear: the function is applied to corresponding bits of its operands. However, what constitute the bits of the operands is not clear.
What gives?
This is underspecified. The issue of what the standard means when it refers to bit-wise operations is the subject of a few defect reports.
For example defect report 1857: Additional questions about bits:
The specification of the bitwise operations in 5.11 [expr.bit.and],
5.12 [expr.xor], and 5.13 [expr.or] uses the undefined term “bitwise” in describing the operations, without specifying whether it is the
value or object representation that is in view.
Part of the resolution of this might be to define “bit” (which is
otherwise currently undefined in C++) as a value of a given power of
2.
and the response was:
CWG decided to reformulate the description of the operations
themselves to avoid references to bits, splitting off the larger
questions of defining “bit” and the like to issue 1943 for further
consideration.
and defect report 1943 says:
CWG decided at the 2014-06 (Rapperswil) meeting to address only a
limited subset of the questions raised by issues 1857 and 1861. This
issue is a placeholder for the remaining questions, such as defining a
“bit” in terms of a value of 2n, specifying whether a bit-field has a
sign bit, etc.
We can see from this defect report 1796: Is all-bits-zero for null characters a meaningful requirement?, that this issue of what the standard means when it refers to bits affected/affects other sections as well:
According to 2.3 [lex.charset] paragraph 3,
The basic execution character set and the basic execution wide-character set shall each contain all the members of the basic
source character set, plus control characters representing alert,
backspace, and carriage return, plus a null character (respectively,
null wide character), whose representation has all zero bits.
It is not clear that a portable program can examine the bits of the
representation; instead, it would appear to be limited to examining
the bits of the numbers corresponding to the value representation
(3.9.1 [basic.fundamental] paragraph 1). It might be more appropriate
to require that the null character value compare equal to 0 or '\0'
rather than specifying the bit pattern of the representation.
There is a similar issue for the definition of shift, bitwise and, and
bitwise or operators: are those specifications constraints on the bit
pattern of the representation or on the values resulting from the
interpretation of those patterns as numbers?
In this case the resolution was to change:
representation has all zero bits
to:
value is 0.
Note that as mentioned in ecatmur's answer the draft C++ standard does defer to C standard section 5.2.4.2.1 in section 3.9.1 [basic.fundamental] in paragraph 3 it does not refer to section 6.5/4 from the C standard which would at least tell us that the results are implementation defined. I explain in my comment below that C++ standard can only incorporate text from normative references explicitly.
[basic.fundamental]/3 defers to C 5.2.4.2.1. It seems reasonable that the bitwise operators in C++ being underspecified should similarly defer to C, in this case 6.5.10/4:
The result of the binary & operator is the bitwise AND of the operands (that is, each bit in
the result is set if and only if each of the corresponding bits in the converted operands is
set).
Note that C 6.5/4 has:
Some operators (the unary operator ~, and the binary operators <<, >>, &, ^, and |,
collectively described as bitwise operators) are required to have operands that have
integer type. These operators yield values that depend on the internal representations of
integers, and have implementation-defined and undefined aspects for signed types.
The internal representations of the integers are of course described in 6.2.6.2/1, /2.
C++ Standard defines storage as a certain amount of bits. The implementation might decide what meaning to attribute to a particular bit; that being said, binary AND is supposed to work on conceptual 0s and 1s forming a particular type's representation.
3.9.1.7. (...) The representations of integral types shall define values by use of a pure binary numeration system.49 (...)
3.9.1, footnote 49) A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive
bits are additive, begin with 1, and are multiplied by successive integral power of 2, except perhaps for the bit with the highest
position
That means that for whatever physical representation used, binary AND acts according to the truth table for the AND function (for each bit number i, take bits Ai and Bi from appropriate operands and produce a value of 1 only if both are 1, otherwise produce a 0 for the bit Ri).. Resulting value is left to interpret by the implementation, but whatever is chosen, it has to be in line with other expectations with regard to other binary operations like OR and XOR.
Legally, we could consider all bitwise operations to have undefined behaviour as they are not actually defined.
More reasonably, we are expected to apply common sense and refer to the common meanings of these operations, applying them to the bits of the operands (hence the term "bitwise").
But nothing actually states that. Shame my answer can't be considered normative wording.
I've been programming for a while in C++, but suddenly had a doubt and wanted to clarify with the Stackoverflow community.
When an integer is divided by another integer, we all know the result is an integer and like wise, a float divided by float is also a float.
But who is responsible for providing this result? Is it the compiler or DIV instruction?
That depends on whether or not your architecture has a DIV instruction. If your architecture has both integer and floating-point divide instructions, the compiler will emit the right instruction for the case specified by the code. The language standard specifies the rules for type promotion and whether integer or floating-point division should be used in each possible situation.
If you have only an integer divide instruction, or only a floating-point divide instruction, the compiler will inline some code or generate a call to a math support library to handle the division. Divide instructions are notoriously slow, so most compilers will try to optimize them out if at all possible (eg, replace with shift instructions, or precalculate the result for a division of compile-time constants).
Hardware divide instructions almost never include conversion between integer and floating point. If you get divide instructions at all (they are sometimes left out, because a divide circuit is large and complicated), they're practically certain to be "divide int by int, produce int" and "divide float by float, produce float". And it'll usually be that both inputs and the output are all the same size, too.
The compiler is responsible for building whatever operation was written in the source code, on top of these primitives. For instance, in C, if you divide a float by an int, the compiler will emit an int-to-float conversion and then a float divide.
(Wacky exceptions do exist. I don't know, but I wouldn't put it past the VAX to have had "divide float by int" type instructions. The Itanium didn't really have a divide instruction, but its "divide helper" was only for floating point, you had to fake integer divide on top of float divide!)
The compiler will decide at compile time what form of division is required based on the types of the variables being used - at the end of the day a DIV (or FDIV) instruction of one form or another will get involved.
Your question doesn't really make sense. The DIV instruction doesn't do anything by itself. No matter how loud you shout at it, even if you try to bribe it, it doesn't take responsibility for anything
When you program in a programming language [X], it is the sole responsibility of the [X] compiler to make a program that does what you described in the source code.
If a division is requested, the compiler decides how to make a division happen. That might happen by generating the opcode for the DIV instruction, if the CPU you're targeting has one. It might be by precomputing the division at compile-time, and just inserting the result directly into the program (assuming both operands are known at compile-time), or it might be done by generating a sequence of instructions which together emulate a divison.
But it is always up to the compiler. Your C++ program doesn't have any effect unless it is interpreted according to the C++ standard. If you interpret it as a plain text file, it doesn't do anything. If your compiler interprets it as a Java program, it is going to choke and reject it.
And the DIV instruction doesn't know anything about the C++ standard. A C++ compiler, on the other hand, is written with the sole purpose of understanding the C++ standard, and transforming code according to it.
The compiler is always responsible.
One of the most important rules in the C++ standard is the "as if" rule:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.
Which in relation to your question means it doesn't matter what component does the division, as long as it gets done. It may be performed by a DIV machine code, it may be performed by more complicated code if there isn't an appropriate instruction for the processor in question.
It can also:
Replace the operation with a bit-shift operation if appropriate and likely to be faster.
Replace the operation with a literal if computable at compile-time or an assignment if e.g. when processing x / y it can be shown at compile time that y will always be 1.
Replace the operation with an exception throw if it can be shown at compile time that it will always be an integer division by zero.
Practically
The C99 standard defines "When integers are divided, the result of the / operator
is the algebraic quotient with any fractional part
discarded." And adds in a footnote that "this is often called 'truncation toward zero.'"
History
Historically, the language specification is responsible.
Pascal defines its operators so that using / for division always returns a real (even if you use it to divide 2 integers), and if you want to divide integers and get an integer result, you use the div operator instead. (Visual Basic has a similar distinction and uses the \ operator for integer division that returns an integer result.)
In C, it was decided that the same distinction should be made by casting one of the integer operands to a float if you wanted a floating point result. It's become convention to treat integer versus floating point types the way you describe in many C-derived languages. I suspect this convention may have originated in Fortran.