FFmpeg compilation warnings ATOMIC_VAR_INIT

FFmpeg compilation warnings ATOMIC_VAR_INIT - c++

While compiling ffmpeg (commit 1368b5a) with emscripten 3.1.17, there are two warnings, I would like the internet to help me better understand (I am not someone with deep c++ experience):
fftools/ffmpeg.c:339:41: warning: macro 'ATOMIC_VAR_INIT' has been marked as deprecated [-Wdeprecated-pragma]
static atomic_int transcode_init_done = ATOMIC_VAR_INIT(0);
^
/home/XYZ/ffmpeg-wasm/modules/emsdk/upstream/lib/clang/15.0.0/include/stdatomic.h:50:41: note: macro marked 'deprecated' here
#pragma clang deprecated(ATOMIC_VAR_INIT)
^
I understand ATOMIC_VAR_INIT in this place is deprecated, but by which tool (emscripten, clang)? And which party to be involved in the fix.
The other one is also interesting:
fftools/ffmpeg_filter.c:898:35: warning: floating-point comparison is always true; constant cannot be represented exactly in type 'float' [-Wliteral-range]
if (audio_drift_threshold != 0.1)
~~~~~~~~~~~~~~~~~~~~~ ^ ~~~
The message and consequences are very clear. Whats not clear to me, is, if this is particular compiler issue, or code issue and which party to take care?

For ATOMIC_VAR_INIT I found this note:
This macro was a part of early draft design for C11 atomic types. It is not needed in C11, and is deprecated in C17 and removed in C23.
So, the deprecation was by the C standard (and the C++ standard, though there it was deprecated in C++20)
As for the floating point issue, the compiler states outright that:
constant cannot be represented exactly in type 'float'
Which is entirely true. In binary every bit of a number represents a power of two. In integers it is fairly easy, the bits represent 1,2,4,8,16,32,64,... matching 2 to the power of 0,1,2,3,4,5,6,...
For example the integer: 0b100101 is (low bit it to the right): 1+4+32=37.
Floating points are roughly the same, but with a few more complications (which I will skip here). Basically the bits now represent negative powers of two: 0.5, 0.25, 0.125, 0.0625, ... But unfortunately this does not allow us to represent all possible decimal numbers (this is an impossible task, there are infinitely many decimal numbers between 0 and 1, and we only have a finite number of combinations with a finite number of bytes).
It turns out that 0.1 is one of the numbers that it is impossible to encode in binary floating point. Which makes sense as 0.1 is 1/2 * 1/5, and 1/5 cannot be represented as a power of 2.
If you do the math, you find that:
2^-4 + 2^-5 + 2^-8 + 2^-9 + 2^-12 + 2^-13 = 0.0999755859375
If you continue this sequence forever (skip two powers, add two powers, repeat), then you will get infinitely close to 0.1. But that is the best we can do.
If you want to do some more reading, then here is some links:
What Every Programmer Should Know About Floating-Point Arithmetic
What Every Computer Scientist Should Know About Floating-Point Arithmetic
IEEE-754 Floating Point Converter
Now, the part I am a bit unsure of is:
warning: floating-point comparison is always true
Technically 0.1 should translate to some bit-pattern, and if you then throw a variable with that exact same bit pattern at the != check, then the result should be false. But perhaps the compiler made an executive decision and just said: "Always true", because getting the exact bit pattern is super unlikely and fragile.
But whatever the case, and more importantly: float != 0.1 is a really bad test. The links I put above elaborate on it much better than I can, but basically you don't ever want to test a floating point for strict equality - you always want to add an epsilon of some value (it depends on the numbers you compare).
For example: abs(float - 0.1) < epsilon.

Related

Rounding error using the floor function in C++

I was asked what will be the output of the following code:
floor((0.7+0.6)*10);
It returns 12.
I know that the floating point representation does not allow to represent all numbers with infinite precision and that I should expect some discrepancies.
My questions are:
How should I know that this piece of code returns 12, not 13? Why is (0.7+0.6)*10 a bit less than 13, not a bit more?
When can I expect the floor function to work incorrectly and when it works correctly for sure?
Note: I'm not asking how floating representation looks like or why the output isn't exactly 13. I'd like to know how should I infer that (0.7+0.6)*10 is a bit less than 13.

How should I know that this piece of code returns 12, not 13? Why is (0.7+0.6)*10 a bit less than 13, not a bit more?
Assume that your compilation platform uses strictly the IEEE 754 standard formats and operations. Then, convert all the constants involved to binary, keeping 53 significant digits, and apply the basic operations, as defined in IEEE 754, by computing the mathematical result and rounding to 53 significant binary digits at each step. A computer does not need to be involved at any stage, but you can make your life easier by using C99's hexadecimal floating-point format for input and output.
When can I expect the floor function to work incorrectly and when it works correctly for sure?
floor() is exact for all positive arguments. It is working correctly in your example. The behavior that surprises you does not originate with floor and has nothing to do with floor. The surprising behavior starts with the fact that 6/10 and 7/10 are not representable exactly as binary floating-point values, and continues with the fact that since these values have long expansions, floating-point operations + and * can produce a slightly rounded result wrt the mathematical result you could expect from the arguments they are actually applied to. floor() is the only place in your code that does not involve approximation.
Example program to see what is happening:
#include <stdio.h>
#include <math.h>
int main(void) {
printf("%a\n%a\n%a\n%a\n%a\n",
0.7,
0.6,
0.7 + 0.6,
(0.7+0.6)*10,
floor((0.7+0.6)*10));
}
Result:
0x1.6666666666666p-1
0x1.3333333333333p-1
0x1.4ccccccccccccp+0
0x1.9ffffffffffffp+3
0x1.8p+3
IEEE 754 double-precision is really defined with respect to binary, but for conciseness the significand is written in hexadecimal. The exponent after p represents a power of two. For instance the last two results are both of the form <number roughly halfway between 1 and 2>*23.
0x1.8p+3 is 12. The next integer, 13, is 0x1.ap+3, but the computation does not quite reach that value, and so the behavior of floor() is to round down to 12.

How should I know that this piece of code returns 12, not 13?
You should know that it can and may be either 12 or 13. You can verify by testing on a given cpu.
You can not know what the value will be, in general, because the C++ standard does not specify the representation of floating point numbers. If you know the format on given architecture (let's say IEEE 754), then you can perform the calculation by hand, but that result would only apply to that particular representation.
Why is (0.7+0.6)*10 a bit less than 13, not a bit more?
It's an implementation detail and not useful knowledge to the programmer. All you need to know that it may be either. Relying on the knowledge that it's one or the other, would make you depend on the implementation detail.
When can I expect the floor function to work incorrectly and when it works correctly for sure?
It always works correctly, that is accroding to how it's specified to work.
Now, speaking of the value that you are expecting to see. If you know that your number is very close to an integer, but might be off a little bit due to representation error, you can add 0.5 before flooring.
double calculated_integer = (0.7+0.6)*10;
floor(calculated_integer + 0.5);
That way, you will always get the expected value, unless the error exceeds 0.5, which would be quite a big error.
If you don't know that the result should be an integer, then you simply have to accept the fact that floor and ceil operations increase the maximum error of your calculation to 1.0.

There are standard like the IEEE floating point standard which try to make floating point calculations at least a little bit predictive
by defining rules how operations like additions and rounding should be implemented.
To know the result, you need to compute the expression
according to the standard rules. Then you can be sure, that
it gives the same result on every machine, that implements the standard.

How should I know that this piece of code returns 12, not 13?
Since that depends on the numbers involved, by trying.
Why is (0.7+0.6)*10 a bit less than 13, not a bit more?
Well, because that's the result of the calculation.
When can I expect the floor function to work incorrectly and when it works correctly for sure?
Correctly for sure: on multiples of powers of two only, iff your floating point number is represented in binary.
To really take all the confusion out of this:
You cannot know the result without calculating it; it depends on both the machine/algorithmics involved and the numbers.

Very short answer: You can not. It depends on the platform and the float iso that is used on this platform.

In general, you can't. The fundamental problem is that the conversion from text representation to floating-point value is often not implemented as accurately as it could be. That's in part momentum, and in part because getting the floating-point value that's closest to the value expressed in text can be expensive, in some cases requiring large integer calculations. So conversions are often off by a few ULPs (i.e., low-end bits) from the ideal value, in ways that you can't predict a priori. So the question of what that code will produce is unanswerable. The question of what it should produce may be a bit more tractable, but it's still an exercise in time-wasting.

Is hardcode float precise if it can be represented by binary format in IEEE 754?

for example, 0 , 0.5, 0.15625 , 1 , 2 , 3... are values converted from IEEE 754. Are their hardcode version precise?
for example:
is
float a=0;
if(a==0){
return true;
}
always return true? other example:
float a=0.5;
float b=0.25;
float c=0.125;
is a * b always equal to 0.125 and a * b==c always true? And one more example:
int a=123;
float b=0.5;
is a * b always be 61.5? or in general, is integer multiply by IEEE 754 binary float precise?
Or a more general question: if the value is hardcode and both the value and result can be represented by binary format in IEEE 754 (e.g.:0.5 - 0.125), is the value precise?

There is no inherent fuzzyness in floating-point numbers. It's just that some, but not all, real numbers can't be exactly represented.
Compare with a fixed-width decimal representation, let's say with three digits. The integer 1 can be represented, using 1.00, and 1/10 can be represented, using 0.10, but 1/3 can only be approximated, using 0.33.
If we instead use binary digits, the integer 1 would be represented as 1.00 (binary digits), 1/2 as 0.10, 1/4 as 0.01, but 1/3 can (again) only be approximated.
There are some things to remember, though:
It's not the same numbers as with decimal digits. 1/10 can be
written exactly as 0.1 using decimal digits, but not using binary
digits, no matter how many you use (short of infinity).
In practice, it is difficult to keep track of which numbers can be
represented and which can't. 0.5 can, but 0.4 can't. So when you need
exact numbers, such as (often) when working with money, you shouldn't
use floating-point numbers.
According to some sources, some processors do strange things
internally when performing floating-point calculations on numbers
that can't be exactly represented, causing results to vary in a way
that is, in practice, unpredictable.
(My opinion is that it's actually a reasonable first approximation to say that yes, floating-point numbers are inherently fuzzy, so unless you are sure your particular application can handle that, stay away from them.)
For more details than you probably need or want, read the famous What Every Computer Scientist Should Know About Floating-Point Arithmetic. Also, this somewhat more accessible website: The Floating-Point Guide.

No, but as Thomas Padron-McCarthy says, some numbers can be exactly represented using binary but not all of them can.
This is the way I explain it to non-developers who I work with (like Mahmut Ali I also work on an very old financial package): Imagine having a very large cake that is cut into 256 slices. Now you can give 1 person the whole cake, 2 people half of the slices but soon as you decide to split it between 3 you can't - it's either 85 or 86 - you can't split the cake any further. The same is with floating point. You can only get exact numbers on some representations - some numbers can only be closely approximated.

C++ does not require binary floating point representation. Built-in integers are required to have a binary representation, commonly two's complement, but one's complement and sign and magnitude are also supported. But floating point can be e.g. decimal.
This leaves open the question of whether C++ floating point can have a radix that does not have 2 as a prime factor, like 2 and 10. Are other radixes permitted? I don't know, and last time I tried to check that, I failed.
However, assuming that the radix must be 2 or 10, then all your examples involve values that are powers of 2 and therefore can be exactly represented.
This means that the single answer to most of your questions is “yes”. The exception is the question “is integer multiply by IEEE 754 binary float [exact]”. If the result exceeds the precision available, then it can't be exact, but otherwise it is.
See the classic “What Every Computer Scientist Should Know About Floating-Point Arithmetic” for background info about floating point representation & properties in general.
If a value can be exactly represented in 32-bit or 64-bit IEEE 754, then that doesn't mean that it can be exactly represented with some other floating point representation. That's because different 32-bit representations and different 64-bit representations use different number of bits to hold the mantissa and have different exponent ranges. So a number that can be exactly represented in one way, can be beyond the precision or range of some other representation.
You can use std::numeric_limits<T>::is_iec559 (where e.g. T is double) to check whether your implementation claims to be IEEE 754 compatible. However, when floating point optimizations are turned on, at least the g++ compiler (1)erroneously claims to be IEEE 754, while not treating e.g. NaN values correctly according to that standard. In effect, the is_iec559 only tells you whether the number representation is IEEE 754, and not whether the semantics conform.
(1) Essentially, instead of providing different types for different semantics, gcc and g++ try to accommodate different semantics via compiler options. And with separate compilation of parts of a program, that can't conform to the C++ standard.

In principle, this should be possible. If you restrict yourself to exactly this class of numbers with a finite 2-power representation.
But it is dangerous: what if someone takes your code and changes your 0.5 to 0.4 or your .0625 to .065 due to whatever reasons? Then your code is broken. And no, even excessive comments won't help about that - someone will always ignore them.

Should I worry about precision when I use C++ mathematical functions with integers?

For example, The code below will give undesirable result due to precision of floating point numbers.
double a = 1 / 3.0;
int b = a * 3; // b will be 0 here
I wonder whether similar problems will show up if I use mathematical functions. For example
int a = sqrt(4); // Do I have guarantee that I will always get 2 here?
int b = log2(8); // Do I have guarantee that I will always get 3 here?
If not, how to solve this problem?
Edit:
Actually, I came across this problem when I was programming for an algorithm task. There I want to get
the largest integer which is power of 2 and is less than or equal to integer N
So round function can not solve my problem. I know I can solve this problem through a loop, but it seems not very elegant.
I want to know if
int a = pow(2, static_cast<int>(log2(N)));
can always give correct result. For example if N==8, is it possible that log2(N) gives me something like 2.9999999999999 and the final result become 4 instead of 8?

Inaccurate operands vs inaccurate results
I wonder whether similar problems will show up if I use mathematical functions.
Actually, the problem that could prevent log2(8) to be 3 does not exist for basic operations (including *). But it exists for the log2 function.
You are confusing two different issues:
double a = 1 / 3.0;
int b = a * 3; // b will be 0 here
In the example above, a is not exactly 1/3, so it is possible that a*3 does not produce 1.0. The product could have happened to round to 1.0, it just doesn't. However, if a somehow had been exactly 1/3, the product of a by 3 would have been exactly 1.0, because this is how IEEE 754 floating-point works: the result of basic operations is the nearest representable value to the mathematical result of the same operation on the same operands. When the exact result is representable as a floating-point number, then that representation is what you get.
Accuracy of sqrt and log2
sqrt is part of the “basic operations”, so sqrt(4) is guaranteed always, with no exception, in an IEEE 754 system, to be 2.0.
log2 is not part of the basic operations. The result of an implementation of this function is not guaranteed by the IEEE 754 standard to be the closest to the mathematical result. It can be another representable number further away. So without more hypotheses on the log2 function that you use, it is impossible to tell what log2(8.0) can be.
However, most implementations of reasonable quality for elementary functions such as log2 guarantee that the result of the implementation is within 1 ULP of the mathematical result. When the mathematical result is not representable, this means either the representable value above or the one below (but not necessarily the closest one of the two). When the mathematical result is exactly representable (such as 3.0), then this representation is still the only one guaranteed to be returned.
So about log2(8), the answer is “if you have a reasonable quality implementation of log2, you can expect the result to be 3.0`”.
Unfortunately, not every implementation of every elementary function is a quality implementation. See this blog post, caused by a widely used implementation of pow being inaccurate by more than 1 ULP when computing pow(10.0, 2.0), and thus returning 99.0 instead of 100.0.
Rounding to the nearest integer
Next, in each case, you assign the floating-point to an int with an implicit conversion. This conversion is defined in the C++ standard as truncating the floating-point values (that is, rounding towards zero). If you expect the result of the floating-point computation to be an integer, you can round the floating-point value to the nearest integer before assigning it. It will help obtain the desired answer in all cases where the error does not accumulate to a value larger than 1/2:
int b = std::nearbyint(log2(8.0));
To conclude with a straightforward answer to the question the the title: yes, you should worry about accuracy when using floating-point functions for the purpose of producing an integral end-result. These functions do not come even with the guarantees that basic operations come with.

Unfortunately the default conversion from a floating point number to integer in C++ is really crazy as it works by dropping the decimal part.
This is bad for two reasons:
a floating point number really really close to a positive integer, but below it will be converted to the previous integer instead (e.g. 3-1×10-10 = 2.9999999999 will be converted to 2)
a floating point number really really close to a negative integer, but above it will be converted to the next integer instead (e.g. -3+1×10-10 = -2.9999999999 will be converted to -2)
The combination of (1) and (2) means also that using int(x + 0.5) will not work reasonably as it will round negative numbers up.
There is a reasonable round function, but unfortunately returns another floating point number, thus you need to write int(round(x)).
When working with C99 or C++11 you can use lround(x).
Note that the only numbers that can be represented correctly in floating point are quotients where the denominator is an integral power of 2.
For example 1/65536 = 0.0000152587890625 can be represented correctly, but even just 0.1 is impossible to represent correctly and thus any computation involving that quantity will be approximated.
Of course when using 0.1 approximations can cancel out leaving a correct result occasionally, but even just adding ten times 0.1 will not give 1.0 as result when doing the computation using IEEE754 double-precision floating point numbers.
Even worse the compilers are allowed to use higher precision for intermediate results. This means that adding 10 times 0.1 may give back 1 when converted to an integer if the compiler decides to use higher accuracy and round to closest double at the end.
This is "worse" because despite being the precision higher the results are compiler and compiler options dependent, making reasoning about the computations harder and making the exact result non portable among different systems (even if they use the same precision and format).
Most compilers have special options to avoid this specific problem.

Warning for inexact floating-point constants

Questions like "Why isn't 0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1 = 0.8?" got me thinking that...
... It would probably be nice to have the compiler warn about the floating-point constants that it rounds to the nearest representable in the binary floating-point type (e.g. 0.1 and 0.8 are rounded in radix-2 floating-point, otherwise they'd need an infinite amount of space to store the infinite number of digits).
I've looked up gcc warnings and so far found none for this purpose (-Wall, -Wextra, -Wfloat-equal, -Wconversion, -Wcoercion (unsupported or C only?), -Wtraditional (C only) don't appear to be doing what I want).
I haven't found such a warning in Microsoft Visual C++ compiler either.
Am I missing a hidden or rarely-used option?
Is there any compiler at all that has this kind of warning?
EDIT: This warning could be useful for educational purposes and serve as a reminder to those new to floating-point.

There is no technical reason the compiler could not issue such warnings. However, they would be useful only for students (who ought to be taught how floating-point arithmetic works before they start doing any serious work with it) and people who do very fine work with floating-point. Unfortunately, most floating-point work is rough; people throw numbers at the computer without much regard for how the computer works, and they accept whatever results they get.
The warning would have to be off by default to support the bulk of existing floating-point code. Were it available, I would turn it on for my code in the Mac OS X math library. Certainly there are points in the library where we depend on every bit of the floating-point value, such as places where we use extended-precision arithmetic, and values are represented across more than one floating-point object (e.g., we would have one object with the high bits of 1/π, another object with 1/π minus the first object, and a third object with 1/π minus the first two objects, giving us about 150 bits of 1/π). Some such values are represented in hexadecimal floating-point in the source text, to avoid any issues with compiler conversion of decimal numerals, and we could readily convert any remaining numerals to avoid the new compiler warning.
However, I doubt we could convince the compiler developers that enough people would use this warning or that it would catch enough bugs to make it worth their time. Consider the case of libm. Suppose we generally wrote exact numerals for all constants but, on one occasion, wrote some other numeral. Would this warning catch a bug? Well, what bug is there? Most likely, the numeral is converted to exactly the value we wanted anyway. When writing code with this warning turned on, we are likely thinking about how the floating-point calculations will be performed, and the value we have written is one that is suitable for our purpose. E.g., it may be a coefficient of some minimax polynomial we calculated, and the coefficient is as good as it is going to get, whether represented approximately in decimal or converted to some exactly-representable hexadecimal floating-point numeral.
So, this warning will rarely catch bugs. Perhaps it would catch an occasion where we mistyped a numeral, accidentally inserting an extra digit into a hexadecimal floating-point numeral, causing it to extend beyond the representable significand. But that is rare. In most cases, the numerals we use are either simple and short or are copied and pasted from software that has calculated them. On some occasions, we will hand-type special values, such as 0x1.fffffffffffffp0. A warning when an extra “f” slips into that numeral might catch a bug during compilation, but that error would almost certainly be caught quickly in testing, since it drastically alters the special value.
So, such a compiler warning has little utility: Very few people will use it, and it will catch very few bugs for the people who do use it.

The warning is in the source: when you write float, double, or long double including any of their respective literals. Obviously, some literals are exact but even this doesn't help much: the sum of two exact values may inexact, e.g., if the have rather different scales. Having the compiler warn about inexact floating point constants would generate a false sense of security. Also, what are you meant to do about rounded constants? Writing the exact closest value explicitly would be error prone and obfuscate the intent. Writing them differently, e.g., writing 1.0 / 10.0 instead of 0.1 also obfuscates the intent and could yield different values.

There will be no such compiler switch and the reason is obvious.
We are writing down the binary components in decimal:
First fractional bit is 0.5
Second fractional bit is 0.25
Third fractional bit is 0.125
....
Do you see it ? Due to the odd endings with the number 5 every bit needs
another decimal to represent it exactly. One bit needs one decimal, two bits
needs two decimals and so on.
So for fractional floating points it would mean that for most decimal numbers
you need 24(!) decimal digits for single precision floats and
53(!!) decimal digits for double precision.
Worse, the exact digits carry no extra information, they are pure artifacts
caused by the base change.
Noone is going to write down 3.141592653589793115997963468544185161590576171875
for pi to avoid a compiler warning.

I don't see how a compiler would know or that the compiler can warn you about something like that. It is only a coincidence that a number can be exactly represented by something that is inherently inaccurate.

Floating-point comparison of constant assignment

When comparing doubles for equality, we need to give a tolerance level, because floating-point computation might introduce errors. For example:
double x;
double y;
x = f();
y = g();
if (fabs(x-y)<epsilon) {
// they are equal!
} else {
// they are not!
}
However, if I simply assign a constant value, without any computation, do I still need to check the epsilon?
double x = 1;
double y = 1;
if (x==y) {
// they are equal!
} else {
// no they are not!
}
Is == comparison good enough? Or I need to do fabs(x-y)<epsilon again? Is it possible to introduce error in assigning? Am I too paranoid?
How about casting (double x = static_cast<double>(100))? Is that gonna introduce floating-point error as well?
I am using C++ on Linux, but if it differs by language, I would like to understand that as well.

Actually, it depends on the value and the implementation. The C++ standard (draft n3126) has this to say in 2.14.4 Floating literals:
If the scaled value is in the range of representable values for its type, the result is the scaled value if representable, else the larger or smaller representable value nearest the scaled value, chosen in an implementation-defined manner.
In other words, if the value is exactly representable (and 1 is, in IEEE754, as is 100 in your static cast), you get the value. Otherwise (such as with 0.1) you get an implementation-defined close match (a). Now I'd be very worried about an implementation that chose a different close match based on the same input token but it is possible.
(a) Actually, that paragraph can be read in two ways, either the implementation is free to choose either the closest higher or closest lower value regardless of which is actually the closest, or it must choose the closest to the desired value.
If the latter, it doesn't change this answer however since all you have to do is hardcode a floating point value exactly at the midpoint of two representable types and the implementation is once again free to choose either.
For example, it might alternate between the next higher and next lower for the same reason banker's rounding is applied - to reduce the cumulative errors.

No if you assign literals they should be the same :)
Also if you start with the same value and do the same operations, they should be the same.
Floating point values are non-exact, but the operations should produce consistent results :)

Both cases are ultimately subject to implementation defined representations.
Storage of floating point values and their representations take on may forms - load by address or constant? optimized out by fast math? what is the register width? is it stored in an SSE register? Many variations exist.
If you need precise behavior and portability, do not rely on this implementation defined behavior.

IEEE-754, which is a standard common implementations of floating point numbers abide to, requires floating-point operations to produce a result that is the nearest representable value to an infinitely-precise result. Thus the only imprecision that you will face is rounding after each operation you perform, as well as propagation of rounding errors from the operations performed earlier in the chain. Floats are not per se inexact. And by the way, epsilon can and should be computed, you can consult any numerics book on that.
Floating point numbers can represent integers precisely up to the length of their mantissa. So for example if you cast from an int to a double, it will always be exact, but for casting into into a float, it will no longer be exact for very large integers.
There is one major example of extensive usage of floating point numbers as a substitute for integers, it's the LUA scripting language, which has no integer built-in type, and floating-point numbers are used extensively for logic and flow control etc. The performance and storage penalty from using floating-point numbers turns out to be smaller than the penalty of resolving multiple types at run time and makes the implementation lighter. LUA has been extensively used not only on PC, but also on game consoles.
Now, many compilers have an optional switch that disables IEEE-754 compatibility. Then compromises are made. Denormalized numbers (very very small numbers where the exponent has reached smallest possible value) are often treated as zero, and approximations in implementation of power, logarithm, sqrt, and 1/(x^2) can be made, but addition/subtraction, comparison and multiplication should retain their properties for numbers which can be exactly represented.

The easy answer: For constants == is ok.
There are two exceptions which you should be aware of:
First exception:
0.0 == -0.0
There is a negative zero which compares equal for the IEEE 754 standard. This means
1/INFINITY == 1/-INFINITY which breaks f(x) == f(y) => x == y
Second exception:
NaN != NaN
This is a special caveat of NotaNumber which allows to find out if a number is a NaN
on systems which do not have a test function available (Yes, that happens).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js