atan2f precision on xcode - c++

I have this very simple code:
#include <cstdio>
#include <cmath>
int main(int argc, const char * argv[])
{
printf("%2.21f", atan2f(0.f, -1.f));
return 0;
}
With next output on Intel CPUs:
Visual Studio 2010: 3.141592741012573200000
GCC 4.8.1 : 3.141592741012573242188
Xcode 5 : 3.141592502593994140625
After reading Appple manual pages for atan2f, I expect the printed value to be near 3.14159265359, as they say they will return +pi for special values like the one I'm using now. As you can see the difference is quite big from the value returned on Xcode and expected value.
Is this a know issue? If yes, is there any workaround to solve this?

A single-precision floating point number has only about 7 digits of decimal precision. Your test value of 3.14159265359 has 12. If you want better precision, use double or long double and atan2 or atan2l to match.
Likely the reason you're getting "better" results from VS and GCC is that the compiler is noticing your function has constant arguments and is precalculating the result at higher-than-single precision. Check the generated code for proof.

The knee-jerk workaround is to use atan2. Casting that down to float gave me 3.141592741012573242188 just like your GCC 4.8.1 test.
I would assume atan2f gives an answer not quite as precise as a float could hold because it arrives at its answer by some means that means that estimating the output precision is a smarter way to go.

Related

Result is one off in VS Studio Code but not on an online IDE [duplicate]

So i was in a computing contest and i noticed a weird bug. pow(26,2) would always return 675, and sometimes 674? even though correct answer is 676. These sort of errors also occur with pow(26,3), pow(26,4) etc
After some debugging after the contest i believe the answer has to do with the fact int rounds down. Interestingly this kind of error has never occured to me before. The computer i had was running mingw on windows 8. GCC version was fairly new, like 2-3 months old i believe. But what i found was that if i turned the o1/o2/o3 optimization flag on these sort of error would miraculously disappear. pow(26,2) would always get 676 aka correct answer Can anyone explain why?
#include <cmath>
#include <iostream>
using namespace std;
int main() {
cout<<pow(26,2)<<endl;
cout<<int(pow(26,2))<<endl;
}
Results with doubles are weird.
double a=26;
double b=2;
cout<<int(pow(a,b))<<endl; #outputs 675
cout<<int(pow(26.0,2.0))<<endl; # outputs 676
cout<<int(pow(26*1.00,2*1.00))<<endl; # outputs 676
The function pow operates on two floating-point values, and can raise one to the other. This is done through approximating algorithm, as it is required to be able to handle values from the smallest to the largest.
As this is an approximating algorithm, it sometimes gets the value a little bit wrong. In most cases, this is OK. However, if you are interested in getting an exact result, don't use it.
I would strongly advice against using it for integers. And if the second operand is known (2, in this case) it is trivial to replace this with code that does this much faster and that return the correct value. For example:
template<typename T>
T square(T x)
{
return x * x;
}
To answer the actual question: Some compilers can replace calls to pow with other code, or eliminate it all together, when one or both arguments are known. This explains why you get different results.

c++ exp function different results under x64 on i7-3770 and i7-4790

When I execute a simple x64 application with the following code, I get different results on Windows PCs with a i7-3770 and i7-4790 CPU.
#include <cmath>
#include <iostream>
#include <limits>
void main()
{
double val = exp(-10.240990982718174);
std::cout.precision(std::numeric_limits<double>::max_digits10);
std::cout << val;
}
Result on i7-3770:
3.5677476354876406e-05
Result on i7-4790:
3.5677476354876413e-05
When I modify the code to call
unsigned int control_word;
_controlfp_s(&control_word, _RC_UP, MCW_RC);
before the exp function call, both CPUs deliver the same results.
My questions:
Does anyone have an idea for the reason of the differences between the i7-3770 and i7-4790?
Is there a way to set the floating point precision or consistency in a Visual Studio 2015/2017 C++ project for the whole project and not only for the following function call? The "Floating Point Model" setting (/fp) does not have any influence on the results here.
Assuming that double is coded using IEEE-754, and using this decimal to binary converter, you can see that:
3.5677476354876406e-05 is represented in hexa as 0x3F02B48CC0D0ABA8
3.5677476354876413e-05 is represented in hexa as 0x3F02B48CC0D0ABA9
which differ only in the last bit, probably due round error.
I did some further investigations and I found out the following facts:
the problem does also occur on Windows with a different compiler (Intel)
on a linux system both values are equal
I also posted this question to the Visual Studio Community. I got the information, that Haswell and newer CPUs use FMA3. You can disable this feature with _set_FMA3_enable(0) at the beginning of the program. When I do this, the results are the same.

C++11 round off error using pow() and std::complex

Running the following
#include <iostream>
#include <complex>
int main()
{
std::complex<double> i (0,1);
std::complex<double> comp =pow(i, 2 );
std::cout<<comp<<std::endl;
return 0;
}
gives me the expected result (-1,0) without c++11. However, compiling with c++11 gives the highly annoying (-1,1.22461e-016).
What to do, and what is best practice?
Of course this can be fixed manually by flooring etc., but I would appreciate to know the proper way of addressing the problem.
SYSTEM: Win8.1, using Desktop Qt 5.1.1 (Qt Creator) with MinGW 4.8 32 bit. Using c++11 by adding the flag QMAKE_CXXFLAGS += -std=c++11 in the Qt Creator .pro file.
In C++11 we have a few new overloads of pow(std::complex). GCC has two nonstandard overloads on top of that, one for raising to an int and one for raising to an unsigned int.
One of the new standard overloads (namely std::complex</*Promoted*/> pow(const std::complex<T> &, const U &)) causes an ambiguity when calling pow(i, 2) with the non-standard ones. Their solution is to #ifdef the non-standard ones out in the presence of C++11 and you go from calling the specialized function (which uses successive squaring) to the generic method (which uses pow(double,double) and std::polar).
You need to get into a different mode when you are using floating point numbers. Floating points are APPROXIMATIONS of real numbers.
1.22461e-016 is
0.0000000000000000122461
An engineer would say that IS zero. You will always get such variations (unless you stick to operations on sums of powers of 2 with the same general range.
A value as simple 0.1 cannot be represented exactly with floating point numbers.
The general problem you present has to parts:
1. Dealing with floating point numbers in processing
2. Displaying flooding point numbers.
For the processing, I would wager that doing:
comp = i * i ;
Would give you want you want.
Pow (x, y) is going to do
exp (log (x) * y)
For output, switch to using an F format.

GCC C++ pow accuracy

So i was in a computing contest and i noticed a weird bug. pow(26,2) would always return 675, and sometimes 674? even though correct answer is 676. These sort of errors also occur with pow(26,3), pow(26,4) etc
After some debugging after the contest i believe the answer has to do with the fact int rounds down. Interestingly this kind of error has never occured to me before. The computer i had was running mingw on windows 8. GCC version was fairly new, like 2-3 months old i believe. But what i found was that if i turned the o1/o2/o3 optimization flag on these sort of error would miraculously disappear. pow(26,2) would always get 676 aka correct answer Can anyone explain why?
#include <cmath>
#include <iostream>
using namespace std;
int main() {
cout<<pow(26,2)<<endl;
cout<<int(pow(26,2))<<endl;
}
Results with doubles are weird.
double a=26;
double b=2;
cout<<int(pow(a,b))<<endl; #outputs 675
cout<<int(pow(26.0,2.0))<<endl; # outputs 676
cout<<int(pow(26*1.00,2*1.00))<<endl; # outputs 676
The function pow operates on two floating-point values, and can raise one to the other. This is done through approximating algorithm, as it is required to be able to handle values from the smallest to the largest.
As this is an approximating algorithm, it sometimes gets the value a little bit wrong. In most cases, this is OK. However, if you are interested in getting an exact result, don't use it.
I would strongly advice against using it for integers. And if the second operand is known (2, in this case) it is trivial to replace this with code that does this much faster and that return the correct value. For example:
template<typename T>
T square(T x)
{
return x * x;
}
To answer the actual question: Some compilers can replace calls to pow with other code, or eliminate it all together, when one or both arguments are known. This explains why you get different results.

How to trace a NaN in C++

I am going to do some math calculations using C++ . The input floating point number is a valid number, but after the calculations, the resulting value is NaN. I would like to trace the point where NaN value appears (possibly using GDB), instead of inserting a lot of isNan() into the code. But I found that even code like this will not trigger an exception when a NaN value appears.
double dirty = 0.0;
double nanvalue = 0.0/dirty;
Could anyone suggest a method for tracing the NaN or turning a NaN into an exception?
Since you mention using gdb, here's a solution that works with gcc -- you want the
functions defined in fenv.h :
#define _GNU_SOURCE
#include <fenv.h>
#include <stdio.h>
int main(int argc, char **argv)
{
double dirty = 0.0;
feenableexcept(FE_ALL_EXCEPT & ~FE_INEXACT); // Enable all floating point exceptions but FE_INEXACT
double nanval=0.0/dirty;
printf("Succeeded! dirty=%lf, nanval=%lf\n",dirty,nanval);
}
Running the above program produces the output "Floating point exception". Without
the call to feenableexcept, the "Succeeded!" message is printed.
If you were to write a signal handler for SIGFPE, that might be a good place to
set a breakpoint and get the traceback you want. (Disclaimer: haven't tried it!)
In Visual Studio you can use the _controlfp function to set the behavior of floating-point calculations (see http://msdn.microsoft.com/en-us/library/e9b52ceh(VS.80).aspx). Maybe there is a similar variant for your platform.
Some notes on floating point programming can be found on http://ds9a.nl/fp/ including the difference between 1/0 and 1.0/0 etc, and what a NaN is and how it acts.
One can enable so-called "signaling NaN". That should make it easily possible to make the debugger find the correct position.
Via google, I found this for enabling signaling NaNs in C++, no idea if it works:
std::numeric_limits::signaling_NaN();
Usefulness of signaling NaN?