C++11 round off error using pow() and std::complex - c++

Running the following
#include <iostream>
#include <complex>
int main()
{
std::complex<double> i (0,1);
std::complex<double> comp =pow(i, 2 );
std::cout<<comp<<std::endl;
return 0;
}
gives me the expected result (-1,0) without c++11. However, compiling with c++11 gives the highly annoying (-1,1.22461e-016).
What to do, and what is best practice?
Of course this can be fixed manually by flooring etc., but I would appreciate to know the proper way of addressing the problem.
SYSTEM: Win8.1, using Desktop Qt 5.1.1 (Qt Creator) with MinGW 4.8 32 bit. Using c++11 by adding the flag QMAKE_CXXFLAGS += -std=c++11 in the Qt Creator .pro file.

In C++11 we have a few new overloads of pow(std::complex). GCC has two nonstandard overloads on top of that, one for raising to an int and one for raising to an unsigned int.
One of the new standard overloads (namely std::complex</*Promoted*/> pow(const std::complex<T> &, const U &)) causes an ambiguity when calling pow(i, 2) with the non-standard ones. Their solution is to #ifdef the non-standard ones out in the presence of C++11 and you go from calling the specialized function (which uses successive squaring) to the generic method (which uses pow(double,double) and std::polar).

You need to get into a different mode when you are using floating point numbers. Floating points are APPROXIMATIONS of real numbers.
1.22461e-016 is
0.0000000000000000122461
An engineer would say that IS zero. You will always get such variations (unless you stick to operations on sums of powers of 2 with the same general range.
A value as simple 0.1 cannot be represented exactly with floating point numbers.
The general problem you present has to parts:
1. Dealing with floating point numbers in processing
2. Displaying flooding point numbers.
For the processing, I would wager that doing:
comp = i * i ;
Would give you want you want.
Pow (x, y) is going to do
exp (log (x) * y)
For output, switch to using an F format.

Related

cout causes compile error when using __float128 (error: ambiguous overload for ‘operator<<’) [duplicate]

In my question about Analysis of float/double precision in 32 decimal digits, one answer said to take a look at __float128.
I used it and the compiler could find it, but I can not print it, since the complier can not find the header quadmath.h.
So my questions are:
__float128 is standard, correct?
How to print it?
Isn't quadmath.h standard?
These answers did not help:
Use extern C
Precision in C++
Printing
The ref also did not help.
Note that I do not want to use any non standard library.
[EDIT]
It would be also useful, if that question had an answer, even if the answer was a negative one.
Work in GNU-Fortran! It allows to run the same program in different precision: single (32 bit), double (64 bit), extended (80 bit) and quad (128 bit). You don't have to do any changes in the program, you simply write 'real' for all floating points. The size of floating points is set by compiler options -freal-4-real-8, -freal-4-real-10 and -freal-4-real-16.
Using the boost library was the best answer for me:
#include <boost/multiprecision/float128.hpp>
#include <boost/math/special_functions/gamma.hpp>
using namespace boost::multiprecision;
float128 su1= 0.33333333333333333q;
cout << "su1=" << su1 << endl;
Remember to link this library:
-lquadmath
No, it's not standard - neither the type nor the header. That's why the type has a double underscore (reserved name). Apparently, quadmath.h provides a quadmath_snprintf method. In C++ you would have used <<, of course.

Is there a way to prevent developers from using std::min, std::max?

We have an algorithm library doing lots of std::min/std::max operations on numbers that could be NaN. Considering this post: Why does Release/Debug have a different result for std::min?, we realised it's clearly unsafe.
Is there a way to prevent developers from using std::min/std::max?
Our code is compiled both with VS2015 and g++. We have a common header file included by all our source files (through /FI option for VS2015 and -include for g++). Is there any piece of code/pragma that could be put here to make any cpp file using std::min or std::max fail to compile?
By the way, legacy code like STL headers using this function should not be impacted. Only the code we write should be impacted.
I don't think making standard library functions unavailable is the correct approach. First off, NaN are a fundamental aspect of how floating point value work. You'd need to disable all kinds of other things, e.g., sort(), lower_bound(), etc. Also, programmers are paid for being creative and I doubt that any programmer reaching for std::max() would hesitate to use a < b? b: a if std::max(a, b) doesn't work.
Also, you clearly don't want to disable std::max() or std::min() for types which don't have NaN, e.g., integers or strings. So, you'd need a somewhat controlled approach.
There is no portable way to disable any of the standard library algorithms in namespace std. You could hack it by providing suitable deleted overloads to locate uses of these algorithms, e.g.:
namespace std {
float max(float, float) = delete; // **NOT** portable
double max(double, double) = delete; // **NOT** portable
long double max(long double, long double) = delete; // **NOT** portable
// likewise and also not portable for min
}
I'm going a bit philosophical here and less code. But I think the best approach would be to educate those developers, and explain why they shouldn't code in a specific way. If you'll be able to give them a good explanation, then not only will they stop using functions that you don't want them to. They will be able to spread the message to other developers in the team.
I believe that forcing them will just make them come up with work arounds.
As modifying std is disallowed, following is UB, but may work in your case.
Marking the function as deprecated:
Since c++14, the deprecated attribute:
namespace std
{
template <typename T>
[[deprecated("To avoid to use Nan")]] constexpr const T& (min(const T&, const T&));
template <typename T>
[[deprecated("To avoid to use Nan")]] constexpr const T& (max(const T&, const T&));
}
Demo
And before
#ifdef __GNUC__
# define DEPRECATED(func) func __attribute__ ((deprecated))
#elif defined(_MSC_VER)
# define DEPRECATED(func) __declspec(deprecated) func
#else
# pragma message("WARNING: You need to implement DEPRECATED for this compiler")
# define DEPRECATED(func) func
#endif
namespace std
{
template <typename T> constexpr const T& DEPRECATED(min(const T&, const T&));
template <typename T> constexpr const T& DEPRECATED(max(const T&, const T&));
}
Demo
There's no portable way of doing this since, aside from a couple of exceptions, you are not allowed to change anything in std.
However, one solution is to
#define max foo
before including any of your code. Then both std::max and max will issue compile-time failures.
But really, if I were you, I'd just get used to the behaviour of std::max and std::min on your platform. If they don't do what the standard says they ought to do, then submit a bug report to the compiler vendor.
If you get different results in debug and release, then the problem isn't getting different results. The problem is that one version, or probably both, are wrong. And that isn't fixed by disallowing std::min or std::max or replacing them with different functions that have defined results. You have to figure out which outcome you would actually want for each function call to get the correct result.
I'm not going to answer your question exactly, but instead of disallowing std::min and std::max altogether, you could educate your coworkers and make sure that you are consistently using a total order comparator instead of a raw operator< (implicitly used by many standard library algorithms) whenever you use a function that relies on a given order.
Such a comparator is proposed for standardization in P0100 — Comparison in C++ (as well as partial and weak order comparators), probably targeting C++20. Meanwhile, the C standard committee has been working for quite a while on TS 18661 — Floating-point extensions for C, part 1: Binary floating-point arithmic, apparently targeting the future C2x (should be ~C23), which updates the <math.h> header with many new functions required to implement the recent ISO/IEC/IEEE 60559:2011 standard. Among the new functions, there is totalorder (section 14.8), which compares floating point numbers according to the IEEE totalOrder:
totalOrder(x, y) imposes a total ordering on canonical members of the format of x and y:
If x < y, totalOrder(x, y) is true.
If x > y, totalOrder(x, y) is false.
If x = y
totalOrder(-0, +0) is true.
totalOrder(+0, -0) is false.
If x and y represent the same floating point datum:
If x and y have negative sign, totalOrder(x, y) is true if and only if the exponent of x ≥ the exponent of y.
Otherwise totalOrder(x, y) is true if and only if the exponent of x ≤ the exponent of y.
If x and y are unordered numerically because x or y is NaN:
totalOrder(−NaN, y) is true where −NaN represents a NaN with negative sign bit and y is a floating-point number.
totalOrder(x, +NaN) is true where +NaN represents a NaN with positive sign bit and x is a floating-point number.
If x and y are both NaNs, then totalOrder reflects a total ordering based on:
negative sign orders below positive sign
signaling orders below quiet for +NaN, reverse for −NaN
lesser payload, when regarded as an integer, orders below greater payload for +NaN, reverse for −NaN.
That's quite a wall of text, so here is a list that helps to see what's greater than what (from greater to lesser):
positive quiet NaNs (ordered by payload regarded as integer)
positive signaling NaNs (ordered by payload regarded as integer)
positive infinity
positive reals
positive zero
negative zero
negative reals
negative infinity
negative signaling NaNs (ordered by payload regarded as integer)
negative quiet NaNs (ordered by payload regarded as integer)
Unfortunately, this total order currently lacks library support, but it is probably possible to hack together a custom total order comparator for floating point numbers and use it whenever you know there will be floating point numbers to compare. Once you get your hands on such a total order comparator, you can safely use it everywhere it is needed instead of simply disallowing std::min and std::max.
If you compile using GCC or Clang, you can poison these identifiers.
#pragma GCC poison min max atoi /* etc ... */
Using them will issue a compiler error:
error: attempt to use poisoned "min"
The only problem with this in C++ is you can only poison "identifier tokens", not std::min and std::max, so actually also poisons all functions and local variables by the names min and max... maybe not quite what you want, but maybe not a problem if you choose Good Descriptive Variable Names™.
If a poisoned identifier appears as part of the expansion of a macro
which was defined before the identifier was poisoned, it will not
cause an error. This lets you poison an identifier without worrying
about system headers defining macros that use it.
For example,
#define strrchr rindex
#pragma GCC poison rindex
strrchr(some_string, 'h');
will not produce an error.
Read the link for more info, of course.
https://gcc.gnu.org/onlinedocs/gcc-3.3/cpp/Pragmas.html
You've deprecated std::min std::max. You can find instances by doing a search with grep. Or you can fiddle about with the headers themselves to break std::min, std::max. Or you can try defining min / max or std::min, std::max to the preprocessor. The latter is a bit dodgy because of C++ namespace, if you define std::max/min you don't pick up using namespace std, if you define min/max, you also pick up other uses of these identifiers.
Or if the project has a standard header like "mylibrary.lib" that everyone includes, break std::min / max in that.
The functions should return NaN when passed NaN, of course. But the natural way of writing them will trigger always false.
IMO the failure of the C++ language standard to require min(NaN, x) and min(x, NaN) to return NaN and similarly for max is a serious flaw in the C++ language standards, because it hides the fact that a NaN has been generated and results in surprising behaviour. Very few software developers do sufficient static analysis to ensure that NaNs can never be generated for all possible input values. So we declare our own templates for min and max, with specialisations for float and double to give correct behaviour with NaN arguments. This works for us, but might not work for those who use larger parts of the STL than we do. Our field is high integrity software, so we don't use much of the STL because dynamic memory allocation is usually banned after the startup phase.

Print __float128, without using quadmath_snprintf

In my question about Analysis of float/double precision in 32 decimal digits, one answer said to take a look at __float128.
I used it and the compiler could find it, but I can not print it, since the complier can not find the header quadmath.h.
So my questions are:
__float128 is standard, correct?
How to print it?
Isn't quadmath.h standard?
These answers did not help:
Use extern C
Precision in C++
Printing
The ref also did not help.
Note that I do not want to use any non standard library.
[EDIT]
It would be also useful, if that question had an answer, even if the answer was a negative one.
Work in GNU-Fortran! It allows to run the same program in different precision: single (32 bit), double (64 bit), extended (80 bit) and quad (128 bit). You don't have to do any changes in the program, you simply write 'real' for all floating points. The size of floating points is set by compiler options -freal-4-real-8, -freal-4-real-10 and -freal-4-real-16.
Using the boost library was the best answer for me:
#include <boost/multiprecision/float128.hpp>
#include <boost/math/special_functions/gamma.hpp>
using namespace boost::multiprecision;
float128 su1= 0.33333333333333333q;
cout << "su1=" << su1 << endl;
Remember to link this library:
-lquadmath
No, it's not standard - neither the type nor the header. That's why the type has a double underscore (reserved name). Apparently, quadmath.h provides a quadmath_snprintf method. In C++ you would have used <<, of course.

Basic use of function "contains" in Boost ICL: Are some combinations of interval types and functions not implemented?

I started to use Boost ICL and I stumbled upon very basic stuff. For example the function contains should return true or false depending if a given element is in the interval or not. However that works for [right,left]_open_intervals but not for [open,closed]_inteval (see example below).
This seems to be too obvious to be an oversight. I am using the library in the intended way?
For example (using gcc 4.8 or clang 3.3 and Boost 1.54):
#include <boost/concept_check.hpp> //needed to make this MWE work, boost icl should include it internally
#include<boost/icl/right_open_interval.hpp>
#include<boost/icl/closed_interval.hpp>
#include<boost/icl/open_interval.hpp>
int main(){
boost::icl::right_open_interval<double> roi(6.,7.);
assert(boost::icl::contains(roi, 6.) == true); //ok
assert(boost::icl::contains(roi, 6.) == false); //ok
boost::icl::closed_interval<double> oi(4.,5.); // or open_interval
assert(boost::icl::contains( oi, 4.) == false); //error: "candidate template ignored"
assert(boost::icl::contains( oi, 5.) == false); //error: "candidate template ignored"
}
Note: The above are called "static" intervals (because their bound properties are part of the type). Dynamic intervals work as expected.
I would guess it comes down to the relative uselessness of floating point equality testing.
Have you ever tried to do assert(0.1 + 0.2 == 0.3)?
Try it. I'll wait.
In case you already know the answer, it'll be clear why a closed interval is not easy to implement correctly. Backgrounders:
Is floating point math broken?
Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?
What Every Programmer Should Know About Floating-Point Arithmetic
Also, if you have two consecutive closed intervals [a,b][b,c]. in which interval does b belong?

atan2f precision on xcode

I have this very simple code:
#include <cstdio>
#include <cmath>
int main(int argc, const char * argv[])
{
printf("%2.21f", atan2f(0.f, -1.f));
return 0;
}
With next output on Intel CPUs:
Visual Studio 2010: 3.141592741012573200000
GCC 4.8.1 : 3.141592741012573242188
Xcode 5 : 3.141592502593994140625
After reading Appple manual pages for atan2f, I expect the printed value to be near 3.14159265359, as they say they will return +pi for special values like the one I'm using now. As you can see the difference is quite big from the value returned on Xcode and expected value.
Is this a know issue? If yes, is there any workaround to solve this?
A single-precision floating point number has only about 7 digits of decimal precision. Your test value of 3.14159265359 has 12. If you want better precision, use double or long double and atan2 or atan2l to match.
Likely the reason you're getting "better" results from VS and GCC is that the compiler is noticing your function has constant arguments and is precalculating the result at higher-than-single precision. Check the generated code for proof.
The knee-jerk workaround is to use atan2. Casting that down to float gave me 3.141592741012573242188 just like your GCC 4.8.1 test.
I would assume atan2f gives an answer not quite as precise as a float could hold because it arrives at its answer by some means that means that estimating the output precision is a smarter way to go.