Is boost::math::sinc_pi unnecessarily complicated? - c++

This is not a question about template hacks or dealing with compiler quirks. I understand why the Boost libraries are the way they are. This is about the actual algorithm used for the sinc_pi function in the Boost math library.
The function sinc(x) is equivalent to sin(x)/x.
In the documentation for the Boost math library's sinc_pi(), it says "Taylor series are used at the origin to ensure accuracy". This seems nonsensical since division of floating point numbers will not cause any more loss of precision than a multiplication would. Unless there's a bug in a particular implementation of sin, the naive approach of
double sinc(double x) {if(x == 0) return 1; else return sin(x)/x;}
seems like it would be fine.
I've tested this, and the maximum relative difference between the naive version and the one in the Boost math toolkit is only about half the epsilon for the type used, for both float and double, which puts it at the same scale as a discretization error. Furthermore, this maximum difference does not occur near 0, but near the end of the interval where the Boost version uses a partial Taylor series (i.e. abs(x) < epsilon**(1/4)). This makes it look like it is actually the Taylor series approximation which is (very slightly) wrong, either through loss of accuracy near the ends of the interval or through the repeated rounding from multiple operations.
Here are the results of the program I wrote to test this, which iterates through every float between 0 and 1 and calculates the relative difference between the Boost result and the naive one:
Test for type float:
Max deviation from Boost result is 5.96081e-08 relative difference
equals 0.500029 * epsilon
at x = 0.0185723
which is epsilon ** 0.25003
And here is the code for the program. It can be used to perform the same test for any floating-point type, and takes about a minute to run.
#include <cmath>
#include <iostream>
#include "boost/math/special_functions/sinc.hpp"
template <class T>
T sinc_naive(T x) { using namespace std; if (x == 0) return 1; else return sin(x) / x; }
template <class T>
void run_sinc_test()
{
using namespace std;
T eps = std::numeric_limits<T>::epsilon();
T max_rel_err = 0;
T x_at_max_rel_err = 0;
for (T x = 0; x < 1; x = nextafter(static_cast<float>(x), 1.0f))
{
T boost_result = boost::math::sinc_pi(x);
T naive_result = sinc_naive(x);
if (boost_result != naive_result)
{
T rel_err = abs(boost_result - naive_result) / boost_result;
if (rel_err > max_rel_err)
{
max_rel_err = rel_err;
x_at_max_rel_err = x;
}
}
}
cout << "Max deviation from Boost result is " << max_rel_err << " relative difference" << endl;
cout << "equals " << max_rel_err / eps << " * epsilon" << endl;
cout << "at x = " << x_at_max_rel_err << endl;
cout << "which is epsilon ** " << log(x_at_max_rel_err) / log(eps) << endl;
cout << endl;
}
int main()
{
using namespace std;
cout << "Test for type float:" << endl << endl;
run_sinc_test<float>();
cout << endl;
cin.ignore();
}

After some sleuthing, I dug up a discussion from the original authors.
[sin(x)] is well behaved at x=0, and so is sinc(x). […] my solution
will have better performance or small argument, i.e.|x| < pow(x, 1/6),
since most processor need much more time to evaluate sin(x) than
1- (1/6) * x *x.
From https://lists.boost.org/Archives/boost/2001/05/12421.php.
The earliest reference I found to using Taylor expansion to ensure accuracy is from much later, and committed by a different person. So it seems like this is about performance, not accuracy. If you want to make sure, you might want to get in touch with the people involved.
Regarding sinc_pi specifically, I found the following exchange. Note that they use sinc_a to refer to the family of functions of the form sin(x*a)/(x*a).
What is the advatage of sinc_a(x) ? To address rounding problems for very
large x ? Then it would be more important to improve sin(x) for very large
arguments.
The main interest of this particular member of the family is that it requires fewer computations, and that, in itself it
is a special function as it is far more common than its brethren.
From https://lists.boost.org/Archives/boost/2001/05/12485.php.

Related

std::abs(std::complex) too slow

Why running std::abs over a big complex array is about 8 times slower than using sqrt and norm?
#include <ctime>
#include <cmath>
#include <vector>
#include <complex>
#include <iostream>
using namespace std;
int main()
{
typedef complex<double> compd;
vector<compd> arr(2e7);
for (compd& c : arr)
{
c.real(rand());
c.imag(rand());
}
double sm = 0;
clock_t tm = clock();
for (const compd& c : arr)
{
sm += abs(c);
}
cout << sm << ' ' << clock() - tm << endl; // 5.01554e+011 - 1640 ms
sm = 0;
tm = clock();
for (const compd& c : arr)
{
sm += sqrt(norm(c));
}
cout << sm << ' ' << clock() - tm << endl; // 5.01554e+011 - 154
sm = 0;
tm = clock();
for (const compd& c : arr)
{
sm += hypot(c.real(), c.imag());
}
cout << sm << ' ' << clock() - tm << endl; // 5.01554e+011 - 221
}
I believe the two are not to be taken as identical in the strict sense.
From cppreference on std::abs(std::complex):
Errors and special cases are handled as if the function is implemented as std::hypot(std::real(z), std::imag(z))
Also from cppreference on std::norm(std::complex):
The norm calculated by this function is also known as field norm or absolute square.
The Euclidean norm of a complex number is provided by std::abs, which is more costly to compute. In some situations, it may be replaced by std::norm, for example, if abs(z1) > abs(z2) then norm(z1) > norm(z2).
In short, there are cases where a different result is obtained from each function. Some of these may be found in std::hypot. There the notes also mention the following:
std::hypot(x, y) is equivalent to std::abs(std::complex<double>(x,y))
In general the accuracy of the result may be different (due to the usual floating point mess), and it seems the functions were designed in such a way to be as accurate as possible.
The main reason is that abs handles underflow and overflow during intermediate computations.
So, if norm under/overflows, your formula returns an incorrect/inaccurate result, while abs will return the correct one (so, for example, if your input numbers are in the range of 10200, then the result should be around 10200 as well. But your formula will give you inf, or a floating point exception, because the intermediate norm is around 10400, which is out of range. Note, I've supposed IEEE-754 64-bit floating point here).
Another reason is that abs may give a little bit more precise result.
If you don't need to handle these cases, because your input numbers are "well-behaved" (and don't need the possible more precise result), feel free to use your formula.

c++ half even rounding to x digits

Given a float, I want to round the result to 4 decimal places using half-even rounding, i.e., rounding to the next even number method. For example, when I have the following code snippet:
#include <iostream>
#include <iomanip>
int main(){
float x = 70.04535;
std::cout << std::fixed << std::setprecision(4) << x << std::endl;
}
The output is 70.0453, but I want to be 70.0454. I could not find anything in the standard library, is there any function to achieve this? If not, what would a custom function look like?
If you use float, you're kind of screwed here. There is no such value as 70.04535, because it's not representable in IEEE 754 binary floating point.
Easy demonstration with Python's decimal.Decimal class, which will try to reproduce the actual float (well, Python float is a C double, but it's the same principle) value out to 30 digits of precision:
>>> import decimal
>>> decimal.Decimal(70.04535)
Decimal('70.0453499999999991132426657713949680328369140625')
So your actual value doesn't end in a 5, it ends in 49999... (the closest to 70.04535 a C double can get; C float is even less precise); even banker's rounding would round it down. If this is important to your program, you need to use an equivalent C or C++ library that matches "human" (base-10) math expectations, e.g. libmpdec (which is what Python's decimal.Decimal uses under the hood).
I'm sure someone can improve this, but it gets the job done.
double round_p( double x, int p ){
double d = std::pow(10,p+1);
return ((x*d)+5)/d;
}
void main(int argc, const char**argv){
double x = 70.04535;
{
std::cout << "value " << x << " rounded " << round_p(x,4) << std::endl;
std::cout << "CHECK " << (bool)(round_p(x,4) == 70.0454) << std::endl;
}
}

Unexpected result after converting uint64_t to double

In the following code:
#include <iostream>
...
uint64_t t1 = 1510763846;
uint64_t t2 = 1510763847;
double d1 = (double)t1;
double d2 = (double)t2;
// d1 == t2 => evaluates to true somehow?
// t1 == d2 => evaluates to true somehow?
// d1 == d2 => evaluates to true somehow?
// t1 == t2 => evaluates to false, of course.
std::cout << std::fixed <<
"uint64_t: " << t1 << ", " << t2 << ", " <<
"double: " << d1 << ", " << d2 << ", " << (d2+1) << std::endl;
I get this output:
uint64_t: 1510763846, 1510763847, double: 1510763904.000000, 1510763904.000000, 1510763905.000000
And I don't understand why. This answer: biggest integer that can be stored in a double says that an integral number up to 2^53 (9007199254740992) can be stored in a double without losing precision.
I actually get errors when I start doing calculations with the doubles, so it's not only a printing issue. (e.g. 1510763846 and 1510763847 both give 1510763904)
It's also very weird that the double can just be added to and then come out correct (d2+1 == 1510763905.000000)
Rationale: I'm converting these numbers to doubles because I need to work with them in Lua, which only supports floating point numbers. I'm sure I'm compiling the Lua lib with double as the lua_Number type, not float.
std::cout << sizeof(t1) << ", " << sizeof(d2) << std::endl;
Outputs
8, 8
I'm using VS 2012 with target MachineX86, toolkit v110_xp. Floating point model "Precise (/fp:precise)"
Addendum
With the help of people who replied and this article Why are doubles added incorrectly in a specific Visual Studio 2008 project?, I've been able to pinpoint the problem. A library is using a function like _set_controlfp, _control87, _controlfp or __control87_2 to change the precision of my executable to "single". That is why a uint64_t conversion to a double behaves as if it's a float.
When doing a file search for the above function names and "MCW_PC", which is used for Precision Control, I found the following libraries that might have set it:
Android NDK
boost::math
boost::numeric
DirectX (We're using June 2010)
FMod (non-EX)
Pyro particle engine
Now I'd like to rephrase my question:
How do I make sure converting from a uint64_t to a double goes correctly every time, without:
having to call _fpreset() each and every time a possible conversion occurs (think about the function parameters)
having to worry about a library's thread changing the floating point precision in between my _fpreset() and the conversion?
Naive code would be something like this:
double toDouble(uint64_t i)
{
double d;
do {
_fpreset();
d = i;
_fpreset();
} while (d != i);
return d;
}
double toDouble(int64_t i)
{
double d;
do {
_fpreset();
d = i;
_fpreset();
} while (d != i);
return d;
}
This solution assumes the odds of a thread messing with the Floating Point Precision twice are astronomically small. Problem is, the values I'm working with, are timers that represent real-world value. So I shouldn't be taking any chances. Is there a silver bullet for this problem?
From ieee754 floating point conversion it looks like your implementation of double is actually float, which is of course allowed by the standard, that mandates that sizeof double >= sizeof float.
The most accurate representation of 1510763846 is 1.510763904E9.

float to int conversion going wrong (even though the float is already an int)

I was writing a little function to calculate the binomial coefficiant using the tgamma function provided by c++. tgamma returns float values, but I wanted to return an integer. Please take a look at this example program comparing three ways of converting the float back to an int:
#include <iostream>
#include <cmath>
int BinCoeffnear(int n,int k){
return std::nearbyint( std::tgamma(n+1) / (std::tgamma(k+1)*std::tgamma(n-k+1)) );
}
int BinCoeffcast(int n,int k){
return static_cast<int>( std::tgamma(n+1) / (std::tgamma(k+1)*std::tgamma(n-k+1)) );
}
int BinCoeff(int n,int k){
return (int) std::tgamma(n+1) / (std::tgamma(k+1)*std::tgamma(n-k+1));
}
int main()
{
int n = 7;
int k = 2;
std::cout << "Correct: " << std::tgamma(7+1) / (std::tgamma(2+1)*std::tgamma(7-2+1)); //returns 21
std::cout << " BinCoeff: " << BinCoeff(n,k); //returns 20
std::cout << " StaticCast: " << BinCoeffcast(n,k); //returns 20
std::cout << " nearby int: " << BinCoeffnear(n,k); //returns 21
return 0;
}
why is it, that even though the calculation returns a float equal to 21, 'normal' conversion fails and only nearbyint returns the correct value. What is the nicest way to implement this?
EDIT: according to c++ documentation here tgamma(int) returns a double.
From this std::tgamma reference:
If arg is a natural number, std::tgamma(arg) is the factorial of arg-1. Many implementations calculate the exact integer-domain factorial if the argument is a sufficiently small integer.
It seems that the compiler you're using is doing that, calculating the factorial of 7 for the expression std::tgamma(7+1).
The result might differ between compilers, and also between optimization levels. As demonstrated by Jonas there is a big difference between optimized and unoptimized builds.
The remark by #nos is on point. Note that the first line
std::cout << "Correct: " <<
std::tgamma(7+1) / (std::tgamma(2+1)*std::tgamma(7-2+1));
Prints a double value and does not perform a floating point to integer conversion.
The result of your calculation in floating point is indeed less than 21, yet this double precision value is printed by cout as 21.
On my machine (x86_64, gnu libc, g++ 4.8, optimization level 0) setting cout.precision(18) makes the results explicit.
Correct: 20.9999999999999964 BinCoeff: 20 StaticCast: 20 nearby int: 21
In this case practical to replace integer operations with floating point operations, but one has to keep in mind that the result must be integer. The intention is to use std::round.
The problem with std::nearbyint is that depending on the rounding mode it may produce different results.
std::fesetround(FE_DOWNWARD);
std::cout << " nearby int: " << BinCoeffnear(n,k);
would return 20.
So with std::round the BinCoeff function might look like
int BinCoeffRound(int n,int k){
return static_cast<int>(
std::round(
std::tgamma(n+1) /
(std::tgamma(k+1)*std::tgamma(n-k+1))
));
}
Floating-point numbers have rounding errors associated with them. Here is a good article on the subject: What Every Computer Scientist Should Know About Floating-Point Arithmetic.
In your case the floating-point number holds a value very close but less than 21. Rules for implicit floating–integral conversions say:
The fractional part is truncated, that is, the fractional part is
discarded.
Whereas std::nearbyint:
Rounds the floating-point argument arg to an integer value in floating-point format, using the current rounding mode.
In this case the floating-point number will be exactly 21 and the following implicit conversion would return 21.
The first cout outputs 21 because of rounding that happens in cout by default. See std::setprecition.
Here's a live example.
What is the nicest way to implement this?
Use the exact integer factorial function that takes and returns unsigned int instead of tgamma.
the problem is on handling the floats.
floats cant 2 as 2 but as 1.99999 something like that.
So converting to int will drop out the decimal part.
So instead of converting to int immediately first round it to by calling the ceil function w/c declared in cmath or math.h.
this code will return all 21
#include <iostream>
#include <cmath>
int BinCoeffnear(int n,int k){
return std::nearbyint( std::tgamma(n+1) / (std::tgamma(k+1)*std::tgamma(n-k+1)) );
}
int BinCoeffcast(int n,int k){
return static_cast<int>( ceil(std::tgamma(n+1) / (std::tgamma(k+1)*std::tgamma(n-k+1))) );
}
int BinCoeff(int n,int k){
return (int) ceil(std::tgamma(n+1) / (std::tgamma(k+1)*std::tgamma(n-k+1)));
}
int main()
{
int n = 7;
int k = 2;
std::cout << "Correct: " << (std::tgamma(7+1) / (std::tgamma(2+1)*std::tgamma(7-2+1))); //returns 21
std::cout << " BinCoeff: " << BinCoeff(n,k); //returns 20
std::cout << " StaticCast: " << BinCoeffcast(n,k); //returns 20
std::cout << " nearby int: " << BinCoeffnear(n,k); //returns 21
std::cout << "\n" << (int)(2.9995) << "\n";
}

std::cout << Predicting the automatic field width in displayed for an arbitrary double

I'm displaying a large number of doubles on the console, and I would like to know in advance how many decimal places std::cout will decide to display for a given double. This is basically so I can make it look pretty in the console.
e.g. (pseudo-code)
feild_width = find_maximum_display_precision_that_cout_will_use( whole_set_of_doubles );
...
// Every cout statement:
std::cout << std::setw( feild_width ) << double_from_the_set << std::endl;
I figure cout "guesses"? a good precision to display based on the double. For example, it seems to display
std::cout << sqrt(2) << std::endl;
as 1.41421, but also
std::cout << (sqrt(0.5)*sqrt(0.5) + sqrt(1.5)*sqrt(1.5)) << std::endl;
as 2 (rather than 2.000000000000?????? or 1.99999999?????). Well, maybe this calculates to exactly 2.0, but I don't think that sqrt(2) will calculate to exactly 1.41421, so std::cout has to make some decision about how many decimal places to display at some point, right?
Anyway possible to predict this to formulate a find_maximum_display_precision...() function?
What you need is the fixed iomanip.
http://www.cplusplus.com/reference/iostream/manipulators/fixed/
double d = 10/3;
std::cout << std::setprecision(5) << std::fixed << d << std::endl;
Sometimes C++ I/O bites. Making pretty output is one of those sometimes. The C printf family is easier to control, more understandable, more terse, and isn't plagued with those truly awful ios:: global variables. If you need to use C++ output for other reasons, you can always sprintf/snprintf to a string buffer and then print that using the << to stream operator. IMHO, If you don't need to use C++ output, don't. It is ugly and verbose.
In your question you are mixing precision and width, which are two different things.
Other answers concentrate on precision, but the given precision is the maximum, not a minimum of displayed digits. It does not pad trailing zeros, if not ios::fixed or ios::scientific is set.
Here is a solution to determine the number of characters used for output, including sign and powers of 10:
#include <string>
#include <sstream>
#include <vector>
size_t max_width(const std::vector<double>& v)
{
size_t max = 0;
for (size_t i = 0; i < v.size(); ++i)
{
std::ostringstream out;
// optional: set precision, width, etc. to the same as in std::cout
out << v[i];
size_t length = out.str().size();
if (length > max) max = length;
}
return max;
}
std::cout::precision(); use it to determine precision
example :
# include <iostream>
# include <iomanip>
int main (void)
{
double x = 3.1415927
std::cout << "Pi is " << std::setprecision(4) << x << std::endl;
return 1;
}
This would display:
Pi is 3.142
This link also includes explanation for std::cout::precision();
http://www.cplusplus.com/reference/iostream/ios_base/precision/