How to get the coefficient from a std::decimal? - c++

Background
I want to write an is_even( decimal::decimal64 d ) function that returns true if the least-significant digit is even.
Unfortunately, I can't seem to find any methods to extract the coefficient from a decimal64.
Code
#include <iostream>
#include <decimal/decimal>
using namespace std;
static bool is_even( decimal::decimal64 d )
{
return true; // fix this - want to: return coefficient(d)%2==0;
}
int main()
{
auto d1 = decimal::make_decimal64( 60817ull, -4 ); // not even
auto d2 = decimal::make_decimal64( 60816ull, -4 ); // is even
cout << decimal64_to_float( d1 ) << " " << is_even( d1 ) << endl;
cout << decimal64_to_float( d2 ) << " " << is_even( d2 ) << endl;
return 0;
}

It's a little odd that there's no provided function to recover the coefficient of a decimal; but you can just multiply by 10 raised to its negative exponent:
bool is_even(decimal::decimal64 d)
{
auto q = quantexpd64(d);
auto coeff = static_cast<long long>(d * decimal::make_decimal64(1, -q));
return coeff % 2 == 0;
}
assert(!is_even(decimal::make_decimal64(60817ull, -4)));
assert(!is_even(decimal::make_decimal64(60816ull, -4)));

I would use corresponding fmod function if possible.
static bool is_even( decimal::decimal64 d )
{
auto e = quantexpd64(d);
auto divisor = decimal::make_decimal64(2, e);
return decimal::fmodd64(d, divisor) == decimal::make_decimal64(0,0);
}
It constructs a divisor that is 2*10^e where e is exponent of the tested value. Then it performs fmod and checks whether it is equal to a decimal 0. (NOTE: operator== for decimal is said to be IEEE 754-2008 conformant so we don't need to take care of -0.0).
An alternative would be to multiply the number by 10^-e (to "normalize" it) and cast it to an integer type and traditionally check modulo. I think this is #ecatmur's proposal. Though the "normalization" might fail if it goes out of chosen integer type bounds.
I think fmod is better when it comes to overflows. You are guaranteed to hold 2*10^e given that is a proper d decimal (i.e. not a NaN, or an inf).
One caveat I see is the definition of least significant digit. The above methods assume that least significant digit is denoted by e, which sometimes might be counterintuitive. I.e. is decimal(21,2) even? Then is decimal(2100,0)?

Related

Exact double division

Consider the following function:
auto f(double a, double b) -> int
{
return std::floor(a/b);
}
So I want to compute the largest integer k such that k * b <= a in a mathematical sense.
As there could be rounding errors, I am unsure whether the above function really computes this k. I do not worry about the case that k could be out of range.
What is the proper way to determine this k for sure?
It depends how strict you are. Take a double b and an integer n, and calculate bn. Then a will be rounded. If a is rounded down, then it is less than the mathematical value of nb, and a/b is mathematically less than n. You will get a result if n instead of n-1.
On the other hand, a == b*n will be true. So the “correct” result could be surprising.
Your condition was that “kb <= a”. If we interpret this as “the result of multiplying kb using double precision is <= a”, then you’re fine. If we interpret it as “the mathematically exact product of k and b is <= a”, then you need to calculate k*b - a using the fma function and check the result. This will tell you the truth, but might return a result of 4 if a was calculated as 5.0 * b and was rounded down.
The problem is that float division is not exact.
a/b can give 1.9999 instead of 2, and std::floor can then give 1.
One simple solution is to add a small value prior calling std::floor:
std::floor (a/b + 1.0e-10);
Result:
result = 10 while 11 was expected
With eps added, result = 11
Test code:
#include <iostream>
#include <cmath>
int main () {
double b = atan (1.0);
int x = 11;
double a = x * b;
int y = std::floor (a/b);
std::cout << "result = " << y << " while " << x << " was expected\n";
double eps = 1.0e-10;
int z = std::floor (a/b + eps);
std::cout << "With eps added, result = " << z << "\n";
return 0;
}

How to print a C++ double with the correct number of significant decimal digits?

When dealing with floating point values in Java, calling the toString() method gives a printed value that has the correct number of floating point significant figures. However, in C++, printing a float via stringstream will round the value after 5 or less digits. Is there a way to "pretty print" a float in C++ to the (assumed) correct number of significant figures?
EDIT: I think I am being misunderstood. I want the output to be of dynamic length, not a fixed precision. I am familiar with setprecision. If you look at the java source for Double, it calculates the number of significant digits somehow, and I would really like to understand how it works and/or how feasible it is to replicate this easily in C++.
/*
* FIRST IMPORTANT CONSTRUCTOR: DOUBLE
*/
public FloatingDecimal( double d )
{
long dBits = Double.doubleToLongBits( d );
long fractBits;
int binExp;
int nSignificantBits;
// discover and delete sign
if ( (dBits&signMask) != 0 ){
isNegative = true;
dBits ^= signMask;
} else {
isNegative = false;
}
// Begin to unpack
// Discover obvious special cases of NaN and Infinity.
binExp = (int)( (dBits&expMask) >> expShift );
fractBits = dBits&fractMask;
if ( binExp == (int)(expMask>>expShift) ) {
isExceptional = true;
if ( fractBits == 0L ){
digits = infinity;
} else {
digits = notANumber;
isNegative = false; // NaN has no sign!
}
nDigits = digits.length;
return;
}
isExceptional = false;
// Finish unpacking
// Normalize denormalized numbers.
// Insert assumed high-order bit for normalized numbers.
// Subtract exponent bias.
if ( binExp == 0 ){
if ( fractBits == 0L ){
// not a denorm, just a 0!
decExponent = 0;
digits = zero;
nDigits = 1;
return;
}
while ( (fractBits&fractHOB) == 0L ){
fractBits <<= 1;
binExp -= 1;
}
nSignificantBits = expShift + binExp +1; // recall binExp is - shift count.
binExp += 1;
} else {
fractBits |= fractHOB;
nSignificantBits = expShift+1;
}
binExp -= expBias;
// call the routine that actually does all the hard work.
dtoa( binExp, fractBits, nSignificantBits );
}
After this function, it calls dtoa( binExp, fractBits, nSignificantBits ); which handles a bunch of cases - this is from OpenJDK6
For more clarity, an example:
Java:
double test1 = 1.2593;
double test2 = 0.004963;
double test3 = 1.55558742563;
System.out.println(test1);
System.out.println(test2);
System.out.println(test3);
Output:
1.2593
0.004963
1.55558742563
C++:
std::cout << test1 << "\n";
std::cout << test2 << "\n";
std::cout << test3 << "\n";
Output:
1.2593
0.004963
1.55559
I think you are talking about how to print the minimum number of floating point digits that allow you to read the exact same floating point number back. This paper is a good introduction to this tricky problem.
http://grouper.ieee.org/groups/754/email/pdfq3pavhBfih.pdf
The dtoa function looks like David Gay's work, you can find the source here http://www.netlib.org/fp/dtoa.c (although this is C not Java).
Gay also wrote a paper about his method. I don't have a link but it's referenced in the above paper so you can probably google it.
Is there a way to "pretty print" a float in C++ to the (assumed) correct number of significant figures?
Yes, you can do it with C++20 std::format, for example:
double test1 = 1.2593;
double test2 = 0.004963;
double test3 = 1.55558742563;
std::cout << std::format("{}", test1) << "\n";
std::cout << std::format("{}", test2) << "\n";
std::cout << std::format("{}", test3) << "\n";
prints
1.2593
0.004963
1.55558742563
The default format will give you the shortest decimal representation with a round-trip guarantee like in Java.
Since this is a new feature and may not be supported by some standard libraries yet, you can use the {fmt} library, std::format is based on. {fmt} also provides the print function that makes this even easier and more efficient (godbolt):
fmt::print("{}", 1.2593);
Disclaimer: I'm the author of {fmt} and C++20 std::format.
You can use the ios_base::precision technique where you can specify the number of digits you want
For example
#include <iostream>
using namespace std;
int main () {
double f = 3.14159;
cout.unsetf(ios::floatfield); // floatfield not set
cout.precision(5);
cout << f << endl;
cout.precision(10);
cout << f << endl;
cout.setf(ios::fixed,ios::floatfield); // floatfield set to fixed
cout << f << endl;
return 0;
The above code with output
3.1416
3.14159
3.1415900000
There is a utility called numeric_limits:
#include <limits>
...
int num10 = std::numeric_limits<double>::digits10;
int max_num10 = std::numeric_limits<double>::max_digits10;
Note that IEEE numbers are not represented exactly bydecimal digits. These are binary quantities. A more accurate number is the number of binary bits:
int bits = std::numeric_limits<double>::digits;
To pretty print all the significant digits use setprecision with this:
out.setprecision(std::numeric_limits<double>::digits10);

Split floating point number into fractional and integral parts

I'm writing a template class designed to work with any floating point type. For some of the methods I need to split a number into its integral and fractional parts. With primitive floating point types I can just cast to an integer to truncate the fractional part, but this wouldn't work with a big number class. Ideally my class would only use the four basic arithmetic operations (addition, subtraction, multiplication, division) in its calculations.
The method below is the solution I came up with. All it does is subtract powers of ten until the original number is less than 1. It works well, but seems like a brute-force approach. Is there a more efficient way to do this?
template< typename T >
class Math
{
public:
static T modf( T const & x, T & intpart )
{
T sub = 1;
T ret = x;
while( x >= sub )
{
sub *= 10;
}
sub /= 10;
while( sub >= 1 )
{
while( ret >= sub )
{
ret -= sub;
}
sub /= 10;
}//while [sub] > 0
intpart = x - ret;
return ret;
}
}
Note that I've removed the sign management code for brevity.
You could perhaps replace the subtraction loop with a binary search, although that's not an improvement in complexity class.
What you have requires a number of subtractions approximately equal to the sum of the decimal digits of x, whereas a binary search requires a number of addition-and-divide-by-two operations approximately equal to 3-and-a-bit times the number of decimal digits of x.
With what you're doing and with the binary search, there's no particular reason to use powers of 10 when looking for the upper bound, you could use any number. Some other number might be a bit quicker on average, although it probably depends on the type T.
Btw, I would also be tempted to make modf a function template within Math (or a free template function in a namespace), rather than Math a class template. That way you can specialize or overload one function at a time for particular types (especially the built-in types) without having to specialize the whole of Math.
Example:
namespace Math
{
template <typename T>
T modf( T const & x, T & intpart )
{ ... }
}
Call it like this:
float f = 1.5, fint;
std::cout << Math::modf(f, fint) << '\n';
double d = 2.5, dint;
std::cout << Math::modf(d, dint) << '\n';
mpf_class ff(3.5), ffint(0); // GNU multi-precision
std::cout << Math::modf(ff, ffint) << '\n';
Overload it like this:
namespace Math {
double modf(double x, double &intpart) {
return std::modf(x, &intpart);
}
mpf_class modf(const mpf_class &x, mpf_class &intpart) {
intpart = floor(x);
return x - intpart;
}
}
mb use std::modf is better?
for custom type you can release Math class specialization.
#include <cmath>
#include <iostream>
template<typename T>
class Math
{
public:
static T modf(const T& x, T& integral_part)
{
return std::modf(x, &integral_part);
}
};
int main()
{
double d_part = 0.;
double res = Math<double>::modf(5.2123, d_part);
std::cout << d_part << " " << res << std::endl;
}
I don't know how strict your "ideally use only mathematical operations" restraint is, but nonetheless for the fractional part, could you extract it to a string and convert back to a float?

Fermat's factorisation in C++

For fun, I've been implementing some maths stuff in C++, and I've been attempting to implement Fermats Factorisation Method, however, I don't know that I understand what it's supposed to return. This implementation I have, returns 105 for the example number 5959 given in the Wikipedia article.
The pseudocode in Wikipedia looks like this:
One tries various values of a, hoping that is a square.
FermatFactor(N): // N should be odd
a → ceil(sqrt(N))
b2 → a*a - N
while b2 isn't a square:
a → a + 1 // equivalently: b2 → b2 + 2*a + 1
b2 → a*a - N // a → a + 1
endwhile
return a - sqrt(b2) // or a + sqrt(b2)
My C++ implementation, look like this:
int FermatFactor(int oddNumber)
{
double a = ceil(sqrt(static_cast<double>(oddNumber)));
double b2 = a*a - oddNumber;
std::cout << "B2: " << b2 << "a: " << a << std::endl;
double tmp = sqrt(b2);
tmp = round(tmp,1);
while (compare_doubles(tmp*tmp, b2)) //does this line look correct?
{
a = a + 1;
b2 = a*a - oddNumber;
std::cout << "B2: " << b2 << "a: " << a << std::endl;
tmp = sqrt(b2);
tmp = round(tmp,1);
}
return static_cast<int>(a + sqrt(b2));
}
bool compare_doubles(double a, double b)
{
int diff = std::fabs(a - b);
return diff < std::numeric_limits<double>::epsilon();
}
What is it supposed to return? It seems to be just returning a + b, which is not the factors of 5959?
EDIT
double cint(double x){
double tmp = 0.0;
if (modf(x,&tmp)>=.5)
return x>=0?ceil(x):floor(x);
else
return x<0?ceil(x):floor(x);
}
double round(double r,unsigned places){
double off=pow(10,static_cast<double>(places));
return cint(r*off)/off;
}
Do note that you should be doing all those calculations on integer types, not on floating point types. It would be much, much simpler (and possibly more correct).
Your compare_doubles function is wrong. diff should be a double.
And once you fix that, you'll need to fix your test line. compare_doubles will return true if its inputs are "nearly equal". You need to loop while they are "not nearly equal".
So:
bool compare_doubles(double a, double b)
{
double diff = std::fabs(a - b);
return diff < std::numeric_limits<double>::epsilon();
}
And:
while (!compare_doubles(tmp*tmp, b2)) // now it is
{
And you will get you the correct result (101) for this input.
You'll also need to call your round function with 0 as "places" as vhallac points out - you shouldn't be rounding to one digit after the decimal point.
The Wikipedia article you link has the equation that allows you to identify b from N and a-b.
There are two problems in your code:
compare_doubles return true when they are close enough. So, the while loop condition is inverted.
The round function requires number of digits after decimal point. So you should use round(x, 0).
As I've suggested, it is easier to use int for your datatypes. Here's working code implemented using integers.
The two factors are (a+b) and (a-b). It is returning one of those. You can get the other easily.
N = (a+b)*(a-b)
a-b = N/(a+b)

Rounding double values in C++ like MS Excel does it

I've searched all over the net, but I could not find a solution to my problem. I simply want a function that rounds double values like MS Excel does. Here is my code:
#include <iostream>
#include "math.h"
using namespace std;
double Round(double value, int precision) {
return floor(((value * pow(10.0, precision)) + 0.5)) / pow(10.0, precision);
}
int main(int argc, char *argv[]) {
/* The way MS Excel does it:
1.27815 1.27840 -> 1.27828
1.27813 1.27840 -> 1.27827
1.27819 1.27843 -> 1.27831
1.27999 1.28024 -> 1.28012
1.27839 1.27866 -> 1.27853
*/
cout << Round((1.27815 + 1.27840)/2, 5) << "\n"; // *
cout << Round((1.27813 + 1.27840)/2, 5) << "\n";
cout << Round((1.27819 + 1.27843)/2, 5) << "\n";
cout << Round((1.27999 + 1.28024)/2, 5) << "\n"; // *
cout << Round((1.27839 + 1.27866)/2, 5) << "\n"; // *
if(Round((1.27815 + 1.27840)/2, 5) == 1.27828) {
cout << "Hurray...\n";
}
system("PAUSE");
return EXIT_SUCCESS;
}
I have found the function here at stackoverflow, the answer states that it works like the built-in excel rounding routine, but it does not. Could you tell me what I'm missing?
In a sense what you are asking for is not possible:
Floating point values on most common platforms do not have a notion of a "number of decimal places". Numbers like 2.3 or 8.71 simply cannot be represented precisely. Therefore, it makes no sense to ask for any function that will return a floating point value with a given number of non-zero decimal places -- such numbers simply do not exist.
The only thing you can do with floating point types is to compute the nearest representable approximation, and then print the result with the desired precision, which will give you the textual form of the number that you desire. To compute the representation, you can do this:
double round(double x, int n)
{
int e;
double d;
std::frexp(x, &e);
if (e >= 0) return x; // number is an integer, nothing to do
double const f = std::pow(10.0, n);
std::modf(x * f, &d); // d == integral part of 10^n * x
return d / f;
}
(You can also use modf instead of frexp to determine whether x is already an integer. You should also check that n is non-negative, or otherwise define semantics for negative "precision".)
Alternatively to using floating point types, you could perform fixed point arithmetic. That is, you store everything as integers, but you treat them as units of, say, 1/1000. Then you could print such a number as follows:
std::cout << n / 1000 << "." << n % 1000;
Addition works as expected, though you have to write your own multiplication function.
To compare double values, you must specify a range of comparison, where the result could be considered "safe". You could use a macro for that.
Here is one example of what you could use:
#define COMPARE( A, B, PRECISION ) ( ( A >= B - PRECISION ) && ( A <= B + PRECISION ) )
int main()
{
double a = 12.34567;
bool equal = COMPARE( a, 12.34567F, 0.0002 );
equal = COMPARE( a, 15.34567F, 0.0002 );
return 0;
}
Thank you all for your answers! After considering the possible solutions I changed the original Round() function in my code to adding 0.6 instead of 0.5 to the value.
The value "127827.5" (I do understand that this is not an exact representation!) becomes "127828.1" and finally through floor() and dividing it becomes "1.27828" (or something more like 1.2782800..001). Using COMPARE suggested by Renan Greinert with a correctly chosen precision I can safely compare the values now.
Here is the final version:
#include <iostream>
#include "math.h"
#define COMPARE(A, B, PRECISION) ((A >= B-PRECISION) && (A <= B+PRECISION))
using namespace std;
double Round(double value, int precision) {
return floor(value * pow(10.0, precision) + 0.6) / pow(10.0, precision);
}
int main(int argc, char *argv[]) {
/* The way MS Excel does it:
1.27815 1.27840 // 1.27828
1.27813 1.27840 -> 1.27827
1.27819 1.27843 -> 1.27831
1.27999 1.28024 -> 1.28012
1.27839 1.27866 -> 1.27853
*/
cout << Round((1.27815 + 1.27840)/2, 5) << "\n";
cout << Round((1.27813 + 1.27840)/2, 5) << "\n";
cout << Round((1.27819 + 1.27843)/2, 5) << "\n";
cout << Round((1.27999 + 1.28024)/2, 5) << "\n";
cout << Round((1.27839 + 1.27866)/2, 5) << "\n";
//Comparing the rounded value against a fixed one
if(COMPARE(Round((1.27815 + 1.27840)/2, 5), 1.27828, 0.000001)) {
cout << "Hurray!\n";
}
//Comparing two rounded values
if(COMPARE(Round((1.27815 + 1.27840)/2, 5), Round((1.27814 + 1.27841)/2, 5), 0.000001)) {
cout << "Hurray!\n";
}
system("PAUSE");
return EXIT_SUCCESS;
}
I've tested it by rounding a hundred double values and than comparing the results to what Excel gives. They were all the same.
I'm afraid the answer is that Round cannot perform magic.
Since 1.27828 is not exactly representable as a double, you cannot compare some double with 1.27828 and hope it will match.
You need to do the maths without the decimal part, to get that numbers... so something like this.
double dPow = pow(10.0, 5.0);
double a = 1.27815;
double b = 1.27840;
double a2 = 1.27815 * dPow;
double b2 = 1.27840 * dPow;
double c = (a2 + b2) / 2 + 0.5;
Using your function...
double c = (Round(a) + Round(b)) / 2 + 0.5;