double and accuracy - c++

Using long double I get 18/19 = 0.947368421052631578..., and 947368421052631578 is the repeating decimal. Using double I get 0.947368421052631526... However, the former is correct. Why such an incorrect result?
Thanks for help.

A double typically provides 16(±1) decimal digits. Your example shows this:
4 8 12 16
v v v v
0.947368421052631578 long double
0.947368421052631526 double
The answers agree to 16 digits. This is what should be expected. Also note that there's no guarantee in the C Standard that a long double has more precision than a double.

You're trying to represent every decimal number with a finite amount of bits. Some things just aren't expressible exactly in floating point. Expecting exact answers with floats is your first problem. Take a look at What Every Computer Scientist Should Know About Floating-Point Arithmetic
Here's a summary from some lecture notes:
As mentioned earlier, computers cannot represent real numbers precisely since there are only a finite number of bits for storing a real number. Therefore, any number that has infinite number of digits such as 1/3, the square root of 2 and PI cannot be represented completely. Moreover, even a number of finite number of digits cannot be represented precisely because of the way of encoding real numbers.

A double which is usually implemented with IEEE 754 will be accurate to between 15 and 17 decimal digits. Anything past that can't be trusted, even if you can make the compiler display it.

Related

Double value change during assignment

I know double has some precision issues and it can truncate values during conversion to integer.
In my case I am assigning a double 690000000000123455 and it gets changed to 690000000000123392 during assignment.
Why is the number being changed so much drastically? After all there's no fractional part assigned with it. It doesn't seems like a precision issues as value doesn't change by 1 but 63.
Presumably you store 690000000000123455 as a 64 bit integer and assign this to a double.
double d = 690000000000123455;
The closest representable double precision value to 690000000000123455 can be checked here: http://pages.cs.wisc.edu/~rkennedy/exact-float?number=690000000000123455 and is seen to be 690000000000123392.
In other words, everything is as to be expected. Your number cannot be represented exactly as a double precision value and so the closest representable value is chosen.
For more discussion of floating point data types see: Is floating point math broken?
IEEE-754 double precision floats have about 53 bits of precision which equates to about 16 decimal digits (give or take). You'll notice that's about where your two numbers start to diverge.
double storage size is 8 byte. It's value ranges from 2.3E-308 to 1.7E+308. It's precision is upto 15 decimal places. But your number contains 18 digits. That's the reason.
You could use long double as it has precision upto 19 decimal places.
The other answers are already pretty complete, but I want to suggest a website I find very helpful in understanding how float point number works: IEEE 754 Converter (only 32-bit float here, but the interaction is still very good).
As we can see, 690000000000123455 is between 2^59 and 2^60, and the highest precision of the Mantissa is 2^-52 for double precision, which means that the precision step for the given number is 2^7=128. The error 63 you provided, is actually within range.
As a side suggestion, it is better to use long for storing big integers, as it will hold the precision and does not overflow (in your case).

Errors multiplying large doubles

I've made a BOMDAS calculator in C++ that uses doubles. Whenever I input an expression like
1000000000000000000000*1000000000000000000000
I get a result like 1000000000000000000004341624882808674582528.000000. I suspect it has something to do with floating-point numbers.
Floating point number represent values with a fixed size representation. A double can represent 16 decimal digits in form where the decimal digits can be restored (internally, it normally stores the value using base 2 which means that it can accurately represent most fractional decimal values). If the number of digits is exceeded, the value will be rounded appropriately. Of course, the upshot is that you won't necessarily get back the digits you're hoping for: if you ask for more then 16 decimal digits either explicitly or implicitly (e.g. by setting the format to std::ios_base::fixed with numbers which are bigger than 1e16) the formatting will conjure up more digits: it will accurately represent the internally held binary values which may produce up to, I think, 54 non-zero digits.
If you want to compute with large values accurately, you'll need some variable sized representation. Since your values are integers a big integer representation might work. These will typically be a lot slower to compute with than double.
A double stores 53 bits of precision. This is about 15 decimal digits. Your problem is that a double cannot store the number of digits you are trying to store. Digits after the 15th decimal digit will not be accurate.
That's not an error. It's exactly because of how floating-point types are represented, as the result is precise to double precision.
Floating-point types in computers are written in the form (-1)sign * mantissa * 2exp so they only have broader ranges, not infinite precision. They're only accurate to the mantissa precision, and the result after every operation will be rounded as such. The double type is most commonly implemented as IEEE-754 64-bit double precision with 53 bits of mantissa so it can be correct to log(253) ≈ 15.955 decimal digits. Doing 1e21*1e21 produces 1e42 which when rounding to the closest value in double precision gives the value that you saw. If you round that to 16 digits it's exactly the same as 1e42.
If you need more range, use double or long double. If you only works with integer then int64_t (or __int128 with gcc and many other compilers on 64-bit platforms) has a much larger precision (64/128 bits compared to 53 bits). If you need even more precision, use an arbitrary-precision arithmetic library instead such as GMP

Want to calculate 18 digits but will only calculate to 6? Using long double but still won't work

I've written a program to estimate pi using the Gregory Leibniz formula, however, it will not calculate to 18 decimal points. It will only calculate up to 5 decimal points. Any suggestions?
Use
cout.precision(50);
To increase the precision of the printed output. Here 50 is the number of decimal digits in your output.
The default printing precision for printf is 6
Precision specifies the exact number of digits to appear after the decimal point character. The default precision is 6
Similarly when std::cout was introduced into C++ the same default value was used
Manages the precision (i.e. how many digits are generated) of floating point output performed by std::num_put::do_put.
Returns the current precision.
Sets the precision to the given one. Returns the previous precision.
The default precision, as established by std::basic_ios::init, is 6.
https://en.cppreference.com/w/cpp/io/ios_base/precision
Therefore regardless of how precise the type is, only 6 fractional digits will be printed out. To get more digits you'll need to use std::setprecision or std::cout.precision
However calling std::cout.precision only affects the number of decimal digits in the output, not the number's real precision. Any digits over that type's precision would be just garbage
Most modern systems use IEEE-754 where float is single-precision with 23 bits of mantissa and double maps to double-precision with 52 bits of mantissa. As a result they're accurate to ~6-7 digits and ~15-16 decimal digits respectively. That means they can't represent numbers to 18 decimal points as you expected
On some platforms there may be some extended precision types so you'll be able to store numbers more precisely. For example long double on most compilers on x86 has 64 bits of precision and can represent ~18 significant digits, but it's not 18 digits after the decimal point. Higher precision can be obtained with quadruple-precision on some compilers. To achieve even more precision, the only way is to use a big number library or write one for your own.

How to print a float value of 6 digit precision in C++?

By default I'm getting 4 digit precision and when I use setprecision(6) the last digits of the variable come in random like 1/3=0.333369.
float has about 7 decimal digits of precision, due to its use of 24 binary digits to store the digits of the number. As far as output is concerned, setprecision(6) does everything you could ask for.
It's likely you are losing precision, for example by subtracting two numbers with similar values and printing the result. The quick solution is to change the computations to use double or long double. But to make any guarantees about the precision of a floating-point result, you need to understand how FP works and analyze how your formula is getting computed.
See What Every Computer Scientist Should Know About Floating-Point Arithmetic.

Setprecision() for a float number in C++?

In C++,
What are the random digits that are displayed after giving setprecision() for a floating point number?
Note: After setting the fixed flag.
example:
float f1=3.14;
cout < < fixed<<setprecision(10)<<f1<<endl;
we get random numbers for the remaining 7 digits? But it is not the same case in double.
Two things to be aware of:
floats are stored in binary.
float has a maximum of 24 significant bits. This is equivalent to 7.22 significant digits.
So, to your computer, there's no such number as 3.14. The closest you can get using float is 3.1400001049041748046875.
double has 53 significant bits (~15.95 significant digits), so you get a more accurate approximation, 3.140000000000000124344978758017532527446746826171875. The "noise" digits don't show up with setprecision(10), but would with setprecision(17) or higher.
They're not really "random" -- they're the (best available) decimal representation of that binary fraction (will be exact only for fractions whose denominator is a power of two, e.g., 3.125 would display exactly).
Of course that changes depending on the number of bits available to represent the binary fraction that best approaches the decimal one you originally entered as a literal, i.e., single vs double precision floats.
Not really a C++ specific issue (applies to all languages using binary floats, typically to exploit the machine's underlying HW, i.e., most languages). For a very bare-bone tutorial, I recommend reading this.