Multiplying doubles in C++ error - c++

I have a seemingly simple c++ issue that's bothering me. The output of the code
#include <iostream>
using namespace std;
int main() {
// your code goes here
double c = 9.43827 * 0.105952 ;
cout << c << endl ;
return 0;
}
is 1. Just 1. I guess this is due to precision loss based on how doubles are stored in c++ but surely there must be a way in c++ to get some sort of precision (2 or 3 decimal places) in the result.

It's not precision loss in storage, it's precision loss in converting to text. The stream inserter for double defaults to six significant digits. The product here, 1.000003583, rounded to six significant digits, is 1.00000. In addition, if you haven't set showpoint, the trailing zeros and the decimal point will be suppressed, so you'll see a bare 1. To get the decimal point to show, use std::cout << std::showpoint << c << '\n';. To see more significant digits, use std::cout << std::setprecision(whatever) << c << '\n';, where whatever is the number of digits you want the formatter to use.

#include <stdio.h>
int main() {
// your code goes here
double c = ((double)9.43827) * 0.105952 ;
for(int i = (sizeof(double)*8)-1; i >= 0; i-- ) {
printf("%ld", (*(long*)&c>>i)&1);
}
}
If you run that, you can clearly see the bit representation of your double is not the integer value 1. You're not losing any data.
0011111111110000000000000000001111000001110100001010001001001001
but it is very close to 1, so that's what gets printed out.

Try using cout<<setprecision(12)<<c<<endl;
setprecision sets the decimal precision to be used to format floating-point values on output operations.
source

Related

How to set precision of a float?

For a number a = 1.263839, we can do -
float a = 1.263839
cout << fixed << setprecision(2) << a <<endl;
output :- 1.26
But what if i want set precision of a number and store it, for example-
convert 1.263839 to 1.26 without printing it.
But what if i want set precision of a number and store it
You can store the desired precision in a variable:
int precision = 2;
You can then later use this stored precision when converting the float to a string:
std::cout << std::setprecision(precision) << a;
I think OP wants to convert from 1.263839 to 1.26 without printing the number.
If this is your goal, then you first must realise, that 1.26 is not representable by most commonly used floating point representation. The closest representable 32 bit binary IEEE-754 value is 1.2599999904632568359375.
So, assuming such representation, the best that you can hope for is some value that is very close to 1.26. In best case the one I showed, but since we need to calculate the value, keep in mind that some tiny error may be involved beyond the inability to precisely represent the value (at least in theory; there is no error with your example input using the algorithm below, but the possibility of accuracy loss should always be considered with floating point math).
The calculation is as follows:
Let P bet the number of digits after decimal point that you want to round to (2 in this case).
Let D be 10P (100 in this case).
Multiply input by D
std::round to nearest integer.
Divide by D.
P.S. Sometimes you might not want to round to the nearest, but instead want std::floor or std::ceil to the precision. This is slightly trickier. Simply std::floor(val * D) / D is wrong. For example 9.70 floored to two decimals that way would become 9.69, which would be undesirable.
What you can do in this case is multiply with one magnitude of precision, round to nearest, then divide the extra magnitude and proceed:
Let P bet the number of digits after decimal point that you want to round to (2 in this case).
Let D be 10P (100 in this case).
Multiply input by D * 10
std::round to nearest integer.
Divide by 10
std::floor or std::ceil
Divide by D.
You would need to truncate it. Possibly the easiest way is to multiply it by a factor (in case of 2 decimal places, by a factor of 100), then truncate or round it, and lastly divide by the very same factor.
Now, mind you, that floating-point precision issues might occur, and that even after those operations your float might not be 1.26, but 1.26000000000003 instead.
If your goal is to store a number with a small, fixed number of digits of precision after the decimal point, you can do that by storing it as an integer with an implicit power-of-ten multiplier:
#include <stdio.h>
#include <math.h>
// Given a floating point value and the number of digits
// after the decimal-point that you want to preserve,
// returns an integer encoding of the value.
int ConvertFloatToFixedPrecision(float floatVal, int numDigitsAfterDecimalPoint)
{
return (int) roundf(floatVal*powf(10.0f, numDigitsAfterDecimalPoint));
}
// Given an integer encoding of your value (as returned
// by the above function), converts it back into a floating
// point value again.
float ConvertFixedPrecisionBackToFloat(int fixedPrecision, int numDigitsAfterDecimalPoint)
{
return ((float) fixedPrecision) / powf(10.0f, numDigitsAfterDecimalPoint);
}
int main(int argc, char ** arg)
{
const float val = 1.263839;
int fixedTwoDigits = ConvertFloatToFixedPrecision(val, 2);
printf("fixedTwoDigits=%i\n", fixedTwoDigits);
float backToFloat = ConvertFixedPrecisionBackToFloat(fixedTwoDigits, 2);
printf("backToFloat=%f\n", backToFloat);
return 0;
}
When run, the above program prints this output:
fixedTwoDigits=126
backToFloat=1.260000
If you're talking about storing exactly 1.26 in your variable, chances are you can't (there may be an off chance that exactly 1.26 works, but let's assume it doesn't for a moment) because floating point numbers don't work like that. There are always little inaccuracies because of the way computers handle floating point decimal numbers. Even if you could get 1.26 exactly, the moment you try to use it in a calculation.
That said, you can use some math and truncation tricks to get very close:
int main()
{
// our float
float a = 1.263839;
// the precision we're trying to accomplish
int precision = 100; // 3 decimal places
// because we're an int, this will keep the 126 but lose everything else
int truncated = a * precision; // multiplying by the precision ensures we keep that many digits
// convert it back to a float
// Of course, we need to ensure we're doing floating point division
float b = static_cast<float>(truncated) / precision;
cout << "a: " << a << "\n";
cout << "b: " << b << "\n";
return 0;
}
Output:
a: 1.26384
b: 1.26
Note that this is not really 1.26 here. But is is very close.
This can be demonstrated by using setprecision():
cout << "a: " << std:: setprecision(10) << a << "\n";
cout << "b: " << std:: setprecision(10) << b << "\n";
Output:
a: 1.263839006
b: 1.25999999
So again, it's not exactly 1.26, but very close, and slightly closer than you were before.
Using a stringstream would be an easy way to achieve that:
#include <iostream>
#include <iomanip>
#include <sstream>
using namespace std;
int main() {
stringstream s("");
s << fixed << setprecision(2) << 1.263839;
float a;
s >> a;
cout << a; //Outputs 1.26
return 0;
}

Calculate PI up to 42 decimal places

I'm trying to write a program that uses the series to compute the value of PI. The user will input how far it wants the program to compute the series and then the program should output its calculated value of PI. I believe I've successfully written the code for this, however it does not do well with large numbers and only gives me a few decimal places. When I tried to use cout << fixed << setprecision(42); It just gave me "nan" as the value of PI.
int main() {
long long seqNum; // sequence number users will input
long double val; // the series output
cout << "Welcome to the compute PI program." << endl; // welcome message
cout << "Please inter the sequence number in the form of an integer." << endl;
cin >> seqNum; // user input
while ( seqNum < 0) // validation, number must be positive
{
cout << "Please enter a positive number." << endl;
cin >> seqNum;
} // end while
if (seqNum > 0)
{
for ( long int i = 0; i < seqNum; i++ )
{
val = val + 4*(pow(-1.00,i)/(1 + 2*i)); // Gregory-Leibniz sum calculation
}// end for
cout << val;
} // end if
return 0;
}
Any help would be really appreciated. Thank you
Your problem involves an elementary, fundamental principle related to double values: a double, or any floating point type, can hold only a fixed upper limit of significant digits. There is no unlimited digits of precision with plain, garden-variety doubles. There's a hard, upper limit. The exact limit is implementation defined, but on modern C++ implementations the typical limit is just 16 or 17 digits of precision, not even close to your desired 42 digits of precision.
#include <limits>
#include <iostream>
int main()
{
std::cout << std::numeric_limits<double>::max_digits10 << std::endl;
return 0;
}
This gives you the maximum digits of precision with your platform/C++ compiler. This shows a maximum of 17 digits of precision with g++ 9.2 on Linux (max_digits10 is C++11 or later, use digits10 with old C++ compilers to show a closely-related metric).
Your desired 42 digits of precision likely far exceed what your modest doubles can handle. There are various special-purpose math libraries that can perform calculations with higher levels of precision, you can investigate those, if you wish.
You did not initialize or assign any value to val, but you are reading it when you get to the first iteration of
val = val + 4*(pow(-1.00,i)/(1 + 2*i));
This cause your program to have undefined behavior. Initialize val, probably to zero:
long double val = 0; // the series output
That aside, as mentioned in the answer of #SamVarshavchik there is a hard limit on the precision you can reach with the built-in floating point types and 42 places significance is almost certainly outside of that. Similarly the integer types that you are using are limited in size to probably at most 2^64 which is approximately 10^19.
Even if these limits weren't the problem, the series requires summation of roughly 10^42 terms to get PI to a precision of 42 places. It would take you longer than the universe has been around to calculate to that precision with all of earths current computing power combined.

How to format doubles in the following way?

I am using C++ and I would like to format doubles in the following obvious way. I have tried playing with 'fixed' and 'scientific' using stringstream, but I am unable to achieve this desired output.
double d = -5; // print "-5"
double d = 1000000000; // print "1000000000"
double d = 3.14; // print "3.14"
double d = 0.00000000001; // print "0.00000000001"
// Floating point error is acceptable:
double d = 10000000000000001; // print "10000000000000000"
As requested, here are the things I've tried:
#include <iostream>
#include <string>
#include <sstream>
#include <iomanip>
using namespace std;
string obvious_format_attempt1( double d )
{
stringstream ss;
ss.precision(15);
ss << d;
return ss.str();
}
string obvious_format_attempt2( double d )
{
stringstream ss;
ss.precision(15);
ss << fixed;
ss << d;
return ss.str();
}
int main(int argc, char *argv[])
{
cout << "Attempt #1" << endl;
cout << obvious_format_attempt1(-5) << endl;
cout << obvious_format_attempt1(1000000000) << endl;
cout << obvious_format_attempt1(3.14) << endl;
cout << obvious_format_attempt1(0.00000000001) << endl;
cout << obvious_format_attempt1(10000000000000001) << endl;
cout << endl << "Attempt #2" << endl;
cout << obvious_format_attempt2(-5) << endl;
cout << obvious_format_attempt2(1000000000) << endl;
cout << obvious_format_attempt2(3.14) << endl;
cout << obvious_format_attempt2(0.00000000001) << endl;
cout << obvious_format_attempt2(10000000000000001) << endl;
return 0;
}
That prints the following:
Attempt #1
-5
1000000000
3.14
1e-11
1e+16
Attempt #2
-5.000000000000000
1000000000.000000000000000
3.140000000000000
0.000000000010000
10000000000000000.000000000000000
There is no way for a program to KNOW how to format the numbers in the way that you are describing, unless you write some code to analyze the numbers in some way - and even that can be quite hard.
What is required here is knowing the input format in your source code, and that's lost as soon as the compiler converts the decimal input source code into binary form to store in the executable file.
One alternative that may work is to output to a stringstream, and then from that modify the output to strip trailing zeros. Something like this:
string obvious_format_attempt2( double d )
{
stringstream ss;
ss.precision(15);
ss << fixed;
ss << d;
string res = ss.str();
// Do we have a dot?
if ((string::size_type pos = res.rfind('.')) != string::npos)
{
while(pos > 0 && (res[pos] == '0' || res[pos] == '.')
{
pos--;
}
res = res.substr(pos);
}
return res;
}
I haven't actually tired it, but as a rough sketch, it should work. Caveats are that if you have something like 0.1, it may well print as 0.09999999999999285 or some such, becuase 0.1 can not be represented in exact form as a binary.
Formatting binary floating-point numbers accurately is quite tricky and was traditionally wrong. A pair of papers published in 1990 in the same journal settled that decimal values converted to binary floating-point numbers and back can have their values restored assuming they don't use more decimal digits than a specific constraint (in C++ represented using std::numeric_limits<T>::digits10 for the appropriate type T):
Clinger's "How to read floating-point numbers accurately" describes an algorithm to convert from a decimal representation to a binary floating-point.
Steele/White's "How to print floating-point numbers accurately" describes how to convert from a binary floating-point to a decimal decimal value. Interestingly, the algorithm even converts to the shortest such decimal value.
At the time these papers were published the C formatting directives for binary floating points ("%f", "%e", and "%g") were well established and they didn't get changed to the take the new results into account. The problem with the specification of these formatting directives is that "%f" assumes to count the digits after the decimal point and there is no format specifier asking to format numbers with a certain number of digits but not necessarily starting to count at the decimal point (e.g., to format with a decimal point but potentially having many leading zeros).
The format specifiers are still not improved, e.g., to include another one for non-scientific notation possibly involving many zeros, for that matter. Effectively, the power of the Steele/White's algorithm isn't fully exposed. The C++ formatting, sadly, didn't improve over the situation and just delegates the semantics to the C formatting directives.
The approach of not setting std::ios_base::fixed and using a precision of std::numeric_limits<double>::digits10 is the closest approximation of floating-point formatting the C and C++ standard libraries offer. The exact format requested could be obtained by getting the digits using using formatting with std::ios_base::scientific, parsing the result, and rewriting the digits afterwards. To give this process a nice stream-like interface it could be encapsulated with a std::num_put<char> facet.
An alternative could be the use of Double-Conversion. This implementation uses an improved (faster) algorithm for the conversion. It also exposes interfaces to get the digits in some form although not directly as a character sequence if I recall correctly.
You can't do what you want to do, because decimal numbers are not representable exactly in floating point format. In otherwords, double can't precisely hold 3.14 exactly, it stores everything as fractions of powers of 2, so it stores it as something like 3 + 9175/65536 or thereabouts (do it on your calculator and you'll get 3.1399993896484375. (I realize that 65536 is not the right denominator for IEEE double, but the gist of it is correct).
This is known as the round trip problem. You can't reliable do
double x = 3.14;
cout << magic << x;
and get "3.14"
If you must solve the round-trip problem, then don't use floating point. Use a custom "decimal" class, or use a string to hold the value.
Here's a decimal class you could use:
https://stackoverflow.com/a/15320495/364818
I am using C++ and I would like to format doubles in the following obvious way.
Based on your samples, I assume you want
Fixed rather than scientific notation,
A reasonable (but not excessive) amount of precision (this is for user display, so a small bit of rounding is okay),
Trailing zeros truncated, and
Decimal point truncated as well if the number looks like an integer.
The following function does just that:
#include <cmath>
#include <iomanip>
#include <sstream>
#include <string>
std::string fixed_precision_string (double num) {
// Magic numbers
static const int prec_limit = 14; // Change to 15 if you wish
static const double log10_fuzz = 1e-15; // In case log10 is slightly off
static const char decimal_pt = '.'; // Better: use std::locale
if (num == 0.0) {
return "0";
}
std::string result;
if (num < 0.0) {
result = '-';
num = -num;
}
int ndigs = int(std::log10(num) + log10_fuzz);
std::stringstream ss;
if (ndigs >= prec_limit) {
ss << std::fixed
<< std::setprecision(0)
<< num;
result += ss.str();
}
else {
ss << std::fixed
<< std::setprecision(prec_limit-ndigs)
<< num;
result += ss.str();
auto last_non_zero = result.find_last_not_of('0');
if (result[last_non_zero] == decimal_pt) {
result.erase(last_non_zero);
}
else if (last_non_zero+1 < result.length()) {
result.erase(last_non_zero+1);
}
}
return result;
}
If you are using a computer that uses IEEE floating point, changing prec_limit to 16 is unadvisable. While this will let you properly print 0.9999999999999999 as such, it also prints 5.1 as 5.0999999999999996 and 9.99999998 as 9.9999999800000001. This is from my computer, your results may vary due to a different library.
Changing prec_limit to 15 is okay, but it still leads to numbers that don't print "correctly". The value specified (14) works nicely so long as you aren't trying to print 1.0-1e-15.
You could do even better, but that might require discarding the standard library (see Dietmar Kühl's answer).

Why is the output different from what I expected?

I run this code but the output was different from what I expected.
The output:
c = 1324
v = 1324.99
I expected that the output should be 1324.987 for v. Why is the data in v different from output?
I'm using code lite on Windows 8 32.
#include <iostream>
using namespace std;
int main()
{
double v = 1324.987;
int n;
n = int (v);
cout << "c = " << n << endl;
cout << "v = " << v << endl;
return 0;
}
Floating point types inherit rounding errors as a result of their fixed width representations. For more information, see What Every Computer Scientist Should Know About Floating-Point Arithmetic.
The default precision when printing with cout is 6, so only 6 decimal places will be displayed. The number is rounded to the nearest value, that's why you saw 1324.99. You need to set a higher precision to see the more "correct" value
However, setting the precision too high may print out a lot of garbage digits behind, because binary floating-point types cannot store all decimal floating-point values exactly.

avoid rounding error (floating specifically) c++

http://www.learncpp.com/cpp-tutorial/25-floating-point-numbers/
I have been about this lately to review C++.
In general computing class professors tend not to cover these small things, although we knew what rounding errors meant.
Can someone please help me with how to avoid rounding error?
The tutorial shows a sample code
#include <iomanip>
int main()
{
using namespace std;
cout << setprecision(17);
double dValue = 0.1;
cout << dValue << endl;
}
This outputs
0.10000000000000001
By default float is kept 6-digits of precisions. Therefore, when we override the default, and asks for more (n this case, 17!!), we may encounter truncation (as explained by the tutorial as well).
For double, the highest is 16.
In general, how do good C++ programmers avoid rounding error?
Do you guys always look at the binary representation of the number?
Thank you.
The canonical advice for this topic is to read "What Every Computer Scientist Should Know About Floating-Point Arithmetic", by David Goldberg.
In other words, to minimize rounding errors, it can be helpful to keep numbers in decimal fixed-point (and actually work with integers).
#include <iostream>
#include <iomanip>
int main() {
using namespace std;
cout << setprecision(17);
double v1=1, v1D=10;
cout << v1/v1D << endl; // 0.10000000000000001
double v2=3, v2D=1000; //0.0030000000000000001
cout << v2/v2D << endl;
// v1/v1D + v2/v2D = (v1*v2D+v2*v1D)/(v1D*v2D)
cout << (v1*v2D+v2*v1D)/(v1D*v2D) << endl; // 0.10299999999999999
}
Short version - you can't really avoid rounding and other representation errors when you're trying to represent base 10 numbers in base 2 (ie, using a float or a double to represent a decimal number). You pretty much either have to work out how many significant digits you actually have or you have to switch to a (slower) arbitrary precision library.
Most floating point output routines look to see if the answer is very close to being even when represented in base 10 and round the answer to actually be even on output. By setting the precision in this way you are short-circuiting this process.
This rounding is done because almost no answer that comes out even in base 10 will be even (i.e. end in an infinite string of trailing 0s) in base 2, which is the base in which the number is represented internally. But, of course, the general goal of an output routine is to present the number in a fashion useful for a human being, and most human beings in the world today read numbers in base 10.
When you calculate simple thing like variance you can have this kind of problem... here is my solution...
int getValue(double val, int precision){
std::stringstream ss;
ss << val;
string strVal = ss.str();
size_t start = strVal.find(".");
std::string major = strVal.substr(0, start);
std::string minor = strVal.substr(start + 1);
// Fill whit zero...
while(minor.length() < precision){
minor += "0";
}
// Trim over precision...
if(minor.length() > precision){
minor = minor.substr(0, precision);
}
strVal = major + minor;
int intVal = atoi(strVal.c_str());
return intVal;
}
So you will make your calcul in the integer range...
for example 2523.49 became 252349 whit a precision of tow digits, and 2523490 whit a precision of tree digit... if you calculate the mean for example first you convert all value in integer, make the summation and get the result back in double, so you not accumulate error... Error are amplifie whit operation like square root and power function...
You want to use the manipulator called "Fixed" to format your digits correctly so they do not round or show in a scientific notation after you use fixed you will also be able to use set the precision() function to set the value placement to the right of the .
decimal point. the example would be as follows using your original code.
#include <iostream>
#include <iomanip>
int main()
{
using namespace std;
double dValue = 0.19213;
cout << fixed << setprecision(2) << dValue << endl;
}
outputs as:
dValue = 0.19