I don't understand why. problems with double [duplicate] - c++

This question already has answers here:
How do I print a double value with full precision using cout?
(17 answers)
Closed 3 years ago.
Why doubles round themselves? How can i prevent it?
If i insert 45000.98 i expect 45000.98, but the number is rounded.
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
double a;
cin >> a; //if i insert 45000.98
cout << a; //output is 45001
cout << endl << setprecision(2) << a; //output is 4.5e+04
}

Double type has 11 bits for exponent and 52 bits for the fractional part, more than enough to give you enough precision to represent 45000.98, but setprecision argument, as far as i recall, receives a characters limit, not the number of digits after decimal point. Use setprecision(8) and you should see 45000.98 as you probably expect.

The double did not round itself; the streaming operation rounded the value — first by default, and later according to your instructions. You requested that your value be rounded to 2 digits of precision, and that's what you got (just the first two digits: 4.5e+04). You are getting scientific notation because you have not requested enough digits to reach the decimal point.
If you want to see all 7 digits of 45000.98 then request at least 7 digits of precision. (You may want to stay under 17 digits though, since that's where you start seeing the artifacts of the floating point representation.)

Related

Why does cout.precision() increase floating-point's precision?

I understand that single floating-point numbers have the precision of about 6 digits, so it's not surprising that the following program will output 2.
#include<iostream>
using namespace std;
int main(void) {
//cout.precision(7);
float f = 1.999998; //this number gets rounded up to the nearest hundred thousandths
cout << f << endl; //so f should equal 2
return 0;
}
But when cout.precision(7) is included, in fact anywhere before cout << f << endl;, the program outputs the whole 1.999998. This could only mean that f stored the whole floating-point number without rounding, right?
I know that cout.precision() should not, in any way, affect floating-point storage. Is there an explanation for this behavior? Or is it just on my machine?
I understand that single floating-point numbers have the precision of about 6 digits
About six decimal digits, or exactly 23 binary digits.
this number gets rounded up to the nearest hundred thousand
No it doesn't. It gets rounded to the nearest 23 binary digits. Not the same thing, and not commensurable with it.
Why does cout.precision() increase floating-point's precision?
It doesn't. It affects how it is printed.
As already written in the comments: The number is stored in binary.
cout.setprecision() actually does not affect the storage of the floating point value, it affects only the output precision.
The default precision for std::cout is 6 according to this and your number is 7 digits long including the parts before and after the decimal place. Therefore when you set precision to 7, there is enough precision to represent your number but when you don't set the precision, rounding is performed.
Remember this only affects how the numbers are displayed, not how they are stored. Investigate IEEE floating point if you are interested in learning how floating point numbers are stored.
Try changing the number before the decimal place to see how it affects the rounding e.g float f = 10.9998 and float f = 10.99998

How to calculate number of digits before and after decimal point? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
Today a question was asked in my c++ class test. "Write a program that inputs a floating point number and calculates the number of digits before and after decimal point."
I calculated numbers before decimal points with this code:
float n;
cin>>n;
float temp = n;
int count = 0;
while(temp1 > 1) {
count++;
temp = temp/10;
}
cout<<count;
but I stuck with after part. Can anyone tell me how to do this? or can provide the whole program?
Thanks in advance,
Write a program that inputs a floating point number and calculates the number of digits before and after decimal point.
Well, as is that task is asking for something not really solvable using a float and standard c++, because the binary representation of a float values exponent and mantissa isn't defined in the c++ standard.
Hence you can't know how many digits will be used to represent the fraction part of the number, unless you know how exactly the c++ compiler implemented float (or double) binary representations.
Most probably the implementation is optimized for the target CPU and its capabilities how to deal with floating point values.
So the only chance you have is to read the number as a std::string representation in 1st place, count the digits that appear before and after the '.' character, and finally convert the std::string variable to a float value.
Here's a simple illustration what I meant in the 1st part of my answer:
#include <iostream>
#include <iomanip>
#include <limits>
#include <cmath>
#include <sstream>
int main() {
std::istringstream iss("3.1415"); // same as reading from cin
std::cout << "Input: " << iss.str() << std::endl;
float temp;
iss >> temp;
std::cout << "Internal representation: "
<< std::fixed << std::setprecision(22) << temp << std::endl;
float fraction = temp - abs(temp);
int fractiondigits = 0;
while(fraction > std::numeric_limits<float>::epsilon()) { // epsilon is the smallest
// value that can be
// represented in binary form
fraction *= 10.0f;
fraction -= abs(fraction);
++fractiondigits;
}
std::cout << "Number of digits used in the representation: "
<< fractiondigits << std::endl;
}
The output is
Input: 3.1415
Internal representation: 3.1414999961853027343750
Number of fraction digits used in the representation: 21
Live Demo
So you see that's not congruent with the user's input.
I don't know if your professors intend was to ask about and letting you acknowledge this incongruence of user input and internal representation of float.
But as mentioned the actual count of digits is compiler implementation and platform dependent, so there's no definite answer for the number of fraction digits.
The question is fundamentally irrelevant. Most real numbers have infinitely many digits, but computer represented numbers must have a finite representation. For the common case of a binary representation, the represented number also has a finite decimal representation. However, truncating this decimal representation at fewer digits (as few as std::numeric_limits<float>::max_digits10 to be precise) still obtains the same representable number. Thus, the relevant number of digits for computer floating-point numbers best refers to their binary rather than their decimal representation. This is given by std::numeric_limits<float>::digits (total: in front of and after the point).

sum of double numbers in c++ [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 6 years ago.
I want to calculate the sum of three double numbers and I expect to get 1.
double a=0.0132;
double b=0.9581;
double c=0.0287;
cout << "sum= "<< a+b+c <<endl;
if (a+b+c != 1)
cout << "error" << endl;
The sum is equal to 1 but I still get the error! I also tried:
cout<< a+b+c-1
and it gives me -1.11022e-16
I could fix the problem by changing the code to
if (a+b+c-1 > 0.00001)
cout << "error" << endl;
and it works (no error). How can a negative number be greater than a positive number and why the numbers don't add up to 1?
Maybe it is something basic with summation and under/overflow but I really appreciate your help.
Thanks
Rational numbers are infinitely precise. Computers are finite.
Precision loss is a well known problem in computer programming.
The real question is, how can you remedy it?
Consider using an approximation function when comparing floats for equality.
#include <iostream>
#include <cmath>
#include <limits>
using namespace std;
template <typename T>
bool ApproximatelyEqual(const T dX, const T dY)
{
return std::abs(dX - dY) <= std::max(std::abs(dX), std::abs(dY))
* std::numeric_limits<T>::epsilon();
}
int main() {
double a=0.0132;
double b=0.9581;
double c=0.0287;
//Evaluates to true and does not print error.
if (!ApproximatelyEqual(a+b+c,1.0)) cout << "error" << endl;
}
Floating point numbers in C++ have a binary representation. This means that most numbers that can exactly represented by a decimal fraction with only a few digits cannot be exactly represented by floating point numbers. That's where your error comes from.
One example: 0.1 (decimal) is a periodic fraction in binary:
0.000110011001100110011001100...
Therefore it cannot be exactly be represented with any number of bits with binary encoding.
In order to avoid this type of error, you can use BCD (binary coded decimal) numbers which are supported by some special libraries. The drawbacks are slower calculation speed (not directly supported by the CPU) and slightly higher memory usage.
ANother option is to represent the number by a general fraction and store numerator and denomiator as separate integers.

Higher precision when parsing string to float

This is my first post here so sorry if it drags a little.
I'm assisting in some research for my professor, and I'm having some trouble with precision when I'm parsing some numbers that need to be precise to the 12th decimal point. For example, here is a number that I'm parsing from a string into an integer, before it's parsed:
-82.636097527336
Here is the code I'm using to parse it, which I also found on this site (thanks for that!):
std::basic_string<char> str = prelim[i];
std::stringstream s_str( str );
float val;
s_str >> val;
degrees.push_back(val);
Where 'prelim[i]' is just the current number I'm on, and 'degrees' is my new vector that holds all of the numbers after they've been parsed to a float. My issue is that, after it's parsed and stored in 'degrees', I do an 'std::cout' command comparing both values side-by-side, and shows up like this (old value (string) on the left, new value (float) on the right):
-82.6361
Does anyone have any insight into how I could alleviate this issue and make my numbers more precise? I suppose I could go character by character and use a switch case, but I think that there's an easier way to do it with just a few lines of code.
Again, thank you in advance and any pointers would be appreciated!
(Edited for clarity regarding how I was outputting the value)
Change to a double to represent the value more accurately, and use std::setprecision(30) or more to show as much of the internal representation as is available.
Note that the internal storage isn't exact; using an Intel Core i7, I got the following values:
string: -82.636097527336
float: -82.63610076904296875
double: -82.63609752733600544161163270473480224609
So, as you can see, double correctly represents all of the digits of your original input string, but even so, it isn't quite exact, since there are a few extra digits than in your string.
There are two problems:
A 32-bit float does not have enough precision for 14 decimal digits. From a 32-bit float you can get about 7 decimal digits, because it has a 23-bit binary mantissa. A 64-bit float (double) has 52 bits of mantissa, which gives you about 16 decimal digits, just enough.
Printing with cout by default prints six decimal digits.
Here is a little program to illustrate the difference:
#include <iomanip>
#include <iostream>
#include <sstream>
int main(int, const char**)
{
float parsed_float;
double parsed_double;
std::stringstream input("-82.636097527336 -82.636097527336");
input >> parsed_float;
input >> parsed_double;
std::cout << "float printed with default precision: "
<< parsed_float << std::endl;
std::cout << "double printed with default precision: "
<< parsed_double << std::endl;
std::cout << "float printed with 14 digits precision: "
<< std::setprecision(14) << parsed_float << std::endl;
std::cout << "double printed with 14 digits precision: "
<< std::setprecision(14) << parsed_double << std::endl;
return 0;
}
Output:
float printed with default precision: -82.6361
double printed with default precision: -82.6361
float printed with 14 digits precision: -82.636100769043
double printed with 14 digits precision: -82.636097527336
So you need to use a 64-bit float to be able to represent the input, but also remember to print with the desired precision with std::setprecision.
You cannot have precision up to the 12th decimal using a simple float. The intuitive course of action would be to use double or long double... but your are not going to have the precision your need.
The reason is due to the representation of real numbers in memory. You have more information here.
For example. 0.02 is actually stored as 0.01999999...
You should use a dedicated library for arbitrary precision, instead.
Hope this helps.

What is the meaning of numeric_limits<double>::digits10

What is the precise meaning of numeric_limits::digits10?
Some other related questions in stackoverflow made me think it is the maximum precision of a double, but
The following prototype starts working (sucess is true) when precision is greater that 17 ( == 2+numeric_limits::digits10)
With STLPort, readDouble==infinity at the end; with microsoft's STL, readDouble == 0.0.
Has this prototype any kind of meaning :) ?
Here is the prototype:
#include <float.h>
#include <limits>
#include <math.h>
#include <iostream>
#include <iomanip>
#include <sstream>
#include <string>
int main(int argc, const char* argv[]) {
std::ostringstream os;
//int digit10=std::numeric_limits<double>::digits10; // ==15
//int digit=std::numeric_limits<double>::digits; // ==53
os << std::setprecision(17);
os << DBL_MAX;
std::cout << os.str();
std::stringbuf sb(os.str());
std::istream is(&sb);
double readDouble=0.0;
is >> readDouble;
bool success = fabs(DBL_MAX-readDouble)<0.1;
}
numeric_limits::digits10 is the number of decimal digits that can be held without loss.
For example numeric_limits<unsigned char>::digits10 is 2. This means that an unsigned char can hold 0..99 without loss. If it were 3 it could hold 0..999, but as we all know it can only hold 0..255.
This manual page has an example for floating point numbers, which (when shortened) shows that
cout << numeric_limits<float>::digits10 <<endl;
float f = (float)99999999; // 8 digits
cout.precision ( 10 );
cout << "The float is; " << f << endl;
prints
6
The float is; 100000000
numeric_limits::digits10 specifies the number of decimal digits to the left of the decimal point you can represent without a loss of precision. Each type will have a different number of representable decimal values.
See this very readable paper:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2005.pdf
Although DBL_MAX ( = std::numeric_limits::digits10 = 15 digits) is the minimum guaranteed number of digits for a double, the DBL_MAXDIG10 value (= 17 digits) proposed in the paper has the useful properties:
Of being the minimum number of digits needed to survive a round-trip to string form and back and get the same double in the end.
Of being the minimum number of digits needed to convert the double
to string form and show different strings every time you get (A != B) in code.
With 16 or fewer digits, you can get doubles that are not equal in code,
but when they are converted to string form they are the same
(which will give the case where they are different when compared in the code,
but a log file will show them as identical - very confusing and hard to debug!)
When you compare values (e.g. by reviewing them manually by diff'ing two log files) we should remember that digits 1-15 are ALWAYS valid, but differences in the 16th and 17th digits MAY be junk.
The '53' is the bit width of the significand that your type (double) holds. The '15' is the number of decimal digits that can be represented safely with that kind of precision.
digits10 is for conversion: string → double → string
max_digits10 is for conversion: double → string → double
In your program, you are using the conversion (double → string → double). You should use max_digits10 instead of digits10.
For more details about digits10 and max_digits10, you can read:
difference explained by stackoverflow
digits10
max_digits10