Is there any elegant solution using the std C++ or Boost libraries to output a double to std::cout in a way that the following conditions are met:
scientific notation is disabled
the precision for the decimal part is 6
however, trailing 0's (for the decimal part) are not printed out
For example:
double d = 200000779998;
std::cout << `[something]` << d;
should print out exactly 200000779998. [something] should possibly be a noexcept combination of some existing manipulators.
This is not a solution to the problem:
std::cout << std::setprecision(6) << std::fixed << d;
because it prints out 200000779998.000000 with trailing 0's
Instead of using the fixed manipulator, you can try to use (abuse?) defaultfloat. As far as I understand, it chooses either fixed or scientific based on the ability to put the number within the specified precision. As a result you can set the precision to the number of digits of the integral part + the requested fractional precision (6 in your case).
double d = 200000779998;
std::cout << std::setprecision(integralDigits(d) + 6) << d << std::endl;
You can try it here.
Hard to prove a negative, but I would assume no.
The requirements are inconsistent with any normal use. Space efficiency dictates a binary format. 6 digits (decimal) of precision suggests a format intended for human readers, who can't churn through lots of data. And humans have no issue dealing with a consistent 6 digit format.
So, you're basically targeting a format that has no obvious audience, and that is why I would be surprised if there is support for that.
Related
I have the following piece of code
#include <iostream>
#include <iomanip>
int main()
{
double x = 7033753.49999141693115234375;
double y = 7033753.499991415999829769134521484375;
double z = (x+ y)/2.0;
std::cout << "y is " << std::setprecision(40) << y << "\n";
std::cout << "x is " << std::setprecision(40) << x << "\n";
std::cout << "z is " << std::setprecision(40) << z << "\n";
return 0;
}
When the above code is run I get,
y is 7033753.499991415999829769134521484375
x is 7033753.49999141693115234375
z is 7033753.49999141693115234375
When I do the same in Wolfram Alpha the value of z is completely different
z = 7033753.4999914164654910564422607421875 #Wolfram answer
I am familiar with floating point precision and that large numbers away from zero can not be exactly represented. Is that what is happening here? Is there anyway in c++ where I can get the same answer as Wolfram without any performance penalty?
large numbers away from zero can not be exactly represented. Is that what is happening here?
Yes.
Note that there are also infinitely many rational numbers that cannot be represented near zero as well. But the distance between representable values does grow exponentially in larger value ranges.
Is there anyway in c++ where I can get the same answer as Wolfram ...
You can potentially get the same answer by using long double. My system produces exactly the same result as Wolfram. Note that precision of long double varies between systems even among systems that conform to IEEE 754 standard.
More generally though, if you need results that are accurate to many significant digits, then don't use finite precision math.
... without any performance penalty?
No. Precision comes with a cost.
Just telling IOStreams to print to 40 significant decimal figures of precision, doesn't mean that the value you're outputting actually has that much precision.
A typical double takes you up to 17 significant decimal figures (ish); beyond that, what you see is completely arbitrary.
Per eerorika's answer, it looks like the Wolfram Alpha answer is also falling foul of this, albeit possibly with some different precision limit than yours.
You can try a different approach like a "bignum" library, or limit yourself to the precision afforded by the types that you've chosen.
I'm trying to write some floating point data to a scientific file format that specifies ASCII fields of a fixed width (in this case, 16 characters). I'd like to maximize the precision written to the file, but every number written must fit in the fixed field limit.
Simply calling std::setprecision(12), for example, results in a loss of precision for M_PI and field overflow for M_PI / 10000, even when combined with std::setw(16). Similar arguments can be made against std::fixed
The best I've been able to come up with is
constexpr int fieldWidth = 16;
string formatField( double value ) {
stringstream ss;
ss << setprecision(fieldWidth-7) << setw(fieldWidth) << scientific << value;
return ss.str();
}
The assumption here is that no number is greater than 1e100 or too close to zero (ex 1e-100). It is also less than optimal for positive numbers less than 100 since it cedes a character to the + sign and two more to the exponent.
Can I improve on my current solution? Improvements would include 1) Better precision across a range of numbers and/or 2) A stronger guarantee that the field width won't be violated. Solutions using boost are welcome as well.
I have some old C code I'm trying to replicate the behavior of in C++. It uses the printf modifiers: "%06.02f".
I naively thought that iomanip was just as capable, and did:
cout << setfill('0') << setw(6) << setprecision(2)
When I try to output the test number 123.456, printf yields:
123.46
But cout yields:
1.2+e02
Is there anything I can do in iomanip to replicate this, or must I go back to using printf?
[Live Example]
Try std::fixed:
std::cout << std::fixed;
Sets the floatfield format flag for the str stream to fixed.
When floatfield is set to fixed, floating-point values are written using fixed-point notation: the value is represented with exactly as many digits in the decimal part as specified by the precision field (precision) and with no exponent part.
The three C format specifiers map to corresponding format setting in C++ IOStreams:
%f -> std::ios_base::fixed (fixed point notation) typically set using out << std::fixed.
%e -> std::ios_base::scientific (scientific notation) typically set using out << std::scientific.
%g -> the default setting, typically set using out.setf(std::fmtflags(), std::ios_base::floatfield) or with C++11 and later out << std::defaultfloat. The default formatting is trying to yield the "best" of the other formats assuming a fixed amount of digits to be used.
The precision, the width, and the fill character match the way you already stated.
The following code will print value of a and b:
double a = 3.0, b=1231231231233.0123456;
cout.setf(std::ios::fixed);
cout.unsetf(std::ios::scientific);
cout << a << endl << b << endl
The output is:
3.000000
1231231231233.012451
You can see that a is outputed with fixed 6 count of decimals.
But I want the output like this:
3
1231231231233.012451
How can i set flags only once, and output the above result.
The stream inserts 0s following the double because the stream's default precision for the output of floating-point values is 6. Unfortunately there is no straightforward way of checking if the double represents a whole number (so you could then only print the integral part). What you could do however is cast the value to an integer.
std::cout << static_cast<int>(a);
The default formatting for floating point numbers won't support the formats as requested. There are basically three settings you could use:
std::fixed which will use precision() digits after the decimal point.
std::scientific which will use scientific notation with precision() digits.
std::defaultfloat which will choose the shorter of the two forms.
(there is also std::hexfloat but that just formats the number in an form which is conveniently machine readable).
What you could do is to create you own std::num_put<char> facet which formats the value into a local buffer using std::fixed formatting an strips off trailing zero digits before sending the values one.
I am unable to understand why C++ division behaves the way it does. I have a simple program which divides 1 by 10 (using VS 2003)
double dResult = 0.0;
dResult = 1.0/10.0;
I expect dResult to be 0.1, However i get 0.10000000000000001
Why do i get this value, whats the problem with internal representation of double/float
How can i get the correct value?
Thanks.
Because all most modern processors use binary floating-point, which cannot exactly represent 0.1 (there is no way to represent 0.1 as m * 2^e with integer m and e).
If you want to see the "correct value", you can print it out with e.g.:
printf("%.1f\n", dResult);
Double and float are not identical to real numbers, it is because there are infinite values for real numbers, but only finite number of bits to represent them in double/float.
You can further read: what every computer scientist should know about floating point arithmetics
The ubiquitous IEEE754 floating point format expresses floating point numbers in scientific notation base 2, with a finite mantissa. Since a fraction like 1/5 (and hence 1/10) does not have a presentation with finitely many digits in binary scientific notation, you cannot represent the value 0.1 exactly. More generally, the only values that can be represented exactly are those that fit precisely into binary scientific notation with a mantissa of a few (e.g. 24 or 53 or 64) binary digits, and a suitably small exponent.
Working with integers, floats, and doubles could be tricky. Depends on what is your purpose. If you only want to display in nice format, then you can play with the C++ iomanipulator, precision, showpint, noshowpint. If you are trying to do precise computing with numeric methods, you may have to use some library for accurate representation. If you are multiplying a lots of small and large number, you may have to resole to use log transformations. Here is a small test:
float x=1.0000001;
cout << x << endl;
float y=9.9999999999999;
cout << "using default io format " << y/x << endl;
cout << showpoint << "using showpoint " << y/x << endl;
y=9.9999;
cout << "fewer 9 default C++ " << y/x << endl;
cout << showpoint << "fewer 9 showpoint" << y/x << endl;
1
using default io format 10
using showpoint 10.0000
fewer 9 default C++ 9.99990
fewer 9 showpoint9.99990
In special cases you want to use double (which may be the result of some complicated algorithm) to represent integer numbers, you have to figure out the proper conversion method. Once I had a situation where I want to use a single double value to store two type of values: -1, +1, or (0-1) to make my code more memory efficient (and speed, large memory tends to reduce performance). It is a little tricky to distinguish between +1 and val < 1. In this case I know that the values < 1 has a resolution say only 1/500, Then I can safely use floor(val+0.000001) to get back the 1 value that I initially stored.